I could see how that would be helpful, but at least for my use case I'm more interested in seeing how LLMs integrated with computer vision can speed up transcriptions. Since a thorough proofread by a human is already baked into the SE production process (and is indeed one of the major selling points), having more automated tools to aid proofreading is nice but doesn't do anything fundamentally different, from my point of view. Whereas if LLMs can be leveraged for transcription SE producers no longer need to depend on external projects like Project Gutenberg or Wikisource to produce texts (which can take months) or transcribe texts from OCR results by hand (very tedious and error-prone--believe me, I'm speaking from experience!). It would drastically open up the range of possible books someone could reasonably produce (in a timely fashion) for SE.
No comments yet.