(no title)
dogline | 13 days ago
"mistral-ocr-latest" did really good handwriting transcription, considering how tight and small some of the handwriting is. Then back to Claude API calls to summarize by month and collect people and places from all of the entires.
Claude then created static html pages from what started as a Flask app. Published on Dreamhost.
dogline|13 days ago
I've never had one of my sites with this much traffic. With everything as static files, website is still holding. Thank you all.
NoiseBert69|12 days ago
pstuart|13 days ago
zzleeper|13 days ago
I'm working on a kinda similar project (documenting bank runs from historical newspapers) and also opted for Claude to build a static website. Crazy that the two sites have a very similar look and feel: https://www.finhist.com/bank-runs/index.html . The only big difference is that mine lacks a map, which I should hopefully fix soon (I already have lat and lon and am linking to google maps).
PS: Do you know if mistral works better at OCRing handwritten text than gemini 3? Was planning on going the gemini3 for another project
dogline|12 days ago
Digitizing history in different ways, with different resources that are unique or only known to small groups, might be a new development area, and that's exciting. As I've shown, and how other people have shared, using AI tools to digitize things which haven't previously been done before is now possible. Are there ways to make this easier for everybody? New techniques to discuss? I don't know, and I'd love to talk about it.
Concerning OCR: I used Mistral because of a posting here describing advancements with handwriting recognition a month or so ago. I didn't actually compare them. And I've got my setup that I can rerun everything again later if there are advancements in the area. Again, another area to keep track of and discuss.
beej71|13 days ago
dogline|13 days ago
These diary pages come largely from Stirling City, just north of Chico, and later from the Hat Creek district, on Hwy 89 north of Mt. Lassen. Nearby, many historical records were lost in the Paradise Camp Fire, and digitizing some of the records in some of the local museums is something this is a test run for.