Show HN: Scanned 1927-1945 Daily USFS Work Diary
121 points| dogline | 13 days ago |forestrydiary.com
This is one of those projects I've sat on for years, but with Claude and Mistral helping with the handwriting recognition, and even helping me write a custom scanning app that would auto scan each page and put it into a database as I assembled everything.
As far as I know, this is the only US Forestry Diary that has been fully scanned in and published. I understand that there are other diaries in some collections, but none have been scanned in. I hope this helps somebody. Please let me know if it does.
This is the sort of project Claude and AI can help with - A personal project that sits on the shelf forever, but now a reasonable project that can be published in my spare time. I'm not trying to earn money on this, but just improving our knowledge and history just a little bit.
dogline|13 days ago
"mistral-ocr-latest" did really good handwriting transcription, considering how tight and small some of the handwriting is. Then back to Claude API calls to summarize by month and collect people and places from all of the entires.
Claude then created static html pages from what started as a Flask app. Published on Dreamhost.
dogline|13 days ago
I've never had one of my sites with this much traffic. With everything as static files, website is still holding. Thank you all.
zzleeper|13 days ago
I'm working on a kinda similar project (documenting bank runs from historical newspapers) and also opted for Claude to build a static website. Crazy that the two sites have a very similar look and feel: https://www.finhist.com/bank-runs/index.html . The only big difference is that mine lacks a map, which I should hopefully fix soon (I already have lat and lon and am linking to google maps).
PS: Do you know if mistral works better at OCRing handwritten text than gemini 3? Was planning on going the gemini3 for another project
beej71|13 days ago
jlpk|13 days ago
kmoser|13 days ago
anonymous908213|13 days ago
- I think it would be a very large improvement if the actual diary pages/transcriptions were more accessible. I found the LLM summaries completely uncompelling, and did not particularly appreciate having to scroll through 5+ pages of LLM summary to get to the part where I could actually read the diary entries for a given month.
- The dates of the diary entries for many months are broken. For example, in the final month, all of the entries are labelled 1945-03-19. From a cursory examination, I believe the dating broke 24th July 1941 and was broken for every month from there to the end.
- The page for Nov 1941 seems entirely broken. For some reason, the dates labelling the pages are described in a different format that included the name of the month rather than a numeric representation, the pages are out of order, and then all manner of months are mixed in. The first pages are "November 1941", "April 1941", "October 2 1941", "October 3 1941", "November 4 1941", "November 12 1941", "November 7 1941" ... and so on. The LLM summary notes an "Event", a construction project that took place from 1931 to 1934, despite this being the entry for Nov 1941.
anonymous908213|12 days ago
Low effort, minimal change suggestion: a link or table of contents header at the top of each month's page to jump to the diary entries.
Higher effort, bigger change suggestion: I think it would make for a significantly better reading experience if all of the diary pages and their transcriptions for a month were listed sequentially, such that you could seamlessly read them without clicking previous/next page.
I think it's a bit of a waste to have put so much effort into preserving this, but the actual ability to read it is de-prioritised relative to the ability to read an LLM summary.
toomuchtodo|13 days ago
https://help.archive.org/help/uploading-a-basic-guide/
https://help.archive.org/help/managing-and-editing-your-item...
Trail Crew Stories and Mountain Gazette might also be interested in this.
https://www.trailcrewstories.com/
https://mountaingazette.com/
dogline|13 days ago
ricksunny|13 days ago
macintux|13 days ago
There are so many interesting stories out there, from the attorney general who summed up the evidence presented at a trial as “the victim lynched himself and his fellow thieves in jail” to the couple with 6 male children who named each successive pair of boys using the same initials (e.g. Carl Ervin, Carwin Earl; Truman William, Tresman Walter; Llewellyn Purcell, Lealyn Percy).
The resources that are already available are amazing (one woman, Violet Toph, assembled thousands of pages of memories and genealogy records for her county in the mid-20th century) but obviously very incomplete. Your idea would be a terrific way to help fill in some of those gaps and encourage people to keep their own memoirs somewhere outside Facebook.
reaperducer|13 days ago
dogline|13 days ago
"Fix up my packs. Load the 2 mules with 225# each. Take the 2 loads to trail camp at Lake Everett, Unload. Have lunch with the Trail cook. Haze mules & ride to 7 1/2 PM."
Horses are mentioned 2586 times. That'd be a whole study on how they're used in the back country. (Edit: horse number is inflated since part of the diary form at one point asks for "Horse Mileage". Will have to refine search).
khuey|12 days ago
canada_dry|13 days ago
It inspires me to tackle a project I've been holding off on for many years: OCR my grandmother/great-grandmother's cookbook. It's about 100 pages of collected and annotated recipes from the 1930-1980s.
OCR and AI have become sufficiently capable (as you've demonstrated) to properly scan, index, and classify the recipes into something I can share with relatives online or as an ebook.
jaffa2|12 days ago
whattheheckheck|12 days ago
unit149|13 days ago
[deleted]