top | item 47080825

(no title)

crashabr | 10 days ago

Would that book be useful as a reference to introduce data journalism students to AI? I'm less interested in the basics of using the API or claude code etc than best practices for workflows dealing with unstructured data, entity extraction, automated pipelines (with evals)? Although I do have some decent workflows around this I'd be interested in reading from someone who lives and breathes this kind of work. Pure data analysis to me is also something where I haven't found a good bridge between the current "generate a python script for me that I'll double check" paradigm and the spreadsheet centric world of most data journalists.

discuss

apwheele|10 days ago

The book is likely a good fit to this type of work. The chapter on structured outputs shows how to extract out data from text, walking through prompt engineering and k-shot examples to generate json, to pydantic, then batch processing with the different providers.

It also shows how to set up evals in different parts of the book. (Depending on what you want to do, the structured outputs has evals show comparing models/prompt changes to ground truth, and the agent chapter has evals LLM as a judge.)