stefanwebb's comments

stefanwebb | 4 months ago | on: Is more training data always better?

First couple of paragraphs:

"There are many things one needs to live a rich and fulfilled life (according to AI researchers). A good initialization [Mishkin and Matas, 2015], attention-based neural networks [Vaswani et al., 2017], and a good title for your research paper [Myself, just now], to name a few.

In this post, we discuss another piece of eternal wisdom from AI researchers: “less is more.” Specifically, how foundation models can be fine-tuned for new capabilities with small data, in many cases less than one-thousand samples, and often outperform the same model fine-tuned on larger datasets. Meditate on that for a moment (suggested pose in figure above)."

stefanwebb | 4 months ago | on: Small Fine-Tuned Models Are All You Need

Seems topical given some recent front-page HN articles on fine-tuning. I discuss a large-scale empirical study from 2014 of fine-tuning 7B models to outperform GPT-4 and GPT-3.5-Turbo, as well as arguments why fine-tuning is coming back into favor

stefanwebb | 5 months ago | on: Custom AI models in hours not months with auto Data Synth and LLM-as-a-Judge

Hello Fellow Hackers, I wanted to share what my team is building. We released our open-source library for foundation model development in February and we're about to release our first Enterprise offering.

In brief, we've developed an easy-to-use platform for fine-tuning custom models. We automate data synthesis for judging and training, as well as automating the judge prompt itself. The end result is that model development times and costs are drastically cut!

Check out our Substack article above if you're interested in learning more or signing up for early access :)

stefanwebb | 5 months ago | on: Voronoi map generation in Civilization VII

This is a really powerful technique in general because it lets us have some controllability over traditional PCG techniques! All you need is the right prompt and an evaluation metric - could definitely apply to Voronoi maps

stefanwebb | 1 year ago | on: DeepSearcher: A local open-source Deep Research

There's quite a few differences between HuggingFace's Open Deep-Research and Zilliz's DeepSearcher.

I think the biggest one is the goal: HF is to replicate the performance of Deep Research on the GAIA benchmark whereas ours is to teach agentic concepts and show how to build research agents with open-source.

Also, we go into the design in a lot more detail than HF's blog post. On the design side, HF uses code writing and execution as a tool, whereas we use prompt writing and calling as a tool. We do an explicit break down of the query into sub-queries, and sub-sub-queries, etc. whereas HF uses a chain of reasoning to decide what to do next.

I think ours is a better approach for producing a detailed report on an open-ended question, whereas HFs is better for answering a specific, challenging question in short form.

page 1