top | item 40549435

(no title)

sdfgtr | 1 year ago

> While some of this is for annotation and ratings on data that came from the web or LLMs, they also create new training data whole-hog:

The article states that this human data is PhDs, poets, and other experts but my recollection from some info about programming LLM training is that there was a small army of low paid Indian programmers feeding it with data.

Even if it's actually experts now I have to wonder when that will switch to 3rd worlders making $1/hour.

discuss

order

stefan_|1 year ago

I love the marketing upstart attitude, but indeed, the reality of "PhDs, poets and subject matter experts expanding the frontiers of AI" is much more likely to be the "Amazon cashierless supermarket" experience.

The problem with hiring that group of people is presumably that they are not poor enough to lack ambition in their career, which every dummy can spot from miles away is an utter dead end feeding some LLM.

XorNot|1 year ago

Isn't it just curating an encyclopaedia though? The point is that LLM training is moving from "suck down the internet" to "consume an annotated and contextualised reference of the library of Congress".

The difference between trusting 5 random people to tell you how they think quantum mechanics works versus asking 5 presently publishing physicists.