top | item 32139078

(no title)

jhoetter | 3 years ago

My guess (some if this we already have, some we don't): - automation: integration of heuristics (multiple columns that you can program via formulas and such) - exploration: finding outliers or most similar records given some reference (e.g. "I want to label more rows that are about business news in some extent") - monitoring - labelmanagement [which we don't offer yet in the extent we'd like to]: merging and splitting labels etc.

generally anything that scales and "somewhat" guarantees the users to input valid labels.

But it definitely offers something that new tools don't: users are super familiar with it.

discuss

order

localhost|3 years ago

Do NLP users use Excel naturally already?

jhoetter|3 years ago

Annotation platforms use Excel. I once received 25 files of separate Excel spreadsheets from a labeling service for 10k texts (short texts about product titles, e.g. "Sauvignon blanc" -> "wine"). Had to merge them, which wasn't as easy as you'd might expect.

Also, I once labeled 5,000 texts during my master's degree via Excel. Was painful as hell.