(no title)
ted537 | 9 months ago
If human response is "That's BS", "fuck off", or something similar, mark as bad assistant message.
If human response is "huh" or "cool", mark as good assistant message.
If on ChatGPT, watch how much scrolling user does. If there's a lot, its somewhat likely that the LLM outputted something useful.
That strategy would have holes of course but as long as its better than guessing something like that would be a useful heuristic.
londons_explore|9 months ago
Even very weak human signals can be immensely valuable over large enough datasets.
DeepYogurt|9 months ago
Marking is not a trivial task though. Use some AI system to mark it and you get a 99.something% filter maybe but whatever that remainder is leaks through. Over time your filter may get worse as a result.
ehecatl42|9 months ago
Grok is the only one that swore back at me. I kinda liked that. The others are way too polite, "Artificial Intelligence? Artificial Canadians, more like", my uni-going kid joked.