(no title)
semitones | 7 months ago
In this situation very often there won't be _any_ answer, plenty of difficult questions go unanswered on the internet. Yet the model probably does not interpret this scenario as such
semitones | 7 months ago
In this situation very often there won't be _any_ answer, plenty of difficult questions go unanswered on the internet. Yet the model probably does not interpret this scenario as such
philipswood|7 months ago
Have a series of pretraining sessions with training data where specific information is not present and training questions/answers of "I don't know" for that data is also trained on.
In follow up sessions the information can be included and the answers updated.
Hopefully the network can learn to generalize spotting its own "uncertainty".
root_axis|7 months ago
tdido|7 months ago
https://m.youtube.com/watch?v=7xTGNNLPyMI&t=5400s
taneq|7 months ago
I’d try adding an output (or some special tokens or whatever) and then train it to track the current training loss for the current sample. Hopefully during inference this output would indicate how out-of-distribution the current inputs are.
wincy|7 months ago
sitkack|7 months ago
devmor|7 months ago
I would think focusing on the “homonym problem” could be a good place to start.
tdtr|7 months ago
delusional|7 months ago
You could decide that the text is "too unlikely" the problem there is that you'll quickly discover that most human sentences are actually pretty unlikely.
littlestymaar|7 months ago
“I don't know” must be derived from the model's knowledge as a whole, not from individual question/anser pairs in training.
simianwords|7 months ago
therein|7 months ago
astrange|7 months ago