top | item 44292895

(no title)

gibsonf1 | 8 months ago

Does it hallucinate with the LLM being used?

discuss

order

michaelt|8 months ago

Sometimes. I just fed the huggingface demo an image containing some rather improbable details [1] and it OCRed "Page 1000000000000" with one extra trailing zero.

Honestly I was expecting the opposite - a repetition penalty to kick in having repeated zero too many times, resulting in too few zeros - but apparently not. So you might want to steer clear of this model if your document has a trillion pages.

Other than that, it did a solid job - I've certainly seen worse attempts to OCR a table.

[1] https://imgur.com/a/8rJeHf8

nattaylor|8 months ago

The base model is Qwen2.5-VL-3B and the announcement says a limitation is "Model can suffer from hallucination"

gibsonf1|8 months ago

Seems a bit scary that the "source" text from the pdfs could actually be hallucinated.