top | item 46588903

(no title)

httpteapot | 1 month ago

What do you think of the DeepSeek OCR approach where they say that vision tokens might better compress a document than its pure text representation?

https://news.ycombinator.com/item?id=45640594

I've spent some time feeding llm with scrapped web pages and I've found that retaining some style information (text size, visibility, decoration image content) is non trivial.

discuss

order

fbouvier|1 month ago

Keeping some kind of style information is definitely important to understand the semantics of the webpage.