(no title)
vikp | 1 year ago
I integrated gemini recently to improve accuracy in certain blocks like tables. (get initial text, then pass to gemini to refine) Marker alone works about as well as gemini alone, but together they benchmark much better.
vikp | 1 year ago
I integrated gemini recently to improve accuracy in certain blocks like tables. (get initial text, then pass to gemini to refine) Marker alone works about as well as gemini alone, but together they benchmark much better.
llm_trw|1 year ago
The rendering step for reports that humans got to see was a call to pandoc after the sxml was rendered to markdown - look ma we support powerpoint! - but it also allowed us to easily convert to whatever insane markup a given large (or small) language model worked best with on the fly.
[0] https://en.wikipedia.org/wiki/SXML
cma|1 year ago
hackernewds|1 year ago
alemos|1 year ago
vikp|1 year ago
Marker output will be higher quality than docling output across most doc types, especially with the --use_llm flag. A few specific things we do differently: