(no title)
bartread | 18 days ago
...
> Option 2: Self-host with vLLM / SGLang
So, first off, this looks really cool and, given I'm looking for OCR at the moment, I'm pretty interested in this and other OCR models.
With that said, the README implies that option 2 requires a GPU. That's fine but it would be incredibly helpful if the README were explicit about requirements, and especially the amount of memory it needs.
EDIT: Looking at the links under option 3, the docs for macOS setup suggest 8GB of unified memory is enough to run the model, which is pretty modest, so I'd imagine Option 2 is similar. Ollama also offers a CPU only option (no idea how that will perform - not amazingly, I'm guessing), but that would suggest to me that if your volume requirements are low and you can't shell out for or source a beefy enough GPU and don't want to pay the sometimes exhorbitant hire costs, you should be able to punt it on to a machine with enough memory to run the model without too much difficulty.
No comments yet.