top | item 46846751

Non sucking, easy tool to convert any website to LLM ready data, Mojo

2 points| malvads | 28 days ago |github.com

3 comments

order

malvads|28 days ago

After running into only paid tools or overly complicated setups for turning web pages into structured data for LLMs, I was pretty much tired of this, wanted free open source solution to convert websites to MD format so built Mojo (for NotebookLM, or any RAG-like solution)

Mojo it's extremly fast, supports proxy rotation and it's MIT licensed -> https://github.com/malvads/mojo

firefoxd|28 days ago

It should start by looking at robot.txt.

malvads|27 days ago

Hi, thanks for your comments (it’s on the plan), since Mojo is early-stage software, there is still things that need to be integrated, however mojo is not a mass-crawler, (you have to specify directly what to crawl), so even if I add robots.txt (wich is in the plan) Evil users can still just bypass this (I expect mojo to be used by technical (non-evil) folks).

But thanks for your suggestion :)