top | item 42934330

Show HN: Smolmodels – open-source tool to build ML models using natural language

37 points| imaginaryspaces | 1 year ago |github.com

Hi HN - Marcello and Vaibhav here. We built smolmodels to experiment with using LLMs for ML development. It's a fully open-source library that generates complete model training and inference code from natural language descriptions. It combines graph search with LLM code generation to find a model that gives as good predictions as possible.

The core idea is that LLMs are overkill for a lot of predictive tasks. Smolmodels automates the trial-and-error process of finding the right model architecture and training approach, letting you build small, specialised models. You can either provide your own training data or have the library generate synthetic data based on your input/output schema requirements. This lets you quickly experiment with different model designs before investing in data collection.

The library handles the full pipeline - from data prep/generation through training to inference code. Everything can be self-hosted and works with major LLM providers.

We would love any thoughts/feedback on the project!

Repo link: https://github.com/plexe-ai/smolmodels

8 comments

order

documentparser|1 year ago

This is an interesting approach, I was wondering what kind of data can I use with it? Can’t wait to try it out

imaginaryspaces|1 year ago

As of now (v0.4.0), the library expects a structured/tabular input (think pandas DataFrame). So you could use it for things like transaction data, insurance records, anything that fits in a "table".

However, the concept generalises to other data types very naturally, and we plan to add support for things like images, audio etc very soon :)

binarymuffin|1 year ago

Love the idea, can take away the complexity that comes with such an effort

pfrpt|1 year ago

looks cool, so I can fine-tune an LLM with this tool?

imaginaryspaces|1 year ago

Not yet, though we plan to add that feature soon-ish! As of 0.4.0, with this tool you can use LLMs to build non-LLM models for your ML use cases.

For example: you have an ecommerce site and want to rank relevant products for your users. You want to launch a prototype quickly. You could use ChatGPT as your ranker ("rank products for this user ..."), or you could use smolmodels to generate a more lightweight ranking model like a smaller neural net, etc.

fkmms|1 year ago

[deleted]