marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
marcellodb's comments
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
1. Tabular data only, for now. Text/images also work if they're in a table, but unfortunately not unstructured text or folders of loose image files. Full support for images, video, audio etc coming sometime in the near future.
2. Input pre-processing is deployed in the model endpoint to ensure feature engineering is applied consistently across training and inference. Once a model is built, you can see the inference code in the UI and you'll notice the pre-processing code mirrors the feature engineering code. If you meant something like deploying scheduled batch jobs for feature processing, we don't support that yet, but it's in our plans!
3. The agent isn't explicitly instructed to "push back" on using ML, but it is instructed to develop a predictor that is as simple and lightweight as possible, including simple baseline heuristics (average, most popular class, etc). Whatever performs best on the test set is selected as the final predictor, and this could just be the baseline heuristic, if none of the models outperform it. I like the idea of explicitly pushing back on developing a model if the use case clearly doesn't call for it!
4. Yes, we have a model evaluator agent that runs an extensive battery of tests on the final model to understand things like robustness to missing data, feature importance, biases, etc. You can find all the info in the "Evaluations" tab of a built model. I'm guessing this is close to what you meant by "model interpretation"?
5. A mix of generic and fine-tuned, and we're actively experimenting with the best models to power each of the agents in the workflow. Unsurprisingly, our experience has been that Anthropic's models (Sonnet 4.5 and Haiku 4.5) are best at the "coding-heavy" tasks like writing a model's training code, while OpenAI's models seem to work better at more "analytical" tasks like reviewing results for logical correctness and writing concise data analysis scripts. Fine-tuning for our specific tasks is, however, an important part of our implementation strategy.
Hope this covers all your questions!
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
1. Depending on your dataset the training could take from 45 mins to a few hours. We do need add an ETA on the build in the UI.
2. The input schema is inferred towards the end of the model building process, not right at the start. This is because the final schema depends on the decisions made regarding input features, model architecture etc during the building process. You should see the sample curl update soon, with actual input fields.
3. Great point about upfront rejecting builds for types of models we don't yet support. We'll be sure to add this soon!
We're in London at the moment, but we'd love to connect with you and/or meet in person next time we're in SF - drop us a note on LinkedIn or something :)
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
Caveat: as a more technical user, you can currently "hack" around this limitation by storing your images as byte arrays in a parquet file, in which case the platform can ingest your data and train a CV model for you. We haven't tested the performance extensively though, so your mileage may vary here.
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
This also highlights the important role of the user as a (potentially non-technical) domain expert. Hope that makes sense!
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
marcellodb | 4 months ago | on: Launch HN: Plexe (YC X25) – Build production-grade ML models from prompts
1. AutoML tools work on clean data. Data preparation requires an understanding of business context, the ability to reason on the data in that context, and then produce code for the required data transformations. Given that this process could not be automated with "templated" pipelines, teams using AutoML still have to do the hardest - and arguably most important - part of the data science job themselves.
2. AutoML tools use "templated" models for regression, classification, etc, which may not result in as good a "task-data-model fit" as the sort of purpose-written ML code a data scientist or ML engineer might produce.
3. AutoML tools still require a working understanding of data science technicalities. They automate the running of ML training experiments, but not the task of deciding what to do in the first place, or the task of understanding whether what was done actually fits the task.
With this in mind, we've seen that most ML teams don't find traditional AutoML tools useful (they only automate the "easy" part), while software teams don't find them accessible (data science knowledge is still required).
Plexe addresses both of these issues: the agents' reasoning capabilities enable it to work with messy data (as long as you provide business context), and to ENTIRELY abstract the deeper technicalities of building custom models fitting the task and the data. We believe this makes Plexe both useful to ML teams and accessible to non-ML teams.
Does this line up with your experience of AutoML tools?