(no title)
aazo11 | 9 months ago
Some observations
- Gemma-3 is the best model for on-device inference - 1B models look fine at first but break under benchmarking - 4B can handle simple rewriting and PII redaction. It also did math reasoning surprisingly well. - General knowledge Q&A does not work with a local model. This might work with a RAG pipeline or additional tools
I plan on training and fine-tuning 1B models to see if I can build high accuracy task specific models under 1GB in the future.
billconan|9 months ago