(no title)
ikhatri | 2 years ago
Honestly while ggml is super cool. It started as a hobby project and you probably shouldn’t use it in production. ONNX has been the defacto standard for ML inference for years. What it is missing (compared to ggml) is 2-6bit inference which is helpful for large scale transformers on edge devices (and is what helped ggml gain adoption so fast).
touisteur|2 years ago
ikhatri|2 years ago
ONNX really is the universal format. If you can get your model exported to ONNX, running it on various platforms becomes much easier.*
*as long as every hardware platform supports the ops you use in your network and you're not doing anything too fancy/custom :P