top | item 45216384

(no title)

MediaSquirrel | 5 months ago

Yes, MLX is for research, but MLX-Swift is for production and it works quite well for supported models! Unlike CoreML, the developer community is vibrant and growing.

https://github.com/ml-explore/mlx-swift

Maybe I am working on a different set of problems than you are. But why would you use CoreML if not to access ANE? There are so many other, better newer options like llama.cpp, MLX-Swift, etc.

What are you seeing here that I am missing? What kind of models do you work with?

discuss

order

llm_nerd|5 months ago

I know what MLX is. MLX-swift is just a more accessible facade, but it's still MLX. The entire raison d'ĂȘtre for MLX is training and research. It is not a deployment library. It has zero intention in being a deployment library. Saying MLX replaces CoreML is simply nonsensical.

> But why would you use CoreML if not to access ANE?

The whole point of CoreML is hardware agnostic operations, not to mention higher level operations for most model touchpoints. If you went into this thinking CoreML = ANE, that's just fundamentally wrong at the beginning. ANE is one extremely limited path for CoreML models. The vast majority of CoreML models will end up running on the GPU -- using metal, it should be noted -- aside from some hyper-optimized models for core system functions, but if/when Apple improves the ANE, existing models will just use that as well. Similarly when you run a CoreML model on an A19 equipped unit, it will use the new matmul instructions where appropriate.

That's the point of CoreML.

Saying other options are "better, newer" is just weird and meaningless. Not only is CoreML rapidly evolving and can support just about every modern model feature, in most benchmarks of CoreML vs people's hand-crafted metal, CoreML smokes them. And then you run it on an A19 or the next M# and it leaves them crying for mercy. That's the point of it.

Can someone hand craft some metal and implement their own model runtime? Of course they can, and some have. That is the extreme exception, and no one in here should think that has replaced anything

MediaSquirrel|5 months ago

It sounds like your experience differs from mine. I oversaw teams trying to use CoreML in the 2020 - 2024 era who found it very buggy, as per the screenshots I provided.

More recently, I personally tried to convert Kokoro TTS to run on ANE. After performing surgery on the model to run on ANE using CoreML, I ended up with a recurring Xcode crash and reported the bug to Apple (as reported in the post and copied in part below).

What actually worked for me was using MLX-audio, which has been great as there is a whole enthusiastic developer community around the project, in a way that I haven't seen with CoreML. It also seems to be improving rapidly.

In contrast, I have talked to exactly 1 developer who have ever used CoreML since ChatGPT launched, and all that person did was complain about the experience and explain how it inspired them to abandon on-device AI for the cloud.

___ Crash report:

A Core ML model exported as an `mlprogram` with an LSTM layer consistently causes a hard crash (`EXC_BAD_ACCESS` code=2) inside the BNNS framework when `MLModel.prediction()` is called. The crash occurs on M2 Ultra hardware and appears to be a bug in the underlying BNNS kernel for the LSTM or a related operation, as all input tensors have been validated and match the model's expected shape contract. The crash happens regardless of whether the compute unit is set to CPU-only, GPU, or Neural Engine.

*Steps to Reproduce:* 1. Download the attached Core ML models (`kokoro_duration.mlpackage` and `kokoro_synthesizer_3s.mlpackage`) 2. Create a new macOS App project in Xcode. Add the two `.mlpackage` files to the project's "Copy Bundle Resources" build phase. 3. Replace the contents of `ContentView.swift` with the code from `repro.swift`. 4. Build and run the app on an Apple Silicon Mac (tested on M2 Ultra, macOS 15.6.1). 5. Click the "Run Prediction" button in the app.

*Expected Results:* The `MLModel.prediction()` call should complete successfully, returning an `MLFeatureProvider` containing the output waveform. No crash should occur.

*Actual Results:* The application crashes immediately upon calling `model.prediction(from: inputs, options: options)`. The crash is an `EXC_BAD_ACCESS` (code=2) that occurs deep within the Core ML and BNNS frameworks. The backtrace consistently points to `libBNNS.dylib`, indicating a failure in a low-level BNNS kernel during model execution. The crash log is below.