l-m-z
|
8 months ago
|
on: High-fidelity simultaneous speech-to-speech translation
Hibiki is an auto-regressive model with temperature based sampling so very similar to a LLM, generations are "random" and you can make them deterministic by fixing the RNG seed.
l-m-z
|
1 year ago
|
on: Moshi: A speech-text foundation model for real time dialogue
Hi swyx, laurent from kyutai here. We actually used the online demo at moshi.chat for the live event (the original demo), so same quantization. We updated the weights on the online version since then to add support for more emotions but we haven't noticed it being worse.
One thing to point out is that it takes time to get used to interact with the model, what tends to work, how to make it speak. The live event was far from perfect but we certainly used this experience. I would encourage you to try a bit the same kind of interaction we add on the live event and you should get similar results (though the model is very unpredictable so hard to be sure, you can see that some part of the live events definitely didn't work as expected).
l-m-z
|
1 year ago
|
on: Llm.c – LLM training in simple, pure C/CUDA
Candle dev here, we also support training/backdrop! We certainly focus on optimizing inference performance but hopefully that should improve the training efficiency too.
l-m-z
|
2 years ago
|
on: Candle: Torch Replacement in Rust
The tensors should be Send and Sync so can be manipulated from multiple threads, the underlying data is protected by a RWLock to guard against data races. The heavy operations such as matrix multiplication will be run on multiple cores even without using some explicit threading.
l-m-z
|
2 years ago
|
on: Llama2.c: Inference llama 2 in one file of pure C
Another random (self) plug for a rust version, this uses the candle ML library we've been working on for the last month and can be run in the browser.
https://laurentmazare.github.io/candle-llama2/index.html
The non-web version has full GPU support but is not at all minimalist :)
l-m-z
|
6 years ago
|
on: Self-Supervised Learning [pdf]
l-m-z
|
7 years ago
|
on: OCaml bindings for PyTorch
(I'm obviously biased as being the author of ocaml-torch)
Using functional programming is not of much help for size mismatches, the current bindings don't even use the type system to check the number of dimensions - although that should be reasonably easy to add. Maybe a better approach to help with this would be tensors with named dimensions
http://nlp.seas.harvard.edu/NamedTensor It's possible that a strong type system would help here but I don't think there have been much attempts.
However when using the Python api I often have errors because of:
- Unused variables when refactoring my code, which are just me forgetting to use some parameters.
- Comparing things of different types (and Python does not report any error and just return that they are different).
- Making changes to some helper functions without adapting all the projects where I'm using them.
Using a good Python linter probably helps with these, but that's a place where languages like ocaml naturally shine.
l-m-z
|
7 years ago
|
on: OCaml bindings for PyTorch
Bindings author here. This should support most of the PyTorch ops - this is done by automatically generating the binding code as there are more than a thousand of these. Most of the models available in torchvision should also be there (with pre-trained weights) see
https://github.com/LaurentMazare/ocaml-torch/tree/master/src... Finally it's also possible to export some python defined models and run them from ocaml.
That being said there are some rough edges compared to the PyTorch api, e.g. no parallel data loader, not much tooling and only a couple tutorials...