Why is it so difficult to write a short description what the project does? With too many open source projects people, who are not familiar with it, have to play detective to figure out what it actually is doing. "Wait a package manager based on numphy? That doesn't make any sense. Oh they mention LLM? So it must have something to do with AI"
The doc comment at the top of the .py file is sufficiently descriptive
"""Simple, minimal implementation of Mamba in one file of Numpy adapted from (1) and inspired from (2).
Suggest reading the following before/while reading the code:
[1] Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Albert Gu and Tri Dao)
https://arxiv.org/abs/2312.00752
[2] The Annotated S4 (Sasha Rush and Sidd Karamcheti)
https://srush.github.io/annotated-s4
I'm not normally one to gripe about name conflicts, but I knew this was going to get confusing. You could install an implementation of the Mamba LLM using the Mamba package manager!
ALL of the machine learning frameworks have incredible churn. I have code from two years ago which I can't make work reliably anymore -- not for lack of trying -- due to all the breaking changes and dependency issues. There are systems where each model runs in its own docker, with its own set of pinned library versions (many with security issues now). It's a complete and utter trainwreck. Don't even get me started on CUDA versions (or Intel/AMD compatibility, or older / deprecated GPUs).
For comparison, virtually all of my non-machine-learning Python code from the year 2010 all still works in 2024.
There are good reasons for this. Those breaking changes aren't just for fun; they're representative of the very rapid rate of progress in a rapidly-changing field. In contrast, Python or numpy are mature systems. Still, it makes many machine learning models insanely expensive to maintain in production environments.
If you're a machine learning researcher, it's fine, but if you have a system like an ecommerce web site or a compiler or whatever, where you'd like to be able to plug in a task-specific ML model, your downpayment is a weekend of hacking to make it work, but your ongoing rent of maintenance costs might be a few weeks each year for each model you use. I have a million places I'd love to plug in a little bit of ML. However, I'm very judicious with it, not because it's hard to do, but because it's expensive to maintain.
A pure Python + numpy implementation would mean that you can avoid all of that.
I was struggling with the fairness of your comment because the libraries are not used as a replacement to NumPy, but to ease dealing with the data. This made me check and it turns out that:
"Yes, the comment you mentioned is fair and reflects a common perspective in the programming and data science communities regarding the usage of "pure" implementations. When someone refers to a "pure X implementation," the typical expectation is that the implementation will rely solely on the functionalities of library X, without introducing dependencies from other libraries or frameworks."
I don’t see a PyTorch import, and the transformers import is just for the tokenizer which I don’t really consider a nontrivial part of mamba
So it’s just numpy and einops, which is pretty cool. I guess you could probably rewrite all the einops stuff in pure numpy if you want to trade readable code for eliminating the einops dependency
Edit: found the torch import, but it’s just for a single torch.load to deserialize some data
It totally mentions what it does. It takes the sentence "I have a dream that" and extends it to: "I have a dream that I will be able to see the sunrise in the morning."
thoronton|1 year ago
Tomte|1 year ago
Why are you entitled to have every single GitHub repo explained, tailored to your individual knowledge?
Many other people understood exactly what this is.
Maybe the submitter could add a comment on HN with an explanation, but the author owes you nothing.
edflsafoiewq|1 year ago
grandma_tea|1 year ago
arthurcolle|1 year ago
at this moment, in this time, if you see Mamba, either you know or you don't
mint2|1 year ago
I’m familiar with mamba, the conda like thing in python, but a numpy implementation of that makes no sense.
nerdponx|1 year ago
exe34|1 year ago
Hugsun|1 year ago
blagie|1 year ago
A numpy program will work tomorrow.
ALL of the machine learning frameworks have incredible churn. I have code from two years ago which I can't make work reliably anymore -- not for lack of trying -- due to all the breaking changes and dependency issues. There are systems where each model runs in its own docker, with its own set of pinned library versions (many with security issues now). It's a complete and utter trainwreck. Don't even get me started on CUDA versions (or Intel/AMD compatibility, or older / deprecated GPUs).
For comparison, virtually all of my non-machine-learning Python code from the year 2010 all still works in 2024.
There are good reasons for this. Those breaking changes aren't just for fun; they're representative of the very rapid rate of progress in a rapidly-changing field. In contrast, Python or numpy are mature systems. Still, it makes many machine learning models insanely expensive to maintain in production environments.
If you're a machine learning researcher, it's fine, but if you have a system like an ecommerce web site or a compiler or whatever, where you'd like to be able to plug in a task-specific ML model, your downpayment is a weekend of hacking to make it work, but your ongoing rent of maintenance costs might be a few weeks each year for each model you use. I have a million places I'd love to plug in a little bit of ML. However, I'm very judicious with it, not because it's hard to do, but because it's expensive to maintain.
A pure Python + numpy implementation would mean that you can avoid all of that.
ptspts|1 year ago
For me pure X means: to use this, all you have to install is X.
qwertox|1 year ago
"Yes, the comment you mentioned is fair and reflects a common perspective in the programming and data science communities regarding the usage of "pure" implementations. When someone refers to a "pure X implementation," the typical expectation is that the implementation will rely solely on the functionalities of library X, without introducing dependencies from other libraries or frameworks."
TIL.
rsfern|1 year ago
So it’s just numpy and einops, which is pretty cool. I guess you could probably rewrite all the einops stuff in pure numpy if you want to trade readable code for eliminating the einops dependency
Edit: found the torch import, but it’s just for a single torch.load to deserialize some data
nobodywillobsrv|1 year ago
unknown|1 year ago
[deleted]
rowanG077|1 year ago
sva_|1 year ago
Proponents of it usually highlight it's inference performance, in particular linear scaling with the input tokens.
wodenokoto|1 year ago
It's an LLM.
piqufoh|1 year ago
I also assumed that "a pure NumPy implementation" meant that it was built purely with numpy, which it isn't smh
unknown|1 year ago
[deleted]