top | item 39567930 (no title) swimwiththebeat | 2 years ago Does anyone know if this is using the Mamba architecture[1] instead of transformers? It looks like it uses a state space model (SSM) layer.[1]: https://arxiv.org/abs/2312.00752 discuss order hn newest milliondreams|2 years ago We covered state space models in a blog post here - https://blog.dragonscale.ai/state-space-models/It gives overview of Mamba And StrypedHyna. sal9000|2 years ago It came earlier than Mamba. It uses hyena hierarchy blocks, which are considered SSM but not the same as Mamba.
milliondreams|2 years ago We covered state space models in a blog post here - https://blog.dragonscale.ai/state-space-models/It gives overview of Mamba And StrypedHyna.
sal9000|2 years ago It came earlier than Mamba. It uses hyena hierarchy blocks, which are considered SSM but not the same as Mamba.
milliondreams|2 years ago
It gives overview of Mamba And StrypedHyna.
sal9000|2 years ago