(no title)
zackangelo | 4 months ago
> Instead of one brittle giant, we orchestrate a Mixture of Experts…
“mixture of experts” is a specific term of art that describes an architectural detail of a type of transformer model. It’s definitely not using smaller specialized models for individual tasks. Experts in an MoE model are actually routed to on a per token basis, not on a per task or per generation basis.
I know it’s tempting to co-opt this term because it would fit nicely for what you’re trying to do but it just adds confusion.
rgthelen|4 months ago