(no title)
hedgehog | 7 days ago
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models (https://arxiv.org/abs/2505.16056)
Breaking the MoE LLM Trilemma: Dynamic Expert Clustering with Structured Compression (https://arxiv.org/abs/2510.02345)
No comments yet.