top | item 37935058 (no title) axiom92 | 2 years ago Right, but no separate image encoder + half the size could be very helpful for many applications. discuss order hn newest GaggiX|2 years ago The 7B LLaVa model is smaller, even considering the image encoder (CLIP-L).
GaggiX|2 years ago