top | item 37935058

(no title)

axiom92 | 2 years ago

Right, but no separate image encoder + half the size could be very helpful for many applications.

discuss

order

GaggiX|2 years ago

The 7B LLaVa model is smaller, even considering the image encoder (CLIP-L).