top | item 45353044

(no title)

be7a | 5 months ago

The biggest takeaway is that they claim SOTA for multi-modal stuff even ahead of proprietary models and still released it as open-weights. My first tests suggest this might actually be true, will continue testing. Wow

discuss

order

ACCount37|5 months ago

Most multi-modal input implementations suck, and a lot of them suck big time.

Doesn't seem to be far ahead of existing proprietary implementations. But it's still good that someone's willing to push that far and release the results. Getting multimodal input to work even this well is not at all easy.

Computer0|5 months ago

I feel like most Open Source releases regardless of size claim to be similar in output quality to SOTA closed source stuff.