(no title)
grey8 | 2 years ago
To answer your question, it's a model that you can give image and videos, which you can then interact with via an LLM (ask questions, describe, process further, etc.) It can "see" them, basically.
It the same capability as GPT-4V (ChatGPT's "upload image" feature), except that ChatGPT only offers images.
No comments yet.