top | item 41777348

(no title)

andblac | 1 year ago

Skimming through the source it seems to run 'car' and 'person' objects through llava with the following prompt:

- "person": "get gender and age of this person in 5 words or less",

- "car": "get body type and color of this car in 5 words or less".

So YOLO gives the bounding box and rough category, while llava describes the object in more details.

discuss

order

No comments yet.