top | item 42966820 (no title) iiJDSii | 1 year ago Such as? Are they able to recognize arbitrary GUI elements from various desktop programs, web browsers, etc? discuss order hn newest mountainriver|1 year ago Qwen2.5-vl seems to be the best right now by our tests.UI-TARS by bytedance also has a good amount of pretraining.Molmo is also very good at coordinates.
mountainriver|1 year ago Qwen2.5-vl seems to be the best right now by our tests.UI-TARS by bytedance also has a good amount of pretraining.Molmo is also very good at coordinates.
mountainriver|1 year ago
UI-TARS by bytedance also has a good amount of pretraining.
Molmo is also very good at coordinates.