top | item 41365287

Using Multimodal LLMs to Understand UI Elements on Websites

11 points| daniel_mp | 1 year ago |qa.tech | reply

5 comments

order
[+] while1|1 year ago|reply
Loving this! Very surprising that the LLMs of today are so bad at understanding interfaces but it also makes it a very interesting case for finetuning!