top | item 47132842

Can frontier LLMs solve CAD tasks?

15 points| KerrickStaley | 7 days ago |kerrickstaley.com

4 comments

order

__atx__|6 days ago

Pretty interesting that simulator-only binary feedback (unless I am reading it wrong) was enough here to build some pretty robust models!

I maintain [1], which provides the models with the ability to render a screenshot from any angle and as far as I can tell, visually driven feedback does not work that well as this point. The models probably don't get enough of "lovecraftian garbled 3D model mess" in the training data or something...

[1] https://atx.github.io/OpenSCAD-Bench/

KerrickStaley|6 days ago

Cool project, thanks for sharing!

The simulator lets the LLM request renders from different angles/times, so the LLM can get visual feedback. For failures, the simulator also returns status codes like `object_fell` or `mount_initially_collided_with_object` depending on what happened. You can see what the tool call looks like by looking at the Transcript tab, e.g. here https://kerrickstaley.com/ai-cad-design-mount-viz/gso__mug__...

I agree it's not clear how much benefit models get from iteration. Many of the successful runs are one-shots. You can see some examples of basic spatial reasoning e.g. here https://kerrickstaley.com/ai-cad-design-mount-viz/gso__mug__... :

> The initial collision is because the mount was positioned at the same height as the mug's body center (z=-22), causing overlap. I need to lower the mount significantly so the mug starts above it and drops into the cradle.

8note|6 days ago

its binary only in the success case. looks like the failures have details returned