top | item 43774990

Teaching LLMs how to solid model

319 points| wgpatrick | 11 months ago |willpatrick.xyz | reply

108 comments

order
[+] alnwlsn|11 months ago|reply
The future: "and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways. Wait, which way is up? Never mind, I'll do it myself."

I'm having trouble understanding why you would want to do this. A good interface between what I want and the model I will make is to draw a picture, not write an essay. This is already (more or less) how Solidworks operates. AI might be able to turn my napkin sketch into a model, but I would still need to draw something, and I'm not good at drawing.

The bottleneck continues to be having a good enough description to make what you want. I have serious doubts that even a skilled person will be able to do it efficiently with text alone. Some combo of drawing and point+click would be much better.

This would be useful for short enough tasks like "change all the #6-32 threads to M3" though. To do so without breaking the feature tree would be quite impressive.

[+] abe_m|11 months ago|reply
I think this is along the lines of the AI horseless carriage[1] topic that is also on the front page right now. You seem to be describing the current method as operated through an AI intermediary. I think the power in AI for CAD will be at a higher level than lines, faces and holes. It will be more along the lines of "make a bracket between these two parts". "Make this part bolt to that other part". "Attach this pump to this gear train" (where the AI determines the pump uses a SAE 4 bolt flange of a particular size and a splined connection, then adds the required features to the housing and shafts). I think it will operate on higher structures than current CAD typically works with, and I don't think it will be history tree and sketch based like Solidworks or Inventor. I suspect it will be more of a direct modelling approach. I also think integrating FEA to allow the AI to check its work will be part of it. When you tell it to make a bracket between two parts, it can check the weight of the two parts, and some environmental specification from a project definition, then auto-configure FEA to check the correct number of bolts, material thickness, etc. If it made the bracket from folded sheet steel, you could then tell it you want a cast aluminum bracket, and it could redo the work.

[1]https://news.ycombinator.com/item?id=43773813

[+] seveibar|11 months ago|reply
Most likely you won’t be asking for specific things like “3mm hole 3in from the side”, you’ll say things like “Create a plastic enclosure sized to go under a desk, ok add a usb receptacle opening, ok add flanges with standard screw holes”

In the text to CAD ecosystem we talk about matching our language/framework to “design intent” a lot. The ideal interface is usually higher level than people expect it to be.

[+] eurekin|11 months ago|reply
I have come across a significant number of non engineers wanting to do, what ultimately involves some basic CAD modelling. Some can stall on such tasks for years (home renovation) or just don't do it at all. After some brief research, the main cause is not wanting to sink over 30 hours into learning basics of a cad package of choice.

For some reason they imagine it as a daunting, complicated, impenetrable task with many pitfalls, which aren't surmountable. Be it interface, general idea how it operates, fear of unknown details (tolerances, clearances).

It's easy to underestimate the knowledge required to use a cad productively.

One such anecdata near me are highschools that buy 3d printers and think pupils will naturally want to print models. After initial days of fascination they stopped being used at all. I've heard from a person close to the education that it's a country wide phenomena.

Back to the point though - maybe there's a group of users that want to create, but just can't do CAD at all and such text description seem perfect for them.

[+] itissid|11 months ago|reply
> and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways.

One thing that is interesting here is you can read faster than TTS to absorb info. But you can speak much faster than you can type. So is it all that typing that's the problem or could be just an interface problem? and in your example, you could also just draw with your hand(wrist sensor) + talk.

As I've been using agents to code this way. Its way faster.

[+] michaelt|11 months ago|reply
> I'm having trouble understanding why you would want to do this.

You would be amazed at how much time CAD users spend using Propriety CAD Package A to redraw things from PDFs generated by Propriety CAD Package B

[+] whatshisface|11 months ago|reply
Here's how it might work, by analogy to the workflow for image generation:

"An aerodynamically curved plastic enclosure for a form-over-function guitar amp."

Then you get something with the basic shapes and bevels in place, and adjust it in CAD to fit your actual design goals. Then,

"Given this shape, make it easy to injection mold."

Then it would smooth out some things a little too much, and you'd fix it in CAD. Then, finally,

"Making only very small changes and no changes at all to the surfaces I've marked as mounting-related in CAD, unify my additions visually with the overall design of the curved shell."

Then you'd have to fix a couple other things, and you'd be finished.

[+] ssl-3|11 months ago|reply
So maybe the future is to draw a picture, and go from there?

For instance: My modelling abilities are limited. I can draw what I want, with measurements, but I am not a draftsman. I can also explain the concept, in conversational English, to a person who uses CAD regularly and they can hammer out a model in no time. This is a thing that I've done successfully in the past.

Could I just do it myself? Sure, eventually! But my modelling needs are very few and far between. It isn't something I need to do every day, or even every year. It would take me longer to learn the workflow and toolsets of [insert CAD system here] than to just earn some money doing something that I'm already good at and pay someone else to do the CAD work.

Except maybe in the future, perhaps I will be able use the bot to help bridge the gap between a napkin sketch of a widget and a digital model of that same widget. (Maybe like Scotty tried to do with the mouse in Star Trek IV.)

(And before anyone says it: I'm not really particularly interested in becoming proficient at CAD. I know I can learn it, but I just don't want to. It has never been my goal to become proficient at every trade under the sun and there are other skills that I'd rather focus on learning and maintaining instead. And that's OK -- there's lots of other things in life that I will probably also never seek to be proficient at, too.)

[+] wgpatrick|11 months ago|reply
Yeah - I fully agree with this POV. From a UX/UI POV, I think this is where things are headed. I talk about this a bit at the end of the piece.
[+] chpatrick|11 months ago|reply
If the napkin sketch generation is 95% right and only needs minor corrections then it's still a massive time saver.
[+] wilg|11 months ago|reply
Think about it this way, if the richest person in the world wanted something done, they would probably just shoot off a text to someone, maybe answer a few questions, and then some time later it would be done. That's your interface.
[+] bboygravity|11 months ago|reply
Text (specs + conversations) is the starting point of 100 percent of all CAD drawings made by more than 1 human though (so essentially everything you see around you).

I don't get your point (and yes I use CAD programs myself).

[+] fragmede|11 months ago|reply
talking to my computer and having it create things is pretty danged cool. Voice input takes out so much friction that, yeah, maybe it would be faster with a mouse and keyboard, but if I can just talk with my computer? I can do it while I'm walking around and thinking.
[+] oofbaroomf|11 months ago|reply
If you would use LLM-assisted CAD for real industrial design, you would have to end up by specifying exactly where everything has to go and what size it has to be. But if you are doing that then you may as well make an automated program to convert those specific requirements into a 3D model.

Oh wait, that's CAD.

Cynical take aside, I think this could be quite useful for normal people making simple stuff, and could really help consumer 3D printing have a much larger impact.

[+] lud_lite|11 months ago|reply
If there I in AI then it would tell me where to put the hole and why.
[+] spmcl|11 months ago|reply
I did this a few months ago to make a Christmas ornament. There are some rough edges with the process, but for hobby 3D printing, current LLMs with OpenSCAD is a game-changer. I hadn't touched my 3D printer for years until this project.

https://seanmcloughl.in/3d-modeling-with-llms-as-a-cad-luddi...

[+] dgacmu|11 months ago|reply
This matches my experience having Claude 3.5 and Gemini 2.0-flash generate openSCAD, but I would call it interesting instead of a game changer.

It gets pretty confused about the rotation of some things and generally needs manual fixing. But it kind of gets the big picture sort of right. It mmmmayybe saved me time the last time I used it but I'm not sure. Fun experiment though.

[+] 0_____0|11 months ago|reply
As a MCAD user this makes me feel more confident that my skills are safe for a bit longer. The geometry you were trying to generate (minus bayonet lock, which is actually a tricky thing to make because it relies on elastic properties of the material) takes maybe a few minutes to build in Solidworks or any modern CAD package.
[+] adamweld|11 months ago|reply
A recent Ezra Klein Interview[0] mentioned some "AI-Enabled" CAD tools used in China. Does anyone know what tools they might be talking about? I haven't been able to find any open-source tools with similar claims.

>I went with my colleague Keith Bradsher to Zeekr, one of China’s new car companies. We went into the design lab and watched the designer doing a 3D model of one of their new cars, putting it in different contexts — desert, rainforest, beach, different weather conditions.

>And we asked him what software he was using. We thought it was just some traditional CAD design. He said: It’s an open-source A.I. 3D design tool. He said what used to take him three months he now does in three hours.

[0] https://www.nytimes.com/2025/04/15/opinion/ezra-klein-podcas...

[+] sota_pop|11 months ago|reply
Sounds like he could have been using an implementation of stable-diffusion+control-net. I’ve used Automatic1111, but I understand comfyUI and somethingsomethingforge are more modern versions.
[+] throwaway314155|11 months ago|reply
Happy to be corrected but this sounds like the kind of bullshit that crops up from time to time confusing "old" AI with generative AI.

Not that I don't believe it's possible. I just think the alternative (that it's bullshit) is more likely.

[+] ariwilson|11 months ago|reply
I'm a great user for this problem as I just got a 3D printer and I'm no good at modeling. I'm doing tutorials and printing a few things with TinkerCAD now, but my historic visualization sense is not great. I used SketchUp when I had a working Oculus Quest which was very cool but not sure how practical it is.

Unfortunately I tried to generate OpenSCAD a few times to make more complex things and it hasn't been a great experience. I just tried o3 with the prompt "create a cool case for a Pixel 6 Pro in openscad" and, even after a few attempts at fixing, still had a bunch of non-working parts with e.g. the USB-C port in the wrong place, missing or incorrect speaker holes, a design motif for the case not connected to the case, etc.

It reminds me of ChatGPT in late 2022 when it could generate code that worked for simple cases but anything mildly subtle it would randomly mess up. Maybe someone needs to finetune one of the more advanced models on some data / screenshots from Thingiverse or MakerWorld?

[+] _mattb|11 months ago|reply
Really cool, I'd love to try something like this for quick and simple enclosures. Right now I have some prototype electronics hot glued to a piece of plywood. It would be awesome to give a GenCAD workflow the existing part STLs (if they exist) and have it roughly arrange everything and then create the 3D model for a case.

Maybe there could be a mating/assembly eval in the future that would work towards that?

[+] rowanG077|11 months ago|reply
About a year ago I had a 2D drawing of a relatively simple, I uploaded it to chatgpt and asked it to model it in cadquery. It required some coaching and manual post processing but it was able to do it. I have since moved to solvespace since even after using cadquery for years I was spending 50% of the time finding some weird structure to continue my drawing from. Solvespace is simply much more productive for me.
[+] niemandhier|11 months ago|reply
This reminds me of using llms for LaTex.

They will get you to 80% fast, The last 20% to match what is in your head are hard.

If you never walked the long path you you probably won’t manage to go the last few steps.

[+] fxtentacle|11 months ago|reply
Wow, this entire thing reads like a huge "stay away" sign to me.

The call to action at the end is: "Try out Text-to-CAD in our Modeling App" But that's like the last thing I want to do. Even when I'm working with very experienced professionals, it's really hard to tell them what exactly I want to see changed in their 3D CAD design. That's why they usually export lots of 2D drawings and then I will use a pencil to draw on top of it and then they will manually update the 3D shape to match my drawn request. The improvement that I would like to see in affordable CAD software is that they make it easier to generate section views and ideally the software would be able to back-propagate changes from 2D into the 3D shape. Maybe one day that will be possible with multimodal AI models, but even then the true improvement is going to be in the methods that the AI model uses internally to update the data. But trying to use text? That's like bringing a knife to a gunfight. It's obviously the wrong modality for humans to reason about shapes.

Also, as a more general comment, I am not sure that it is possible to introduce a new CAD tool with only subscription pricing. Typically, an enclosure design will go through multiple variations over multiple production runs in multiple years. That means it's obvious to everyone that you need your CAD software to continue working for years into the future. For a behemoth like Autodesk, that is believable. For a startup with a six month burn rate, it's not. That's why people treat startups with subscription pricing like vaporware.

[+] acyou|11 months ago|reply
I think if you could directly tokenize 3D geometry and train an LLM on 3D models directly, you might get somewhere. In order to prompt it, you would need to feed it a 3D model(s), and prompts and it could give you back a different 3D model. This has been done to some extent with generative modeling pre-LLM, but I don't know of any work that takes LLM techniques applied to language and applies them to "tokenizing" 3D geometry. I suspect NVIDIA is probably working very hard on this exact problem for graphics applications.

For mechanical design, 3D modeling is highly integrative, inputs are from a vast array of poorly specified inputs with a high amount of unspecified and fluid contextual knowledge, and outputs are not well defined either. I'm not convinced that mechanical design is particularly well suited to pairing with LLM workflow. Certain aspects, sure. But 3D models and drawings that we consider "well-defined" are still usually quite poorly defined, and from necessity rely heavily on implicit assumptions.

The geometry of machine threads, for example. Are you going to have a big computer specify the position of each of the atoms in the machine thread? Even the most detailed CAD/CAM packages have thread geometry extremely loosely defined, to the point of listing the callout, and not modeling any geometry at all in many cases.

It would just be very difficult to feed enough contextual knowledge into an LLM to have the knowledge it needs to do mechanical design. Therein lies the main problem. And I will stress that it's not a training problem, it's a prompt problem, if that makes sense.

[+] alexose|11 months ago|reply
As a huge OpenSCAD fan and everyday Cursor user, it seems obvious to me that there's a huge opportunity _if_ we can improve the baseline OpenSCAD code quality.

If the model could plan ahead well, set up good functions, pull from standard libraries, etc., it would be instantly better than most humans.

If it had a sense of real-world applications, physics, etc., well, it would be superhuman.

Is anyone working on this right now? If so I'd love to contribute.

[+] switchbak|11 months ago|reply
OpenSCAD has some fundamental issues with which folks are well aware. Build123d is a Python alternative that shows promise and seems more capable, and there's others around.

Hard to beat the mindshare of OpenSCAD at the moment though.

[+] conorbergin|11 months ago|reply
Your prompts are very long for how simple the models are, using a CAD package would be far more productive.

I can see AI being used to generate geometry, but not a text based one, it would have to be able to reason with 3d forms and do differential geometry.

You might be able to get somewhere by training an LLM to make models with a DSL for Open Cascade, or any other sufficiently powerful modelling kernel. Then you could train the AI to make query based commands, such as:

  // places a threaded hole at every corner of the top surface (maybe this is an enclosure)
  CUT hole(10mm,m3,threaded) LOCATIONS surfaces().parallel(Z).first().inset(10).outside_corners()
This has a better chance of being robust as the LLM would just have to remember common patterns, rather than manually placing holes in 3d space, which is much harder.
[+] wgpatrick|11 months ago|reply
I definitely agree with your point about the long prompts.

The long prompts are primarily an artifact of trying to make an eval where there is a "correct" STL.

I think your broader point, text input is bad for CAD, is also correct. Some combo of voice/text input + using a cursor to click on geometry makes sense. For example, clicking on the surface in question and then asking for "m6 threaded holes at the corners". I think a drawing input also make sense as its quite quick to do.

[+] Legend2440|11 months ago|reply
There are diffusion models for 3D generation. They make pretty good decorative or ornamental models, like figurines. They are less good for CAD.
[+] dave1010uk|11 months ago|reply
I 3D printed a replacement screw cap for something that GPT-4o designed for me with OpenSCAD a few months ago. It worked very well and the resulting code was easy to tweak.

Good to hear that newer models are getting better at this. With evals and RL feedback loops, I suspect it's the kind of thing that LLMs will get very good at.

Vision language models can also improve their 3D model generation if you give them renders of the output: "Generating CAD Code with Vision-Language Models for 3D Designs" https://arxiv.org/html/2410.05340v2

OpenSCAD is primitive. There are many libraries that may give LLMs a boost. https://openscad.org/libraries.html

[+] geor9e|11 months ago|reply
I get that CAD interfaces are terrible - but if I imagine the technological utopia of the future - using the english language as the interface sounds terrible no matter how well you do it. Unless you are paraplegic and speaking is your only means of manipulating the world.

I much prefer the direction of sculpting with my hands in VR, pulling the dimensions out with a pinch, snapping things parellel with my fine motor control. Or sketching on an iPad, just dragging a sketch to extrude is to it's normal, etc etc. These UIs could be vastly improved.

I get that LLMs are amazing lately, but perhaps keep them somewhere under the hood where I never need to speak to them. My hands are bored and capable of a very high bandwidth of precise communication.

[+] klysm|11 months ago|reply
I’m not sure that CAD interfaces are terrible, it’s just hard work
[+] emorning3|11 months ago|reply
Wow! As someone that's written openscad scripts manually I can get real excited about this.
[+] ein0p|11 months ago|reply
I've done this, and printed actual models AIs generated. In my experience Grok does the best job with this - it one shots even the more elaborate designs (with thinking). Gemini often screws up, but it sometimes can (get this!) figure things out if you show it what the errors are, as a screenshot. This in particular gives me hope that some kind of RL loop can be built around this. OpenAI models screw up and can't fix the errors (common symptom: generate slightly different model with the same exact flaws). DeepSeek is about at the same level at OpenSCAD as OpenAI. I have not tried Claude.
[+] derac|11 months ago|reply
You've got to be a bit more specific, those words can all refer to many models.
[+] karolist|11 months ago|reply
I wanted to use this process (LLM -> OpenSCAD) a few months ago to create custom server rack brackets (ears) for externally mounting water-cooling radiator of the server I am building. I ended up learning about 3D printing, using SolidWorks (it has great built-in tutorials) and did this the old fashioned way. This process may work for refining parts against very well known objects, i.e. iPhone, but the amount of refinement, back and forth and verbosity needed, the low acceptance rate - I do not believe we're close to using these tools for CAD.
[+] jmcpheron|11 months ago|reply
It's so cool to see this post, and so many other commenters with similar projects.

I had the same thought recently and designed a flexible bracelet for pi Day using openscad and a mix of some the major AI providers. I'm cool to see other people are doing similar projects. I'm surprised how well I can do basic shapes and open scad with these AI assistants.

https://github.com/jmcpheron/counted-out-pi

[+] bdcravens|11 months ago|reply
Most of the 3D printing model repositories offer financial incentives for model creators, as they are usually owned by manufacturers who want to own as much of the ecosystem as possible. (Makerworld, Printables, etc)

Widespread AI generation obviously enables abuse of those incentives, so it'll be interesting to see how they adjust to this. (It's already a small problem, with modelers using AI renderings that are deceptive in terms of quality)

[+] rcarmo|11 months ago|reply
As someone who enjoys doing CAD and spends a fair amount of time doing contortions to get OpenSCAD to do relatively simple things like bevels and chamfers, I’d say this is interesting because of the model ranking, but ultimately pointless because LLMs do not really have a mental model of things like CSG and actual relative positioning - they can do simple boxes and cylinders with mounting holes, but that’s about it.
[+] isoprophlex|11 months ago|reply
Makes you wonder if there is a place in the pipeline for generating G-code (motion commands that run CNC mills, 3d printers etc.)

Being just a domestic 3d printer enthousiast I have no idea what the real world issues are in manufacting with CNC mills; i'd personally enjoy an AI telling me which of the 1000 possible combinations of line width, infill %, temperatures, speeds, wall generation params etc. to use for a given print.

[+] rowanG077|11 months ago|reply
There is some industry usage of AI in G-code generation. But it often requires at least some post processing. In general if you just want a few parts without hard tolerances it can be pretty good. But when you need to churn out thousands it's worth it to go in an manually optimize to squeeze out those precious machine hours.