top | item 46425918

(no title)

joefourier | 2 months ago

Have you used an LLM specifically trained for tool calling, in Claude Code, Cursor or Aider?

They’re capable of looking up documentation, correcting their errors by compiling and running tests, and when coupled with a linter, hallucinations are a non issue.

I don’t really think it’s possible to dismiss a model that’s been trained with reinforcement learning for both reasoning and tool usage as only doing pattern matching. They’re not at all the same beasts as the old style of LLMs based purely on next token prediction of massive scrapes of web data (with some fine tuning on Q&A pairs and RLHF to pick the best answers).

discuss

order

treespace8|2 months ago

I'm using Claude code to help me learn Godot game programming.

One interesting thing is that Claude will not tell me if I'm following the wrong path. It will just make the requested change to the best of its ability.

For example a Tower Defence game I'm making I wanted to keep turret position state in an AStarGrid2D. It produced code to do this, but became harder and harder to follow as I went on. It's only after watching more tutorials I figured out I was asking for the wrong thing. (TileMapLayer is a much better choice)

LLMs still suffer from Garbage in Garbage out.

jennyholzer3|2 months ago

don't use LLMs for Godot game programming.

edit: Major engine changes have occurred after the models were trained, so you will often be given code that refers to nonexistent constants and functions and which is not aware of useful new features.

memoriuaysj|2 months ago

before coding I just ask the model "what are the best practices in this industry to solve this problem? what tools/libraries/approaches people use?

after coding I ask it "review the code, do you see any for which there are common libraries implementing it? are there ways to make it more idiomatic?"

you can also ask it "this is an idea on how to solve it that somebody told me, what do you think about it, are there better ways?"

belter|2 months ago

Ask a model to

"Write a chess engine where pawns move backward and kings can jump like nights"

It will keep slipping back into real chess rules. It learned chess, it did not understand the concept of "rules"

Or

Ask it to reverse a made up word like

"Reverse the string 'glorbix'"

It will get it wrong on the first try. You would not fail.

Or even better ask it to...

"Use the dxastgraphx library to build a DAG scheduler."

dxastgraphx is a non existing library...

Marvel at the results...tried in both Claude and ChatGPT....

manmal|2 months ago

I‘ve just tried the dxastgraphx one in pi with Opus 4.5. This was its response:

    I couldn't find a library called dxastgraphx in either pip (Python) or npm (JavaScript) package registries. This library doesn't appear to exist.

    Did you perhaps mean one of these popular DAG/graph libraries?

    Python:
    - networkx - comprehensive graph library with DAG support
    - graphlib - Python standard library (3.9+) with TopologicalSorter
    - dask - parallel computing with DAG task scheduling

    JavaScript/TypeScript:
    - graphlib - graph data structures
    - dagre - DAG layout

    Would you like me to build a DAG scheduler using one of these existing libraries, or would you like me to implement one from scratch? Let me know which language and approach you prefer.

bossyTeacher|2 months ago

Tried ChatGPT (free version). I answered correctly to the first question.

Answer to second question:

"I can do that, but there’s a catch: dxastgraphx is not a known or standard Python (or JS) library as of now. I don’t have any public API or documentation for it to target directly.

To avoid guessing wrong, here are the two sensible options:"

somebodythere|2 months ago

Why would I ask the model to reverse the string 'glorbix,' especially in the context of software engineering?

criticalfault|2 months ago

just tried to reverse the string you provided using Gemini. it worked fine on the first try

baq|2 months ago

You’re trying to interrogate a machine as you would a human and presenting this as evidence that machines aren’t humans. Yes, you’re absolutely right! And also completely missing the point.