(no title)
_bin_ | 9 months ago
3.5 is better for this, ime. I hooked claude desktop up to an MCP server to fake claude-code less the extortionate pricing and it works decently. I've been trying to apply it for rust work; it's not great yet (still doesn't really seem to "understand" rust's concepts) but can do some stuff if you make it `cargo check` after each change and stop it if it doesn't.
I expect something like o3-high is the best out there (aider leaderboards support this) either alone or in combination with 4.1, but tbh that's out of my price range. And frankly, I can't mentally get past paying a very high price for an LLM response that may or may not be useful; it leaves me incredibly resentful as a customer that your model can fail the task, requiring multiple "re-rolls", and you're passing that marginal cost to me.
agilebyte|9 months ago
actsasbuffoon|9 months ago
I’m finding it useful for really tedious stuff like doing complex, multi step terminal operations. For the coding… it’s not been great.
nico|9 months ago
I’ve been looking for something that can take “bare diffs” (unified diffs without line numbers), from the clipboard and then apply them directly on a buffer (an open file in vscode)
None of the paste diff extension for vscode work, as they expect a full unified diff/patch
I also tried a google-developed patch tool, but also wasn’t very good at taking in the bare diffs, and def couldn’t do clipboard
never_inline|9 months ago
harvey9|9 months ago
johnsmith1840|9 months ago
o1-pro and o1-preview are the only models I've ever used that can reliably update and work with 1000 LOC without error.
I don't let o3 write any code unless it's very small. Any "cheap" model will hallucinate or fail massively when pushed.
One good tip I've done lately. Remove all comments in your code before passing or using LLMs, don't let LLM generated comments persist under any circumstance.
_bin_|9 months ago
I wouldn't be shocked if huge, expensive-to-run models performed better and if all the "optimized" versions were actually labs trying to ram cheaper bullshit down everyone's throat. Basically chinesium for LLMs; you can afford them but it's not worth it. I remember someone saying o1 was, what, 200B dense? I might be misremembering.
doug_durham|9 months ago
layoric|9 months ago
nico|9 months ago