top | item 41526115

(no title)

deisteve | 1 year ago

so i've had some chance to look at different people trying out o1 and heres my take:

its largely hype with some interesting moments

First they used it to compare it to gpt4o not gpt4. Second the benchmark they used for coding is iffy since we've already seen scores around 1400ish from other LLMs last year. Third I can't help but feel this is some marketing gimmick to get more ChatGPT Plus subscribers.

I almost subbed to it thinking I'd have full access to the coding version for o1 but seems like they released a nerfed version.

It also seems like Claude has long implemented the same RL techniques to its CoT.

I rarely use ChatGPT and Claude has replaced the need for even Cursor (using ClaudeDev)

I'm going to wait until the full o1 is released for lower tier ChatGPT users and I feel like this CoT they are keeping internal is to raise the cost of prompts

so my early excitement this evening has largely subsided and I'm back on Claude again after logging into ChatGPT in months

discuss

order

batperson|1 year ago

Wasn't aware of "Claude Dev", just tried it out, it's pretty cool. I had this simple node chatbot with a handful of features but it was all just one big script, asked it to split up the logic into separate files that would make further dev work/maintenance easier.

5 minutes and 30 cents later it came up with a decent folder structure and split everything up nicely, even cleaned up a bunch of messy code into some helper functions. Vanilla claude API could've probably done all this as well, but it would've been much more of a hassle. I'll definitely be playing around with it some more!