top | item 43833400

(no title)

karpour | 10 months ago

I recently tried looking up something about local tax law in ChatGPT. It confidently told me a completely wrong rule. There are lots of sources for this, but since some probably unknowingly spread misinformation, ChatGPT just treated it as correct. Since I always verify what ChatGPT spits out, it wasn't a big deal for me, just a reminder that it's garbage in, garbage out.

discuss

freehorse|10 months ago

Yeah, I also find very often llms say sth wrong just because they found it in the internet. The problem is that we know to not trust a random website, but LLMs make wrong info more believable. So the problem in some sense is not exactly the LLM, as they pick up on wrong stuff people or "people" have written, but they are really bad at figuring these errors out and particularly good at covering them or backing them up.

mediaman|10 months ago

Out of curiosity, did you try this in o3?

O3's web research seems to have gotten much, much better than their earlier attempts at using the web, which I didn't like. It seems to browse in a much more human way (trying multiple searches, noticing inconsistencies, following up with more refined searches, etc).

But I wonder how it would do in a case like yours where there is conflicting information and whether it picks up on variance in information it finds.

SpicyLemonZest|10 months ago

I just asked o3 how to fill out a form 8949 for a sale with an incorrect 1099-B basis not reported to the IRS. It said (with no caveats or hedging, and explicit acknowledgement that it understood the basis was not reported) that you should put the incorrect basis in column (e) with adjustments in (f) and (g), while the IRS instructions are clear (as much as IRS instructions can be...) that in this scenario you should put the correct basis directly in column (e).

vjvjvjvjghv|10 months ago

I think this will be fixed by having LLM trained not on the whole internet but on well curated content. To me this feels like the internet in maybe 1993. You see the potential and it’s useful. But a lot of work and experimentation has to be done to work out use cases.

I think it’s weird to reject AI based on its current form.

throwaway743|10 months ago

Chatgpt isn't any good these days. Try switching to Claude or Gemini 2.5 pro.

calmoo|10 months ago

ChatGPT is still good. Try o3.