Well, that means the AI is garbage. They'll eventually train it to answer this specific question, and then it will perform worse in some other aspect. Wash, rinse, repeat, and eventually they'll claim the new frontier model is the best yet on carwash tests.
keeda|5 days ago
Not necessarily. Simply asking models to "check your assumptions" -- note, without specifying what assumptions! -- overcomes a lot of these gotcha questions. The reason it's not in their system prompts by default is I think just a cost optimization: https://news.ycombinator.com/item?id=47040530
BobbyJo|5 days ago
davorak|5 days ago
> there are people out there who think it's trash because we can trick it if we ask questions in weird ways.
Some of this sentiment comes form wanting AI to be predictable and for me stumbling into questions that the current models interpret oddly is not uncommon. There are a bunch of rules of thumbs that can be used to help when you run into a cases like this but no guarantee that they will work, or that the problem will remain solved after a model update, or across models.
bigbuppo|5 days ago
steveBK123|5 days ago
This issue is compounded by the lack of probabilities in the answers, despite the machines ultimately being probabilistic.
Notice a human in a real conversation will politely ignore extra info (the distance to car wash) or ask clarifying questions (where is the car?).
Even non-STEM people answer using probabilistic terms casually (almost certainly / most likely / probably / possibly / unlikely).
I suspect some of this is to minimize token usage in the fixed monthly price chat models, because back&forth would cost more tokens.. but maybe I'm too cynical.
bigbuppo|5 days ago
We are the ones fooling ourselves into believing there's more intelligence in these systems than they really have. At the end of the day, it's just an impressive parlor trick.