This is a very tiring criticism. Yes, this is true. But, it's an implementation detail (tokenization) that has very little bearing on the practical utility of these tools. How often are you relying on LLM's to count letters in words?
The implementation detail is that we keep finding them! After this, it couldn't locate a seahorse emoji without freaking out. At some point we need to have a test: there are two drinks before you. One is water, the other is whatever the LLM thought you might like to drink after it completed refactoring the codebase. Choose wisely.
An analogy is asking someone who is colorblind how many colors are on a sheet of paper. What you are probing isn't reasoning, it's perception. If you can't see the input, you can't reason about the input.
No, it’s an example that shows that LLMs still use a tokenizer, which is not an impediment for almost any task (even many where you would expect it to be, like searching a codebase for variants of a variable name in different cases).
1970-01-01|2 months ago
101008|2 months ago
altruios|2 months ago
Uehreka|2 months ago
victorbjorklund|2 months ago
iAMkenough|2 months ago
It's an example of a simple task. How often are you relying on LLMs to complete simple tasks?
andy99|2 months ago
properbrew|2 months ago
https://chatgpt.com/share/6941df90-789c-8005-8783-6e1c76cdfc...