top | item 37715469

(no title)

vvrm | 2 years ago

The fine-grained results look like:

- 1444x faster for single character prefixes

- 252x faster for two character prefixes

- 55x faster for three character prefixes

- ~20x faster for 4 and 5 character prefixes

- <= 5x faster for longer prefixes

I used to work on a production auto-complete system operating at over 100k peak QPS. For prefixes of length one and two we would not even bother hitting the server, just from a quality perspective, not because of latency/throughput considerations. Btw, up until 3 characters, you could store everything in an in-memory hash map. 20x speedup on length 4 and 5 prefixes is still very impressive, but not quite 1000x speedup either.

discuss

jjice|2 years ago

I also worked on a production auto complete feature for a web app a bit ago and I couldn't agree more with the quality sentiment. One or two characters is almost never enough to give a meaningful result. Using history or similar user search is much more effective than trying to guess what someone meant by "th".

ComputerGuru|2 years ago

> One or two characters is almost never enough to give a meaningful result.

TFA acknowledges that and mentions the exception:

While a prefix of length=1 is not very useful for the Latin alphabet, it does make sense for CJK languages

Kalabasa|2 years ago

How does that fare for non-English queries? E.g. are two chars still not enough for Chinese languages?