top | item 38109929

(no title)

AdrenalinMd | 2 years ago

But a lot of the question /responses could be trivial cached. No need to run expensive LLM every time for the same basic "how are you today?" prompts, it only has to be cached once.

discuss

order

poisonborz|2 years ago

Caching static requests alone is hard enough. With all the ways you can ask this question, welcome to the most complicated caching backend ever. Caching exact matches would also not help much because of this.

dartos|2 years ago

Then you’re kind of defeating the purpose of an llm.

Fixed responses for common queries is what we have now.

Not to mention that LLMs tend to be very wordy right now. I’d hate to way 20 seconds to hear my phone say “As a voice assistant I’m not aware of the exact menu of the Thai restaurant on 2nd, but I have opened a google search for it and found the following results.

…”

tpmx|2 years ago

"You are a Siri-style voice assistant. Be succinct and terse, but polite and helpful." seems to work okay with ChatGPT.