It's using the llama-2 13B model currently, running on gpus in the basement. LLama-2 is weak for languages outside of English, so we use machine translation, which is mostly ok but gets wierd occasionally when the model isn't aware that it's not 'speaking' English. The reason for not using chatGPT was that when we started it had a very stuffy character and didn't like to role play (think it's better now, text-davinci-003 was fine but much more expensive). Running models yourself can get potentially costs way down. chatGPT is used for the correction feature.
I check the logs occasionally, the feature doesn't have a lot of users. There's two issues I can see: it's hard to get the model to speak using simple language with a prompt only (probably a fine tune is needed), and, I haven't found good TTS for many languages (comparable to microsoft APIs) that we can run on our hardware. TTS is much better in Edge than in Chrome btw.
cwbuilds|2 years ago
davidzweig|2 years ago
It's using the llama-2 13B model currently, running on gpus in the basement. LLama-2 is weak for languages outside of English, so we use machine translation, which is mostly ok but gets wierd occasionally when the model isn't aware that it's not 'speaking' English. The reason for not using chatGPT was that when we started it had a very stuffy character and didn't like to role play (think it's better now, text-davinci-003 was fine but much more expensive). Running models yourself can get potentially costs way down. chatGPT is used for the correction feature.
I check the logs occasionally, the feature doesn't have a lot of users. There's two issues I can see: it's hard to get the model to speak using simple language with a prompt only (probably a fine tune is needed), and, I haven't found good TTS for many languages (comparable to microsoft APIs) that we can run on our hardware. TTS is much better in Edge than in Chrome btw.
I've heard https://www.loora.ai/ is good, haven't tried it yet.