(no title)
jc4p | 9 months ago
It uses OpenAI's realtime API to simulate either a tutoring session (the speaker will revert to English to help you) or a first date or business meeting (the speaker will always speak the target language)
You can see the AI's transcriptions but not your own, limitation of the current OpenAI API but definitely something I can fix.
The prompts are like this: https://gist.github.com/jc4p/d8b9d121425ec191d62602d8720eeed... and the rest of it is a Nextjs app wrapped around the WebRTC connection.
I'm not fully in love with the app so I'd love any feedback or hearing if it works well for you -- It doesn't have a lot of features yet (including saving context) and if you bump into the time limit just open it up in incognito to keep going.
internet_points|9 months ago
The "next level" feature would be to get it to speak even simpler, with some hints about how to reply, for the beginners. I don't know how that would ideally look, but maybe a button to pop up some "key words" or phrases that one could use? (Even so, I found myself using the little I know, so it's obviously somehow working even though my knowledge is extremely basic.)
This is one of the places where I feel LLM's can do something good for the world, giving a safe playground for getting experience with speaking new languages without the anxiety of performing badly in front of other people – and hopefully make it easier to connect with real people in that language later.
rowborg|9 months ago
One small piece of feedback… There were a couple times where I asked to learn something, and it asked me to repeat a phrase back, which was great. But when I repeated it back, I know I didn’t quite nail it (eg perhaps said “un” instead of “una”) and rather than correcting me, it actually told me I did it perfectly. Maybe there’s some tuning with the prompts that may help turn down the natural sycophancy of the model and make sure it’s a little more strict.
Keep up the great work!
sampleuser58|9 months ago
"write as if you are a person from {{REGION}}. Modify your language to proficiency level {{PROFICIENCY_LEVEL}}"
that way I could for example, speak as if it's someone using Mexican Spanish vs Madrid Spanish vs Chilean Spanish, etc.
Secondly, you could include the user's speech transcribed as part of the conversation window
jc4p|9 months ago
d--b|9 months ago
I've learned Japanese a while back but haven't practised in a long time.
1. it would be awesome if this could transcript what I just said in japanese to be sure that it got me
2. I don't know kanjis that well, so reading is hard, having a button to have the AI repeat the sentence would be quite useful.
Other than that, I could definitely use something like that for practice
jeffwass|9 months ago
Curious because I’m trying to learn Romanian, and since it’s a less common language there are fewer resources available. So I wasn’t sure if you added Dutch with minimal amount of effort following the poster’s request.
That said, I gave your app a try with Spanish and it looks pretty good! But I didn’t see a Help page to clarify how I’m “supposed” to interact. Eg I tried saying in English “I don’t understand” (even though I know how to say that in Spanish) and it responded in Spanish which may be hard for absolute beginners. Although full immersion is much better way to learn.
I can try playing around more with it to give you some feedback.
iLoveOncall|9 months ago
I tried to use ChatGPT as a "live" translator with my in laws and I noticed it is extremely bad at language "consistency" or at understanding your intent when it comes to multiple languages.
It will sometimes respond in English when you talk to it in the foreign language, it will sometimes assume that a clear instruction like "repeat the last sentence" needs to be translated, etc.
I don't know how the person above is approaching the problem but your experience is consistent with mine and I don't think GenAI models (at least OpenAI ones) are suitable for the task.
jc4p|9 months ago
Please let me know if it works, and I'll definitely work on adding in instructions for the expected interactivity, thank you!
gield|9 months ago
I tried practicing some verb conjugations. The trainer displayed some fill-in-the-blank sentences like "she ... home after class", asking me to conjugate "to walk" in that sentence. However, the audio actually pronounced the full sentence "she walks home after class", giving away the answer.
valleyer|9 months ago
I've used the realtime API for something similar (also related to practicing speaking, though not for foreign languages). I just wanted to comment that the realtime API will definitely give you the user's transcriptions -- they come back as an `server.conversation.item.input_audio_transcription.completed` event. I use it in my app for exactly that purpose.
jc4p|9 months ago
If the language is correct, a lot of the times the exact text isn't 100% accurate, if that's 100% accurate, it comes in slower than the audio output and not in real time. All in all not what I would consider feature ready to release in my app.
What I've been thinking about is switching to a full audio in --> transcribe --> send to LLM --> TTS pipeline, in which case I would be able to show the exact input to the model, but that's way more work than just one single OpenAI API call.
sampleuser58|9 months ago
I will probably use something like this for language practice.
ciaovietnam|9 months ago
jc4p|9 months ago
fhatfield|9 months ago
altern8|9 months ago