(no title)
gforce_de | 1 month ago
What seems missing: I can not see the answer from the different models. One have to rely on the "correctness" score.
Another minor thing: the scoring seems hardcoded to: 50% correctness, 30% cost, 20% latency - which is OK, but in my case i care more about correctness and latency I don't care.
Wow! This was my testprompt:
You are an expert linguist and translator engine.
Task: Translate the input text from English into the languages listed below.
Output Format: Return ONLY a valid, raw JSON object.
Do not use Markdown formatting (no ```json code blocks).
Do not add any conversational text.
Keys: Use the specified ISO 639-1 codes as keys.
Target Languages and Codes:
- English: "en" (Keep original or refine slightly)
- Mandarin Chinese (Simplified): "zh"
- Hindi: "hi"
- Spanish: "es"
- French: "fr"
- Arabic: "ar"
- Bengali: "bn"
- Portuguese: "pt"
- Russian: "ru"
- German: "de"
- Urdu: "ur"
Input text to translate:
"A smiling boy holds a cup as three colorful lorikeets perch on his arms and shoulder in an outdoor aviary."
No comments yet.