I love the user experience for your product. You're giving a free demo with results within 5 minutes and then encourage the customer to "sign in" for more than 10 prompts.
Presumably that'll be some sort of funnel for a paid upload of prompts.
What seems missing:
I can not see the answer from the different models.
One have to rely on the "correctness" score.
Another minor thing: the scoring seems hardcoded to:
50% correctness, 30% cost, 20% latency - which is OK,
but in my case i care more about correctness and latency I don't care.
Wow! This was my testprompt:
You are an expert linguist and translator engine.
Task: Translate the input text from English into the languages listed below.
Output Format: Return ONLY a valid, raw JSON object.
Do not use Markdown formatting (no ```json code blocks).
Do not add any conversational text.
Keys: Use the specified ISO 639-1 codes as keys.
Target Languages and Codes:
- English: "en" (Keep original or refine slightly)
- Mandarin Chinese (Simplified): "zh"
- Hindi: "hi"
- Spanish: "es"
- French: "fr"
- Arabic: "ar"
- Bengali: "bn"
- Portuguese: "pt"
- Russian: "ru"
- German: "de"
- Urdu: "ur"
Input text to translate:
"A smiling boy holds a cup as three colorful lorikeets perch on his arms and shoulder in an outdoor aviary."
gforce_de|1 month ago
What seems missing: I can not see the answer from the different models. One have to rely on the "correctness" score.
Another minor thing: the scoring seems hardcoded to: 50% correctness, 30% cost, 20% latency - which is OK, but in my case i care more about correctness and latency I don't care.
Wow! This was my testprompt:
iFire|1 month ago
Here's a bug report, by switching the model group the api hangs in private mode.
iFire|1 month ago
lorey|1 month ago