Please stop spreading this "AI evals" terminology. "evals" is what providers like OpenAI and Anthropic do with their models. If you wrote a test for a feature that uses an LLM, it's just a test, there's no need to say "evals." Having a separate term only further confuses people who already have no idea what that actually means.
alexhans|16 hours ago
Words win when they're used. Just because Agent Skills is just a pattern for standarization and saving context doesn't mean it wasn't incredibly useful.
Think beyond software developers by trade. Think beyond people those who realized they needed tests instead of those who thought "the models will just get smarter" and "they told me there's guardrails".