(no title)
vikasnair | 2 years ago
Some common ones are fraud and churn detection for financial institutions or e-commerce sites (both tabular classification examples). It's very important for these types of tasks in particular to guard against biases and false negatives, so they use us to set up wide test nets that help give them assurance that their models are working properly before they hit production (and to monitor them post-deployment).
Another example is Zuma (https://www.getzuma.com), a startup building an AI-driven chatbot that uses us to track their experiments and improve the accuracy of their NLP intent classification model.
Of course, we're also building out support for evaluating LLMs. Because this is an open problem, we've been spending a lot of time interviewing people in the space who are building these models (please reach out if this is you!).
No comments yet.