(no title)
alandu
|
1 year ago
We have not come across any benchmark dataset that's actually worth evaluating on because the questions are not representative of real world enterprise problems. They don't reflect the degree of context needed to answer domain/business-specific questions accurately.
HanClinto|1 year ago
https://github.com/defog-ai/sql-eval
alandu|1 year ago