ainesh93
|
2 years ago
|
on: Show HN: Natural-SQL-7B, a strong text-to-SQL model
Hard to claim success with "complex" questions if you don't account for business context and organizational nuances. For example, "active" listings on Redfin may be a combination of days on market, last open house, last update, etc instead of a Boolean flag called "is_active". How can we expect models to generate correct SQL at the enterprise level without providing a support structure of business context? The model can only be so good.
ainesh93
|
2 years ago
|
on: Fine-tuning GPT-3.5-turbo for natural language to SQL
> You also need a bunch of information about the real business that the data is describing.
While the article focuses on finetuning GPT-3.5-turbo, how you use the text-to-SQL engine within the architecture of your overall solution is for you to decide. Providing this business context from vectorized context stores in the actual prompt would be a step in the right direction.
ainesh93
|
2 years ago
|
on: Fine-tuning GPT-3.5-turbo for natural language to SQL
I think the focus here isn't necessarily on compute cost. When companies hire data scientists or analysts, they're niche-skilled and expensive. If those people spend 50-60% of their time courting ad-hoc questions from various people in the org, the cost of that employee's time (and the money spent on them doing menial tasks that are a waste of their skillset) is the biggest factor.
ainesh93
|
2 years ago
|
on: Fine-tuning GPT-3.5-turbo for natural language to SQL
Although Spider is better known in the text-to-SQL world, you're right that BiRD may provide a better testing ground. Comparing against the current leaderboard on that standard is on the docket!
ainesh93
|
2 years ago
|
on: Show HN: Dataherald AI – Natural Language to SQL Engine
ainesh93
|
2 years ago
|
on: Show HN: Dataherald AI – Natural Language to SQL Engine
Yes, opening the connection to the DB read-only would also work. That's what we're planning on doing.
ainesh93
|
2 years ago
|
on: Show HN: Dataherald AI – Natural Language to SQL Engine
Hit the nail on the head! Not only is the context length a limitation, but the speed of response gets impacted as well.
With a human in the loop, even providing a "mostly" correct SQL that takes a swing at the correct joins between relevant tables reduces the data practitioner's work significantly. Of course, as more questions are asked, the tool gets better at writing the SQL better. Almost like a human in a Database Management and SQL class...
ainesh93
|
2 years ago
|
on: Text-to-SQL Benchmarks and the Current State-of-the-Art
Medium article presenting background on the most popular text-to-SQL benchmark datasets and current performance of text-to-SQL algorithms
ainesh93
|
3 years ago
|
on: Show HN: Dataherald (YC W21) Live Data, Ready to Visualize in 3 Clicks
What a quick way to visualize and get the most telling insights from cool public data sources! I've always known this data is out there, but now I have a medium to understand what it's meant to show. Super cool tool!