ainesh93's comments

ainesh93 | 2 years ago | on: Show HN: Natural-SQL-7B, a strong text-to-SQL model

Hard to claim success with "complex" questions if you don't account for business context and organizational nuances. For example, "active" listings on Redfin may be a combination of days on market, last open house, last update, etc instead of a Boolean flag called "is_active". How can we expect models to generate correct SQL at the enterprise level without providing a support structure of business context? The model can only be so good.

ainesh93 | 2 years ago | on: Fine-tuning GPT-3.5-turbo for natural language to SQL

> You also need a bunch of information about the real business that the data is describing.

While the article focuses on finetuning GPT-3.5-turbo, how you use the text-to-SQL engine within the architecture of your overall solution is for you to decide. Providing this business context from vectorized context stores in the actual prompt would be a step in the right direction.

ainesh93 | 2 years ago | on: Fine-tuning GPT-3.5-turbo for natural language to SQL

I think the focus here isn't necessarily on compute cost. When companies hire data scientists or analysts, they're niche-skilled and expensive. If those people spend 50-60% of their time courting ad-hoc questions from various people in the org, the cost of that employee's time (and the money spent on them doing menial tasks that are a waste of their skillset) is the biggest factor.

ainesh93 | 2 years ago | on: Fine-tuning GPT-3.5-turbo for natural language to SQL

Although Spider is better known in the text-to-SQL world, you're right that BiRD may provide a better testing ground. Comparing against the current leaderboard on that standard is on the docket!

ainesh93 | 2 years ago | on: Show HN: Dataherald AI – Natural Language to SQL Engine

Hi, you can find updated documentation on connecting to BigQuery here: https://dataherald.readthedocs.io/en/latest/api.database.htm.... We have also updated the ReadMe.

ainesh93 | 2 years ago | on: Show HN: Dataherald AI – Natural Language to SQL Engine

Yes, opening the connection to the DB read-only would also work. That's what we're planning on doing.

ainesh93 | 2 years ago | on: Show HN: Dataherald AI – Natural Language to SQL Engine

Hit the nail on the head! Not only is the context length a limitation, but the speed of response gets impacted as well.

With a human in the loop, even providing a "mostly" correct SQL that takes a swing at the correct joins between relevant tables reduces the data practitioner's work significantly. Of course, as more questions are asked, the tool gets better at writing the SQL better. Almost like a human in a Database Management and SQL class...

ainesh93 | 2 years ago | on: Text-to-SQL Benchmarks and the Current State-of-the-Art

Medium article presenting background on the most popular text-to-SQL benchmark datasets and current performance of text-to-SQL algorithms

ainesh93 | 3 years ago | on: Show HN: Dataherald (YC W21) Live Data, Ready to Visualize in 3 Clicks

What a quick way to visualize and get the most telling insights from cool public data sources! I've always known this data is out there, but now I have a medium to understand what it's meant to show. Super cool tool!