top | item 24283573

Photon – a live demo of a natural language interface to databases

48 points| atrudeau | 5 years ago |naturalsql.com | reply

26 comments

order
[+] gagege|5 years ago|reply
"What is the price of Ginger Beer?"

It couldn't translate that into SQL.

"What is the price in dollars of Ginger Beer?"

> SELECT Catalog_Contents.price_in_dollars FROM Catalog_Contents WHERE Catalog_Contents.price_in_dollars = "Ginger Beer"

Nope.

"What is the price in dollars of catalog entry name Ginger Beer?"

> SELECT Catalog_Contents.price_in_dollars FROM Catalog_Contents WHERE Catalog_Contents.catalog_entry_name = "Ginger Beer"

Cool! You have to be more specific than I was hoping, but this is still pretty neat.

[+] vanusa|5 years ago|reply
"What is the price in dollars of catalog entry name Ginger Beer?"

OK - but that's not "natural language".

[+] thom|5 years ago|reply
There’s a lot missing here. During the brief and unhappy period of my life where I worked in this area, we had quite a lot of luck just generating semantics based on Wordnet in the domain in question. So here you can’t successfully ask for “French wines” even though we know what a country is and that French is a correct adjectival form. Same with things like “oldest wine”, that’s an easy to derive superlative based on info you already have. We got some mileage out of this old fashioned tree based system at the core, with fuzzier machine learning stuff at the edges.
[+] deadfa11|5 years ago|reply
> What singer sang in the most stadiums?

    SELECT singer.Name FROM singer JOIN singer_in_concert ON singer.Singer_ID = singer_in_concert.Singer_ID GROUP BY singer.Singer_ID ORDER BY COUNT(*) DESC LIMIT 1
It is close... sort of? It figured out it needed to join, group, and order, but it only drew the relation to the concert, not the venue. Correctness seems a huge challenge here. Even knowing SQL, I feel I'm double checking my results at times. But I can see how this might be incredibly useful someday for Salesforce if there's confidence in the results.
[+] samatman|5 years ago|reply
> how many teachers older than thirty?

> SELECT COUNT(*) FROM teacher WHERE teacher.Age > "thirty"

Not a bad idea. A good idea, maybe. Implementation needs some work.

[+] moonchild|5 years ago|reply
IMO better would be a database interface that acts as a normal programming language, treating tables as arrays of records. Compare:

  how many teachers older than 30
  teachers.filter(*.age > 30).len
Same length, but the second one has a degree of precision that the first lacks. (Though they might diverge somewhat as the complexity of queries grows, I suspect programming languages would do better as they have better facilities for symbolic manipulation.) Note also that the second example (and your SQL example) had to specify the table's name, while the natural language one did not—another mark against the natural language solution.
[+] yetihehe|5 years ago|reply
Looking at responses to your comment I predict that AI won't replace properly trained programmers too soon...
[+] pmontra|5 years ago|reply
The Covid database contains cumulative figures, so if you ask "How many deaths in ...?" you get the naive query with the sum of the Deaths column for that country, which is wrong. Actually I wonder how to explain it. I cheated and asked "how many deaths in ... on July 14?" but got the wrong query, with July 14 as Province_or_State no matter how I rephrased the date.
[+] macro-b|5 years ago|reply
“Select all elements from catalog” did not work...
[+] rco8786|5 years ago|reply
> All confirmed covid deaths

> SELECT covid_19_july_data.Deaths FROM covid_19_july_data WHERE covid_19_july_data.Confirmed = "covid"

[+] visarga|5 years ago|reply
I've seen a talk about this task, it's supposed to be hard to get enough training data for regular DL approaches.
[+] tgv|5 years ago|reply
Why do you think this is online?

BTW, they didn't make life easy for themselves by having a field called "number of records". I asked something like "what's the number of records" and "what's the sum of the number of records", but it kept replying with `SELECT COUNT(*) FROM wines;`.

[+] aaron695|5 years ago|reply
In my opinion this doesn't make sense.

SQL is a tight, unambiguous language, that's why it exists.

This is like a legal document written in spoken English. It's only all fine when it works.

Part of writing SQL is also understanding the underling data. This won't address this issue.

This is also not replicable. Language changes in context and time.

[+] legacynl|5 years ago|reply
> In my opinion this doesn't make sense. > Part of writing SQL is also understanding the underling data. This won't address this issue.

I guess the endgoal for this is to make non-technical people also be able to efficiently work with databases. From a purely business/financial context this would save companies hours (i.e. money) onboarding/teaching employees to use their database, and even possibly remove the need to hire expensive data analysts because their lower-tier employers suddenly can interact with their databases as efficiently as they can.

edit: I also believe you're putting the cart before the horse with your reasoning. SQL and Legal English NEED to be exact, which makes them very 'complex' because you need to disallow any edge cases. This doesn't mean we WANT them to be complex. It is way more useful if it is easy and intuitive (like natural language). Matter of fact, this would save in both Legal and SQL cases a lot of time, because in both you'd often start with natural english, like 'I want to write a rent contract that protects me and my renter from legal trouble', or 'I want to know from this database which company had the highest net profit in the last quarter'. It's only then that you put money, effort and time into translating this into Legal English or SQL.

[+] neilalexander|5 years ago|reply
list all company

> Please check the results in the table. Did I get it right?

yes

> Great!

list all designation

> Sorry, 'designation' is confusing to me

[+] subhajeet2107|5 years ago|reply
meh "is there any area column in addresses table" did not work seems straight forward to me ?