top | item 31869169

(no title)

English->SQL is very much a thing, and has a good amount of research going into it. It does have a few weaknesses though:

1. It is ML based, and the best results I have seen put it at about 90% accurate. This might be "good enough", but not perfect. Verification and error correction is needed.

2. Knowledge of the schema needs to be passed in as part of the feature, or have the model explicitly trained to the target schema.

3. Going to a different DB requires a retraining of the model, due to slight differences in SQL dialects.

4. ML takes either a lot of time (speed) or money (GPUs). This is more a general ML problem, but does affect English -> SQL.

I am no expert in English -> SQL, or in ML in general, so somebody correct me if I'm wrong on the above points. These are just what I've seen or experienced in my research.

discuss

euroderf|3 years ago

I wonder, does it help if the user understands which relationships are 1-to-N and which are N-to-N. In order to state commands fairly clearly, and respond to requests from the system for clarification and disambiguation. I would think that this kind of info (1-to-N, N-to-N) could be grokked in a straightforward way by a user with domain knowledge.