This was a nice read for the history. Honestly, being in a big legacy industry (Insurance), it's as though NoSQL never happened. We're too big, at least my organization, to have made the wholesale change and we've been plugging along mostly in Teradata and DB2 for a long time. Teradata in particular has performed well across a variety of use cases, its only large downside being the cost.
Anyhow, only point I'd add is that I'm not sure that SQL-like languages added on top of NoSQL solutions was really that bad of a thing. It's not as though there's one flavor of SQL to rule them all on RDBMSs. I get that the bones are largely the same - SELECT, FROM, WHERE, most join types, aggregates, etc. But take a really common analytic use case - persisted transactional data. Something like this -
SELECT
A,
B,
C,
FROM SomeTbl
WHERE SomeDate BETWEEN SomeBegin AND SomeEnd
QUALIFY RANK() OVER (PARTITION BY A, B ORDER BY SomeOtherDate DESC) = 1
Pretty sure that SELECT, FROM, WHERE will work just about anywhere. That QUALIFY clause is written for Teradata, and once you get into the ordered analytic functions (as one example), you need to know the implementation relevant to the RDBMS you're dealing with. It's not uncommon for a large enterprise to have MS SQL, DB2, Teradata, Oracle, etc. all under one roof. As a data engineer, you still need to know the differences - first to get the data out, and then to optimize.
Dunno, I could be way off on this, but that's been my experience at least.
I think the bigger issue is that you can use an RDBMS to do NoSQL-like things, such as key-value stores, with the flexibility to structure your data if you need that later. So why not start with a relational database?
I asked the same question. The biggest reason I've heard is RDBMS's don't horizontally scale well, meaning you can't easily have 50 replicated nodes across the globe and expect it to perform well, or setup easily, because it's fairly complicated with an RDBMS. There's things like Oracle's grid or SQL's high availability clusters, but they get complicated fast, particularly when you need a bunch of nodes.
Obviously it was a solution for a very large amount of unstructured or semi-structured data that needed to be redundant across networks. With NoSQL, the key/values are a lot simpler to replicate apparantly.
It's a Google/Twitter size problem that most industries wouldn't have, but since it's the new shiny thing, you know how that goes.
>I think the bigger issue is that you can use an RDBMS to do NoSQL-like things, such as key-value stores, with the flexibility to structure your data if you need that later. So why not start with a relational database?
I've tried that in the past and failed miserably.
1) values in a key-value table will endup needing to hold nested data structures, such as a JS object/hash. Ie. mykey={...}
2) turning values to JSON (or some other serialization) makes it impossible to concurrently update or search subkeys.
3) so you convert your complex (sets, hashes) key-value data into several rows that hold one subkey per row, so now you have updatable rows but still no indexable solution and a serious performance problem.
4) so you create a multi column type-segregated table (one column for each DB type) for indexing your key-values and making them searcheable. That also requires a metadata table to go with it so that you know in which column your key or subkey is indexed.
5) say you successfully implemented your key-value store with your RDBMS. You still don't have usable joins (you don't have relational data) or a real migration path out of your key-values.
Trust me, don't just put keys and values in a relational DB. Start with the right tool for the job, either make your schema relational from the beginning or use a proper KV or document store.
The thing about the NoSQL trend is that it's still very valid in some areas. Even big and slow companies have some project or supplier that uses a software with a NoSQL service in it.
You just rarely see those bundled redis or elasticsearch servers that are crucial for some app to do its session management and search engine.
The most disturbing part of the NoSQL trend is that people are treating it like a battle between two competing systems.
I have at least one major system under my own belt that uses relational SQL for backend data, ES for search engine and cassandra for TSDB.
I need and trust all those services to work as one unit.
The reason people don’t start with a relational DB has nothing to do with the data model or query language - people choose databases like Cassandra because scaling to 500 Postgres or MySQL instances holding a combined petabyte of data is horrific, but it’s dirt simple in things like Cassandra
The query language is a side effect of the underlying storage engine - you don’t choose it because you want a key value store, you choose it because you want horizontal scalability and cross wan HA
I don't think the article could have said it much better.
SQL is super powerful and makes much sense in so many ways. Nearly all apps have a relational structure to them and SQL is a reasonable way to interact with them.
Some of my favorite conversations from the Postgres community 5-6 years back were when they were talking about a time when Postgres was being disrupted. The gray bearded DBAs (Hi Berkus and others) were talking about JSON like it's fad and how it's going to come to pass. They were saying so because they'd heard about this disruption before... there was this new thing XML and these document databases were so much more powerful. They waited a few years and added an XML datatype [1]... And then XML databases came and passed.
At first they scoffed a bit on JSON as the new hip thing, but came around a little in Postgres 9.2, and then in 9.4 [2] we all really got what we wanted.
SQL has never been a particularly elegant language, but it's always been a powerful one. The lone knock against has either been usability or scalability [3]. On the usability side that's a bit of a hard one as well it's not super pretty. But once you understand it's foundations around relational algebra or relational calculus you're so far ahead. On the scalability side there are definitely some options such as Citus, but also core Postgres is massively improving here.
There's really no such thing as NoSQL first of all. Relational, graph, key/value (including wide column), document-store, etc. There are lots of database types but it turns out relational works 95% of the time and we're getting better at recognizing the use-cases for the others.
SQL is also just a query language, that's literally the name. Any database can implement it, not just relational. Is SQL a great interface for both OLTP and OLAP? Yes, it's proven itself over decades. Is it the only valid interface? No. Does it work with other newer/different data systems? Yes, Spark, Hadoop, Kafka, and even these new distributed relational databases are all examples of such.
It would be far better for the industry if we can get past these ideological and faulty "debates" and move on to more important things.
I don't buy it. The makers of the software that this appears to be a carefully written advertisement for, came to the same conclusion as the rest of the IT world (that some of us saw a mile away): NoSQL was, and still is, only good for very specific things in very specific cases; it's generally dreadful for anything that SQL engines could already do well.
The subtitle for this contains: "After years of being left for dead" and the author throws phrases around like: "And boy did the software developer community eat up NoSQL, embracing it arguably much more broadly than the original Google/Amazon authors intended."
Who? Where's the data? This blog post has a lot of links, references, and studies, but where's the data to back up the premise?
A quick search on Google Trends comparing SQL databases to NoSQL as an entire term, or any of the popular flavors of NoSQL, reveals that it is not even a blip in comparison.
But don't take my word for it, the author and their company had the "DUH" moment too (emphasis mine):
> ...we soon realized that we’d have to do a lot more work: e.g., deciding syntax, building various connectors, educating users, etc. We also found ourselves constantly looking up the proper syntax to queries that we could already express in SQL, for a query language we had written ourselves! One day we realized that building our own query language made no sense. That the key was to embrace SQL.
You might have had hubris stemming from discarding or not knowing all of the history that you decided to share with us in this blog post. A great deal of the NoSQL community was completely unpalatable for this reason. The folks who had been doing data for decades, and built stable and powerful systems on top of many prior decades of mistakes, built them for a reason.
And here we see the author trying to proclaim that suddenly SQL is back?
SQL never went anywhere. NoSQL is a neat tool that was developed and continues to be developed and probably isn't going anywhere. But the idea that NoSQL suddenly overtook SQL, and now SQL is seeing some huge resurgence, feels like it comes from the perspective of someone who only saw a window into the last 6-8 years of development.
>Who? Where's the data? This blog post has a lot of links, references, and studies, but where's the data to back up the premise?
If you had been following startup blogs and HN, then you don't need any more data to back their premise.
It's not like total data matters anyway -- what's important is what use cases people regularly encounter in their periphery and the part of the industry they work on, which might not be what some overall data will show.
I don't care for example if NoSQL only caught on with 1% of developers while 99% of Fortune 500 enterprises and companies in rural Idaho and southern Mongolia used trusty old MS-SQL Server.
For most of us here around HN, judging from posts, comments, and discussions, the NoSQL era was very real, in the kind of companies and environments we knew.
There have been a ton of start-ups that I've talked to / been a part of who used MongoDb thinking their company is going to exponentially explode in MAU and they think they'll save themselves the scaling troubles by using NoSQL. What ends up happening is the codebase gets too gnarly when they try to start doing complex analysis. SQL is appropriate for like 95% of companies. A lot of these places I'm referencing end up porting their codebase to Postgres, mysql or oracle.
Thanks for writing this. I was about to write almost that exact thing. It's a weird interpretation of database history with a ton of stuff left out. My best guess is that the author of this article had all the data that would be needed to back this up, but it was stored in Mongo and he couldn't find it when he started writing.
About 15 years ago, I worked on a team with a product that used an object based database. No SQL didn't exist, but it was kind of in a similar vein. The database was a pure object storage and there was an index b-tree to make sense of it. We could store anything at all in the database. As long as it inherited from a specific class. It used the object serialization features of the language (Object Pascal) to read and write the data chunks. It had a query layer that allowed collection interrogation of data with specific values set. But, it was also very problematic. The structure was fairly chaotic. Ignoring the specific implementation, it taught me that databases - despite the rigidity of the storage mechanism, were a bloody good way to keep data organised. And SQL, despite the warts and inconsistencies of the major dialects, is actually a very powerful way to query data. When you strip away the integrity and the ability to easily pull data out in a uniform way, it makes things a lot less fluid.
It's almost as if "everyone" forgot that before SQL (relational database) there was only NoSQL databases of various sorts (hierarchical, document, etc) and SQL and the relational model arose to address their shortcomings.
Personally, I knew stuff had gotten stupid when I sat through a presentation by a gemfire evangalist who advised everyone present to just "do your joins in code". If you need to join data you should be using SQL.
I've always found it a little funny that SQL was originally designed for non-programmers, sort of like AppleScript. I used to think neither of those panned out, but in fact there really are a lot of smart not-programmers who can use it. At a company I work with many of the support staff have been learning SQL to help customers pull reports from our data warehousey reporting database. So maybe the article is onto something about the value of the language.
On the other hand, SQL-the-language isn't essential to relational databases. I have often wondered where C.J. Date has been the last few years. I actually love SQL, but it does have its limitations. I wouldn't mind a solid relational database with an alternative query language. It's such a missed opportunity for a great VC pitch: Tutorial D, NoSQL before it was cool. :-)
I really believe that Spark's "more-than-SQL" query interface is how things should be. SQL is of course the gold standard and probably represents north of 90% of analytic workloads, but there a lot of queries that (especially for us programmers) are much easier to express procedurally/functionally, rather than purely declaratively.
To add onto this, you could theoretically call SQL a logic programming language and we've had good success teaching some of our non-programmers a different logic programming language (sort of like datalog) to do some work in.
Could be something there with logic programming. Or could not be something there.
> So maybe the article is onto something about the value of the language.
The strongest feature of SQL is the lack of specificity about "How it should return results" and primarily deals with the details of what the expected results are.
I work on a SQL engine which honestly breaks so many rules of "How to do things" and basically tries to avoid nearly everything a sane SQL engine engineer would do.
But the advantage of the lack of forcing implementation is that a new idea like this could still implement the "give me expected results" part of the implementation.
Whenever it doesn't - it needs fixing, not documenting in a "vs vs" comparison.
> I actually love SQL, but it does have its limitations. I wouldn't mind a solid relational database with an alternative query language
MDX is a pretty interesting thing to think with.
Mostly because if you're used to spread-sheets, it is a more natural way of expressing what you want generally in a straight forward lookup order - get me some columns from these rows, where some condition is satisfied.
That makes index & cube lookups so much easier to detangle for an engine than a more free-form SQL tree which has so much more variety in it.
Also, SQL was designed back when most programming was done in assembler or C. Compared to assembler, SQL is easy. Programming itself has gotten a lot easier since.
Tutorial D obviously isn't SQL, but its not noSQL, either; despite the name, noSQL refers to non-relational stores, not relational stores with alternative query languages.
But as much as I think it's better than SQL, I don't think Tutorial D (or D as a concept more broadly) offers enough to really displace SQL.
SQL was designed for mainframe programmers, made to look similar to COBOL and PL/I presumably with the idea that at some point in the future it could be integrated into one of those.
Its syntax has all the drawbacks of COBOL's syntax: too many useful words end up as reserved; it doesn't compose very well, leading to statements with very complex syntax rules; and it lulls users into a false sense of security by looking like natural language while being something very different (see HAVING vs. WHERE).
>I work with many of the support staff have been learning SQL to help customers pull reports from our data warehousey reporting database.
Why wouldn't you use a visual query tool for this? Tableau and similar apps generate pretty decent SQL queries nowadays. Even old-school BusinessObjects does a decent job (although requires way more initial modelling).
I think Tutorial D would be more suitable as an ORM than a layer on top of SQL, I feel there would be less of an impedance mismatch if a Tutorial D implementation is a DSL. There is a Haskell implementation, but I'm not going to be doing work in Haskell any time soon.
NoSQL is a really terrible term. I don't know why so many people seem eager to the argue the merits of an incredibly disparate umbrella label that includes databases that have almost nothing in common. What meaningful things can you say about a category that includes Cassandara, Datalog, LDMB and Neo4J?
I think when a lot of people talk about NoSQL they just want to rant against a certain kind of strawman programmer. You know the one. Young, stupid, naiive, too arrogant to learn nth normal form or define schemas. This programmer probably uses nodejs or some other such heresy and only wants quick results, integrity be damned!
Don't get me wrong, Relational Databases are really good, and fit a lot of problems really well. But there do exist legitimate use cases (not necessarily scale!) where an RDBMS will simply be a poor fit, or a lot more work. Don't dredge up the strawman of the programmer too arrogant to learn SQL because you're too arogant to learn the merits of something that isn't SQL.
I like how it pretty much just glosses over decades of familiarity. If there is anything the last 20 years prove, it is that the majority of developers will stick with what they know over what might be a good fit for the job. It goes even deeper in the SQL world down to the specific database flavor.
From the ops side I actually find RDBMS more difficult to deal with cause the power of relationships is easy to abuse and they are not anti-fragile. Instead of smartly reasoning about the data, it is all to easy to just "JOIN ALL THE THINGS WITH MEGA TEMP TABLES!". I've taken more database outages from bad queries then anything else.
There are bad implementations on both sides. There are reasons to pick both sides over the other given a set a circumstances. At the end of the day, the technical facts don't matter to most people's decision making though.
I had the task of replicating data from DynamoDB to Redshift on AWS on two large projects on the last few years. The primary factor in using DynamoDB is that is's cheap and scales like nothing else in the galaxy.
Then the other shoe fell. The data was needed for reports. Reports require a static schema. NoSQL (and the developers who love it) despises static schema. "I can add properties whenever I want!"
This process of analyzing and reporting on production data becomes a very time-consuming, costly, and brittle exercise.
So then you have to determine, did we save enough money on the design side (using NoSQL over SQL) and then piss it away on reporting?
I'd argue AWS and other cloud providers need to create a SQL capable relational database as a service. This would (I hope) solve the problem.
But in the meantime, let's build our micro-services on relational databases so we can actually get aggregate data to stakeholders in real-time.
Way back when I took my first database class by Mike Stonebraker, he talked of some of the old war stories between RDBMS and network-based database, which was hot before RDBMS. He said the relational model by Date and Codd had won out back then, for the simple fact that relational data stood by themselves while the network-based data were tightly coupled with the applications. When NoSQL came around, yep, it looked like the old network-based database again. History has repeated itself.
Relational model, where SQL is merely the querying language, will win again and again, for the simple fact that it enables data to stand by themselves, and data tend to outlive applications, both in longevity and in scope.
Edit: PostgreSQL came from Postgres, the research project started by Stonebraker and others in UCB.
So the long and short of it is SQL would be a good common language for relational and NoSQL style databases.
I would agree with that, but since they different monsters (RDBMS and NoSQL), it will take a bit of tweaking to find a good dialect (how do joins work, etc). Of course it makes sense to adapt the NoSQL databases to the existing ANSI SQL rather than make existing ANSI SQL users switch to a new type of SQL that accommodates NoSQL, but we'll see what happens.
Google create a "Standard SQL," but I'm not familiar with it.
It's not one versus the other. That's like suggesting JSON is going to 'beat' HTML. They're different tools for different jobs and both are used alongside each other to great effect.
the whole debate always seems very confused to me. SQL refers to the language with which you interact with a database, NoSQL refers to the database system itself and it's normally used as opposed to traditional RDBSs. So really the debate should be either SQL vs NON-SQL interfaces or RDBS vs NoSQL databases. Also NoSQL meaning "not only SQL" means SQL can be a subset of NoSQL, so the debate is over. Really we should review the terminology of this entire subject...
It only left in terms of attention from startups I suppose. I also disagree with the reason the article suggested - SQL could not handle the loads. My opinion is that startups simply liked the idea of not having a schema as it fit their agile approach. So, they went NoSQL because it allowed them to get going faster and change easier.
No tool is good when abused. NoSQL was a knee-jerk reaction and the truth lies somewhere in the middle. Putting everything in relational databases is as bad as putting none of it there. If everything is treated like a tool the world becomes brighter at the cost of having to learn and understand more.
I do find the amusing "backronym" rather hard to take at face value. I also found the old critics of SQL databases odd focus on the query language hard to take. The problem wasn't the queries, per se. The problem was the power of the queries for interactive uses and how that doesn't scale up.
That is, the problem with most SQL databases is that you need some pretty specialized knowledge in order to construct good queries. Not shockingly, the problem with most modern key/value (or otherwise) datastores is that you need some pretty specialized knowledge in order to construct good queries.
Now, for things getting off the ground, this is probably fine. Most of the places you will go south with queries is in the ad-hoc style query. Of course, that style query is perfect for interactive use. Ideal, even. It falls on its face if it is supposed to support an automated case at high TPS. Unfortunately, automated cases at low TPS often turn into automated cases at high TPS. Worse, interactive cases at ridiculously low tps often turn into automated cases at low TPS. Which, of course, just feeds itself.
How this feeds itself, of course, is you can explore your data much more effectively if there is an ad-hoc query engine. So, we are now seeing the resurgence of ad-hoc queries and the learning that those can lead to some powerful insights.
I am really surprised that while everyone talks about SQL vs NoSQL, nobody has mentioned RDF as a model and SPARQL as a language. Graph-like structures based on triples allow the data relations to be represented properly (relational properties), while not limiting the data structure. This is not a shiny hot thing, but instead something developed over many years by (mostly) academics. Take a look at the tutorial: https://www.w3.org/TR/2014/NOTE-rdf11-primer-20140225/
And just to prove that RDF is a model, but not a format, look at JSON-LD as a serialisation format for RDF and at the SQLGraph [1] paper from Google to see how RDF can be implemented on top of an SQL RDBMS.
If SQL is to databases as Javascript is to browsers — ubiquitous and largely standard across the market — are there any languages that transpile to SQL?
People in this thread have commented that SQL is clunky, and JS definitely fit that description for a while. So I'm wondering if there are any alternatives that would prove SQL's clunkiness to me.
The premise that SQL is "beating" NoSQL is pretty dumb. I have never been aware that such a competition between datastore designs exists.
Who is that competition between developers? Users? Most of the places I have worked in the last 5 or 6 years have had both relational and non-relational databases. This is not uncommon. In none of those shops was there a competition between the two databases but rather they were complimentary.
This article and title seem to be very self-serving for their own product. Its seems to willfully conflate SQL the interface and SQL a general term for a relational database.
NoSQL has always been something of a misnomer - non-relational would have been a better term but it doesn't sound as buzz-worthy I suppose.
The SQL like interface has been in Cassandra for longer now than it was absent. The gain in SQL like interfaces for non-relational databases is because it's familiar and works really well. Anywhere there is a database there is going to be a need for selection, filtering and projections of tuples.
>"In Amazon’s own words, its PostgreSQL- and MySQL-compatible database Aurora database product has been the “fastest growing service in the history of AWS”.
Is this really surprising that the world's largest cloud provider is selling more databases than anything else? Almost everyone needs a database, given howrelational database, there are more people around that have experience with it.
>"To understand why SQL is making a comeback ..."
No, SQL never went away. Full stop.
>"But don’t take our word for it. Take Google’s"
No, this practice of blindly adopting ideas just because they work for Google needs to stop.
This post sounds as if the author(s) themselves bought into all of the NoSQL hype that buzzword-obsessed tech journalists were spinning and they are just now having an epiphany that much of that hype was just that.
Great article, but I think the author puts too much emphasis on SQL instead on the fact that the data being stored in a relational way.
Personally I actually would like an alternative language that would be capable to be integrated with the language in such way that it could also fall with type checking.
So for example if I rename a column in a database, the type checker would highlight all places in my code that were broken by this change.
JOOQ[1] seems to do something like that, but it's only for Java.
Also, looks like QUEL[2] would be a bit easier to be integrated with a language, too bad it died.
[+] [-] SmellTheGlove|8 years ago|reply
Anyhow, only point I'd add is that I'm not sure that SQL-like languages added on top of NoSQL solutions was really that bad of a thing. It's not as though there's one flavor of SQL to rule them all on RDBMSs. I get that the bones are largely the same - SELECT, FROM, WHERE, most join types, aggregates, etc. But take a really common analytic use case - persisted transactional data. Something like this -
SELECT A, B, C, FROM SomeTbl WHERE SomeDate BETWEEN SomeBegin AND SomeEnd QUALIFY RANK() OVER (PARTITION BY A, B ORDER BY SomeOtherDate DESC) = 1
Pretty sure that SELECT, FROM, WHERE will work just about anywhere. That QUALIFY clause is written for Teradata, and once you get into the ordered analytic functions (as one example), you need to know the implementation relevant to the RDBMS you're dealing with. It's not uncommon for a large enterprise to have MS SQL, DB2, Teradata, Oracle, etc. all under one roof. As a data engineer, you still need to know the differences - first to get the data out, and then to optimize.
Dunno, I could be way off on this, but that's been my experience at least.
I think the bigger issue is that you can use an RDBMS to do NoSQL-like things, such as key-value stores, with the flexibility to structure your data if you need that later. So why not start with a relational database?
[+] [-] Clubber|8 years ago|reply
I asked the same question. The biggest reason I've heard is RDBMS's don't horizontally scale well, meaning you can't easily have 50 replicated nodes across the globe and expect it to perform well, or setup easily, because it's fairly complicated with an RDBMS. There's things like Oracle's grid or SQL's high availability clusters, but they get complicated fast, particularly when you need a bunch of nodes.
Obviously it was a solution for a very large amount of unstructured or semi-structured data that needed to be redundant across networks. With NoSQL, the key/values are a lot simpler to replicate apparantly.
It's a Google/Twitter size problem that most industries wouldn't have, but since it's the new shiny thing, you know how that goes.
[+] [-] ojosilva|8 years ago|reply
I've tried that in the past and failed miserably.
1) values in a key-value table will endup needing to hold nested data structures, such as a JS object/hash. Ie. mykey={...}
2) turning values to JSON (or some other serialization) makes it impossible to concurrently update or search subkeys.
3) so you convert your complex (sets, hashes) key-value data into several rows that hold one subkey per row, so now you have updatable rows but still no indexable solution and a serious performance problem.
4) so you create a multi column type-segregated table (one column for each DB type) for indexing your key-values and making them searcheable. That also requires a metadata table to go with it so that you know in which column your key or subkey is indexed.
5) say you successfully implemented your key-value store with your RDBMS. You still don't have usable joins (you don't have relational data) or a real migration path out of your key-values.
Trust me, don't just put keys and values in a relational DB. Start with the right tool for the job, either make your schema relational from the beginning or use a proper KV or document store.
[+] [-] tomnipotent|8 years ago|reply
Cassandra & CQL is a prime example. CQL become the de facto interface for getting data out of Cassandra, even without the benefits of JOIN's & sets.
[+] [-] INTPenis|8 years ago|reply
You just rarely see those bundled redis or elasticsearch servers that are crucial for some app to do its session management and search engine.
The most disturbing part of the NoSQL trend is that people are treating it like a battle between two competing systems.
I have at least one major system under my own belt that uses relational SQL for backend data, ES for search engine and cassandra for TSDB.
I need and trust all those services to work as one unit.
[+] [-] jjirsa|8 years ago|reply
The query language is a side effect of the underlying storage engine - you don’t choose it because you want a key value store, you choose it because you want horizontal scalability and cross wan HA
[+] [-] eximius|8 years ago|reply
select a, b, c from table where somedate between somebegin and someend and someotherdate = (select max(someotherdate) from table group by a, b)
which will work in all sqls. My point being, the advanced features make life nice, but are by no means necessary.
(would be more efficient as a join, but not sure if joining on subtables is always supported...)
[+] [-] craigkerstiens|8 years ago|reply
SQL is super powerful and makes much sense in so many ways. Nearly all apps have a relational structure to them and SQL is a reasonable way to interact with them.
Some of my favorite conversations from the Postgres community 5-6 years back were when they were talking about a time when Postgres was being disrupted. The gray bearded DBAs (Hi Berkus and others) were talking about JSON like it's fad and how it's going to come to pass. They were saying so because they'd heard about this disruption before... there was this new thing XML and these document databases were so much more powerful. They waited a few years and added an XML datatype [1]... And then XML databases came and passed.
At first they scoffed a bit on JSON as the new hip thing, but came around a little in Postgres 9.2, and then in 9.4 [2] we all really got what we wanted.
SQL has never been a particularly elegant language, but it's always been a powerful one. The lone knock against has either been usability or scalability [3]. On the usability side that's a bit of a hard one as well it's not super pretty. But once you understand it's foundations around relational algebra or relational calculus you're so far ahead. On the scalability side there are definitely some options such as Citus, but also core Postgres is massively improving here.
[1] https://www.postgresql.org/docs/9.3/static/datatype-xml.html
[2] https://blog.codeship.com/unleash-the-power-of-storing-json-...
[3] https://www.citusdata.com/blog/2017/02/16/citus61-released/
[+] [-] manigandham|8 years ago|reply
SQL is also just a query language, that's literally the name. Any database can implement it, not just relational. Is SQL a great interface for both OLTP and OLAP? Yes, it's proven itself over decades. Is it the only valid interface? No. Does it work with other newer/different data systems? Yes, Spark, Hadoop, Kafka, and even these new distributed relational databases are all examples of such.
It would be far better for the industry if we can get past these ideological and faulty "debates" and move on to more important things.
[+] [-] jevgeni|8 years ago|reply
[+] [-] randomdrake|8 years ago|reply
The subtitle for this contains: "After years of being left for dead" and the author throws phrases around like: "And boy did the software developer community eat up NoSQL, embracing it arguably much more broadly than the original Google/Amazon authors intended."
Who? Where's the data? This blog post has a lot of links, references, and studies, but where's the data to back up the premise?
A quick search on Google Trends comparing SQL databases to NoSQL as an entire term, or any of the popular flavors of NoSQL, reveals that it is not even a blip in comparison.
But don't take my word for it, the author and their company had the "DUH" moment too (emphasis mine):
> ...we soon realized that we’d have to do a lot more work: e.g., deciding syntax, building various connectors, educating users, etc. We also found ourselves constantly looking up the proper syntax to queries that we could already express in SQL, for a query language we had written ourselves! One day we realized that building our own query language made no sense. That the key was to embrace SQL.
You might have had hubris stemming from discarding or not knowing all of the history that you decided to share with us in this blog post. A great deal of the NoSQL community was completely unpalatable for this reason. The folks who had been doing data for decades, and built stable and powerful systems on top of many prior decades of mistakes, built them for a reason.
And here we see the author trying to proclaim that suddenly SQL is back?
SQL never went anywhere. NoSQL is a neat tool that was developed and continues to be developed and probably isn't going anywhere. But the idea that NoSQL suddenly overtook SQL, and now SQL is seeing some huge resurgence, feels like it comes from the perspective of someone who only saw a window into the last 6-8 years of development.
The king is dead. Long live the king.
[+] [-] coldtea|8 years ago|reply
If you had been following startup blogs and HN, then you don't need any more data to back their premise.
It's not like total data matters anyway -- what's important is what use cases people regularly encounter in their periphery and the part of the industry they work on, which might not be what some overall data will show.
I don't care for example if NoSQL only caught on with 1% of developers while 99% of Fortune 500 enterprises and companies in rural Idaho and southern Mongolia used trusty old MS-SQL Server.
For most of us here around HN, judging from posts, comments, and discussions, the NoSQL era was very real, in the kind of companies and environments we knew.
[+] [-] paulddraper|8 years ago|reply
[+] [-] tn_|8 years ago|reply
[+] [-] ianamartin|8 years ago|reply
[+] [-] Clubber|8 years ago|reply
But to your point SQL the language or RDBMS was never dead, it just wasn't the prettiest girl in the room for all the new GI's that just shipped in.
[+] [-] memsom|8 years ago|reply
[+] [-] le-mark|8 years ago|reply
Personally, I knew stuff had gotten stupid when I sat through a presentation by a gemfire evangalist who advised everyone present to just "do your joins in code". If you need to join data you should be using SQL.
[+] [-] reboog711|8 years ago|reply
SQL has been around since 1974 according to Wikipedia. There was a ton of SQL Work during the first dot com era of the mid-90s.
On another note, I did a lot of Lotus Notes working during that time, which was clearly a no-SQL database.
[+] [-] pjungwir|8 years ago|reply
On the other hand, SQL-the-language isn't essential to relational databases. I have often wondered where C.J. Date has been the last few years. I actually love SQL, but it does have its limitations. I wouldn't mind a solid relational database with an alternative query language. It's such a missed opportunity for a great VC pitch: Tutorial D, NoSQL before it was cool. :-)
[+] [-] davidw|8 years ago|reply
SQL sounds like something from the Star Trek original series era to me. Read it with a Shatner voice:
[+] [-] elvinyung|8 years ago|reply
I really believe that Spark's "more-than-SQL" query interface is how things should be. SQL is of course the gold standard and probably represents north of 90% of analytic workloads, but there a lot of queries that (especially for us programmers) are much easier to express procedurally/functionally, rather than purely declaratively.
[+] [-] Blackthorn|8 years ago|reply
Could be something there with logic programming. Or could not be something there.
[+] [-] gopalv|8 years ago|reply
The strongest feature of SQL is the lack of specificity about "How it should return results" and primarily deals with the details of what the expected results are.
I work on a SQL engine which honestly breaks so many rules of "How to do things" and basically tries to avoid nearly everything a sane SQL engine engineer would do.
But the advantage of the lack of forcing implementation is that a new idea like this could still implement the "give me expected results" part of the implementation.
Whenever it doesn't - it needs fixing, not documenting in a "vs vs" comparison.
> I actually love SQL, but it does have its limitations. I wouldn't mind a solid relational database with an alternative query language
MDX is a pretty interesting thing to think with.
Mostly because if you're used to spread-sheets, it is a more natural way of expressing what you want generally in a straight forward lookup order - get me some columns from these rows, where some condition is satisfied.
That makes index & cube lookups so much easier to detangle for an engine than a more free-form SQL tree which has so much more variety in it.
[+] [-] Clubber|8 years ago|reply
[+] [-] dragonwriter|8 years ago|reply
But as much as I think it's better than SQL, I don't think Tutorial D (or D as a concept more broadly) offers enough to really displace SQL.
[+] [-] adamnemecek|8 years ago|reply
[+] [-] EdiX|8 years ago|reply
SQL was designed for mainframe programmers, made to look similar to COBOL and PL/I presumably with the idea that at some point in the future it could be integrated into one of those.
Its syntax has all the drawbacks of COBOL's syntax: too many useful words end up as reserved; it doesn't compose very well, leading to statements with very complex syntax rules; and it lulls users into a false sense of security by looking like natural language while being something very different (see HAVING vs. WHERE).
[+] [-] dgudkov|8 years ago|reply
Why wouldn't you use a visual query tool for this? Tableau and similar apps generate pretty decent SQL queries nowadays. Even old-school BusinessObjects does a decent job (although requires way more initial modelling).
[+] [-] kristianp|8 years ago|reply
[+] [-] coldtea|8 years ago|reply
That's not true.
[+] [-] matwood|8 years ago|reply
I've taught many business analyst to be quite functional in SQL.
[+] [-] HumanDrivenDev|8 years ago|reply
I think when a lot of people talk about NoSQL they just want to rant against a certain kind of strawman programmer. You know the one. Young, stupid, naiive, too arrogant to learn nth normal form or define schemas. This programmer probably uses nodejs or some other such heresy and only wants quick results, integrity be damned!
Don't get me wrong, Relational Databases are really good, and fit a lot of problems really well. But there do exist legitimate use cases (not necessarily scale!) where an RDBMS will simply be a poor fit, or a lot more work. Don't dredge up the strawman of the programmer too arrogant to learn SQL because you're too arogant to learn the merits of something that isn't SQL.
[+] [-] lmickh|8 years ago|reply
From the ops side I actually find RDBMS more difficult to deal with cause the power of relationships is easy to abuse and they are not anti-fragile. Instead of smartly reasoning about the data, it is all to easy to just "JOIN ALL THE THINGS WITH MEGA TEMP TABLES!". I've taken more database outages from bad queries then anything else.
There are bad implementations on both sides. There are reasons to pick both sides over the other given a set a circumstances. At the end of the day, the technical facts don't matter to most people's decision making though.
[+] [-] ChicagoDave|8 years ago|reply
Then the other shoe fell. The data was needed for reports. Reports require a static schema. NoSQL (and the developers who love it) despises static schema. "I can add properties whenever I want!"
This process of analyzing and reporting on production data becomes a very time-consuming, costly, and brittle exercise.
So then you have to determine, did we save enough money on the design side (using NoSQL over SQL) and then piss it away on reporting?
I'd argue AWS and other cloud providers need to create a SQL capable relational database as a service. This would (I hope) solve the problem.
But in the meantime, let's build our micro-services on relational databases so we can actually get aggregate data to stakeholders in real-time.
[+] [-] ww520|8 years ago|reply
Relational model, where SQL is merely the querying language, will win again and again, for the simple fact that it enables data to stand by themselves, and data tend to outlive applications, both in longevity and in scope.
Edit: PostgreSQL came from Postgres, the research project started by Stonebraker and others in UCB.
[+] [-] Clubber|8 years ago|reply
I would agree with that, but since they different monsters (RDBMS and NoSQL), it will take a bit of tweaking to find a good dialect (how do joins work, etc). Of course it makes sense to adapt the NoSQL databases to the existing ANSI SQL rather than make existing ANSI SQL users switch to a new type of SQL that accommodates NoSQL, but we'll see what happens.
Google create a "Standard SQL," but I'm not familiar with it.
Here is a link on Google Standard SQL:
https://cloud.google.com/bigquery/docs/reference/standard-sq...
[+] [-] nkozyra|8 years ago|reply
Yes, I'm sure the inevitable "SQL for NoSQL Databases" book will be a good read :)
[+] [-] mmaunder|8 years ago|reply
[+] [-] joeminichino|8 years ago|reply
[+] [-] nextInt|8 years ago|reply
[+] [-] krf|8 years ago|reply
[+] [-] 20years|8 years ago|reply
[+] [-] wernercd|8 years ago|reply
[+] [-] methodin|8 years ago|reply
[+] [-] taeric|8 years ago|reply
That is, the problem with most SQL databases is that you need some pretty specialized knowledge in order to construct good queries. Not shockingly, the problem with most modern key/value (or otherwise) datastores is that you need some pretty specialized knowledge in order to construct good queries.
Now, for things getting off the ground, this is probably fine. Most of the places you will go south with queries is in the ad-hoc style query. Of course, that style query is perfect for interactive use. Ideal, even. It falls on its face if it is supposed to support an automated case at high TPS. Unfortunately, automated cases at low TPS often turn into automated cases at high TPS. Worse, interactive cases at ridiculously low tps often turn into automated cases at low TPS. Which, of course, just feeds itself.
How this feeds itself, of course, is you can explore your data much more effectively if there is an ad-hoc query engine. So, we are now seeing the resurgence of ad-hoc queries and the learning that those can lead to some powerful insights.
[+] [-] smarx007|8 years ago|reply
And just to prove that RDF is a model, but not a format, look at JSON-LD as a serialisation format for RDF and at the SQLGraph [1] paper from Google to see how RDF can be implemented on top of an SQL RDBMS.
[1]: https://static.googleusercontent.com/media/research.google.c...
[+] [-] labster|8 years ago|reply
People in this thread have commented that SQL is clunky, and JS definitely fit that description for a while. So I'm wondering if there are any alternatives that would prove SQL's clunkiness to me.
[+] [-] bogomipz|8 years ago|reply
Who is that competition between developers? Users? Most of the places I have worked in the last 5 or 6 years have had both relational and non-relational databases. This is not uncommon. In none of those shops was there a competition between the two databases but rather they were complimentary.
This article and title seem to be very self-serving for their own product. Its seems to willfully conflate SQL the interface and SQL a general term for a relational database.
NoSQL has always been something of a misnomer - non-relational would have been a better term but it doesn't sound as buzz-worthy I suppose.
The SQL like interface has been in Cassandra for longer now than it was absent. The gain in SQL like interfaces for non-relational databases is because it's familiar and works really well. Anywhere there is a database there is going to be a need for selection, filtering and projections of tuples.
>"In Amazon’s own words, its PostgreSQL- and MySQL-compatible database Aurora database product has been the “fastest growing service in the history of AWS”.
Is this really surprising that the world's largest cloud provider is selling more databases than anything else? Almost everyone needs a database, given howrelational database, there are more people around that have experience with it.
>"To understand why SQL is making a comeback ..."
No, SQL never went away. Full stop.
>"But don’t take our word for it. Take Google’s"
No, this practice of blindly adopting ideas just because they work for Google needs to stop.
This post sounds as if the author(s) themselves bought into all of the NoSQL hype that buzzword-obsessed tech journalists were spinning and they are just now having an epiphany that much of that hype was just that.
[+] [-] takeda|8 years ago|reply
Personally I actually would like an alternative language that would be capable to be integrated with the language in such way that it could also fall with type checking.
So for example if I rename a column in a database, the type checker would highlight all places in my code that were broken by this change.
JOOQ[1] seems to do something like that, but it's only for Java.
Also, looks like QUEL[2] would be a bit easier to be integrated with a language, too bad it died.
[1] http://www.jooq.org/
[2] https://en.wikipedia.org/wiki/QUEL_query_languages