I've seen a lot of these new SQL databases, and they all seem to compare themselves to MySQL. That seems like a particularly easy target, since MySQL's main strength is speed, and it's a fundamentally old-school (ie slow) design which has sacrificed everything else to get half-fast. What I'd like to know is how any of these compare in features and reliability to the featureful, reliable databases like Oracle and Postgres.
(For specific examples that I've run into recently: does it have geo-spatial extensions, and does it crash and burn and corrupt all its data if a malloc() fails)
It still uses a transaction log, but seems to make snapshots of the logs at a period of time to create compressed snapshots of the data itself.
You can turn off durability, but then you risk losing everything.
For features... forget about geo-spatial extensions - they can only do read-committed isolation level... but as they only allow for one SQL statement per transaction I suppose this might not be such a big concern.
If you were using this for something like a session management database, or something where durability is not so important then it's probably fine. Not sure I'd use it for anything that relies heavily on transaction management features.
I see other limitations, like it cannot support CHANGE COLUMN, and it can't do joins on more than 2 tables. Actually, it doesn't look like it can do FULL OUTER or RIGHT OUTER joins, either. :(
"and does it crash and burn and corrupt all its data if a malloc() fails"
MemSQL is fully durable (with option of sync and async durability, similar to sync and async replication) with its transactional log. If MemSQL process is killed it will recover all commited transactions on the next startup. Durability has very little performance cost on throughput as MemSQL uses only sequential I/O. And of course, durability can be completely turned off for maximum performance.
Is it just me or does this entire product smell like it was designed for the sole purpose of extracting money from less than tech-savvy investors and clueless institutions?
The bullshit-bingo-lingo on their homepage is mindnumbing.
Meanwhile their actual software seems rather underwhelming, bordering on SnakeOil.
Thanks! We're super-excited to have the opportunity to build this. When we started we asked ourselves - what would a database look like if it were designed today?
We see a future where OLTP databases live in memory, and where you have hundreds of terabytes of memory and hundreds to thousands of CPUs at your disposal.
We built MemSQL to make it easy to go directly into memory and get the speed improvements we all need. What we're releasing aims to fulfills that vision.
p.s. Eric, two annoyances about your site: once I've been dumped on to the developers subsite, it's non-trivial to get back to the main company homepage.
That's a little harsh I think. It still has a transaction log, so it has certain advantages over memory tables. Some interesting enhancements to indexes also. And it is also binary compatible with MySQL, so you can actually add it as a MySQL node.
I'd say it's a database for a niche market, as one of the folks from MemSQL has been saying. It would fit in nicely under a few scenarios.
MEMORY tables have table-level locks and from a concurrency perspective, is just as bad as MyISAM, so really offer no advantage unless there is only a small number of consumers. A well tuned InnoDB table will outperform the MEMORY engine )(after the warm-up period) as soon as you have more than a handful of users performing write operations.
Advantages are clear. Blazing speed, transactional durability, ease of use and flexibility that come from using a well known protocol and SQL dialect. Did I miss something?
On the other hand, if your data fits in RAM and you don't need to join more than two tables and you don't need clustering/sharding whatsoever and you need the full power of SQL, what would you use?
> ..something purpose-built (redis)?
Errr maybe "features" like GROUP BY with HAVING? I think you're trolling, Mr Moe.
How it compares to VoltDB, which used similar approach and already "combines the proven power of relational processing with blazing speed, linear scalability and uncompromising fault tolerance" (according to VoldDB website) ?
There are few differences with VoltDB.
1. It's much easier to use (MySQL compatibility, ad-hoc sql no java dependency)
2. It doesn't have any issues with data skews. We are using lockfree skiplists vs b-trees partitioned by core.
3. I believe we are faster on a single box, however this claim should obviously be verified by a third party.
Xeround looks really interesting (http://xeround.com/).
Seems to do everything in RAM and takes care of any replication / backup & scailing for you. Looks like it would get expensive with large amounts of data though... anyone using this in production?
Most times that the press mention SQL, they actually mean "Relational Database Management System". But in this case, MemSQL seems to be doing two things: reducing locking semantics (the only transaction isolation level is read committed), and doing something or other to speed up query compilation.
Frankly, I can't see how their query compliation is much better than anything Oracle, SQL Server or PG does already. (see here: http://developers.memsql.com/docs/1b/codegen.html) Perhaps there is no plan cache in MySQL, or the plan cache is inadequate?
Incidentally, you don't want to use it if you need to do multi-statement transactions. Can't be done. Check out the documentation here:
Most DBMS products have "Stored Procedures" which, essentially, are compiled chunks of SQL that you can call by name, e.g.
CREATE PROCEDURE foobar
AS
SELECT *
FROM users
On very large queries (many table joins) there can be a noticeable performance hit while the DBMS figures out the optimal query plans. Stored procedure compile and store this plan after first execution (some pre-compile) so the DBMS can just look it up in the cache.
It also removes the need to do runtime parsing (compiled vs. interpreted languages). There are more nuances to Stored Procedures than I can put here, but here's a link to a more in-depth analysis I wrote:
http://stackoverflow.com/questions/59880/are-stored-procedur...
In general, MySQL has limited query plan cache (last I read, it was one cache per database connection and was wiped out when the client disconnected), so this is definitely a win. However, since every other DBMS out there already has robust query cache, I don't think the Postgres/Oracle/MS SQL etc. folks are losing any sleep.
I think its that its an in memory database, optimised for memory as opposed to disk. The SQL to C++ seems a bit of a confusion to me, I guess it means it has a JIT SQL compiler to optimise the query to native opcodes. Meh. But the in memory and mysql wire compatibility are big wins.
I am not sure this solves anything. From their implementation details I could just mount mysql on a memory filesystem and have 100% of MySQL's feature. If I need near 100% assurance that the data will be saved, there is always block replication, transaction logs and regular replication. their transition module adds nearly nothing but overhead, since stored procedures are better, and if you have the need for ultra performance, there is always the ability to access mysql at the command level and bypass sql entirely.
There are many bottlenecks and it really depends on the workload. Sometimes it's IO, sometimes it's CPU. When you have a throughput problem, yes, it's the SQL.
Question - how well does the query parser handle stale queries? SQL Server for the longest time had issues with plan stability in that the plan became too stable. When the data distribution changed dramatically, the queries didn't age out of the cache and the queries would do such things as use the wrong index, or not work out the correct cardinality of a table and then use an index where really a full table scan would have been better... and so on.
Right now you can handle this with query hints. The plans are stable, but they are attached to the query text. When you throw a hint in there memsql generates a new plan since the query text changes.
Sharded/replicated MemSQL is in the works. And yep, the replication will be compatible with MySQL replication to enable grouping MemSQL and MySQL nodes.
Why not just use objects in your favourite shitty OOP language as usual and take regular snapshots of the data in memory and save them to disk as rollback points instead? Bam! Nothing new to implement in the application. Bam! Lightening fast access and write times. Bam! Data persistence.
[+] [-] Shish2k|13 years ago|reply
(For specific examples that I've run into recently: does it have geo-spatial extensions, and does it crash and burn and corrupt all its data if a malloc() fails)
[+] [-] chris_wot|13 years ago|reply
You can turn off durability, but then you risk losing everything.
For features... forget about geo-spatial extensions - they can only do read-committed isolation level... but as they only allow for one SQL statement per transaction I suppose this might not be such a big concern.
If you were using this for something like a session management database, or something where durability is not so important then it's probably fine. Not sure I'd use it for anything that relies heavily on transaction management features.
I see other limitations, like it cannot support CHANGE COLUMN, and it can't do joins on more than 2 tables. Actually, it doesn't look like it can do FULL OUTER or RIGHT OUTER joins, either. :(
[+] [-] tintor|13 years ago|reply
MemSQL is fully durable (with option of sync and async durability, similar to sync and async replication) with its transactional log. If MemSQL process is killed it will recover all commited transactions on the next startup. Durability has very little performance cost on throughput as MemSQL uses only sequential I/O. And of course, durability can be completely turned off for maximum performance.
[+] [-] vessenes|13 years ago|reply
1) Facebookers: Check
2) Data Scaling experience: Check
3) In-Memory with SQL semantics: Check
4) New-York based software that can be sold to quant funds: Check
I'm excited. It's downloading now.
[+] [-] moe|13 years ago|reply
The bullshit-bingo-lingo on their homepage is mindnumbing.
Meanwhile their actual software seems rather underwhelming, bordering on SnakeOil.
[+] [-] ericfrenkiel|13 years ago|reply
We see a future where OLTP databases live in memory, and where you have hundreds of terabytes of memory and hundreds to thousands of CPUs at your disposal.
We built MemSQL to make it easy to go directly into memory and get the speed improvements we all need. What we're releasing aims to fulfills that vision.
[+] [-] vessenes|13 years ago|reply
Also, pricing. Come on, man! Pricing!!
[+] [-] moe|13 years ago|reply
Data must fit in RAM, no joins over >2 tables, no transactions, no builtin support for clustering/sharding/horizontal scaling whatsoever.
What is the advantage over memory-tables, MySQL on a ramdisk or something purpose-built (redis)?
[+] [-] chris_wot|13 years ago|reply
I'd say it's a database for a niche market, as one of the folks from MemSQL has been saying. It would fit in nicely under a few scenarios.
[+] [-] AaronBBrown|13 years ago|reply
[+] [-] tigerBL00D|13 years ago|reply
[+] [-] old-gregg|13 years ago|reply
> ..something purpose-built (redis)?
Errr maybe "features" like GROUP BY with HAVING? I think you're trolling, Mr Moe.
[+] [-] snissn|13 years ago|reply
1. http://developers.memsql.com/docs/1b/sql/join.html
[+] [-] chris_wot|13 years ago|reply
1. http://developers.memsql.com/docs/1b/isolationlevel.html
[+] [-] alexro|13 years ago|reply
[+] [-] nikita|13 years ago|reply
[+] [-] spitfire|13 years ago|reply
Ycombinator seems to have backed a few of these next-gen databases. It will be interesting to see who wins.
EDIT: Rethinkdb. But it looks liked they've dumped their mysql engine and gone to a pure key-value store.
[+] [-] JonM|13 years ago|reply
[+] [-] sargun|13 years ago|reply
[+] [-] zwischenzug|13 years ago|reply
Is there more to this than the article suggests?
[+] [-] chris_wot|13 years ago|reply
Frankly, I can't see how their query compliation is much better than anything Oracle, SQL Server or PG does already. (see here: http://developers.memsql.com/docs/1b/codegen.html) Perhaps there is no plan cache in MySQL, or the plan cache is inadequate?
Incidentally, you don't want to use it if you need to do multi-statement transactions. Can't be done. Check out the documentation here:
http://developers.memsql.com/docs/1b/isolationlevel.html
It has a few extra tricks up its sleeve to enhance indexes in MySQL, but again nothing really particularly spectacular. http://developers.memsql.com/docs/1b/indexes.html
More interesting for me, is that it can be hosted on EC2, which is pretty neat!
[+] [-] MattRogish|13 years ago|reply
CREATE PROCEDURE foobar AS SELECT * FROM users
On very large queries (many table joins) there can be a noticeable performance hit while the DBMS figures out the optimal query plans. Stored procedure compile and store this plan after first execution (some pre-compile) so the DBMS can just look it up in the cache.
It also removes the need to do runtime parsing (compiled vs. interpreted languages). There are more nuances to Stored Procedures than I can put here, but here's a link to a more in-depth analysis I wrote: http://stackoverflow.com/questions/59880/are-stored-procedur...
In general, MySQL has limited query plan cache (last I read, it was one cache per database connection and was wiped out when the client disconnected), so this is definitely a win. However, since every other DBMS out there already has robust query cache, I don't think the Postgres/Oracle/MS SQL etc. folks are losing any sleep.
[+] [-] al_james|13 years ago|reply
[+] [-] imfine|13 years ago|reply
[+] [-] nikita|13 years ago|reply
[+] [-] continuations|13 years ago|reply
MySQL Cluster is lock-free, distributed, HA, and has better performances:
http://mikaelronstrom.blogspot.com/2012/05/mysql-cluster-72-...
http://mikaelronstrom.blogspot.com/2012/05/mysql-cluster-727...
And MySQL Cluster has been used in mission critical apps for years.
What does MemSQL give me that MySQL Cluster doesn't?
[+] [-] chris_wot|13 years ago|reply
How does the database handle this sort of thing?
[+] [-] nikita|13 years ago|reply
[+] [-] acdha|13 years ago|reply
[+] [-] ericfrenkiel|13 years ago|reply
[+] [-] al_james|13 years ago|reply
Does anyone know if there are any plans for having MongoDB style replica sets (e.g. sharded and replicated databases in a cluster)?
Also, it would be great if it supported the native mysql replication, so you could have MemSQL replicas of a master mysql DB.
[+] [-] ankrgyl|13 years ago|reply
[+] [-] factorialboy|13 years ago|reply
I should be able to try it for one of my apps over this weekend!
[+] [-] ozgune|13 years ago|reply
[+] [-] prezjordan|13 years ago|reply
[+] [-] nikita|13 years ago|reply
[+] [-] alexrson|13 years ago|reply
[+] [-] Produce|13 years ago|reply
Bam!
[+] [-] ericmoritz|13 years ago|reply
[+] [-] MrMan|13 years ago|reply
[+] [-] vmalkani|13 years ago|reply
[+] [-] omi|13 years ago|reply
[deleted]
[+] [-] visa|13 years ago|reply
[deleted]