I know this is a small gripe, but the name of the company (even though it makes sense) makes my skin crawl a little. The name does stick in my memory though, so I'm not sure if this is a good thing or a bad thing.
I love hearing about a database that decides to focus on replication and self-healing itself! It drives me nuts how most databases implement a data store and then leave all the complexities of sharding and replication as an exercise to the reader, who is busy trying to get other things done.
I've been looking for a database which does sharding and replication automatically and without throwing away any focus on consistency and transactions, so I figure I'm likely to use this in the future. I've struggled to try to find any others meeting these criteria.
There's a number of datastores that claim to shard and replicate automatically, with no worries for dev/ops/devops.
They've been lying. Never trust them.
Some datastores can actually do this, but performance per beefy server is less than you'd expect. You can use Riak but you have to write proper CRDTs. You can use zookeeper or etcd but those are for small amounts of configuration data, not for large amounts of customer data.
For all the datastores that claim to do everything automatically and have great performance, we can thank Aphyr for providing the proof that they don't live up to their promises, while we just suspect they don't.
I'd suggest trying to use a simpler model, and understand and accept its failure modes. Maybe your app has to go into read-only mode for a few hours if there's a server failure, etc.
"Today, we’re launching CockroachDB for everyone. Use it. Build on it. Contribute to it!"
Does this mean it's more or less ready? The status in Github hasn't been updated in quite a while and lists it as Alpha with important parts like raft concensus still missing.
Can someone (preferably from the team) clarify the current situation?
PS.: CockroachDB is the only distributed DB that I would bet on going forward and being a solid base for a big distributed DB.
not quite ready yet, but the pace has picked up dramatically. We've begun work on the structured data layer and are whipping up a suite of extensive acceptance tests (load testing, performance metrics, ...) to iron out all of the performance issues/bugs that we don't want to be a part of the beta.
Raft consensus, btw, is already implemented. We'll update the README shortly to give a more concrete estimate of the situation.
I've been following the development of this project from the beginning and it has been very interesting to see how they've productized it. IIRC, they all used to work at Square (and before that in a startup called viewfinder) and started it on a hackweek.
Opening line: Databases are the beating heart of every business in the world
Well that's not remotely true, is it? Not even close. Is it really a good idea to lead with something so obviously untrue? If you're trying to convince me of something (i.e. that this product is good), putting such a jarring, obvious falsehood right at the start is a bad idea. I'm wondering if they're deliberately spoofing their own seriousness, but I see nothing else in there to support that.
This line didn't bug me so much. Would you have been okay with something subtly milder? e.g.
Databases are at the beating heart of every business in the world
or even:
Data is the beating heart of every business in the world.
> Cockroach is a distributed key:value datastore (SQL and structured data layers of cockroach have yet to be defined) - emphasis mine
I guess this is interesting, but distributed hard consistency pure K-V stores have been done before, Zookeeper, etcd, etc. It seems like the vast majority of the hard work is left to do. I don't want to get into naming arguments, but I wouldn't really call this a 'database' yet. It doesn't sound like you can do anything but a key lookup or range query currently, which is incredibly limiting for most real world applications.
I somewhat question the approach. e.g. why not figure out the hard part first? i.e. build the `SQL and data layers` on top of zookeeper or etcd then replace the backend to scale better? I would think this would get a lot more early adopters. As is, it's a very niche usage case that the alpha fills.
If you look at the documentation (eg., [1]), the design has been rather carefully thought out; it's just that they're implementing it from the bottom up.
According to their roadmap [2], they're aiming for KV functionality in 1.0 and aren't aiming for SQL until past version 2.0 (it's currently alpha).
Given the backgrounds of the technical people involved (including Google, as this project is inspired by Spanner), they should have a lot of experience with what they're trying to accomplish.
As for "done before", a core feature of Cockroach is true ACID transaction support, including snapshot isolation, something no distributed NoSQL database I know about supports. (ArangoDB does support transaction, but is mostly NoSQL in the sense of implementing a different query language than SQL.)
Great storytelling, accompanied by a call to action at the end... but right there at the end a big bold button (or link) is missing, you need to figure out that the tech details are from the menu. Make it simpler for the reader!
I've been following this project for well over a year now. It's come a long way, has a long way to go still, but it's pretty exciting as an alternative to weak consistency stores available now.
Agreed. I'm in the middle of implementing one of the lesser DBs, and have all of the engineering ahead of me that requires. Unfortunately this doesn't look smart until 2.0, which is probably years away. Too long to wait for.
[+] [-] mackwic|10 years ago|reply
[+] [-] mattikus|10 years ago|reply
[0]: https://www.youtube.com/watch?v=ndKj77VW2eM
[+] [-] sciurus|10 years ago|reply
[+] [-] patorjk|10 years ago|reply
[+] [-] steve918|10 years ago|reply
[+] [-] wamatt|10 years ago|reply
It sounds a little more palatable, whilst still keeping the spirit of the original.
[+] [-] uptown|10 years ago|reply
[+] [-] hakanderyal|10 years ago|reply
[+] [-] vellum|10 years ago|reply
[+] [-] AgentME|10 years ago|reply
I've been looking for a database which does sharding and replication automatically and without throwing away any focus on consistency and transactions, so I figure I'm likely to use this in the future. I've struggled to try to find any others meeting these criteria.
[+] [-] ploxiln|10 years ago|reply
They've been lying. Never trust them.
Some datastores can actually do this, but performance per beefy server is less than you'd expect. You can use Riak but you have to write proper CRDTs. You can use zookeeper or etcd but those are for small amounts of configuration data, not for large amounts of customer data.
For all the datastores that claim to do everything automatically and have great performance, we can thank Aphyr for providing the proof that they don't live up to their promises, while we just suspect they don't.
I'd suggest trying to use a simpler model, and understand and accept its failure modes. Maybe your app has to go into read-only mode for a few hours if there's a server failure, etc.
[+] [-] eis|10 years ago|reply
Can someone (preferably from the team) clarify the current situation?
PS.: CockroachDB is the only distributed DB that I would bet on going forward and being a solid base for a big distributed DB.
[+] [-] tschottdorf|10 years ago|reply
[+] [-] curiousDog|10 years ago|reply
[+] [-] brown9-2|10 years ago|reply
[+] [-] Mahn|10 years ago|reply
[+] [-] justincormack|10 years ago|reply
[1] http://venturebeat.com/2015/06/04/peter-fentons-latest-inves...
[+] [-] smrtinsert|10 years ago|reply
[+] [-] tzury|10 years ago|reply
[+] [-] BasDirks|10 years ago|reply
Ant: http://ant.apache.org Wasp: https://www.waspbarcode.com Hornet: https://www.npmjs.com/package/hornet
[+] [-] exacube|10 years ago|reply
[+] [-] EliRivers|10 years ago|reply
It feels like going back in time.
[+] [-] brazzledazzle|10 years ago|reply
[+] [-] EliRivers|10 years ago|reply
Well that's not remotely true, is it? Not even close. Is it really a good idea to lead with something so obviously untrue? If you're trying to convince me of something (i.e. that this product is good), putting such a jarring, obvious falsehood right at the start is a bad idea. I'm wondering if they're deliberately spoofing their own seriousness, but I see nothing else in there to support that.
[+] [-] jnpatel|10 years ago|reply
or even: Data is the beating heart of every business in the world.
[+] [-] mbell|10 years ago|reply
I guess this is interesting, but distributed hard consistency pure K-V stores have been done before, Zookeeper, etcd, etc. It seems like the vast majority of the hard work is left to do. I don't want to get into naming arguments, but I wouldn't really call this a 'database' yet. It doesn't sound like you can do anything but a key lookup or range query currently, which is incredibly limiting for most real world applications.
I somewhat question the approach. e.g. why not figure out the hard part first? i.e. build the `SQL and data layers` on top of zookeeper or etcd then replace the backend to scale better? I would think this would get a lot more early adopters. As is, it's a very niche usage case that the alpha fills.
[+] [-] lobster_johnson|10 years ago|reply
If you look at the documentation (eg., [1]), the design has been rather carefully thought out; it's just that they're implementing it from the bottom up.
According to their roadmap [2], they're aiming for KV functionality in 1.0 and aren't aiming for SQL until past version 2.0 (it's currently alpha).
Given the backgrounds of the technical people involved (including Google, as this project is inspired by Spanner), they should have a lot of experience with what they're trying to accomplish.
As for "done before", a core feature of Cockroach is true ACID transaction support, including snapshot isolation, something no distributed NoSQL database I know about supports. (ArangoDB does support transaction, but is mostly NoSQL in the sense of implementing a different query language than SQL.)
[1] https://github.com/cockroachdb/cockroach/blob/master/docs/de...
[2] https://github.com/cockroachdb/cockroach/wiki/Roadmap
[+] [-] danmaz74|10 years ago|reply
[+] [-] mindstab|10 years ago|reply
"The highest level of abstraction is the SQL layer (currently not implemented)."
[+] [-] ilya-pi|10 years ago|reply
[+] [-] rudiger|10 years ago|reply
[+] [-] mattparlane|10 years ago|reply
[0] https://github.com/cockroachdb/cockroach#architecture
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] gauravagarwalr|10 years ago|reply
[+] [-] chubs|10 years ago|reply
[+] [-] tim333|10 years ago|reply
[+] [-] paulkonp|10 years ago|reply
[+] [-] andybons|10 years ago|reply
[+] [-] zenogais|10 years ago|reply
[+] [-] themartorana|10 years ago|reply
I can always scan-and-switch when it's ready.
[+] [-] unknown|10 years ago|reply
[deleted]