top | item 21878086

(no title)

dongxu | 6 years ago

PingCAP CTO here, thanks for these comments! We highly appreciate all the feedback!

First, it’s true that the current setup/deployment of TiDB is not easy. This is something that we're making serious moves to improve. For example,

A. We provide Ansible playbooks to simplify the deployment and rolling upgrade for on-prem users;

B. We built and open-sourced TiDB Operator (https://github.com/pingcap/tidb-operator) to enable TiDB on Kubernetes. We are working on a fully managed service in the public cloud (coming soon). Whether it is one binary or multiple binaries, it’ll be all transparent at the user level;

C. We are improving the default or self-adaptative parameters and are continuously refining the configuration process;

D. We are also trying to reduce the number of components. For example, the new version of CDC is implemented directly inside TiKV.

E. We are developing TiOps tools in a single binary to improve the operating and maintaining experience of the cluster. A fair amount of customers around the world are using TiDB in their production environments and we are making sure they get our help when needed in the setup so it would not be a deal-breaker.

Second, TiDB’s multiple-component or highly-layered architecture is challenging for deployment but the benefits are also obvious:

A. The separation of the storage and computing layers makes it flexible and agile to scale/upgrade each layer as needed. Different layers need different types or different number of hardware resources. If the computing resources become the bottleneck, users can scale the SQL layer by adding more TiDB instances in real-time; if the bottleneck is the storage layer, they can easily add more TiKV instances to increase the storage capacity.

B. As is known to many that we have donated TiKV to CNCF last year. We are fully committed to the open-source community and would like to see TiKV be the building block and foundation of the next-generation infrastructure. For example, we are happy to see some community users sit Redis on top of TiKV, and we ourselves built the TiSpark (https://github.com/pingcap/tispark) connector to run Apache Spark on TiKV.

For more thoughts about this, please take a look at my blog: https://pingcap.com/blog/9-whys-to-ask-when-evaluating-a-dis...

Feel free to give more feedback on https://github.com/pingcap/tidb and our community Slack channel https://pingcap.com/tidbslack. We're glad to discuss more with you on this issue!

discuss

_tkzm|6 years ago

all those points sound more like quickfixes for bad architecture design. I've been there, done that.. and i've learnt from it. instead of fixing the initial problem, you are just throwing more code and complexity over it. which costs you money and time. i think you made a BIG mistake byt going polyglot and not sticking to single a language which now prevents you from merging the code into a single binary in cost-efficient way.

if you decide to go with the single binary approach, i am curious to see if you decide to go with Go or Rust :)

just fyi: cdb went with Go but their storage layer ended up being rewritten with C, so they too are not much different from you, with the exception of being able to run a single binary via Cgo which you cannot do with Rust.

gravypod|6 years ago

> all those points sound more like quickfixes for bad architecture design.

As an on looker the architecture of TiDB is anything but bad. It doesn't fit into the "run one thing for everything" bucket of software because it's designed to be horizontally scalable. Every part of the design is scoped well enough to only do the bare minimum for it's goal. I have no doubt that internally at Google things like BigTable look very architecturally similar.

> Big mistake going polyglot.

Why? Because you can't make a single binary? That's not really a use case that makes sense for a cloud native DB. Everything is going to be a container anyway, so it doesn't matter what artifacts are included. Also the individual implementation layers of TiDB are individually useful (TiKV).

Making a single binary has no benifits unless you're running your hardware in a pet mentality which manually curated software. In this case a horizontally scalable db will not help you.

Also, providing an operator is a far cry from a "quick fix". I applaud them for doing this because it essentially removes the operation burden of running their DB.

irfansharif|6 years ago

> cdb went with Go but their storage layer ended up being rewritten with C,

The storage layer was not rewritten in C, it's just RocksDB. There's ongoing work to use a custom built LSM store instead: https://github.com/cockroachdb/pebble

steveklabnik|6 years ago

You can do the "single binary" thing with Rust if you use musl.