top | item 11262318

Update on InfluxDB Clustering, High Availability and Monetization

124 points| KyleBrandt | 10 years ago |influxdata.com | reply

70 comments

order
[+] 23david|10 years ago|reply
I've advocated for and implemented several InfluxDB installations in production over the last year+, and one of the considerations was always that non-alpha (prod-ready) clustering was always promised in the 'next version' that was just around the corner.

Several months ago it seemed clear that the team was overly optimistic, and it's just disappointing to see that now the clustering will be available only in a paid (minimum $400!) option or on their hosted service.

I understand the business considerations here, but it feels like a bait n' switch for all the people who evaluated/used InfluxDB in single-node operation as a temporary measure while giving the team ample time to work out the clustering kinks.

Lesson learned I guess... but dang what an expensive lesson.

[+] infinotize|10 years ago|reply
I agree, I've invested a bit of time building a monitoring stack with influx, and one of the main reasons I chose it was development velocity vs some of the competition. And from a while back, clustering was supposed to be a top-level issue, coming just around the corner. At this point I'm going to have to bite the bullet and go with a more cumbersome but mature solution, like opentsdb.
[+] pauldix|10 years ago|reply
I'm sorry for the disappointment and understand your frustration. Our hope is that for users for whom a commercial solution isn't an option, there will be open source methods to do what they need.

The HA page with open source options we put up will be getting more detail over time: https://influxdata.com/high-availability/

We're comitted to continued innovation in the open source InfluxDB and see this as one of the avenues for ensuring that we can continue those open source contributions.

[+] m1keil|10 years ago|reply
Same situation here. This is beyond "disappointment". Total bait.
[+] jrv|10 years ago|reply
This makes me think: an open-source project can be better off if it's not controlled by any one company. While in Prometheus (http://prometheus.io), we might still take a while before we have a clustered remote long-term storage, we'd never prevent it because the project is independent of any company and we'd want the open-source project to be as good as it can be.

Also, there were some tentative thoughts about using InfluxDB as the main long-term storage backend for Prometheus, but that has become pretty much uninteresting now that clustering support (needed for LTS and durability) is basically cancelled for the open-source version.

Still, I guess I can understand that when you're a company, you need to focus on making money.

[+] mindcrime|10 years ago|reply
This is concerning to me, not because I use InfluxDB, but because my company is also a "pure play" OSS vendor. So it's always concerning to see a company fail to achieve success with the pure-play model, and wind up having to go "open core".

Also, there were some tentative thoughts about using InfluxDB as the main long-term storage backend for Prometheus, but that has become pretty much uninteresting now that clustering support (needed for LTS and durability) is basically cancelled for the open-source version.

True, but consider this: Given that InfluxDB (core) is Open Source, there's nothing to stop somebody else from coming along and building a version with clustering support. Whether that will happen or not is, of course, an open question.

[+] bogomipz|10 years ago|reply
+1 to Prometheus. I'm a new user but have found the project to be well thought out and have found the community on IRC channel to incredibly helpful and responsive to the community. I'm glad you didn't hitch your wagon to Influx. Is there a reason to not continue with your LevelDB backend long term?
[+] KyleBrandt|10 years ago|reply
I'm interested in what the pricing will be like at various scales, on Twitter the CEO (Paul Dix) indicated $400 month will be the basic offering for limited cores.

I guess time will tell, but a bigger deal than the money to me is will the clustering actually work well? Clustered databases is an extremely difficult problem. Seen it get better in the years in things like MsSQL, but even with that sort of resources it took a long time (years) for the newer availability model to become stable.

At Stack Overflow we are still using OpenTSBD behind Bosun. But HBase sucks to manage if you don't have any other reason to be using the technology. So I see a lot of users interested in using InfluxDB as the backend (and some do, Bosun can talk to it) because it is easier to get started. But if you know you will have to scale up eventually, all the options right now are not appealing in TSDB land :-/ So if some $$ really does get a good TSDB that scales and reasonable to manage then great, but I'm skeptical.

[+] pauldix|10 years ago|reply
We're committed to making the clustering a world class product. You're right, it'll take time. However, we'll get there. We'll put the same focused effort and testing into it over the coming months and years that we put into the TSM storage engine, which is working remarkably well at significant loads.

In the meantime, the stuff we have going into the open source is going to make the standalone server (and thus anyone using open source options for HA or clustering) even more performant, stable, and scalable.

[+] meirelles|10 years ago|reply
Check out KairosDB. Run on top of Cassandra. It's pretty stable, easy to maintain and update. Comes with some limitations, but still worthy to check out if fits your usage. The server itself is stateless which is great for HA and scaling.
[+] kev009|10 years ago|reply
The suck of running HBase is that it is several large components, but it actually works. A product that doesn't work and over promises like InfluxDB sucks infinitely more. Use GCE's BigTable if you want to shed the support of HBase.
[+] noir-york|10 years ago|reply
This basically kills Influxdb for me. We're evaluating influxdb in stand-alone mode to collect limited metrics (grafana) with a view to eventually moving more and more data to Influxdb once clustering became available. $399 per month is ridiculous. Clustering is table stakes.

Lesson learnt: before deploying a new OSS, check if they have a credible plan to support the project. Otherwise skip.

[+] akanet|10 years ago|reply
I mean, honestly, $400/mo is table stakes too, if you're actually using a clustered database in production for anything serious.

I think HN in general is being a bit too cynical here. This is an OSS company trying to find a sustainable way to survive. There are going to be some tricky tradeoffs. "They should just charge $20/mo for us small guys!" is not a reasonable course of action.

Regardless of whether this particular pricing plan works out or not, I think it's a good idea for programmers to support other programmers trying to make money on OSS. We are all not served very well by the general attitude of "I heard about this technology last year, it should be $0 now".

[+] ericb|10 years ago|reply
Ugh, we are a startup, we invested in influxdb on faith that it would "get there", developed around it, and now we can't afford what they want to charge for the scalable version.

Maybe we can avoid clustering with some workaround, (sharding) but I feel tricked.

[+] pauldix|10 years ago|reply
I'm sorry that you feel tricked. I hope that one of the open source options handles your use case if commercial software doesn't work for you.

However, we think that the community will be better off if we can continue to contribute to the open source InfluxDB. We want to invest heavily in open and closed source software. Ultimately the open source community will benefit from an open InfluxDB

[+] redwood|10 years ago|reply
In all seriousness how is it possible that you're running a business and can't afford to pay what is basically a very small amount for some software?

I'm guessing you also have customers and are trying to figure out how it is that you can get them to pay you money so that you yourself can create a long-term business and create better products over time. Right?

[+] gtirloni|10 years ago|reply
People will find increasingly clever ways to work around the lack of clustering, as they always did. This could mean only the top-tier users will be paying, which must be a very small portion of the pie.

Even though this announcement is sold as something that will empower InfluxData to add even more cool features to the open source version, I'm actually worried that it won't be around as an open source product for very long. Let me explain.

InfluxDB is a very fine product but I don't think it has all the momentum it needs to keep going in the long term yet. That means it's still building that critical mass of supporters in an open source ecosystem. With this announcement, InfluxData is eroding the trust that small to medium-sized users had in it so the momentum slows down. If InfluxDB was a no-brainer for anyone starting a time-series project, it is not anymore. All the other alternatives need to be carefully analyzed and, even if InfluxDB is chosen, there is that voice in the back of your head saying you'll be in trouble if you exceed a single node's capacity. Eventually people will jump to the next TSDB solution that offers clustering as soon as it becomes avaiable. Then where are all those paying customers going? It'll then be easier and cheaper for InfluxDB to become a proprietary software company.

While there are some comparisons being made to what Nginx does with its Nginx Plus offering, I think it's the opposite situation. IIRC, since the early days, Nginx has offered a paid product with more features. And recently they have started to add those features to the open source version, so people got happier. InfluxDB has always promised clustering (it was a major selling point, go watch any presentation about it from 2 years ago), shipped the code (even if it's half-working as of now) and now announced it's removing the functionality. Suddenly any InfluxDB node you have (even if it's running just fine without clustering) looks like a lemon.

It's all such bad PR. And it's 2016, haven't hundreds of OSS companies been through this already? Oh well.

[+] damm|10 years ago|reply
I really don't see a point to worry about. So as of 0.12 the OSS product will change; however a new product will come out to bring in replication. Somehow they will produce a private binary that will do the same thing

Single node usage is great for most people; sure it can be useful to cluster and you have that option.

If there's limitations it's open source; those can be removed. Then people would likely use the fork that doesn't have that limitation.

I don't see evil here. Just trying to work on their product

[+] jrv|10 years ago|reply
I don't think anyone is saying anything about it being evil, and it may even make sense for InfluxData the company. But for users, not having clustering (for scalability, long-term storage, and durability) will limit the value of the open-source version a lot in the long-term. As for the viability of a fork, see the other comments around that.
[+] kev009|10 years ago|reply
"less than ½ of a percent are running active clusters." because it doesn't work. They have been TheatricDB to me for a long time, this is just another nail in the coffin.
[+] sp1982|10 years ago|reply
We use blueflood developed by Rackspace at Square which uses cassandra , coupled with a query layer called MQE (https://github.com/square/metrics). While it's actively developed, I definitely recommend taking a look if you are interested in highly scalable metrics system with decent strategy for rollups. (100k+ metrics/sec).
[+] pauldix|10 years ago|reply
The reason we think InfluxDB is compelling as a standalone server is that you can get 300k metrics/sec on a single server. The scale is there for many use cases in the current release
[+] hoov|10 years ago|reply
I'm pretty damn angry right now, but I'm sure that in a few days I'll get over it.

I run the tech side of a decent sized startup right now. Having joined ~4 years into the venture, one of my immediate concerns was the lack of visibility into how our system works. Keeping tabs on the performance of several hundred ETL pipelines is not super easy. I decided to double down on InfluxDB, changing the road-map of my infrastructure team.

Then, InfluxData introduced the TICK stack. I can get monitoring and alerting as well? Let's double down again!

I bought hardware, set up a cluster (painful), we filed bug reports, learned what sorts of queries not to run, and we were in good shape. I purchased tickets for a training session and a flight (out of pocket; we don't have travel budget). And then I saw the blog post last night.

All along, I knew that the promise of all of this great technology for free was too good to be true. There was always a little voice in my head asking me about how they actually made money -- I knew that a paid offering + professional services was not sustainable.

I'm angry, but we'll be fine. We're going to break apart our cluster and shard. We've already got code written to do a backup/restore that actually works (slowly) from one cluster to another. If we hit the point where sharding doesn't work, then it'll also probably be financially viable to pay the $399/month (on top of server costs). I'll still go to the training, but I'm not sure what I'll get out of it. The reason why I was going to the training was the section on "Cluster Administration".

The only thing that I'm raw about is the lack of apology. You have to make money, and you had to do a bait-and-switch as a result. I totally understand, but the tone of that blog post was too defensive and unapologetic.

[+] tlipcon|10 years ago|reply
Hopefully this isn't too "pitch"-y, but: if you're looking for a database that's good at time series, will always be open source, and does support scale-out and HA, you might be interested in Apache Kudu (incubating).

Feel free to drop by our Slack (http://getkudu-slack.herokuapp.com ) if you have any questions.

[+] terom|10 years ago|reply
Congrats on the decision. Standalone InfluxDB can be scaled up just fine to meet most usecases, and it's better to have the long-term project sustainability that a hosted/enterprise offering can bring to the standalone offering.

InfluxDB 0.9 still had plenty of bugs, and I'd rather see a high-quality standalone server than any not-quite-there-yet clustered version.

[+] bogomipz|10 years ago|reply
Agreed I would have like to have seen these bugs ironed out before the whole splashy rebranding to Influx Data or whatever they are calling themselves now and their business pivot to a "data platform." I think the timing is unfortunate.
[+] pauldix|10 years ago|reply
Thanks, we do feel that this will lead to a better overall InfluxDB in both standalone and clustered.

The current version is siginficantly better than 0.9 and it's only getting better

[+] spotman|10 years ago|reply
> ... "customers eventually drop support as their infrastructures mature and they look to reduce operating costs"

This is part of the game called software. In some worlds this is actually the goal; that the software works so well and is so reliable that your customers eventually don't all need to keep paying you.

So you find new customers and add new, never advertised before features and enterprise clients with SLA support contracts, etc.

It's very understandable that it's hard to monetize, but giving people the impression clustering was going to be included long term and taking it away is not going to score you points.

Furthermore you want to get all the folks doing a startup on a budget hearing a story that works for them. Most ( especially new or young ) cofounder programmer types rarely plan to not need clustering, even if the reality is most won't need or use it. Saying this is for the big kids only is a turn off at this point in the influx story.

Wish you the best of luck

[+] troyk|10 years ago|reply
I understand the need for businesses to make money, but I do not understand how a business can understand OSS and implement the OSS core-only model. If your product has demand, at some point, an OSS project will arise to displace you.

Are there many winners with this model to name. Maybe NGINX, MongoDB?

[+] seiji|10 years ago|reply
I do not understand how a business can understand OSS and implement the OSS core-only model

Whenever you see "open core," you should read it as "VCs said we need to do this to keep them happy."

VCs love the idea of how you can give part of something away for free then lock people in to auto-renewing subscription contracts for the whole banana (with fees increasing 7% to 17% every year for ongoing "maintenance"). Many VCs still actively recoil at the thought of giving away everything for free then just hoping the best will happen.

Nginx is a weird example because I'm not sure how well they are doing or who would buy "nginx pro" instead of haproxy+nginx or haproxy+varnish+nginx some other combination of already existing stuff.

MongoDB is an interesting example because they went with the "provide open software, but with really serious, game-stopping flaws, then force people to pay for support because nothing works" route. That's almost literally bait-and-switch (bait-and-support?). But, that hasn't ended up so well for them. From a recent WSJ article: Fidelity has cut its valuation of MongoDB in eight of the nine quarters since Fidelity made its investment in December 2013, valuing the shares 58% below what it paid.

[+] eknkc|10 years ago|reply
- Varnish comes to my mind. Has Varnish Plus with paid features. - There's GitLab with CE and EE editions. - Elasticsearch has a similar approach I think. - As you mentioned, there is MongoDB, NGINX etc.

I think this is a reasonable model. It's not easy to build / sustain complex software and while it works for a lot of projects, maybe majority of projects get neglected or depends on a few dedicated contributors. Who often burns out etc.

It also provides some confidence to users (at least to me). I know the project is open source. If the backing company goes evil, a fork can always emerge (iojs comes to mind). And I also know that some people are making money over this software and they will keep it alive / work on it as long as it sustains the business.

[+] rodionos|10 years ago|reply
Axibase Time Series Database is free for pseudo-cluster installations and ships with SQL and visualization included [0].

We also have a fully functional Grafana driver in case the customer prefers it over programmable visualization that we ship.

By programmable visualization I mean a way of building dashboards using toml-flavored configuration [1] and templating language, as opposed to manual design.

[0] https://axibase.com/products/axibase-time-series-database/ [1] https://apps.axibase.com/chartlab/2ef08f32

[+] fangjin|10 years ago|reply
Druid (http://druid.io/druid-powered.html) is another option for similar workloads. Druid is a community-led open source data store used by many technology companies at very large scale. Comes with multiple visualization/open source applications, SQL interfaces, Grafana extensions, and a community to help with issues.
[+] icbm504|10 years ago|reply
It seems like there are 2 OSS models other there: 1) community supported; 2) company led.

Most of the "community supported" projects that I have used are libraries. Since shifting from developer to the devops team, nearly everything that I rely on is "company led": saltstack, docker, grafana, influxdb, ELK, ngnix, sensu, rabbitmq, etc.

That being said, we are now re-evaluating the use of influxdb.

[+] siscia|10 years ago|reply
An interesting way to monetize software could be to have people pay to access official docker images.

You want to pull the latest images an unlimited number of times ? Sure, it is just XXX$/month

Nothing will stop people to build open source containers of the same product, but those won't be "official" and there won't be any update guarantees...

[+] reubensutton|10 years ago|reply
I wonder if clustering will be available in the cloud service?

We use a single server on Influx's cloud service for some internal metrics and it has been running without any interruptions for >6mo.

[+] pauldix|10 years ago|reply
It will be available in the cloud service in the next few months. But if you want to continue running single server that's ok too