top | item 30433722

(no title)

jtlisi | 4 years ago

Have you read the gorilla TSDB paper? https://www.vldb.org/pvldb/vol8/p1816-teller.pdf

It does a good job laying out why TSDBs are used and some of the tricks they leverage to store this type of data. See the requirements for the service layed out in the paper:

• 2 billion unique time series identified by a string key.

• 700 million data points (time stamp and value) added per minute.

• Store data for 26 hours.

• More than 40,000 queries per second at peak.

• Reads succeed in under one millisecond.

• Support time series with 15 second granularity (4 points per minute per time series).

• Two in-memory, not co-located replicas (for disaster recovery capacity).

• Always serve reads even when a single server crashes.

• Ability to quickly scan over all in memory data.

• Support at least 2x growth per year

Lots of organizations want to adopt an SRE/devops model and want a similar system. Also one thing you should know is that trying to accomplish this with traditional DBMS is usually possible but since it is not making specifically optimized trade offs it usually is more expensive and requires a lot of tuning/expertise.

Lots of organizations (even legacy companies) have a massive need for this kind of service. Also there are very cheap options out there than can handle the million metric use case for basically a <100$ a month is infra costs. The use case is definitely there and even if it's possible with traditional DBMS systems, it usually cheaper and more performant to use a dedicated TSDB.

discuss

No comments yet.