top | item 30435023

(no title)

mfringel | 4 years ago

How does Timescale handle lookup tables?

That is, "I have this seldomly-updated list of ~10000 things, and I'm going to need to join it against my time-series data."

With other time-series databases I've dealt with, it's an afterthought at best and the answer is typically "Enrich the data via flink/benthos/etc. on import and avoid using any kind of join."

Does Timescale's use of PostgreSQL circumvent this issue, both in terms of storage of lookup tables, and performance on join?

discuss

mfreed|4 years ago

Yes, we support the rich set of PostgreSQL's JOIN operations, including against hypertables. It's generally smart enough to only apply these JOINs against the right subset of time-series data if you also have any time predicates (due to the way we perform "constraint exclusion" against our hypertable chunks).

There are other common queries related to what you describe, like a "last point query": Tell me the last record for each distinct object. Here, for example, we've built special query optimizations to significantly accelerate such queries:

https://www.timescale.com/blog/how-we-made-distinct-queries-...

manigandham|4 years ago

There are plenty of distributed relational columnstores that can do joins. Timescale is bringing that to Postgres but you already have options from Clickhouse to Redshift.

mfringel|4 years ago

Sure, but the ability to do so is separate from whether it's a good idea or not.