top | item 40417347

pg_timeseries: Open-source time-series extension for PostgreSQL

331 points| samaysharma | 1 year ago |tembo.io

82 comments

order
[+] riedel|1 year ago|reply
>You may already be asking: “why not just power the stack using TimescaleDB?” The Timescale License would restrict our use of features such as compression, incremental materialized views, and bottomless storage. With these missing, we felt that what remained would not provide an adequate basis for our customers’ time-series needs. Therefore, we decided to build our own PostgreSQL-licensed extension.

Have been using the free version timescaledb before to shard a 500 Million observation time series database. Worked drop-in without much hassle. Would have expected some benchmarks and comparisons in the post. I will for sure watch this...

[+] osigurdson|1 year ago|reply
500 million is very little however. A regular table with a covering index would probably be fine for many use cases with this number of points.
[+] rapsey|1 year ago|reply
Databases are a tough business. You're just waiting for open source to eat your lunch.
[+] remram|1 year ago|reply
500 million observations, with 4-byte floats, is 2 GB. This is the kind of size that you can store uncompressed, in RAM, on a phone. It is hardly at the point where you require specialized time-series software at all.
[+] mnahkies|1 year ago|reply
Looking at their roadmap, the killer feature for me would be incremental materialised views

> Incremental view maintenance — define views which stay up-to-date with incoming data without the performance hit of a REFRESH

I wonder if they plan to incorporate something like https://github.com/sraoss/pg_ivm or write their own implementation.

(Although I'm hopeful that one day we see ivm land in postgres core)

[+] techoffs|1 year ago|reply
Former Timescaler here.

It's about time that Timescale started getting what it deserves.

Sometime in early 2022, just as they raised their Series C, leadership decided that they had gotten what they wanted from the open-source community and TimescaleDB. They decided it was time to focus 100% on Timescale Cloud. Features began to become exclusive to Timescale Cloud, and the self-hosted TimescaleDB was literally treated as competition. At the same time, they managed to spoil their long-time PaaS partnership with Aiven, which was (and still is) a major source of revenue for the company. The reason? Everyone needed to use Timescale Cloud and give their money to Timescale, thus making Aiven a competitor. In short, with the raising of Series C, Timescale stopped being an OSS startup and began transitioning to a money hungry corporation.

In 2023, they conducted two rounds of layoffs, even though the company was highly profitable. Recently, Planetscale also carried out layoffs in a similarly harsh manner as Timescale, but at least Planetscale had the "courtesy" to address this with two sentences in their PR statement about company restructuring. Timescale did not even do that; they kept it all quiet. Out of 160 employees, around 65 were laid off. The first round of layoffs occurred in January, and the second in September. No warnings. No PIPs. Just an email informing you that you no longer work for them. Many of the affected employees were in the middle of ongoing projects. The CEO even mentioned in the in-house memo how they diligently worked on the September layoff throughout the summer. Interestingly, many of these employees were hired by competitors like Supabase and Neon. It’s worth emphasizing that this was not a financial issue—Timescale is far from having such problems. Instead, it was a restructuring effort to present nice financial numbers during ongoing turbulence in the tech market. (And yes, you guessed it! Timescale also hired their first CFO a couple of months before the first layoffs.)

You might say that it's just business, but as an OSS startup, I expect them to live by the values they have advertised over the years and treat their users and employees much better than they currently do. With this in mind, I welcome Tembo as a new player in the time-series market.

Footnotes: Timescale = the company. TimescaleDB = OSS time-series database developed by Timescale. Timescale Cloud = TimescaleDB managed by Timescale on AWS.

[+] redwood|1 year ago|reply
Knowing little about the company, it's almost certainly completely untrue to state the "the company was highly profitable"; you don't hire 160 people in an OSS-centric business and also turn a profit. You likely have a distorted understanding of the challenging nature of the burn rate in a changing macro environment.
[+] akulkarni|1 year ago|reply
Ajay, Timescale CEO and co-founder, here.

It saddens me to see that we have generated so much ill will from you. It sounds like you were affected by our layoffs last year. You have every right to be upset. If you ever want to chat about this 1:1, you know how to reach me. I’d be happy to make the time.

To anyone else reading this: Some of what this person has shared is true, but some of it is not true.

I debated whether or not to reply. But one of my personal leadership values is “transparency”, so I thought I’d take the time to respond.

Yes, we conducted two rounds of layoffs in 2023. Like many tech companies, we hired a lot in 2021 and early 2022. Then, as the tech market began to correct mid 2022, we were forced to make tough decisions, including layoffs.

I take responsibility for the over-hiring and the layoffs. It brought me no joy to do them. But I feel a moral obligation to our customers to stay on the path of financial sustainability. I also feel a fiduciary obligation to our investors, some of whom are individuals, some of whom are large funds, who have all trusted us with their money. I feel a similar responsibility to current and former Timescalers who own equity in Timescale.

Sometimes, that means making tough decisions like this. But again, it was my call (not anyone else), and I accept full responsibility.

Yes, we did not publicize this news. Frankly, we thought we were too small for others to care. Maybe we got that wrong. But that decision came from a place of humility.

This is not true: “Just an email informing you that you no longer work for them.” Every affected person – except for a handful who were not working that day – was told the news individually, on a live Zoom call, that included at least one of our executives or a member of our People team. For the few teammates who were not working that day, we made many attempts to connect with them personally. I know the team tried their best to approach these hard conversations with care and empathy.

I was glad to see that a number of the affected individuals quickly found new roles at other companies in the PostgreSQL ecosystem, including at Supabase, Neon, and Tembo. These are good, smart people. The PostgreSQL ecosystem is better off with these people continuing to work to improve PostgreSQL.

The comments questioning our belief in open source are also not true. We still believe in open source. The core of TimescaleDB is still open source. Some of the advanced features are under a free, source-available license. Our latest release – TimescaleDB 2.15 – was just two weeks ago. Unlike most (all?) of our competitors, we have never re-licensed our open source software. This is something that is true for us but not for many others, like MongoDB, Elastic, Redis, Hashicorp, Confluent, etc.

Yes, we are building a self-sustaining open source business. Yes, it is hard and sometimes we get things wrong. But we have never stopped investing in our community. Today the TimescaleDB community (open source and free) is 20x larger than our customer base. And this community has more than doubled in the past 1+ year. We are also planning significant open source contributions for the next few months.

To the author of this post: I hope this response provides some clarification. And again, I’m available to chat one-on-one if you’d like.

To our open source and free community users, and to our customers: thank you for trusting us with your workloads. We are committed to serving you.

Finally, to the Timescale team, both current and former: thank you for all your hard work making developers successful. We are here to serve developers so that they can build the future. The road won’t always be easy or smooth. But we are committed, and we will get there.

[+] MuffinFlavored|1 year ago|reply
Dumb question: why can't I just insert a bunch of rows with a timestamp column and indices? Where does that fall short? At a certain # of rows or something?

What does this let me do that can't be achieved with "regular PostgreSQL without the extension"?

[+] skibbityboop|1 year ago|reply
I'm with you, I need to read up more on where timeseries could benefit, at work we have a PostgreSQL instance with around 27 billion rows in a single partitioned table, partitioned by week. Goes back to January of 2017 and just contains tons of data coming in from sensors. It's not "fast", but also not ridiculously slow to say e.g. "Give me everything for sensor 29380 in March of 2019".

I guess depends on your needs but I do think I need to investigate timeseries more to see if it'd help us.

[+] ishikawa|1 year ago|reply
there are several good articles explaining this, especially on Timescable blog, but in short, without time partitioning and just index, at some given point the performance for reads and writes degrades exponencially.
[+] gonzo41|1 year ago|reply
Time based partitioning.
[+] nitinreddy88|1 year ago|reply
Most of the time-series queries (almost all of them) are aggregated queries. Why not leverage or build top-notch Columnarstore for the same.

Everything seems to be there and why there's not first class product like ClickHouse on PG.

[+] netik|1 year ago|reply
The gold standard for this Druid at very large scale, or ClickhouseDB. Clickhouse has a lot of problems as far as modifying/scaling shards after the fact, while Druid handles this with ease (and the penalty of not being able to update after the fact.)
[+] PeterZaitsev|1 year ago|reply
Great to see this kind of innovation. PostgreSQL is interesting while "core" was always Open Source and using very permissive Open Source library, there have been many proprietary and source available extensions, ranging from replication to time series support.

Now we see those Proprietary extensions being disrupted by proper Open Source!

[+] vantiro|1 year ago|reply
PostgreSQL licensed, good move!
[+] plainOldText|1 year ago|reply
Your site is very well designed and easy to read btw, and the app UI looks great from the demo photos. I might try it!
[+] nhourcard|1 year ago|reply
Interesting release, it feels that the time-series database landscape is evolving toward:

a) columnar store & built from scratch, with convergence toward open formats such as parquet & arrow: influxdb 3.0, questdb

b) Adding time-series capabilities on top of Postgres: timescale, pg_timeseries

c) platforms focused on observability around the Prometheus ecosystem: grafana, victoria metrics, chronosphere

[+] wdb|1 year ago|reply
Would this be a good extension when you want to load balancer log entries (status, response body, headers etc)?

I think a columnar database store would be more efficient than normal row-based databases? load balancer log entries could be considered something similar to analytics events.

[+] samaysharma|1 year ago|reply
Yes. Columnar is integrated with pg_timeseries already.
[+] gxyt6gfy5t|1 year ago|reply
How’s it different than timescaledb?
[+] logrot|1 year ago|reply
> You may already be asking: “why not just power the stack using TimescaleDB?” The Timescale License would restrict our use of features such as compression, incremental materialized views, and bottomless storage. With these missing, we felt that what remained would not provide an adequate basis for our customers’ time-series needs. Therefore, we decided to build our own PostgreSQL-licensed extension.
[+] hosh|1 year ago|reply
The use of postgresql licensing might mean we can see this available for AWS RDS and other managed PostgreSQL providers.
[+] vantiro|1 year ago|reply
Timescaledb's license is more like Redis' new license?
[+] valenterry|1 year ago|reply
It's about time that postgres (and other databases) add native append-only tables. That doesn't make it timeseries, but it probably helps with the standardiziation and all the logic/access around it.
[+] ramoneguru|1 year ago|reply
How does this stack up against something like what QuestDB offers?
[+] suyash|1 year ago|reply
Interesting, how does it compare to proper (open source) time series database like InfluxDB other than being 'Postgres' like ?
[+] tucnak|1 year ago|reply
i's not Postgres-like, it _is_ Postgres
[+] RedShift1|1 year ago|reply
InfluxDB is a "proper" time series database?
[+] matthewmueller|1 year ago|reply
Would love to use this with RDS!
[+] rywalker|1 year ago|reply
Tembo CEO here - we are targeting feature parity for Tembo Cloud w/ RDS as soon as possible, would love to have you give Tembo a try sometime, give us feedback :)

Tembo Cloud is standard SaaS offering, and our new Tembo Self Hosted (https://tembo.io/docs/product/software/tembo-self-hosted/ove...) allows you to run the same software that powers our SaaS, but in your own K8s cluster.

[+] anentropic|1 year ago|reply
Same here, but not holding my breath since all these neat Postgres extensions compete with other AWS DBs like Redshift, Timestream etc