top | item 45937732

(no title)

vulkoingim | 3 months ago

I'm not sure where you saw that Victoria Metrics uses object storage. It doesn't - it uses block storage and it runs completely fine on HDD, you don't even need SSD/NVMe.

There are multiple ways to deal with ingestion floods. Kafka/distributed log is one of them, but it's not the only one. In cluster mode VM is a distributed set of services that scale out independently and buffer at different levels.

Resource usage for ingestion/storage is much lower than other solutions, and you get more for your money. At $PREVIOUS_JOB, we migrated from a very expensive Thanos to a VM cluster backed by HDDs, and saved a lot. Performance was much better as well. It was a while ago, and I don't remember the exact number of time series, but it was meant to handle 10k+ VMs (and a lot of other resources, multiple k8s clusters) and did it with ease (also for everybody involved).

I don't think you have really looked into VM - you might get pleasantly surprised by what you find :) Check out this benchmark with Mimir[1] (it is a few years old though), and some case studies [2]. Some of the companies in the case studies run at significantly higher volume than your requirements.

[1] https://victoriametrics.com/blog/mimir-benchmark/

[2] https://docs.victoriametrics.com/victoriametrics/casestudies...

discuss

order

solatic|3 months ago

There were other problems with VictoriaMetrics - a failed migration attempt by previous engineers made it politically difficult to raise as a possibility, lack of a promise of full PromQL compatibility (too many PromQL dashboards built by too many teams), seeing features locked behind the Enterprise version (Mimir Enterprise had features added on top, not features locked away).

> HDD

You're right, I'm misremembering here, that particular complaint about a lack of Kafka was a Thanos issue, not VM.

That said, HDD is a hard sell to management. Seen as "not cloud native". People with old trauma from 100% full disks not expanded in time. Organizational perception that object storage does not need to be backed up (because redundancy is built into the object storage system) but HDD does (and automated backups are a VM Enterprise feature, and even more important if storing long-term metrics in VM).

> In cluster mode VM is a distributed set of services that scale out independently and buffer at different levels

So are Thanos and Mimir, which suffer from ingest floods causing DoS, at least until Kafka was added. vminsert is billed as stateless, same as Thanos Receiver, same as Mimir Distributor. Not convinced.

valyala|3 months ago

> lack of a promise of full PromQL compatibility (too many PromQL dashboards built by too many teams)

This is a classical FUD. VictoriaMetrics is used as a drop-in replacement for Prometheus, Thanos and Mimir. It works perfectly across all the existing dashboards in Grafana, and across all the existing recording and alerting rules. I'm unaware of VictoriaMetrics users who hit PromQL compatibility issues during the migration from Prometheus, Thanos and Mimir to VictoriaMetrics. There are a few deliberate incompatibilities aimed towards improving user experience. See https://medium.com/@romanhavronenko/victoriametrics-promql-c...

> seeing features locked behind the Enterprise version (Mimir Enterprise had features added on top, not features locked away)

All the VictoriaMetrics features, which are useful across the majority of practical use cases, are included in open-source version. The main Enterprise feature - high-quality technical support by VictoriaMetrics engineers. Other Enterprise features are needed only for large enterprise companies. See https://docs.victoriametrics.com/victoriametrics/enterprise/

I recommend reading real-world case studies from happy users, who migrated from other systems (including Prometheus, Thanos and Mimir) to VictoriaMetrics - https://docs.victoriametrics.com/victoriametrics/casestudies...

gopher_space|3 months ago

In the back of my head there’s always the thought of dropping availability once we start discussing mutually exclusive operations.