top | item 37298474

(no title)

flaviut | 2 years ago

I've got a small personal project submitting traces/logs/metrics to Clickhouse via SigNoz. Only about 400k-800k spans per day (https://i.imgur.com/s0J6Mzo.png), but running on a single t4g.small with CPU typically at 11% and IOPS at 4%. I also have everything older than a certain number of GB getting pushed to a sc1 cold storage drive.

w/ 1 month retention for traces:

    ┌─parts.table─────────────────┬──────rows─┬─disk_size──┬─engine────┬─compressed_size─┬─uncompressed_size─┬────ratio─┐
    │ signoz_index_v2             │  26902115 │ 17.06 GiB  │ MergeTree │ 6.21 GiB        │ 66.74 GiB         │   0.0930 │
    │ durationSort                │  26901998 │ 5.44 GiB   │ MergeTree │ 5.40 GiB        │ 53.02 GiB         │  0.10190 │
    │ trace_log                   │ 123185362 │ 2.64 GiB   │ MergeTree │ 2.64 GiB        │ 37.96 GiB         │   0.0695 │
    │ trace_log_0                 │ 120052084 │ 2.46 GiB   │ MergeTree │ 2.45 GiB        │ 37.60 GiB         │  0.06528 │
    │ signoz_spans                │  26902115 │ 2.21 GiB   │ MergeTree │ 2.21 GiB        │ 76.73 GiB         │ 0.028784 │
    │ query_log                   │  16384865 │ 1.91 GiB   │ MergeTree │ 1.90 GiB        │ 18.31 GiB         │  0.10398 │
    │ part_log                    │  17906105 │ 846.73 MiB │ MergeTree │ 845.39 MiB      │ 3.84 GiB          │  0.21521 │
    │ metric_log                  │   4713151 │ 820.92 MiB │ MergeTree │ 806.13 MiB      │ 14.56 GiB         │  0.05405 │
    │ part_log_0                  │  15632289 │ 702.82 MiB │ MergeTree │ 701.70 MiB      │ 3.34 GiB          │  0.20490 │
    │ asynchronous_metric_log     │ 795170674 │ 576.24 MiB │ MergeTree │ 562.50 MiB      │ 11.11 GiB         │ 0.049429 │
    │ query_views_log             │   6597156 │ 461.35 MiB │ MergeTree │ 459.75 MiB      │ 6.36 GiB          │  0.07060 │
    │ logs                        │   6448259 │ 408.59 MiB │ MergeTree │ 406.65 MiB      │ 5.99 GiB          │  0.06627 │
    │ samples_v2                  │ 949110122 │ 345.01 MiB │ MergeTree │ 325.31 MiB      │ 22.09 GiB         │ 0.014382 │
If I was less stupid I'd get a machine with the recommended Clickhouse specs and save myself a few hours of tuning, but this works great.

Downsides:

- clickhouse takes about 5 minute to start up because my tiny sc1 drive has like 4 IOPS allowed

- signoz's UI isn't amazing. It's totally functional, and they've been improving very quickly, but don't expect datadog-level polish

discuss

order

pranay01|2 years ago

Thanks for mentioning SigNoz, I am one of the maintainers at SigNoz and would love your feedback on how we can improve it further.

If anyone wants to check our project, here’s our GitHub repo - https://github.com/SigNoz/signoz

flaviut|2 years ago

I hope I'm not coming across as negative! Y'all are just have a much younger product, and have not had time to do all the polish and tiny tweaks. I'm also much more familiar with Datadog, and sometimes a learning curve feels like missing features.

- I really like your new Logs & Traces Explorers. I spend a lot of time coming up with queries, and having a focused place for that is great. Especially since there's now a way to quickly turn my query into an alert or a dashboard item.

- You've also recently (6mo?) improved the autocomplete dramatically! This is awesome, and one of my annoyances with Datadog

Other feedback, and honestly this is all very minor. I'd be perfectly happy if nothing ever changed.

- where do I go see the metrics? There's no "Metrics" tab the way there's a "Logs" and "Traces" tab. A "Metrics Explorer" would be great.

- when I add a new plot, having to start out with a blank slate is not great. Datadog defaults to a generic system.cpu query just to fill something in, I find this helpful.

- when I have a plot in a dashboard and I see it is trending in the wrong direction, it would be nice to be able to create an alert directly from the chart rather than have to copy the query over.

- the exceptions tab is very helpful, but I've only recently discovered the LOW_CARDINAL_EXCEPTION_GROUPING flag. It'd be super nice if the variable part of exceptions was automatically detected and they were grouped

- once nice thing in DD is being able to preview a span from a log or logs from a span without opening a new page. Or previewing a span from the global page. Temporary popping this stuff up in a sidebar would be great.

- I'm not sure if there's a way to view only root spans in the trace viewer.

- This might be a problem with the spring boot instrumentation, but I can't see how to figure out what kind of span it is. Is it a `http.request`, `db.query`, etc?