top | item 46604862

Show HN: Ayder – HTTP-native durable event log written in C (curl as client)

56 points| Aydarbek | 1 month ago |github.com

Hi HN,

I built Ayder — a single-binary, HTTP-native durable event log written in C. The wedge is simple: curl is the client (no JVM, no ZooKeeper, no thick client libs).

There’s a 2-minute demo that starts with an unclean SIGKILL, then restarts and verifies offsets + data are still there.

Numbers (3-node Raft, real network, sync-majority writes, 64B payload): ~50K msg/s sustained (wrk2 @ 50K req/s), client P99 ~3.46ms. Crash recovery after SIGKILL is ~40–50s with ~8M offsets.

Repo link has the video, benchmarks, and quick start. I’m looking for a few early design partners (any event ingestion/streaming workload).

29 comments

Aydarbek|1 month ago

The demo intentionally starts with SIGKILL to show crash recovery first.

For benchmarks: I used real network (not loopback) and sync-majority writes in a 3-node Raft cluster. Happy to answer questions about tradeoffs vs Kafka / Redis Streams and what’s still missing.

tontinton|1 month ago

Very cool, have you taken a look into what TigerBeetle does with VSR (and why they chose it instead of raft)?

Aydarbek|1 month ago

Yes I’ve read through TigerBeetle’s VSR design and their rationale for not using Raft.

VSR makes a lot of sense for their problem space: fixed schema, deterministic state machine, and a very tight control over replication + execution order.

Ayder has a different set of constraints: - append-only logs with streaming semantics - dynamic topics / partitions - external clients producing arbitrary payloads over HTTP

Raft here is a pragmatic choice: it’s well understood, easier to reason about for operators, and fits the “easy to try, easy to operate” goal of the system.

That said, I think VSR is a great example of what’s possible when you fully own the problem and can specialize aggressively. Definitely a project I’ve learned from.

dagss|1 month ago

Nice to see HTTP API for consuming events.

I wish there was a standard protocol for consuming event logs, and that all the client side tooling for processing them didn't care what server was there.

I was part of making this:

https://github.com/vippsas/feedapi-spec

https://github.com/vippsas/feedapi-spec/blob/main/SPEC.md

I hope some day there will be a widespread standard that looks something like this.

An ecosystem building on Kafka clients libraries with various non-Kafka servers would work fine too, but we didn't figure out how to easily do that.

Aydarbek|1 month ago

This resonates a lot.

I’d love a world where “consume an event log” is a standard protocol and client-side tooling doesn’t care which broker is behind it.

Feed API is very close to the mental model I’d want: stable offsets, paging, resumability, and explicit semantics over HTTP. Ayder’s current wedge is keeping the surface area minimal and obvious (curl-first), but long-term I’d much rather converge toward a shared model than invent yet another bespoke API.

If you’re open to it, I’d be very curious what parts of Feed API were hardest to standardize in practice and where you felt the tradeoffs landed in real systems.

apitman|1 month ago

Love seeing this written in C with an organic, grass-fed Makefile. Any details on why you decided to go that route instead of using something with more hype?

eddd-ddde|1 month ago

That makefile could be made even simpler if it used the implicit rules that compile c files into object files!

heipei|1 month ago

Thank you for sharing, this looks really cool. The simplicity of setting this up and operating it reminds me a lot of nsq which received a lot less publicity than it should have.

Aydarbek|1 month ago

That’s a great comparison nsq is a project I have a lot of respect for.

I think there’s a similar philosophy around simplicity and operator experience. Where Ayder diverges is in durability and recovery semantics nsq intentionally trades some of that off to stay lightweight.

The goal here is to keep the “easy to run” feeling, but with stronger guarantees around crash recovery and replication.

BrouteMinou|1 month ago

That's really interesting, I am even more eager to arrive at home to check that out.

Thank you for sharing this with us.

Aydarbek|1 month ago

Thanks! If you hit any rough edges getting it running, tell me I’ll fix the docs/scripts.

ghxst|1 month ago

If you go http native, could you leverage range headers for offsets?

Aydarbek|1 month ago

Yes, that maps quite naturally.

Classic HTTP Range is byte-oriented, but custom range units (e.g. `Range: offsets=…`) or using `Link` headers for pagination both fit log semantics well.

I kept the initial API explicit (`offset` / `limit`) to stay obvious for curl users, but offset-range via headers is something I want to experiment with, especially if it helps generic tooling.

roywiggins|1 month ago

> No manual intervention. No partition reassignment. No ISR drama.

> Numbers are real, not marketing.

I'm not questioning the actual benchmarks or anything, but this README is substantially AI generated, yeah?

Aydarbek|1 month ago

Fair question.

The benchmarks, logs, scripts, and recovery scenarios are all real and hand-run that’s the part I care most about being correct.

For the README text itself: I did iterate on wording and structure (including tooling), but the system, measurements, and tradeoffs are mine.

If any part reads unclear or misleading, I’m very open to tightening it up. Happy to clarify specifics.

mgaunard|1 month ago

Are those performance measurements meant be impressive? Seems on par with something threwn around with Python in 5 minutes.

dang|1 month ago

Please don't be a jerk or put down others' work on HN. That's not the kind of site we're trying to be.

You're welcome to make your substantive points thoughtfully, of course.

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/showhn.html

Aydarbek|1 month ago

Totally fair, if this were “single-node HTTP handler on localhost”, then yeah, you can hit big numbers quickly in many stacks.

The point of these numbers is the envelope: 3-node consensus (Raft), real network (not loopback), and sync-majority writes (ACK after 2/3 replicas) plus the crash/recovery semantics (SIGKILL → restart → offsets/data still there).

If you have a quick Python setup that does majority-acked replication + fast crash recovery with similar measurements, I’d honestly love to compare apples-to-apples happy to share exact scripts/config and run the same test conditions.