top | item 45347714

Show HN: Kekkai – a simple, fast file integrity monitoring tool in Go

59 points| catatsuy | 5 months ago |github.com

I built a tool called *Kekkai* for file integrity monitoring in production environments. It records file hashes during deployment and later verifies them to detect unauthorized modifications (e.g. from OS command injection or tampering).

Why it matters:

* Many web apps (PHP, Ruby, Python, etc.) on AWS EC2 need a lightweight way to confirm their code hasn’t been changed. * Traditional approaches that rely on metadata often create false positives. * Kekkai checks only file content, so it reliably detects real changes. * I’ve deployed it to an EC2 PHP application in production, and it’s working smoothly so far.

Key points:

* *Content-only hashing* (ignores timestamps/metadata) * *Symlink protection* (detects swaps/changes) * *Secure S3 storage* (deploy servers write-only, app servers read-only) * *Single Go binary* with minimal dependencies

Would love feedback from others running apps on EC2 or managing file integrity in production.

16 comments

teraflop|5 months ago

I don't really understand the use case for this. Despite all the details in the README, there are only a couple sentences devoted to describing what it's actually for, and they don't make much sense to me.

You're assuming that an attacker already has access to your system, and you want to detect any changes they make to certain files.

If you are dealing with a relatively unsophisticated attacker, surely it would be easier to just mount the data that shouldn't be changed on a read-only filesystem, or set the immutable bit?

And if the attacker is sophisticated, surely they could just disable the verifier? Or replace it with a no-op that doesn't actually check hashes?

> Many web apps (PHP, Ruby, Python, etc.) on AWS EC2 need a lightweight way to confirm their code hasn’t been changed.

I don't think this is true, any more than the square-root function needs a way to confirm that its argument hasn't been tampered with. You're solving the problem in the wrong place. It seems like security theater.

abhas9|5 months ago

You're right that FIM assumes the possibility of compromise, but that's exactly the point - it's a detection control, not a prevention control. Prevention (read-only mounts, immutable bits, restrictive permissions, etc.) is necessary but not sufficient. In practice, attackers often find ways around those measures - for example, through misconfigured deployments, command injection, supply chain attacks, or overly broad privileges.

File Integrity Monitoring gives you a way to prove whether critical code or configuration has been changed after deployment. That’s valuable not only for security investigations but also for compliance.

For example, PCI DSS (Payment Card Industry Data Security Standard) explicitly requires this. Requirement 11.5.2 states:

"Deploy a change-detection mechanism (for example, file-integrity monitoring tools) to alert personnel to unauthorized modification of critical content files, configuration files, or system binaries."

Sure, a "sufficiently advanced" attacker could try to tamper with the monitoring tool, but (1) defense in depth is about making that harder, and (2) good implementations isolate the baseline and reports (e.g. write-only to S3, read-only on app servers), which raises the bar considerably.

smartmic|5 months ago

I posted about AIDE a few weeks ago. I have not checked how that compares to this submission:

https://news.ycombinator.com/item?id=44688636

catatsuy|5 months ago

AIDE is a solid and mature tool. Kekkai focuses on being lightweight:

content-only hashing to avoid false positives,

S3 integration with strict write/read separation,

a single Go binary with minimal dependencies. It’s designed to be easy to deploy and run in production.

irq-1|5 months ago

How is this different from sha256sum (and variants)? Create, store and check file hashes?

catatsuy|5 months ago

Conceptually it’s the same as sha256sum, but Kekkai automates the workflow:

hashes recorded automatically at deploy,

stored in S3 with write/read separation,

verification runs regularly. It saves you from scripting all of that by hand.

huhtenberg|5 months ago

> a simple, fast ... tool

What does "fast" mean here? Fast compared to what?

catatsuy|5 months ago

By fast I mean two things:

Files are hashed in parallel, so large sets can be processed quickly.

On repeated runs, unchanged files skip hashing with a default 90% probability using a cache. This keeps checks lightweight even at scale