top | item 47042729

Shard – A Distributed P2P AI Network for Shared Inference

1 points| tpierce89 | 13 days ago |github.com

4 comments

Hi HN! I’ve been building Shard, a browser-powered distributed AI inference network designed to let users contribute compute (via WebGPU) while powerful verifier nodes finalize outputs.

What it is right now

Shard is a functioning early-stage system that lets: • Browsers act as Scout nodes to contribute WebGPU compute • A libp2p mesh for P2P networking • Verifier nodes run stronger local models to validate and finalize inference • A demo web app you can try live today • Clients fall back gracefully if WebGPU isn’t available • Rust daemon + Python API + web UI all wired together

It’s essentially a shared inference fabric — think distributed GPU from volunteers’ browsers + stronger hosts that stitch results into reliable responses. The repo includes tooling and builds for desktop, web, and daemon components.

Why it matters

There’s a growing gap between massive models and accessible compute. Shard aims to: • Harness idle WebGPU in browsers (scouts) • Validate and “finish” results on robust verifier nodes • Enable decentralized inference without centralized cloud costs • Explore community-driven compute networks for AI tasks

This isn’t just a demo — it’s a full stack P2P inference system with transport, networking, and workflow management.

Current limitations • Early stage, not production hardened • Needs more tests, documentation, and examples • Security and incentive layers are future work • UX around joining scheduler/mesh could improve

Come build with me

If you’re into decentralized compute, AI infrastructure, web GPU, or mesh networks — I’d love feedback, contributions, and ideas. Let’s talk about where shared inference networks could go next.

Repo: https://github.com/TrentPierce/Shard

verdverm|13 days ago

If you can do this with Ai so easily, why do I want to use yours instead of the one my Ai generates?

tpierce89|13 days ago

shard isn't about generating content, it's infrastructure for distributed AI compute. instead of paying cloud providers for GPU time, it lets you tap into spare WebGPU capacity from browsers + verification nodes. think of it as shared computing resources rather than an AI assistant. The assistant is just a way to show that distributed inference works. I also plan to add access to the api.