Hi HN! I’ve been building Shard, a browser-powered distributed AI inference network designed to let users contribute compute (via WebGPU) while powerful verifier nodes finalize outputs.
What it is right now
Shard is a functioning early-stage system that lets:
• Browsers act as Scout nodes to contribute WebGPU compute
• A libp2p mesh for P2P networking
• Verifier nodes run stronger local models to validate and finalize inference
• A demo web app you can try live today
• Clients fall back gracefully if WebGPU isn’t available
• Rust daemon + Python API + web UI all wired together
It’s essentially a shared inference fabric — think distributed GPU from volunteers’ browsers + stronger hosts that stitch results into reliable responses. The repo includes tooling and builds for desktop, web, and daemon components.
Why it matters
There’s a growing gap between massive models and accessible compute. Shard aims to:
• Harness idle WebGPU in browsers (scouts)
• Validate and “finish” results on robust verifier nodes
• Enable decentralized inference without centralized cloud costs
• Explore community-driven compute networks for AI tasks
This isn’t just a demo — it’s a full stack P2P inference system with transport, networking, and workflow management.
Current limitations
• Early stage, not production hardened
• Needs more tests, documentation, and examples
• Security and incentive layers are future work
• UX around joining scheduler/mesh could improve
Come build with me
If you’re into decentralized compute, AI infrastructure, web GPU, or mesh networks — I’d love feedback, contributions, and ideas. Let’s talk about where shared inference networks could go next.
shard isn't about generating content, it's infrastructure for distributed AI compute. instead of paying cloud providers for GPU time, it lets you tap into spare WebGPU capacity from browsers + verification nodes. think of it as shared computing resources rather than an AI assistant. The assistant is just a way to show that distributed inference works. I also plan to add access to the api.
tpierce89|13 days ago
What it is right now
Shard is a functioning early-stage system that lets: • Browsers act as Scout nodes to contribute WebGPU compute • A libp2p mesh for P2P networking • Verifier nodes run stronger local models to validate and finalize inference • A demo web app you can try live today • Clients fall back gracefully if WebGPU isn’t available • Rust daemon + Python API + web UI all wired together
It’s essentially a shared inference fabric — think distributed GPU from volunteers’ browsers + stronger hosts that stitch results into reliable responses. The repo includes tooling and builds for desktop, web, and daemon components.
Why it matters
There’s a growing gap between massive models and accessible compute. Shard aims to: • Harness idle WebGPU in browsers (scouts) • Validate and “finish” results on robust verifier nodes • Enable decentralized inference without centralized cloud costs • Explore community-driven compute networks for AI tasks
This isn’t just a demo — it’s a full stack P2P inference system with transport, networking, and workflow management.
Current limitations • Early stage, not production hardened • Needs more tests, documentation, and examples • Security and incentive layers are future work • UX around joining scheduler/mesh could improve
Come build with me
If you’re into decentralized compute, AI infrastructure, web GPU, or mesh networks — I’d love feedback, contributions, and ideas. Let’s talk about where shared inference networks could go next.
Repo: https://github.com/TrentPierce/Shard
verdverm|13 days ago
tpierce89|13 days ago