top | item 42516981

Show HN: Web RAG to generate perplexity like answers from your docs [in browser]

How It Works - Offline Indexing: Docs are processed and embedded using the GTE-small model at build time.

Browser-Based Magic:

- SQLite database (stored in the browser) for vector search.

- Local embedding model for query processing.

- Local LLaMA model for response generation using WebLLM.

- Everything Happens Locally: No data leaves the user’s device.

Key Benefits

- No API Costs: Everything runs in the browser—zero backend expenses.

- Unlimited Chats: No rate limits or usage restrictions.

- Privacy-First: Your data stays on your device, always.

1 comment

Excited to see this shared! Love how everything runs right in the browser - makes the whole experience super smooth for users.