top | item 47162581

Show HN: Context Harness – Local first context engine for AI tools

2 points| __parallaxis | 5 days ago |github.com

Context Harness is a single Rust binary that gives AI tools like Cursor and Claude project-specific memory. It ingests docs, code, Jira tickets, Slack threads, and anything else into a local SQLite database, indexes them with FTS5 and optional vector embeddings, and exposes hybrid search via CLI and an MCP-compatible HTTP server.

I built this because I kept hitting the same problem: AI tools are powerful but have no memory of my complex multi-repo project. They can't search our internal docs, past incidents, or architecture decisions. Cloud RAG services exist, but they're complex, expensive, and your data leaves your machine. I wanted something I could point at my sources and just run `ctx sync all`.

Quick start:

    # Install (pre-built binaries available for macOS/Linux/Windows)
    cargo install --git https://github.com/parallax-labs/context-harness.git

    # Create config and initialize
    ctx init

    # Sync your data sources (filesystem, Git, S3, or Lua scripts)
    ctx sync all

    # Search from CLI
    ctx search "how does the auth service validate tokens"

    # Or start the MCP server for Cursor/Claude Desktop
    ctx serve mcp
What it does differently from other RAG tools:

- *Truly local*: SQLite + single binary. No Docker, no Postgres, no cloud. Local embeddings (bundled or pure-Rust) so semantic and hybrid search work with zero API keys. Back up your entire knowledge base with `cp ctx.sqlite ctx.sqlite.bak`.

- *Hybrid search*: FTS5 keyword scoring + cosine vector similarity with configurable blending. Works without embeddings too (keyword-only mode); with local embeddings you get full hybrid search offline.

- *Lua extensibility*: Write custom connectors, tools, and agents in Lua without recompiling anything. The Lua VM has HTTP, JSON, crypto, and filesystem APIs built in.

- *Extension registry*: `ctx registry init` installs a Git-backed community registry with 10 connectors (Jira, Confluence, Slack, Notion, RSS, Stack Overflow, Linear, etc.), 4 MCP tools, and 2 agent personas.

- *MCP protocol*: Cursor, Claude Desktop, Continue.dev, and any MCP-compatible client can connect and search your knowledge base directly.

Embeddings: you can run *fully offline* — the default build uses local embeddings (fastembed with bundled ONNX on most platforms, or a pure-Rust tract path on Linux musl and Intel Mac). No API key required. Optional: Ollama (local LLM stack) or OpenAI if you prefer. Keyword-only mode needs zero deps. There's no built-in auth layer; it's designed for local or trusted network use.

Stack: Rust, SQLite (WAL mode), FTS5, mlua (Lua 5.4), axum, MCP Streamable HTTP. MIT licensed.

GitHub: https://github.com/parallax-labs/context-harness

Docs: https://parallax-labs.github.io/context-harness/

Community Registry: https://github.com/parallax-labs/ctx-registry

If you find it useful, a star on GitHub is always appreciated.

Would love feedback on the search quality tuning (hybrid alpha, candidate counts) and the Lua extension model.

4 comments

order

fidorka|5 days ago

Hey Parker, this is really cool! Thanks for sharing. Have you tried using the entire CLI? It might also be a tool which you could compose into your workflow to have better memory of what the agents themselves did in the repo.

Btw, I built something similar to solve the context problem for most of my laptop-based activity.

It's slightly more heavyweight (electron app ingesting screenshots) - that being said I took many similar design decisions (local embeddings, sqlite with vector search and FTS hybrid, MCP extension to claude). Feel free to check it out:

https://github.com/deusXmachina-dev/memorylane

__parallaxis|5 days ago

Hey, thanks for the kind words — and for the CLI/agent-memory idea. I haven’t fully leaned into “index what the agents did in the repo” yet; that’s a great direction. The MCP server gets most of the love so far, but the CLI is the same pipeline (sync → chunk → embed → search), so wiring it in so agents can search their own history is totally doable. I’ll keep that in mind.

I have mostly used the entire API surface so far. Check out the usage in this github action script: https://github.com/parallax-labs/context-harness/blob/main/s...

This is used to build the search index on the website (below)

This tool is made for not only local, but embedded in a ci context

MemoryLane looks really cool — same problem, different surface. Local embeddings + SQLite + hybrid FTS/vector + MCP into Claude is basically the same stack; the screenshot-ingestion and Electron UX are a neat take for “everything I’ve seen on this machine.” I’ll definitely poke around the repo. If you want to see how we’re using custom agents on top of that pipeline, a couple of blog posts go into it: Chat with your blog

- https://parallax-labs.github.io/context-harness/blog/enginee... (persona grounded in your own writing, inline + Lua agents) this is in the same vein. Allowing agents to write into the vector store with an MCP tool is on the road map.

- https://parallax-labs.github.io/context-harness/blog/enginee... Unified context for your engineering team (shared KB, code-reviewer and on-call agents).

jamiecode|5 days ago

[deleted]

__parallaxis|5 days ago

Good call on WAL — we do use it. The DB layer explicitly enables WAL on every connection so we get concurrent readers and a single writer without blocking, which matters when the MCP server is serving search while a sync is running in the same process. Concurrency model: one process (one ctx serve mcp or one ctx sync at a time) with a small connection pool. Inside that process, reads (search, get) and writes (sync, ingest) can overlap; WAL keeps readers from blocking the writer and the writer from blocking readers. So it’s built for single-agent/single-server use in the sense of one Context Harness process, but that process can handle many concurrent MCP clients (all reading) and the occasional sync (one writer).

If you run multiple processes (e.g. a cron that runs ctx sync while another sync or a long ctx serve mcp is also writing), you still only get one writer at a time at the SQLite level. We recommend not overlapping writers across processes — e.g. cron that runs sync every N hours and doesn’t start the next run until the previous one has finished (or use a lockfile). Our deployment doc says: if you see "database is locked", ensure only one ctx sync runs at a time. So: WAL is on and does what you’d expect; multi-process write contention is avoided by design (one sync at a time / no overlapping cron invocations, etc).

I might add configurable vector storage later (e.g. plug in something else for the embedding index), but I’m still not sure I need or want it. I like keeping the stack opinionated toward SQLite — one file, one binary, no extra services — so that’s the default for the foreseeable future.