top | item 43930575

(no title)

andrewpareles | 9 months ago

Thanks for the feedback. We'll definitely add a feature list. To answer your question, yes - we support Cursor's features (quick edits, agent mode, chat, inline edits, links to files/folders, fast apply, etc) using open source and openly-available models (for example, we haven't trained our own autocomplete model, but you can bring any autocomplete model or "FIM" model).

We don't have a repomap or codebase summary - right now we're relying on .voidrules and Gather/Agent mode to look around to implement large edits, and we find that works decently well, although we might add something like an auto-summary or Aider's repomap before exiting Beta.

Regarding context - you can customize the context window and reserved amount of token space for each model. You can also use "@ to mention" to include entire files and folders, limited to the context window length. (you can also customize the model's reasoning ability, think tags to parse, tool use format (gemini/openai/anthropic), FIM support, etc).

discuss

order

throwup238|9 months ago

An important Cursor feature that no one else seems to have implemented yet is documentation indexing. You give it a base URL and it crawls and generates embeddings for API documentation, guides, tutorials, specifications, RFCs, etc in a very language agnostic way. That plus an agent tool to do fuzzy or full text search on those same docs would also be nice. Referring to those @docs in the context works really well to ground the LLMs and eliminate API hallucinations

Back in 2023 one of the cursor devs mentioned [1] that they first convert the HTML to markdown then do n-gram deduplication to remove nav, headers, and footers. The state of the art for chunking has probably gotten a lot better though.

[1] https://forum.cursor.com/t/how-does-docs-crawling-work/264/3

mapmap|9 months ago

The continue.dev plugin for Visual Studio Code provides documentation indexing. You provide a base URL and a tag. The plugin then scrapes the documentation and builds a RAG index. This allows you to use the documentation as context within chat. For example, you could ask @godotengine what is a sprite?

lgiordano_notte|9 months ago

Cursor’s doc indexing is acc one of the few AI coding features that feels like it saves time. Embedding full doc sites, deduping nav/header junk, then letting me reference @docs inline actually improves context grounding instead of guessing APIs.

steveharman|9 months ago

Just use the Context7 MCP ? Actually I'm assuming Void supports MCP.

andrewpareles|9 months ago

This is a good point.We've stayed away from documentation assuming that it's more of a browser agent task, and I agree with other commenters that this would make a good MCP integration.

I wonder if the next round of models trained on tool-use will be good at looking at documentation. That might solve the problem completely, although OSS and offline models will need another solution. We're definitely open to trying things out here, and will likely add a browser-using docs scraper before exiting Beta.

RobinL|9 months ago

I agree that on the face of it this is extremely useful. I tried using it for multiple libraries and it was a complete failure though, it failed to crawl fairly standard mkdocs and sphynx sites. I guess it's better for the 'built in' ones that they've pre-indexed