top | item 45304366

(no title)

page_index | 5 months ago

The word index originally came from how humans retrieve information: book indexes and tables of contents that guide us to the right place.

Computers later borrowed the term for data structures such as B-trees, hash tables, and more recently, vector indexes. They're highly efficient for machines, but also abstract and unnatural: not something a human, or an LLM, can directly use as a reasoning aid. This creates a gap between how indexes work for computers and how they should work for models that reason like humans.

PageIndex is a new step that looks back to move forward. It revives the original, human-oriented idea of an index and adapts it for LLMs. Now the index itself (PageIndex) lives inside the LLM's context window: the model sees a hierarchical table-of-contents tree and reasons its way down to the right span, much like a person would retrieve information using a book's index.

PageIndex MCP shows how this works in practice: it runs as a MCP server, exposing a document's structure directly to LLMs. This means platforms like Claude, Cursor, or any MCP-enabled agent can navigate the index themselves and reason their way through documents, not with vectors or chunking, but in a human-like, reasoning-based way.

discuss

order

No comments yet.