top | item 46436723

Mutatis – Database that mutates its schema based on semantic patterns

3 points| Mutatis | 2 months ago |github.com

4 comments

Mutatis|2 months ago

I spent 9 months building Mutatis — a neuroplastic database that physically rewrites its own schema at runtime based on what it learns about you.

The core problem: RAG systems treat all memories equally. "I'm allergic to peanuts" gets the same storage priority as "I like jazz music." Eventually, critical facts get buried under noise.

Mutatis solves this through semantic-triggered schema evolution. When patterns emerge (e.g., "sara" mentioned 7+ times with family_spouse pattern), the system automatically creates a dedicated indexed table and migrates relevant records. Zero downtime via shadow table + atomic swap.

The performance improvement is superlinear: • 1,000 records: 18.9× faster (6.04ms → 0.32ms) • 10,000 records: 62.1× faster (59.03ms → 0.95ms) • 500,000 records: 213.4× faster (4.26s → 19.97ms)

Query complexity improves from O(N) full table scans to O(log N) index lookups. The mutation literally changes SQL from:

```sql -- Before: Generic table, full scan SELECT * FROM generic_memories WHERE content LIKE '%sara%'

-- After: Dedicated table, indexed SELECT * FROM family_spouse_sara WHERE entity = 'sara' ```

The system uses √2 gravity weighting to ensure foundational memories outrank transient ones. Even when episodic memories have higher raw similarity (0.696 vs 0.670), the √2 boost ensures foundational facts rank first (0.947 final score).

What triggers evolution: • Medical conditions ("I'm allergic to penicillin") • Identity statements ("I am vegetarian") • Strong preferences ("I hate coffee") • Pattern matching + confidence scoring + entity tracking

Built with TypeScript + SQLite. Uses mock embeddings for the POC (no API keys needed—just clone and run). Patent pending (US 63/949,136).

Interactive demo: Clone the repo and run `cd core && npm run dev` to watch schema evolution happen live.

Repo: https://github.com/ScooterMageee/mutatis-public

Looking for feedback on: 1. What other semantic patterns should trigger schema evolution? 2. Edge cases where automatic schema mutation could create inconsistencies? 3. How do you currently handle memory drift in RAG systems?

regnodon|2 months ago

Really interesting approach to the RAG noise problem. The atomic swap via shadow tables is a clever way to handle the migration.

One edge case I’m curious about is how the system handles modal logic or intent vs. fact. If a user says 'I live in Texas' and then 'I wish I lived in Florida,' a regex-heavy approach might struggle to differentiate between current state and aspiration.

In a 'neuroplastic' database, how do you handle schema deprecation or 'forgetting' when the foundational patterns drift (e.g., a user moves cities or changes a diet)? Do you have a mechanism for the schema to 'de-evolve' or merge back into a generic table if a specific entity's mention-frequency drops below a certain threshold?

Mutatis|2 months ago

Here's what happens when you run the demo:

After mentioning "abel" 4 times with emotional patterns, schema evolution triggers:

════════════════════════════════════════ SCHEMA EVOLUTION TRIGGERED ════════════════════════════════════════ [SHADOW] Creating emotional_love_evolved_shadow... [BACKFILL] Moving records mentioning 'abel'... [BACKFILL] Moved 8 records [SWAP] Executing atomic transaction... [COMPLETE] Schema evolved successfully

  Before: SELECT * FROM generic_memories WHERE LIKE '%abel%' (O(N) scan)
  After:  SELECT * FROM emotional_love_evolved WHERE entity = 'abel' (O(log N) index)

════════════════════════════════════════

Query performance: • Before evolution: 1.25ms (vector scan) • After evolution: 0.57ms (indexed lookup)

Try it yourself: ```bash cd core && npm install && npm run dev ```

Example session: ``` add I live in Texas add I love abel add abel lives with me add abel loves Texas add I love abel # ← Evolution triggers here

query who is abel? ```

Watch the system detect patterns, track entities, and evolve the schema in real-time. The O(log N) indexed retrieval kicks in automatically after evolution.