top | item 47170645

(no title)

guerython | 3 days ago

Love this direction. Memory failures are usually silent until quality drops, so treating memory as an SLO surface makes sense.

One metric that helped us was retrieval precision@k against a small gold set of "must-return" facts from prior sessions. Drift there showed degradation earlier than latency/token metrics.

If you haven’t already, adding write-amplification + duplicate-rate tracking is useful too. We found many systems look healthy while gradually filling with near-duplicate notes that poison recall.

discuss

order

sukinai|3 days ago

This is super useful. I really like the idea of treating memory as an SLO surface rather than just a storage layer.

Retrieval precision@k against a small gold set is a very strong suggestion. That feels like a much better early warning signal than just latency or token usage, because those can look fine while memory quality is quietly degrading.

Write amplification and duplicate-rate tracking also make a lot of sense. Near-duplicate buildup is exactly the kind of thing that makes a memory system look healthy on the outside while slowly poisoning recall underneath.

I have basic duplicate detection in /nemp:health, but I haven’t framed it yet in terms of retrieval quality metrics the way you described. That’s a really good direction. Thank you