top | item 47170645

(no title)

Love this direction. Memory failures are usually silent until quality drops, so treating memory as an SLO surface makes sense.

One metric that helped us was retrieval precision@k against a small gold set of "must-return" facts from prior sessions. Drift there showed degradation earlier than latency/token metrics.

If you haven’t already, adding write-amplification + duplicate-rate tracking is useful too. We found many systems look healthy while gradually filling with near-duplicate notes that poison recall.

discuss

sukinai|3 days ago

This is super useful. I really like the idea of treating memory as an SLO surface rather than just a storage layer.

Retrieval precision@k against a small gold set is a very strong suggestion. That feels like a much better early warning signal than just latency or token usage, because those can look fine while memory quality is quietly degrading.

Write amplification and duplicate-rate tracking also make a lot of sense. Near-duplicate buildup is exactly the kind of thing that makes a memory system look healthy on the outside while slowly poisoning recall underneath.

I have basic duplicate detection in /nemp:health, but I haven’t framed it yet in terms of retrieval quality metrics the way you described. That’s a really good direction. Thank you