One thing that surprised our team: cost isn’t just “more usage” — retries and context creep can multiply spend with the same user behavior. We now track cost/request and cost per user-action per endpoint over time, plus a retry ratio. When either drifts after a change, it’s usually a quick fix (backoff, caps, trimming history).
No comments yet.