top | item 46713992

(no title)

cr4zy | 1 month ago

For compression and long-running agents, may I suggest https://memtree.dev. We offer a simple API that compresses messages asynchronously for instant responses and small context leading to much higher quality generations. We're about to release a dashboard that will show you what each compressed request looked like, the token distribution between system, memory, and tool messages, along with memory retrievals, etc... Is this the type of thing that you're looking for?

discuss

nullbio|1 month ago

Something like this needs to be open-sourced. You're going to have a hell of a time trying to get enough trust from people to run all of their prompts through your servers.