(no title)
shaial | 2 years ago
One plugin or feature that I will like to see in an AI gateway: *Cache* per unique request. So if I send the same request (system, messages, temperature, etc.), I will have the option to pull if from a cache (if it was already populated) and skip the LLM generation. This is much faster and cheaper - especially during development and testing.
retrovrv|2 years ago
The caching part isn't open source yet, but part of our internal workers. Would be very cool to open source it!
shaial|2 years ago