Sorry, I was mistaken about having used LiteLLM - this is the one I was using: https://github.com/wangxj03/ai-cookbook.git, and you can see the hardcoded shit here: https://github.com/wangxj03/ai-cookbook/blob/main/anthropic-... - it does not actually restrict anything though, and like I said, I noticed it was getting slow when it was sending 100-200K token requests.Also, it's pretty easy to find unresolved bugs related to openwebui not handling context length parameters correctly - I believe I actually read something from the author saying that this parameter is effectively disabled (for non-local LLMs maybe?).
No comments yet.