top | item 42820307

(no title)

I think that's probably the shim I was referring to - it has hardcoded context length, but it is either implemented incorrectly, Anthropic ignores it, or maybe it's on openwebui to manage the window and it just isn't? Not sure. I found it kept getting slow, so I was starting new conversations to work around that. Eventually I got suspicious and checked - I'd burned through almost $100 within a few hours.

LibreChat isn't as nice in some areas, but it's much more efficient in this regard.

discuss

wkat4242|1 year ago

I do exactly this, use LiteLLM to bridge it. In fact I use LiteLLM to bridge OpenAI and Groq too. Even though OpenWebUI supports them directly, with LiteLLM I can control better which models I see. Otherwise my model list gets cluttered up. I configured this back when OpenWebUI only supported one OpenAI endpoint but I kept using it because it's just quite handy.

And no it doesn't cost extra credits, isn't ignored and doesn't have hardcoded context length. It works perfectly.

emptiestplace|1 year ago

Sorry, I was mistaken about having used LiteLLM - this is the one I was using: https://github.com/wangxj03/ai-cookbook.git, and you can see the hardcoded shit here: https://github.com/wangxj03/ai-cookbook/blob/main/anthropic-... - it does not actually restrict anything though, and like I said, I noticed it was getting slow when it was sending 100-200K token requests.

Also, it's pretty easy to find unresolved bugs related to openwebui not handling context length parameters correctly - I believe I actually read something from the author saying that this parameter is effectively disabled (for non-local LLMs maybe?).