top | item 43726020

(no title)

tempthrow | 10 months ago

I think it's the small TPM limits. I'll be way under the 10-30 requests per minute while using Cline, but it appears that the input tokens count towards the rate limit so I'll find myself limited to one message a minute if I let the conversation go on for too long, ironically due to Gemini's long context window. AFAIK Cline doesn't currently offer an option to limit the context explosion to lower than model capacity.

discuss

No comments yet.