Skip to content

[Bug]: /v1/messages max_parallel_requests permanently rate limits keys when Redis cache enabled #17323

@colindonovan-8451

Description

@colindonovan-8451

What happened?

We recently turned on Redis caching to enable Postgres pooling of requests like so:

 cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS-HOST
    password: os.environ/REDIS-PASSWORD
    port: ####
    ssl: true
    supported_call_types: []

Upon making this change, users of /v1/messages began reporting issues with their keys hitting their maximum_parallel_requests limits. Upon further testing, we have discovered that requests to /v1/messages end up treating max_parallel_requests essentially as a maximum number of requests a key is able to make rather than properly functioning only on parallel requests. Once a key hits it's rate limit, it permanently becomes rate limited. That key will never be functional again if it is only used to hit /v1/messages.

Hitting /chat/completions however will set the timer back to 1 minute before the key's rate limit is reset and it is able to make another set of requests up to their MPR limit.

To recreate follow the following steps on a LiteLLM instance with caching enabled the way shown above:

  1. Create a new key and set its maximum parallel requests to 5
  2. Hit /v1/messages with your newly created key 5 times
    You should now observe rate limiting occurring forever on that key when you hit /v1/messages
  3. Hit /chat/completions with that key and then wait 60 seconds
  4. Hit /v1/messages again and it should work again for 5 more requests, upon which is will be rate limited permanently again

Relevant log output

{
  "error": {
    "message": "429: Rate limit exceeded for api_key: xxxxxxxxx. Limit type: max_parallel_requests. Current limit: 5, Remaining: 0. Limit resets at: 2025-12-01 20:28:17 UTC",
    "type": "None",
    "param": "None",
    "code": "429"
  }
}

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.79.1

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions