[Bug]: /v1/messages max_parallel_requests permanently rate limits keys when Redis cache enabled

### What happened?

We recently turned on Redis caching to enable Postgres pooling of requests like so:
```
 cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS-HOST
    password: os.environ/REDIS-PASSWORD
    port: ####
    ssl: true
    supported_call_types: []
```

Upon making this change, users of /v1/messages began reporting issues with their keys hitting their maximum_parallel_requests limits. Upon further testing, we have discovered that requests to /v1/messages end up treating max_parallel_requests essentially as a maximum number of requests a key is able to make rather than properly functioning only on parallel requests. Once a key hits it's rate limit, it permanently becomes rate limited. That key will never be functional again if it is only used to hit /v1/messages. 

Hitting /chat/completions however will set the timer back to 1 minute before the key's rate limit is reset and it is able to make another set of requests up to their MPR limit.

To recreate follow the following steps on a LiteLLM instance with caching enabled the way shown above:

1. Create a new key and set its maximum parallel requests to 5
2. Hit /v1/messages with your newly created key 5 times
    You should now observe rate limiting occurring forever on that key when you hit /v1/messages
3. Hit /chat/completions with that key and then wait 60 seconds
4. Hit /v1/messages again and it should work again for 5 more requests, upon which is will be rate limited permanently again

### Relevant log output

```shell
{
  "error": {
    "message": "429: Rate limit exceeded for api_key: xxxxxxxxx. Limit type: max_parallel_requests. Current limit: 5, Remaining: 0. Limit resets at: 2025-12-01 20:28:17 UTC",
    "type": "None",
    "param": "None",
    "code": "429"
  }
}
```

### Are you a ML Ops Team?

No

### What LiteLLM version are you on ?

v1.79.1

### Twitter / LinkedIn details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: /v1/messages max_parallel_requests permanently rate limits keys when Redis cache enabled #17323

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: /v1/messages max_parallel_requests permanently rate limits keys when Redis cache enabled #17323

Description

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions