Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

Closed
suresiva opened this issue Feb 4, 2025 · 3 comments · Fixed by #8477
Closed
Assignees
Labels
bug Something isn't working mlops user request

Comments

@suresiva
Copy link

suresiva commented Feb 4, 2025

What happened?

Unable to call VertexAI model through predict endpoint as the request from LiteLLM causes HTTP 500 error on VertexAI due to unexpected parameter 'max_retries'. The same model works fine through LiteLLM with completion call when we add the model with '/openai/' in the endpoint name.

But we want to connect to VertexAI model through the 'custom' call and get the predict api invoked to support a special use case on our side.

So we would like you to pop the 'max_retries' parameter from below code section,

Relevant log output

Below log is generated by the VertexAI,

2025-02-04 14:04:28.340
INFO:     10.32.0.131:51394 - "POST /generate HTTP/1.1" 500 Internal Server Error

2025-02-04 14:04:28.340
ERROR:    Exception in ASGI application
2025-02-04 14:04:28.340
Traceback (most recent call last):
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
2025-02-04 14:04:28.340
    result = await app(  # type: ignore[func-returns-value]
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
2025-02-04 14:04:28.340
    return await self.app(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
2025-02-04 14:04:28.340
    await super().__call__(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
2025-02-04 14:04:28.340
    await self.middleware_stack(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
2025-02-04 14:04:28.340
    await self.app(scope, receive, _send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__
2025-02-04 14:04:28.340
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-02-04 14:04:28.340
    await app(scope, receive, sender)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 756, in __call__
2025-02-04 14:04:28.340
    await self.middleware_stack(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 776, in app
2025-02-04 14:04:28.340
    await route.handle(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 297, in handle
2025-02-04 14:04:28.340
    await self.app(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
2025-02-04 14:04:28.340
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-02-04 14:04:28.340
    await app(scope, receive, sender)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app
2025-02-04 14:04:28.340
    response = await func(request)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app
2025-02-04 14:04:28.340
    raw_response = await run_endpoint_function(
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
2025-02-04 14:04:28.340
    return await dependant.call(**values)
2025-02-04 14:04:28.340
  File "/workspace/vllm/vllm/entrypoints/api_server.py", line 127, in generate
2025-02-04 14:04:28.340
    sampling_params = SamplingParams(**request_dict)
2025-02-04 14:04:28.340
TypeError: SamplingParams.__init__() got an unexpected keyword argument 'max_retries'
2025-02-04 14:04:32.297
INFO:     10.32.6.129:40910 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:32.298
INFO:     10.32.6.129:59616 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:42.298
INFO:     10.32.6.129:52952 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:52.298
INFO:     10.32.6.129:53406 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:02.298
INFO:     10.32.6.129:50722 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:12.298
INFO:     10.32.6.129:48576 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:22.299
INFO:     10.32.6.129:40260 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:32.298
INFO:     10.32.6.129:46956 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:42.298
INFO:     10.32.6.129:49630 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:52.298
INFO:     10.32.6.129:57512 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:02.298
INFO:     10.32.6.129:55322 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:12.298
INFO:     10.32.6.129:53218 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:22.298
INFO:     10.32.6.129:47046 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:32.298
INFO:     10.32.6.129:55946 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:42.298
INFO:     10.32.6.129:33968 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:52.298
INFO:     10.32.6.129:58840 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:02.298
INFO:     10.32.6.129:44204 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:12.298
INFO:     10.32.6.129:33926 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:22.298
INFO:     10.32.6.129:38192 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:32.298
INFO:     10.32.6.129:43078 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:42.298
INFO:     10.32.6.129:34442 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:52.298
INFO:     10.32.6.129:54588 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:02.298
INFO:     10.32.6.129:54150 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:12.298
INFO:     10.32.6.129:53338 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:22.298
INFO:     10.32.6.129:36672 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:32.298
INFO:     10.32.6.129:51096 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:42.298
INFO:     10.32.6.129:52294 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:52.298
INFO:     10.32.6.129:46824 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:02.297
INFO:     10.32.6.129:37056 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:12.298
INFO:     10.32.6.129:50234 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:22.298
INFO:     10.32.6.129:34928 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:32.298
INFO:     10.32.6.129:51224 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:42.298
INFO:     10.32.6.129:37978 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:52.298
INFO:     10.32.6.129:60312 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:02.298
INFO:     10.32.6.129:43100 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:12.298
INFO:     10.32.6.129:51560 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:22.299
INFO:     10.32.6.129:36654 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:32.298
INFO:     10.32.6.129:51162 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:42.298
INFO:     10.32.6.129:51154 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:52.298
INFO:     10.32.6.129:43598 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:02.297
INFO:     10.32.6.129:37300 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:12.298
INFO:     10.32.6.129:41134 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:22.298
INFO:     10.32.6.129:54268 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:32.298
INFO:     10.32.6.129:37562 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:42.298
INFO:     10.32.6.129:44658 - "GET /ping HTTP/1.1" 200 OK

Are you a ML Ops Team?

Yes

What LiteLLM version are you on ?

v.1.60.2

Twitter / LinkedIn details

No response

@suresiva suresiva added the bug Something isn't working label Feb 4, 2025
@ishaan-jaff
Copy link
Contributor

can you share a working curl request to your predict endpoint @suresiva ?

@suresiva
Copy link
Author

suresiva commented Feb 4, 2025

We didn't access the model's predict endpoint directly with cURL, instead we use typical OpenAI completion call like below,

import openai
client = openai.OpenAI(base_url="http://localhost:4000", api_key="sk-1234")       
response = client.chat.completions.create(
    model="vertex_ai/meta/llama3-1-8b-ft-icd-l4-predict",
    messages = [
        {
            "role": "user",
            "content": llama_input_text
        }
    ],
    max_tokens=max_tokens,
    timeout=100,
    top_p= top_p,
    temperature=temperature,
    extra_body={"top_k" : top_k},
    )

The model configured on LiteLLM is,

{
  "model_name": "vertex_ai/meta/llama3-1-8b-ft-icd-l4-predict",
  "litellm_params": {
    "vertex_project": "***********",
    "vertex_location": "us-central1",
    "use_in_pass_through": false,
    "model": "vertex_ai/1984786713414729728"
  },
  "model_info": {
    "id": "d3bd3c60-2559-48c5-a112-ab68322b7d3e",
    "db_model": true,
    "key": "vertex_ai/1984786713414729728",
    "max_tokens": null,
    "max_input_tokens": null,
    "max_output_tokens": null,
    "input_cost_per_token": 0,
    "cache_creation_input_token_cost": null,
    "cache_read_input_token_cost": null,
    "input_cost_per_character": null,
    "input_cost_per_token_above_128k_tokens": null,
    "input_cost_per_query": null,
    "input_cost_per_second": null,
    "input_cost_per_audio_token": null,
    "output_cost_per_token": 0,
    "output_cost_per_audio_token": null,
    "output_cost_per_character": null,
    "output_cost_per_token_above_128k_tokens": null,
    "output_cost_per_character_above_128k_tokens": null,
    "output_cost_per_second": null,
    "output_cost_per_image": null,
    "output_vector_size": null,
    "litellm_provider": "vertex_ai",
    "mode": null,
    "supports_system_messages": null,
    "supports_response_schema": null,
    "supports_vision": false,
    "supports_function_calling": false,
    "supports_assistant_prefill": false,
    "supports_prompt_caching": false,
    "supports_audio_input": false,
    "supports_audio_output": false,
    "supports_pdf_input": false,
    "supports_embedding_image_input": false,
    "supports_native_streaming": null,
    "tpm": null,
    "rpm": null,
    "supported_openai_params": [
      "temperature",
      "top_p",
      "max_tokens",
      "max_completion_tokens",
      "stream",
      "tools",
      "functions",
      "tool_choice",
      "response_format",
      "n",
      "stop",
      "frequency_penalty",
      "presence_penalty",
      "extra_headers",
      "seed",
      "logprobs"
    ]
  },
  "provider": "vertex_ai",
  "input_cost": 0,
  "output_cost": 0,
  "litellm_model_name": "vertex_ai/1984786713414729728",
  "max_tokens": null,
  "max_input_tokens": null,
  "cleanedLitellmParams": {
    "vertex_project": "*********",
    "vertex_location": "us-central1",
    "use_in_pass_through": false
  }
}

This model configured on LiteLLM is able to call vertex_ai model's predict using the below code segment,

response_obj = await llm_model.predict(

We skipped the /openai/ in the model endpoint name to avoid the openai-like completion route so that we can access the model through predict() call.

@krrishdholakia
Copy link
Contributor

note: expose vertex_ai/custom/<endpoint_id for simpler optional param translation

@krrishdholakia krrishdholakia self-assigned this Feb 12, 2025
krrishdholakia added a commit that referenced this issue Feb 12, 2025
don't pass max retries to unsupported route

Fixes #8254
krrishdholakia added a commit that referenced this issue Feb 14, 2025
* fix(utils.py): fix vertex ai optional param handling

don't pass max retries to unsupported route

Fixes #8254

* fix(get_supported_openai_params.py): fix linting error

* fix(get_supported_openai_params.py): default to openai-like spec

* test: fix test

* fix: fix linting error

* Improved wildcard route handling on `/models` and `/model_group/info`  (#8473)

* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix

ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models

* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper

* test(test_models.py): add e2e testing for `/model_group/info` endpoint

* feat(prometheus.py): support tracking total requests by user_email on prometheus

adds initial support for tracking total requests by user_email

* test(test_prometheus.py): add testing to ensure user email is always tracked

* test: update testing for new prometheus metric

* test(test_prometheus_unit_tests.py): add user email to total proxy metric

* test: update tests

* test: fix spend tests

* test: fix test

* fix(pagerduty.py): fix linting error

* (Bug fix) - Using `include_usage` for /completions requests + unit testing (#8484)

* pass stream options (#8419)

* test_completion_streaming_usage_metrics

* test_text_completion_include_usage

---------

Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>

* fix naming docker stable release

* build(model_prices_and_context_window.json): handle azure model update

* docs(token_auth.md): clarify scopes can be a list or comma separated string

* docs: fix docs

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* update load testing script

* fix test_async_router_context_window_fallback

* pplx - fix supports tool choice openai param (#8496)

* fix prom check startup (#8492)

* test_async_router_context_window_fallback

* ci(config.yml): mark daily docker builds with `-nightly` (#8499)

Resolves #8495

* (Redis Cluster) - Fixes for using redis cluster + pipeline (#8442)

* update RedisCluster creation

* update RedisClusterCache

* add redis ClusterCache

* update async_set_cache_pipeline

* cleanup redis cluster usage

* fix redis pipeline

* test_init_async_client_returns_same_instance

* fix redis cluster

* update mypy_path

* fix init_redis_cluster

* remove stub

* test redis commit

* ClusterPipeline

* fix import

* RedisCluster import

* fix redis cluster

* Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix naming of redis cluster integration

* test_redis_caching_ttl_pipeline

* fix async_set_cache_pipeline

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Litellm UI stable version 02 12 2025 (#8497)

* fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param

Allows user to specify they want the keys as a list of objects

* refactor(key_list.tsx): initial refactor of key table in user dashboard

offloads key filtering logic to backend api

prevents common error of user not being able to see their keys

* fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys

* fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint

allow internal user to see their keys. not anybody else's

* fix(view_key_table.tsx): fix issue where internal user could not see default team keys

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* test_supports_tool_choice

* test_async_router_context_window_fallback

* fix: fix test (#8501)

* Litellm dev 02 12 2025 p1 (#8494)

* Resolves #6625 (#8459)

- enables no auth for SMTP

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* test: fix test

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>

* test: fix test

* UI Fixes p2  (#8502)

* refactor(admin.tsx): cleanup add new admin flow

removes buggy flow. Ensures just 1 simple way to add users / update roles.

* fix(user_search_modal.tsx): ensure 'add member' button is always visible

* fix(edit_membership.tsx): ensure 'save changes' button always visible

* fix(internal_user_endpoints.py): ensure user in org can be deleted

Fixes issue where user couldn't be deleted if they were a member of an org

* fix: fix linting error

* add phoenix docs for observability integration (#8522)

* Add files via upload

* Update arize_integration.md

* Update arize_integration.md

* add Phoenix docs

* Added custom_attributes to additional_keys which can be sent to athina (#8518)

* (UI) fix log details page  (#8524)

* rollback changes to view logs page

* ui new build

* add interface for prefetch

* fix spread operation

* fix max size for request view page

* clean up table

* ui fix column on request logs page

* ui new build

* Add UI Support for Admins to Call /cache/ping and View Cache Analytics (#8475) (#8519)

* [Bug] UI: Newly created key does not display on the View Key Page (#8039)

- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.

* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.

* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx

* - added full payload when we open single log entry
- added Combined Info Card in index.tsx

* fix: keys not showing on refresh for internal user

* merge

* main merge

* cache page

* ca remove

* terms change

* fix:places caching inside exp

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: exiao <exiao@users.noreply.github.com>
Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com>
Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
abhijitherekar pushed a commit to acuvity/litellm that referenced this issue Feb 20, 2025
* fix(utils.py): fix vertex ai optional param handling

don't pass max retries to unsupported route

Fixes BerriAI#8254

* fix(get_supported_openai_params.py): fix linting error

* fix(get_supported_openai_params.py): default to openai-like spec

* test: fix test

* fix: fix linting error

* Improved wildcard route handling on `/models` and `/model_group/info`  (BerriAI#8473)

* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix

ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models

* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper

* test(test_models.py): add e2e testing for `/model_group/info` endpoint

* feat(prometheus.py): support tracking total requests by user_email on prometheus

adds initial support for tracking total requests by user_email

* test(test_prometheus.py): add testing to ensure user email is always tracked

* test: update testing for new prometheus metric

* test(test_prometheus_unit_tests.py): add user email to total proxy metric

* test: update tests

* test: fix spend tests

* test: fix test

* fix(pagerduty.py): fix linting error

* (Bug fix) - Using `include_usage` for /completions requests + unit testing (BerriAI#8484)

* pass stream options (BerriAI#8419)

* test_completion_streaming_usage_metrics

* test_text_completion_include_usage

---------

Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>

* fix naming docker stable release

* build(model_prices_and_context_window.json): handle azure model update

* docs(token_auth.md): clarify scopes can be a list or comma separated string

* docs: fix docs

* add sonar pricings (BerriAI#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* update load testing script

* fix test_async_router_context_window_fallback

* pplx - fix supports tool choice openai param (BerriAI#8496)

* fix prom check startup (BerriAI#8492)

* test_async_router_context_window_fallback

* ci(config.yml): mark daily docker builds with `-nightly` (BerriAI#8499)

Resolves BerriAI#8495

* (Redis Cluster) - Fixes for using redis cluster + pipeline (BerriAI#8442)

* update RedisCluster creation

* update RedisClusterCache

* add redis ClusterCache

* update async_set_cache_pipeline

* cleanup redis cluster usage

* fix redis pipeline

* test_init_async_client_returns_same_instance

* fix redis cluster

* update mypy_path

* fix init_redis_cluster

* remove stub

* test redis commit

* ClusterPipeline

* fix import

* RedisCluster import

* fix redis cluster

* Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix naming of redis cluster integration

* test_redis_caching_ttl_pipeline

* fix async_set_cache_pipeline

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Litellm UI stable version 02 12 2025 (BerriAI#8497)

* fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param

Allows user to specify they want the keys as a list of objects

* refactor(key_list.tsx): initial refactor of key table in user dashboard

offloads key filtering logic to backend api

prevents common error of user not being able to see their keys

* fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys

* fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint

allow internal user to see their keys. not anybody else's

* fix(view_key_table.tsx): fix issue where internal user could not see default team keys

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* test_supports_tool_choice

* test_async_router_context_window_fallback

* fix: fix test (BerriAI#8501)

* Litellm dev 02 12 2025 p1 (BerriAI#8494)

* Resolves BerriAI#6625 (BerriAI#8459)

- enables no auth for SMTP

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>

* add sonar pricings (BerriAI#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* test: fix test

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>

* test: fix test

* UI Fixes p2  (BerriAI#8502)

* refactor(admin.tsx): cleanup add new admin flow

removes buggy flow. Ensures just 1 simple way to add users / update roles.

* fix(user_search_modal.tsx): ensure 'add member' button is always visible

* fix(edit_membership.tsx): ensure 'save changes' button always visible

* fix(internal_user_endpoints.py): ensure user in org can be deleted

Fixes issue where user couldn't be deleted if they were a member of an org

* fix: fix linting error

* add phoenix docs for observability integration (BerriAI#8522)

* Add files via upload

* Update arize_integration.md

* Update arize_integration.md

* add Phoenix docs

* Added custom_attributes to additional_keys which can be sent to athina (BerriAI#8518)

* (UI) fix log details page  (BerriAI#8524)

* rollback changes to view logs page

* ui new build

* add interface for prefetch

* fix spread operation

* fix max size for request view page

* clean up table

* ui fix column on request logs page

* ui new build

* Add UI Support for Admins to Call /cache/ping and View Cache Analytics (BerriAI#8475) (BerriAI#8519)

* [Bug] UI: Newly created key does not display on the View Key Page (BerriAI#8039)

- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.

* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.

* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx

* - added full payload when we open single log entry
- added Combined Info Card in index.tsx

* fix: keys not showing on refresh for internal user

* merge

* main merge

* cache page

* ca remove

* terms change

* fix:places caching inside exp

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: exiao <exiao@users.noreply.github.com>
Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com>
Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mlops user request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants