[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

suresiva · 2025-02-04T21:09:40Z

What happened?

Unable to call VertexAI model through predict endpoint as the request from LiteLLM causes HTTP 500 error on VertexAI due to unexpected parameter 'max_retries'. The same model works fine through LiteLLM with completion call when we add the model with '/openai/' in the endpoint name.

But we want to connect to VertexAI model through the 'custom' call and get the predict api invoked to support a special use case on our side.

So we would like you to pop the 'max_retries' parameter from below code section,

litellm/litellm/llms/vertex_ai/vertex_ai_non_gemini.py

Line 244 in 27e1e22

Relevant log output

Below log is generated by the VertexAI,

2025-02-04 14:04:28.340
INFO:     10.32.0.131:51394 - "POST /generate HTTP/1.1" 500 Internal Server Error

2025-02-04 14:04:28.340
ERROR:    Exception in ASGI application
2025-02-04 14:04:28.340
Traceback (most recent call last):
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
2025-02-04 14:04:28.340
    result = await app(  # type: ignore[func-returns-value]
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
2025-02-04 14:04:28.340
    return await self.app(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
2025-02-04 14:04:28.340
    await super().__call__(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in __call__
2025-02-04 14:04:28.340
    await self.middleware_stack(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in __call__
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in __call__
2025-02-04 14:04:28.340
    await self.app(scope, receive, _send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 65, in __call__
2025-02-04 14:04:28.340
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-02-04 14:04:28.340
    await app(scope, receive, sender)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 756, in __call__
2025-02-04 14:04:28.340
    await self.middleware_stack(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 776, in app
2025-02-04 14:04:28.340
    await route.handle(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 297, in handle
2025-02-04 14:04:28.340
    await self.app(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 77, in app
2025-02-04 14:04:28.340
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2025-02-04 14:04:28.340
    raise exc
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-02-04 14:04:28.340
    await app(scope, receive, sender)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app
2025-02-04 14:04:28.340
    response = await func(request)
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 278, in app
2025-02-04 14:04:28.340
    raw_response = await run_endpoint_function(
2025-02-04 14:04:28.340
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
2025-02-04 14:04:28.340
    return await dependant.call(**values)
2025-02-04 14:04:28.340
  File "/workspace/vllm/vllm/entrypoints/api_server.py", line 127, in generate
2025-02-04 14:04:28.340
    sampling_params = SamplingParams(**request_dict)
2025-02-04 14:04:28.340
TypeError: SamplingParams.__init__() got an unexpected keyword argument 'max_retries'
2025-02-04 14:04:32.297
INFO:     10.32.6.129:40910 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:32.298
INFO:     10.32.6.129:59616 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:42.298
INFO:     10.32.6.129:52952 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:05:52.298
INFO:     10.32.6.129:53406 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:02.298
INFO:     10.32.6.129:50722 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:12.298
INFO:     10.32.6.129:48576 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:22.299
INFO:     10.32.6.129:40260 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:32.298
INFO:     10.32.6.129:46956 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:42.298
INFO:     10.32.6.129:49630 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:06:52.298
INFO:     10.32.6.129:57512 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:02.298
INFO:     10.32.6.129:55322 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:12.298
INFO:     10.32.6.129:53218 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:22.298
INFO:     10.32.6.129:47046 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:32.298
INFO:     10.32.6.129:55946 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:42.298
INFO:     10.32.6.129:33968 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:07:52.298
INFO:     10.32.6.129:58840 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:02.298
INFO:     10.32.6.129:44204 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:12.298
INFO:     10.32.6.129:33926 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:22.298
INFO:     10.32.6.129:38192 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:32.298
INFO:     10.32.6.129:43078 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:42.298
INFO:     10.32.6.129:34442 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:08:52.298
INFO:     10.32.6.129:54588 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:02.298
INFO:     10.32.6.129:54150 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:12.298
INFO:     10.32.6.129:53338 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:22.298
INFO:     10.32.6.129:36672 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:32.298
INFO:     10.32.6.129:51096 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:42.298
INFO:     10.32.6.129:52294 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:09:52.298
INFO:     10.32.6.129:46824 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:02.297
INFO:     10.32.6.129:37056 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:12.298
INFO:     10.32.6.129:50234 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:22.298
INFO:     10.32.6.129:34928 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:32.298
INFO:     10.32.6.129:51224 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:42.298
INFO:     10.32.6.129:37978 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:10:52.298
INFO:     10.32.6.129:60312 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:02.298
INFO:     10.32.6.129:43100 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:12.298
INFO:     10.32.6.129:51560 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:22.299
INFO:     10.32.6.129:36654 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:32.298
INFO:     10.32.6.129:51162 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:42.298
INFO:     10.32.6.129:51154 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:11:52.298
INFO:     10.32.6.129:43598 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:02.297
INFO:     10.32.6.129:37300 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:12.298
INFO:     10.32.6.129:41134 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:22.298
INFO:     10.32.6.129:54268 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:32.298
INFO:     10.32.6.129:37562 - "GET /ping HTTP/1.1" 200 OK
2025-02-04 14:12:42.298
INFO:     10.32.6.129:44658 - "GET /ping HTTP/1.1" 200 OK

Are you a ML Ops Team?

Yes

What LiteLLM version are you on ?

v.1.60.2

Twitter / LinkedIn details

No response

ishaan-jaff · 2025-02-04T21:20:29Z

can you share a working curl request to your predict endpoint @suresiva ?

suresiva · 2025-02-04T21:29:00Z

We didn't access the model's predict endpoint directly with cURL, instead we use typical OpenAI completion call like below,

import openai
client = openai.OpenAI(base_url="http://localhost:4000", api_key="sk-1234")       
response = client.chat.completions.create(
    model="vertex_ai/meta/llama3-1-8b-ft-icd-l4-predict",
    messages = [
        {
            "role": "user",
            "content": llama_input_text
        }
    ],
    max_tokens=max_tokens,
    timeout=100,
    top_p= top_p,
    temperature=temperature,
    extra_body={"top_k" : top_k},
    )

The model configured on LiteLLM is,

{
  "model_name": "vertex_ai/meta/llama3-1-8b-ft-icd-l4-predict",
  "litellm_params": {
    "vertex_project": "***********",
    "vertex_location": "us-central1",
    "use_in_pass_through": false,
    "model": "vertex_ai/1984786713414729728"
  },
  "model_info": {
    "id": "d3bd3c60-2559-48c5-a112-ab68322b7d3e",
    "db_model": true,
    "key": "vertex_ai/1984786713414729728",
    "max_tokens": null,
    "max_input_tokens": null,
    "max_output_tokens": null,
    "input_cost_per_token": 0,
    "cache_creation_input_token_cost": null,
    "cache_read_input_token_cost": null,
    "input_cost_per_character": null,
    "input_cost_per_token_above_128k_tokens": null,
    "input_cost_per_query": null,
    "input_cost_per_second": null,
    "input_cost_per_audio_token": null,
    "output_cost_per_token": 0,
    "output_cost_per_audio_token": null,
    "output_cost_per_character": null,
    "output_cost_per_token_above_128k_tokens": null,
    "output_cost_per_character_above_128k_tokens": null,
    "output_cost_per_second": null,
    "output_cost_per_image": null,
    "output_vector_size": null,
    "litellm_provider": "vertex_ai",
    "mode": null,
    "supports_system_messages": null,
    "supports_response_schema": null,
    "supports_vision": false,
    "supports_function_calling": false,
    "supports_assistant_prefill": false,
    "supports_prompt_caching": false,
    "supports_audio_input": false,
    "supports_audio_output": false,
    "supports_pdf_input": false,
    "supports_embedding_image_input": false,
    "supports_native_streaming": null,
    "tpm": null,
    "rpm": null,
    "supported_openai_params": [
      "temperature",
      "top_p",
      "max_tokens",
      "max_completion_tokens",
      "stream",
      "tools",
      "functions",
      "tool_choice",
      "response_format",
      "n",
      "stop",
      "frequency_penalty",
      "presence_penalty",
      "extra_headers",
      "seed",
      "logprobs"
    ]
  },
  "provider": "vertex_ai",
  "input_cost": 0,
  "output_cost": 0,
  "litellm_model_name": "vertex_ai/1984786713414729728",
  "max_tokens": null,
  "max_input_tokens": null,
  "cleanedLitellmParams": {
    "vertex_project": "*********",
    "vertex_location": "us-central1",
    "use_in_pass_through": false
  }
}

This model configured on LiteLLM is able to call vertex_ai model's predict using the below code segment,

litellm/litellm/llms/vertex_ai/vertex_ai_non_gemini.py

Line 571 in 27e1e22

response_obj = await llm_model.predict(

We skipped the /openai/ in the model endpoint name to avoid the openai-like completion route so that we can access the model through predict() call.

krrishdholakia · 2025-02-11T17:53:28Z

note: expose vertex_ai/custom/<endpoint_id for simpler optional param translation

don't pass max retries to unsupported route Fixes #8254

* fix(utils.py): fix vertex ai optional param handling don't pass max retries to unsupported route Fixes #8254 * fix(get_supported_openai_params.py): fix linting error * fix(get_supported_openai_params.py): default to openai-like spec * test: fix test * fix: fix linting error * Improved wildcard route handling on `/models` and `/model_group/info` (#8473) * fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models * test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper * test(test_models.py): add e2e testing for `/model_group/info` endpoint * feat(prometheus.py): support tracking total requests by user_email on prometheus adds initial support for tracking total requests by user_email * test(test_prometheus.py): add testing to ensure user email is always tracked * test: update testing for new prometheus metric * test(test_prometheus_unit_tests.py): add user email to total proxy metric * test: update tests * test: fix spend tests * test: fix test * fix(pagerduty.py): fix linting error * (Bug fix) - Using `include_usage` for /completions requests + unit testing (#8484) * pass stream options (#8419) * test_completion_streaming_usage_metrics * test_text_completion_include_usage --------- Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> * fix naming docker stable release * build(model_prices_and_context_window.json): handle azure model update * docs(token_auth.md): clarify scopes can be a list or comma separated string * docs: fix docs * add sonar pricings (#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * update load testing script * fix test_async_router_context_window_fallback * pplx - fix supports tool choice openai param (#8496) * fix prom check startup (#8492) * test_async_router_context_window_fallback * ci(config.yml): mark daily docker builds with `-nightly` (#8499) Resolves #8495 * (Redis Cluster) - Fixes for using redis cluster + pipeline (#8442) * update RedisCluster creation * update RedisClusterCache * add redis ClusterCache * update async_set_cache_pipeline * cleanup redis cluster usage * fix redis pipeline * test_init_async_client_returns_same_instance * fix redis cluster * update mypy_path * fix init_redis_cluster * remove stub * test redis commit * ClusterPipeline * fix import * RedisCluster import * fix redis cluster * Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix naming of redis cluster integration * test_redis_caching_ttl_pipeline * fix async_set_cache_pipeline --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Litellm UI stable version 02 12 2025 (#8497) * fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param Allows user to specify they want the keys as a list of objects * refactor(key_list.tsx): initial refactor of key table in user dashboard offloads key filtering logic to backend api prevents common error of user not being able to see their keys * fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys * fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint allow internal user to see their keys. not anybody else's * fix(view_key_table.tsx): fix issue where internal user could not see default team keys * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * test_supports_tool_choice * test_async_router_context_window_fallback * fix: fix test (#8501) * Litellm dev 02 12 2025 p1 (#8494) * Resolves #6625 (#8459) - enables no auth for SMTP Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> * add sonar pricings (#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * test: fix test --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> * test: fix test * UI Fixes p2 (#8502) * refactor(admin.tsx): cleanup add new admin flow removes buggy flow. Ensures just 1 simple way to add users / update roles. * fix(user_search_modal.tsx): ensure 'add member' button is always visible * fix(edit_membership.tsx): ensure 'save changes' button always visible * fix(internal_user_endpoints.py): ensure user in org can be deleted Fixes issue where user couldn't be deleted if they were a member of an org * fix: fix linting error * add phoenix docs for observability integration (#8522) * Add files via upload * Update arize_integration.md * Update arize_integration.md * add Phoenix docs * Added custom_attributes to additional_keys which can be sent to athina (#8518) * (UI) fix log details page (#8524) * rollback changes to view logs page * ui new build * add interface for prefetch * fix spread operation * fix max size for request view page * clean up table * ui fix column on request logs page * ui new build * Add UI Support for Admins to Call /cache/ping and View Cache Analytics (#8475) (#8519) * [Bug] UI: Newly created key does not display on the View Key Page (#8039) - Fixed issue where all keys appeared blank for admin users. - Implemented filtering of data via team settings to ensure all keys are displayed correctly. * Fix: - Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`. - Ensured other teams still follow the original validation rules. * - added some classes in global.css - added text wrap in output of request,response and metadata in index.tsx - fixed styles of table in table.tsx * - added full payload when we open single log entry - added Combined Info Card in index.tsx * fix: keys not showing on refresh for internal user * merge * main merge * cache page * ca remove * terms change * fix:places caching inside exp --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: exiao <exiao@users.noreply.github.com> Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com> Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>

* fix(utils.py): fix vertex ai optional param handling don't pass max retries to unsupported route Fixes BerriAI#8254 * fix(get_supported_openai_params.py): fix linting error * fix(get_supported_openai_params.py): default to openai-like spec * test: fix test * fix: fix linting error * Improved wildcard route handling on `/models` and `/model_group/info` (BerriAI#8473) * fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models * test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper * test(test_models.py): add e2e testing for `/model_group/info` endpoint * feat(prometheus.py): support tracking total requests by user_email on prometheus adds initial support for tracking total requests by user_email * test(test_prometheus.py): add testing to ensure user email is always tracked * test: update testing for new prometheus metric * test(test_prometheus_unit_tests.py): add user email to total proxy metric * test: update tests * test: fix spend tests * test: fix test * fix(pagerduty.py): fix linting error * (Bug fix) - Using `include_usage` for /completions requests + unit testing (BerriAI#8484) * pass stream options (BerriAI#8419) * test_completion_streaming_usage_metrics * test_text_completion_include_usage --------- Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> * fix naming docker stable release * build(model_prices_and_context_window.json): handle azure model update * docs(token_auth.md): clarify scopes can be a list or comma separated string * docs: fix docs * add sonar pricings (BerriAI#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * update load testing script * fix test_async_router_context_window_fallback * pplx - fix supports tool choice openai param (BerriAI#8496) * fix prom check startup (BerriAI#8492) * test_async_router_context_window_fallback * ci(config.yml): mark daily docker builds with `-nightly` (BerriAI#8499) Resolves BerriAI#8495 * (Redis Cluster) - Fixes for using redis cluster + pipeline (BerriAI#8442) * update RedisCluster creation * update RedisClusterCache * add redis ClusterCache * update async_set_cache_pipeline * cleanup redis cluster usage * fix redis pipeline * test_init_async_client_returns_same_instance * fix redis cluster * update mypy_path * fix init_redis_cluster * remove stub * test redis commit * ClusterPipeline * fix import * RedisCluster import * fix redis cluster * Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix naming of redis cluster integration * test_redis_caching_ttl_pipeline * fix async_set_cache_pipeline --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Litellm UI stable version 02 12 2025 (BerriAI#8497) * fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param Allows user to specify they want the keys as a list of objects * refactor(key_list.tsx): initial refactor of key table in user dashboard offloads key filtering logic to backend api prevents common error of user not being able to see their keys * fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys * fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint allow internal user to see their keys. not anybody else's * fix(view_key_table.tsx): fix issue where internal user could not see default team keys * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * fix: fix linting error * test_supports_tool_choice * test_async_router_context_window_fallback * fix: fix test (BerriAI#8501) * Litellm dev 02 12 2025 p1 (BerriAI#8494) * Resolves BerriAI#6625 (BerriAI#8459) - enables no auth for SMTP Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> * add sonar pricings (BerriAI#8476) * add sonar pricings * Update model_prices_and_context_window.json * Update model_prices_and_context_window.json * Update model_prices_and_context_window_backup.json * test: fix test --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> * test: fix test * UI Fixes p2 (BerriAI#8502) * refactor(admin.tsx): cleanup add new admin flow removes buggy flow. Ensures just 1 simple way to add users / update roles. * fix(user_search_modal.tsx): ensure 'add member' button is always visible * fix(edit_membership.tsx): ensure 'save changes' button always visible * fix(internal_user_endpoints.py): ensure user in org can be deleted Fixes issue where user couldn't be deleted if they were a member of an org * fix: fix linting error * add phoenix docs for observability integration (BerriAI#8522) * Add files via upload * Update arize_integration.md * Update arize_integration.md * add Phoenix docs * Added custom_attributes to additional_keys which can be sent to athina (BerriAI#8518) * (UI) fix log details page (BerriAI#8524) * rollback changes to view logs page * ui new build * add interface for prefetch * fix spread operation * fix max size for request view page * clean up table * ui fix column on request logs page * ui new build * Add UI Support for Admins to Call /cache/ping and View Cache Analytics (BerriAI#8475) (BerriAI#8519) * [Bug] UI: Newly created key does not display on the View Key Page (BerriAI#8039) - Fixed issue where all keys appeared blank for admin users. - Implemented filtering of data via team settings to ensure all keys are displayed correctly. * Fix: - Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`. - Ensured other teams still follow the original validation rules. * - added some classes in global.css - added text wrap in output of request,response and metadata in index.tsx - fixed styles of table in table.tsx * - added full payload when we open single log entry - added Combined Info Card in index.tsx * fix: keys not showing on refresh for internal user * merge * main merge * cache page * ca remove * terms change * fix:places caching inside exp --------- Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com> Co-authored-by: Lucca Zenóbio <luccazen@gmail.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Dani Regli <1daniregli@gmail.com> Co-authored-by: exiao <exiao@users.noreply.github.com> Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com> Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>

suresiva added the bug Something isn't working label Feb 4, 2025

github-actions bot added the mlops user request label Feb 4, 2025

krrishdholakia self-assigned this Feb 12, 2025

krrishdholakia added a commit that referenced this issue Feb 12, 2025

fix(utils.py): fix vertex ai optional param handling

04d07b0

don't pass max retries to unsupported route Fixes #8254

krrishdholakia mentioned this issue Feb 12, 2025

fix(utils.py): fix vertex ai optional param handling #8477

Merged

krrishdholakia closed this as completed in #8477 Feb 14, 2025

krrishdholakia closed this as completed in 8903bd1 Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

suresiva commented Feb 4, 2025

ishaan-jaff commented Feb 4, 2025

suresiva commented Feb 4, 2025

krrishdholakia commented Feb 11, 2025

[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

[Bug]: VertexAI Custom predict endpoint call failed due to extra 'max_retries' param #8254

Comments

suresiva commented Feb 4, 2025

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

ishaan-jaff commented Feb 4, 2025

suresiva commented Feb 4, 2025

krrishdholakia commented Feb 11, 2025