-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(llms): support OpenAI v1 for Azure OpenAI #755
Conversation
WalkthroughThe recent update enhances the Changes
TipsChat with CodeRabbit Bot (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 5
Configuration used: CodeRabbit UI
Files selected for processing (2)
- pandasai/llm/azure_openai.py (3 hunks)
- tests/llms/test_azure_openai.py (7 hunks)
Additional comments: 10
pandasai/llm/azure_openai.py (3)
12-24: The import of
openai
and theload_dotenv
function call are appropriate for setting up the environment and dependencies. However, ensure that theopenai
library supports the Azure-specific features mentioned in the pull request summary.50-123: The
__init__
method has been updated to accept new parameters for Azure configurations. The logic for raising exceptions when required parameters are not provided is sound. However, theis_openai_v1()
checks seem to imply that there are different behaviors expected based on the OpenAI API version. Ensure that theis_openai_v1()
function accurately reflects the API version being used and that the logic branching is correct.177-177: The
type
property is simple and clear, correctly returning the type of service.tests/llms/test_azure_openai.py (7)
9-11: The
OpenAIObject
class is a simple wrapper to convert a dictionary into an object with attributes. This is a common pattern in Python for creating objects with dynamic attributes. However, it's important to note that this approach bypasses the usual attribute setting mechanisms of Python classes, which can lead to unexpected behavior if not used carefully. For example, it won't call any property setters.14-15: The
TestAzureOpenAILLM
class is correctly named to reflect that it contains tests for theAzureOpenAI
class. It's good practice to have a separate test class for each class in the codebase.* 44-51: The test for setting a proxy is good as it ensures that the class can handle proxy configurations. It's important for classes that make network requests to support proxies due to the common use of proxies in corporate environments.
- 81-86: The
test_completion
method is setting up a mock response for thecompletion
method of theAzureOpenAI
class. This is a good practice as it allows the test to verify the behavior of thecompletion
method without making actual network requests. However, thecompletion
method is being patched on an instance ofAzureOpenAI
rather than theopenai.Completion
class or method. This could be a mistake if the intention was to mock the OpenAI API client's completion method.* 107-113: Similar to the previous hunk, the `test_chat_completion` method is setting up a mock response for the `chat_completion` method. Again, it's important to ensure that the correct method is being mocked. If `chat_completion` is a method of the `AzureOpenAI` class, then this is correct. If it's supposed to be a method of the OpenAI API client, then the mock should be applied to that instead. * 95-100: The `AzureOpenAI` instance is being created with the correct parameters, reflecting the changes in the class's `__init__` method. This is good as it ensures that the test is in sync with the class definition.
- 129-133: The test is correctly mocking the
chat_completion
method and asserting that it is called with the expected arguments. This is good practice for unit tests as it verifies that the method behaves as expected when given a specific input.</blockquote></details></blockquote></details> </details>
dict: A dictionary containing Default Params. | ||
|
||
""" | ||
return {**super()._default_params, "engine": self.engine} | ||
|
||
def call(self, instruction: AbstractPrompt, suffix: str = "") -> str: | ||
""" | ||
Call the Azure OpenAI LLM. | ||
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name} | ||
|
||
Args: | ||
instruction (AbstractPrompt): A prompt object with instruction for LLM. | ||
suffix (str): Suffix to pass. | ||
|
||
Returns: | ||
str: LLM response. | ||
|
||
""" | ||
self.last_prompt = instruction.to_string() + suffix | ||
@property | ||
def _invocation_params(self) -> Dict[str, Any]: | ||
"""Get the parameters used to invoke the model.""" | ||
if is_openai_v1(): | ||
return super()._invocation_params | ||
else: | ||
return { | ||
**super()._invocation_params, | ||
"api_type": self.api_type, | ||
"api_version": self.api_version, | ||
} | ||
|
||
return ( | ||
self.chat_completion(self.last_prompt) | ||
if self.is_chat_model | ||
else self.completion(self.last_prompt) | ||
) | ||
@property | ||
def _client_params(self) -> Dict[str, any]: | ||
client_params = { | ||
"api_version": self.api_version, | ||
"azure_endpoint": self.azure_endpoint, | ||
"azure_deployment": self.deployment_name, | ||
"azure_ad_token": self.azure_ad_token, | ||
"azure_ad_token_provider": self.azure_ad_token_provider, | ||
} | ||
return {**client_params, **super()._client_params} | ||
|
||
@property | ||
def type(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [142-174]
The properties _default_params
, _invocation_params
, and _client_params
are well-implemented to handle different configurations based on the API version. However, there is a syntax error in the _default_params
method. The ternary operator is used incorrectly. It should be fixed to correctly return the dictionary with the appropriate key based on the API version.
- return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
+ return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name}
def test_type_without_api_version(self): | ||
with pytest.raises(APIKeyNotFoundError): | ||
AzureOpenAI(api_token="test", api_base="test") | ||
AzureOpenAI(api_token="test", azure_endpoint="test") | ||
|
||
def test_type_without_deployment(self): | ||
with pytest.raises(MissingModelError): | ||
AzureOpenAI(api_token="test", api_base="test", api_version="test") | ||
AzureOpenAI(api_token="test", azure_endpoint="test", api_version="test") | ||
|
||
def test_type_with_token(self): | ||
assert ( | ||
AzureOpenAI( | ||
api_token="test", | ||
api_base="test", | ||
api_version="test", | ||
deployment_name="test", | ||
).type | ||
== "azure-openai" | ||
AzureOpenAI( | ||
api_token="test", | ||
azure_endpoint="test", | ||
api_version="test", | ||
deployment_name="test", | ||
).type | ||
== "azure-openai" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are checking for the correct exceptions when required parameters are not provided. This is good practice as it ensures that the class constructor correctly handles missing arguments. However, the exception APIKeyNotFoundError
seems to be used for different missing parameters, which might be confusing. It might be more appropriate to have different exceptions for different types of configuration errors or to rename the exception to something more generic like ConfigurationError
.
def test_params_setting(self): | ||
llm = AzureOpenAI( | ||
api_token="test", | ||
api_base="test", | ||
azure_endpoint="test", | ||
api_version="test", | ||
deployment_name="Deployed-GPT-3", | ||
is_chat_model=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is setting up an AzureOpenAI
instance with various parameters. However, the is_chat_model
parameter is being set, which does not match the updated attribute name _is_chat_model
as per the summary. This discrepancy should be corrected to ensure the test reflects the current state of the code.
- is_chat_model=True,
+ _is_chat_model=True,
Commitable suggestion
[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def test_params_setting(self): | |
llm = AzureOpenAI( | |
api_token="test", | |
api_base="test", | |
azure_endpoint="test", | |
api_version="test", | |
deployment_name="Deployed-GPT-3", | |
is_chat_model=True, | |
def test_params_setting(self): | |
llm = AzureOpenAI( | |
api_token="test", | |
azure_endpoint="test", | |
api_version="test", | |
deployment_name="Deployed-GPT-3", | |
_is_chat_model=True, |
stop=["\n"], | ||
) | ||
|
||
assert llm.engine == "Deployed-GPT-3" | ||
assert llm.is_chat_model | ||
assert llm.deployment_name == "Deployed-GPT-3" | ||
assert llm._is_chat_model | ||
assert llm.temperature == 0.5 | ||
assert llm.max_tokens == 50 | ||
assert llm.top_p == 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertions here are checking that the properties of the AzureOpenAI
instance are set correctly. This is a good practice as it ensures that the instance variables are being assigned as expected. However, the test should also include assertions for the new _invocation_params
and _client_params
properties if they are critical to the operation of the class.
<!-- This is an auto-generated comment by CodeRabbit -->
@gventuri I recommend undoing the release of 1.4.9. and re-doing it after merging this PR. As explained in the previous PR, we migrated to Openai v1 only for OpenAI API, not for the Azure's version - supported here. As a result, version 1.4.9 is in a non-coherent state. |
@mspronesti thanks a lot, will remove the 1.4.9 from pypi and push a 1.4.10 right after merging this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 3
Configuration used: CodeRabbit UI
Files selected for processing (1)
- pandasai/llm/azure_openai.py (4 hunks)
Additional comments: 2
pandasai/llm/azure_openai.py (2)
12-24: The import statements and module docstring are clear and well-organized. The use of
load_dotenv
from thehelpers
module suggests that the environment variables will be loaded at module import time, which is a common practice for configuration.165-173: The
_client_params
property is well-implemented, combining specific Azure parameters with the superclass's client parameters. This is a good example of extending base class functionality in a subclass.
azure_endpoint: Union[str, None] = None | ||
"""Your Azure Active Directory token. | ||
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided. | ||
For more: | ||
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id. | ||
""" | ||
azure_ad_token: Union[str, None] = None | ||
"""A function that returns an Azure Active Directory token. | ||
Will be invoked on every request. | ||
""" | ||
azure_ad_token_provider: Union[str, None] = None | ||
deployment_name: str | ||
api_version: str = "" | ||
"""Legacy, for openai<1.0.0 support.""" | ||
api_base: str | ||
"""Legacy, for openai<1.0.0 support.""" | ||
api_type: str = "azure" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class attributes are well-documented. However, the azure_ad_token_provider
should be a callable that returns a token, not a string. This needs to be corrected to reflect the proper type.
- azure_ad_token_provider: Union[str, None] = None
+ azure_ad_token_provider: Optional[Callable[[], str]] = None
Commitable suggestion
[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
azure_endpoint: Union[str, None] = None | |
"""Your Azure Active Directory token. | |
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided. | |
For more: | |
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id. | |
""" | |
azure_ad_token: Union[str, None] = None | |
"""A function that returns an Azure Active Directory token. | |
Will be invoked on every request. | |
""" | |
azure_ad_token_provider: Union[str, None] = None | |
deployment_name: str | |
api_version: str = "" | |
"""Legacy, for openai<1.0.0 support.""" | |
api_base: str | |
"""Legacy, for openai<1.0.0 support.""" | |
api_type: str = "azure" | |
azure_endpoint: Union[str, None] = None | |
"""Your Azure Active Directory token. | |
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided. | |
For more: | |
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id. | |
""" | |
azure_ad_token: Union[str, None] = None | |
"""A function that returns an Azure Active Directory token. | |
Will be invoked on every request. | |
""" | |
azure_ad_token_provider: Optional[Callable[[], str]] = None | |
deployment_name: str | |
api_version: str = "" | |
"""Legacy, for openai<1.0.0 support.""" | |
api_base: str | |
"""Legacy, for openai<1.0.0 support.""" | |
api_type: str = "azure" |
This class uses `BaseOpenAI` class to support Azure OpenAI features. | ||
""" | ||
|
||
azure_endpoint: Union[str, None] = None | ||
"""Your Azure Active Directory token. | ||
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided. | ||
For more: | ||
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id. | ||
""" | ||
azure_ad_token: Union[str, None] = None | ||
"""A function that returns an Azure Active Directory token. | ||
Will be invoked on every request. | ||
""" | ||
azure_ad_token_provider: Union[str, None] = None | ||
deployment_name: str | ||
api_version: str = "" | ||
"""Legacy, for openai<1.0.0 support.""" | ||
api_base: str | ||
"""Legacy, for openai<1.0.0 support.""" | ||
api_type: str = "azure" | ||
api_version: str | ||
engine: str | ||
|
||
def __init__( | ||
self, | ||
api_token: Optional[str] = None, | ||
api_base: Optional[str] = None, | ||
api_version: Optional[str] = None, | ||
deployment_name: str = None, | ||
is_chat_model: bool = True, | ||
**kwargs, | ||
self, | ||
api_token: Optional[str] = None, | ||
azure_endpoint: Union[str, None] = None, | ||
azure_ad_token: Union[str, None] = None, | ||
azure_ad_token_provider: Union[str, None] = None, | ||
api_base: Optional[str] = None, | ||
api_version: Optional[str] = None, | ||
deployment_name: str = None, | ||
is_chat_model: bool = True, | ||
**kwargs, | ||
): | ||
""" | ||
__init__ method of AzureOpenAI Class. | ||
|
||
Args: | ||
api_token (str): Azure OpenAI API token. | ||
api_base (str): Base url of the Azure endpoint. | ||
azure_endpoint (str): Azure endpoint. | ||
It should look like the following: | ||
<https://YOUR_RESOURCE_NAME.openai.azure.com/> | ||
azure_ad_token (str): Your Azure Active Directory token. | ||
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided. | ||
For more: https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id. | ||
azure_ad_token_provider (str): A function that returns an Azure Active Directory token. | ||
Will be invoked on every request. | ||
api_version (str): Version of the Azure OpenAI API. | ||
Be aware the API version may change. | ||
api_base (str): Legacy, kept for backward compatibility with openai < 1.0 | ||
deployment_name (str): Custom name of the deployed model | ||
is_chat_model (bool): Whether ``deployment_name`` corresponds to a Chat | ||
or a Completion model. | ||
**kwargs: Inference Parameters. | ||
""" | ||
|
||
self.api_token = api_token or os.getenv("OPENAI_API_KEY") or None | ||
self.api_base = api_base or os.getenv("OPENAI_API_BASE") or None | ||
self.api_token = ( | ||
api_token | ||
or os.getenv("OPENAI_API_KEY") | ||
or os.getenv("AZURE_OPENAI_API_KEY") | ||
) | ||
self.azure_endpoint = azure_endpoint or os.getenv("AZURE_OPENAI_ENDPOINT") | ||
self.api_base = api_base or os.getenv("OPENAI_API_BASE") | ||
self.api_version = api_version or os.getenv("OPENAI_API_VERSION") | ||
if self.api_token is None: | ||
raise APIKeyNotFoundError( | ||
"Azure OpenAI key is required. Please add an environment variable " | ||
"`OPENAI_API_KEY` or pass `api_token` as a named parameter" | ||
) | ||
if self.api_base is None: | ||
if is_openai_v1(): | ||
if self.azure_endpoint is None: | ||
raise APIKeyNotFoundError( | ||
"Azure endpoint is required. Please add an environment variable " | ||
"`AZURE_OPENAI_API_ENDPOINT` or pass `azure_endpoint` as a named parameter" | ||
) | ||
elif self.api_base is None: | ||
raise APIKeyNotFoundError( | ||
"Azure OpenAI base is required. Please add an environment variable " | ||
"`OPENAI_API_BASE` or pass `api_base` as a named parameter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [50-140]
The __init__
method has been updated to handle the new Azure-specific parameters. The logic for raising exceptions when required parameters are not provided is correct and follows good error handling practices. However, the api_version
check should be inside the is_openai_v1()
condition block, as it is only relevant for the v1 API.
if is_openai_v1():
if self.azure_endpoint is None:
raise APIKeyNotFoundError(
"Azure endpoint is required. Please add an environment variable "
"`AZURE_OPENAI_API_ENDPOINT` or pass `azure_endpoint` as a named parameter"
)
+ if self.api_version is None:
+ raise APIKeyNotFoundError(
+ "Azure OpenAI version is required. Please add an environment variable "
+ "`OPENAI_API_VERSION` or pass `api_version` as a named parameter"
+ )
- elif self.api_base is None:
- raise APIKeyNotFoundError(
- "Azure OpenAI base is required. Please add an environment variable "
- "`OPENAI_API_BASE` or pass `api_base` as a named parameter"
- )
- if self.api_version is None:
- raise APIKeyNotFoundError(
- "Azure OpenAI version is required. Please add an environment variable "
- "`OPENAI_API_VERSION` or pass `api_version` as a named parameter"
- )
dict: A dictionary containing Default Params. | ||
|
||
""" | ||
return {**super()._default_params, "engine": self.engine} | ||
|
||
def call(self, instruction: AbstractPrompt, suffix: str = "") -> str: | ||
""" | ||
Call the Azure OpenAI LLM. | ||
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a syntax error in the _default_params
method. The ternary operator is used incorrectly. It should be fixed to correctly return the dictionary with the appropriate key based on the API version.
- return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
+ return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name}
Commitable suggestion
[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
dict: A dictionary containing Default Params. | |
""" | |
return {**super()._default_params, "engine": self.engine} | |
def call(self, instruction: AbstractPrompt, suffix: str = "") -> str: | |
""" | |
Call the Azure OpenAI LLM. | |
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name} | |
""" | |
return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name} |
@gventuri follow-up of #753
Summary by CodeRabbit
New Features
AzureOpenAI
class to support Azure-specific configurations and model invocation.AzureOpenAI
class.call
method to support different logic based on the model type and API version.Refactor
AzureOpenAI
class.Tests
AzureOpenAI
class.