Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llms): support OpenAI v1 for Azure OpenAI #755

Merged
merged 2 commits into from
Nov 15, 2023

Conversation

mspronesti
Copy link
Contributor

@mspronesti mspronesti commented Nov 15, 2023

@gventuri follow-up of #753

Summary by CodeRabbit

  • New Features

    • Expanded the AzureOpenAI class to support Azure-specific configurations and model invocation.
    • Introduced new parameters for enhanced Azure integration within the initialization of the AzureOpenAI class.
    • Enhanced the call method to support different logic based on the model type and API version.
  • Refactor

    • Renamed attributes to better reflect their purpose within the AzureOpenAI class.
  • Tests

    • Updated tests to align with the new attribute names and logic in the AzureOpenAI class.
    • Improved test methods to use more robust mocking techniques.

Copy link
Contributor

coderabbitai bot commented Nov 15, 2023

Walkthrough

The recent update enhances the AzureOpenAI class to integrate more deeply with Azure's services. It now supports Azure-specific configurations, including endpoint details, Active Directory tokens, and deployment names. The class is more versatile, with logic that adapts to different model types and API versions, and it introduces new properties to manage invocation and client parameters.

Changes

File Path Change Summary
.../azure_openai.py Expanded AzureOpenAI class to support Azure configurations, added new attributes, updated initialization, and modified call method for different model behaviors.
.../test_azure_openai.py Updated tests to align with AzureOpenAI changes, including attribute renaming and method call updates.

🐇🌟 In the code's gentle hop, a new update lands atop,
With Azure's might, it integrates just right,
As the autumn leaves twirl, our code unfurls,
On this day in history, it's a tech poetry mystery. 🍂📜


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 5

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 57cc654 and 0ed89c6.
Files selected for processing (2)
  • pandasai/llm/azure_openai.py (3 hunks)
  • tests/llms/test_azure_openai.py (7 hunks)
Additional comments: 10
pandasai/llm/azure_openai.py (3)
  • 12-24: The import of openai and the load_dotenv function call are appropriate for setting up the environment and dependencies. However, ensure that the openai library supports the Azure-specific features mentioned in the pull request summary.

  • 50-123: The __init__ method has been updated to accept new parameters for Azure configurations. The logic for raising exceptions when required parameters are not provided is sound. However, the is_openai_v1() checks seem to imply that there are different behaviors expected based on the OpenAI API version. Ensure that the is_openai_v1() function accurately reflects the API version being used and that the logic branching is correct.

  • 177-177: The type property is simple and clear, correctly returning the type of service.

tests/llms/test_azure_openai.py (7)
  • 9-11: The OpenAIObject class is a simple wrapper to convert a dictionary into an object with attributes. This is a common pattern in Python for creating objects with dynamic attributes. However, it's important to note that this approach bypasses the usual attribute setting mechanisms of Python classes, which can lead to unexpected behavior if not used carefully. For example, it won't call any property setters.

  • 14-15: The TestAzureOpenAILLM class is correctly named to reflect that it contains tests for the AzureOpenAI class. It's good practice to have a separate test class for each class in the codebase.



* 44-51: The test for setting a proxy is good as it ensures that the class can handle proxy configurations. It's important for classes that make network requests to support proxies due to the common use of proxies in corporate environments.

  • 81-86: The test_completion method is setting up a mock response for the completion method of the AzureOpenAI class. This is a good practice as it allows the test to verify the behavior of the completion method without making actual network requests. However, the completion method is being patched on an instance of AzureOpenAI rather than the openai.Completion class or method. This could be a mistake if the intention was to mock the OpenAI API client's completion method.


* 107-113: Similar to the previous hunk, the `test_chat_completion` method is setting up a mock response for the `chat_completion` method. Again, it's important to ensure that the correct method is being mocked. If `chat_completion` is a method of the `AzureOpenAI` class, then this is correct. If it's supposed to be a method of the OpenAI API client, then the mock should be applied to that instead.



* 95-100: The `AzureOpenAI` instance is being created with the correct parameters, reflecting the changes in the class's `__init__` method. This is good as it ensures that the test is in sync with the class definition.

  • 129-133: The test is correctly mocking the chat_completion method and asserting that it is called with the expected arguments. This is good practice for unit tests as it verifies that the method behaves as expected when given a specific input.


</blockquote></details></blockquote></details>



</details>

Comment on lines 148 to 177
dict: A dictionary containing Default Params.

"""
return {**super()._default_params, "engine": self.engine}

def call(self, instruction: AbstractPrompt, suffix: str = "") -> str:
"""
Call the Azure OpenAI LLM.
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}

Args:
instruction (AbstractPrompt): A prompt object with instruction for LLM.
suffix (str): Suffix to pass.

Returns:
str: LLM response.

"""
self.last_prompt = instruction.to_string() + suffix
@property
def _invocation_params(self) -> Dict[str, Any]:
"""Get the parameters used to invoke the model."""
if is_openai_v1():
return super()._invocation_params
else:
return {
**super()._invocation_params,
"api_type": self.api_type,
"api_version": self.api_version,
}

return (
self.chat_completion(self.last_prompt)
if self.is_chat_model
else self.completion(self.last_prompt)
)
@property
def _client_params(self) -> Dict[str, any]:
client_params = {
"api_version": self.api_version,
"azure_endpoint": self.azure_endpoint,
"azure_deployment": self.deployment_name,
"azure_ad_token": self.azure_ad_token,
"azure_ad_token_provider": self.azure_ad_token_provider,
}
return {**client_params, **super()._client_params}

@property
def type(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [142-174]

The properties _default_params, _invocation_params, and _client_params are well-implemented to handle different configurations based on the API version. However, there is a syntax error in the _default_params method. The ternary operator is used incorrectly. It should be fixed to correctly return the dictionary with the appropriate key based on the API version.

- return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
+ return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name}

Comment on lines 25 to 42
def test_type_without_api_version(self):
with pytest.raises(APIKeyNotFoundError):
AzureOpenAI(api_token="test", api_base="test")
AzureOpenAI(api_token="test", azure_endpoint="test")

def test_type_without_deployment(self):
with pytest.raises(MissingModelError):
AzureOpenAI(api_token="test", api_base="test", api_version="test")
AzureOpenAI(api_token="test", azure_endpoint="test", api_version="test")

def test_type_with_token(self):
assert (
AzureOpenAI(
api_token="test",
api_base="test",
api_version="test",
deployment_name="test",
).type
== "azure-openai"
AzureOpenAI(
api_token="test",
azure_endpoint="test",
api_version="test",
deployment_name="test",
).type
== "azure-openai"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are checking for the correct exceptions when required parameters are not provided. This is good practice as it ensures that the class constructor correctly handles missing arguments. However, the exception APIKeyNotFoundError seems to be used for different missing parameters, which might be confusing. It might be more appropriate to have different exceptions for different types of configuration errors or to rename the exception to something more generic like ConfigurationError.

Comment on lines 57 to 63
def test_params_setting(self):
llm = AzureOpenAI(
api_token="test",
api_base="test",
azure_endpoint="test",
api_version="test",
deployment_name="Deployed-GPT-3",
is_chat_model=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is setting up an AzureOpenAI instance with various parameters. However, the is_chat_model parameter is being set, which does not match the updated attribute name _is_chat_model as per the summary. This discrepancy should be corrected to ensure the test reflects the current state of the code.

- is_chat_model=True,
+ _is_chat_model=True,

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def test_params_setting(self):
llm = AzureOpenAI(
api_token="test",
api_base="test",
azure_endpoint="test",
api_version="test",
deployment_name="Deployed-GPT-3",
is_chat_model=True,
def test_params_setting(self):
llm = AzureOpenAI(
api_token="test",
azure_endpoint="test",
api_version="test",
deployment_name="Deployed-GPT-3",
_is_chat_model=True,

Comment on lines 69 to 76
stop=["\n"],
)

assert llm.engine == "Deployed-GPT-3"
assert llm.is_chat_model
assert llm.deployment_name == "Deployed-GPT-3"
assert llm._is_chat_model
assert llm.temperature == 0.5
assert llm.max_tokens == 50
assert llm.top_p == 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions here are checking that the properties of the AzureOpenAI instance are set correctly. This is a good practice as it ensures that the instance variables are being assigned as expected. However, the test should also include assertions for the new _invocation_params and _client_params properties if they are critical to the operation of the class.




<!-- This is an auto-generated comment by CodeRabbit -->

@mspronesti
Copy link
Contributor Author

mspronesti commented Nov 15, 2023

@gventuri I recommend undoing the release of 1.4.9. and re-doing it after merging this PR. As explained in the previous PR, we migrated to Openai v1 only for OpenAI API, not for the Azure's version - supported here. As a result, version 1.4.9 is in a non-coherent state.

@gventuri
Copy link
Collaborator

@mspronesti thanks a lot, will remove the 1.4.9 from pypi and push a 1.4.10 right after merging this!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 0ed89c6 and 5d54ecb.
Files selected for processing (1)
  • pandasai/llm/azure_openai.py (4 hunks)
Additional comments: 2
pandasai/llm/azure_openai.py (2)
  • 12-24: The import statements and module docstring are clear and well-organized. The use of load_dotenv from the helpers module suggests that the environment variables will be loaded at module import time, which is a common practice for configuration.

  • 165-173: The _client_params property is well-implemented, combining specific Azure parameters with the superclass's client parameters. This is a good example of extending base class functionality in a subclass.

Comment on lines +32 to 48
azure_endpoint: Union[str, None] = None
"""Your Azure Active Directory token.
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided.
For more:
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id.
"""
azure_ad_token: Union[str, None] = None
"""A function that returns an Azure Active Directory token.
Will be invoked on every request.
"""
azure_ad_token_provider: Union[str, None] = None
deployment_name: str
api_version: str = ""
"""Legacy, for openai<1.0.0 support."""
api_base: str
"""Legacy, for openai<1.0.0 support."""
api_type: str = "azure"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class attributes are well-documented. However, the azure_ad_token_provider should be a callable that returns a token, not a string. This needs to be corrected to reflect the proper type.

- azure_ad_token_provider: Union[str, None] = None
+ azure_ad_token_provider: Optional[Callable[[], str]] = None

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
azure_endpoint: Union[str, None] = None
"""Your Azure Active Directory token.
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided.
For more:
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id.
"""
azure_ad_token: Union[str, None] = None
"""A function that returns an Azure Active Directory token.
Will be invoked on every request.
"""
azure_ad_token_provider: Union[str, None] = None
deployment_name: str
api_version: str = ""
"""Legacy, for openai<1.0.0 support."""
api_base: str
"""Legacy, for openai<1.0.0 support."""
api_type: str = "azure"
azure_endpoint: Union[str, None] = None
"""Your Azure Active Directory token.
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided.
For more:
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id.
"""
azure_ad_token: Union[str, None] = None
"""A function that returns an Azure Active Directory token.
Will be invoked on every request.
"""
azure_ad_token_provider: Optional[Callable[[], str]] = None
deployment_name: str
api_version: str = ""
"""Legacy, for openai<1.0.0 support."""
api_base: str
"""Legacy, for openai<1.0.0 support."""
api_type: str = "azure"

Comment on lines 29 to 106
This class uses `BaseOpenAI` class to support Azure OpenAI features.
"""

azure_endpoint: Union[str, None] = None
"""Your Azure Active Directory token.
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided.
For more:
https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id.
"""
azure_ad_token: Union[str, None] = None
"""A function that returns an Azure Active Directory token.
Will be invoked on every request.
"""
azure_ad_token_provider: Union[str, None] = None
deployment_name: str
api_version: str = ""
"""Legacy, for openai<1.0.0 support."""
api_base: str
"""Legacy, for openai<1.0.0 support."""
api_type: str = "azure"
api_version: str
engine: str

def __init__(
self,
api_token: Optional[str] = None,
api_base: Optional[str] = None,
api_version: Optional[str] = None,
deployment_name: str = None,
is_chat_model: bool = True,
**kwargs,
self,
api_token: Optional[str] = None,
azure_endpoint: Union[str, None] = None,
azure_ad_token: Union[str, None] = None,
azure_ad_token_provider: Union[str, None] = None,
api_base: Optional[str] = None,
api_version: Optional[str] = None,
deployment_name: str = None,
is_chat_model: bool = True,
**kwargs,
):
"""
__init__ method of AzureOpenAI Class.

Args:
api_token (str): Azure OpenAI API token.
api_base (str): Base url of the Azure endpoint.
azure_endpoint (str): Azure endpoint.
It should look like the following:
<https://YOUR_RESOURCE_NAME.openai.azure.com/>
azure_ad_token (str): Your Azure Active Directory token.
Automatically inferred from env var `AZURE_OPENAI_AD_TOKEN` if not provided.
For more: https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id.
azure_ad_token_provider (str): A function that returns an Azure Active Directory token.
Will be invoked on every request.
api_version (str): Version of the Azure OpenAI API.
Be aware the API version may change.
api_base (str): Legacy, kept for backward compatibility with openai < 1.0
deployment_name (str): Custom name of the deployed model
is_chat_model (bool): Whether ``deployment_name`` corresponds to a Chat
or a Completion model.
**kwargs: Inference Parameters.
"""

self.api_token = api_token or os.getenv("OPENAI_API_KEY") or None
self.api_base = api_base or os.getenv("OPENAI_API_BASE") or None
self.api_token = (
api_token
or os.getenv("OPENAI_API_KEY")
or os.getenv("AZURE_OPENAI_API_KEY")
)
self.azure_endpoint = azure_endpoint or os.getenv("AZURE_OPENAI_ENDPOINT")
self.api_base = api_base or os.getenv("OPENAI_API_BASE")
self.api_version = api_version or os.getenv("OPENAI_API_VERSION")
if self.api_token is None:
raise APIKeyNotFoundError(
"Azure OpenAI key is required. Please add an environment variable "
"`OPENAI_API_KEY` or pass `api_token` as a named parameter"
)
if self.api_base is None:
if is_openai_v1():
if self.azure_endpoint is None:
raise APIKeyNotFoundError(
"Azure endpoint is required. Please add an environment variable "
"`AZURE_OPENAI_API_ENDPOINT` or pass `azure_endpoint` as a named parameter"
)
elif self.api_base is None:
raise APIKeyNotFoundError(
"Azure OpenAI base is required. Please add an environment variable "
"`OPENAI_API_BASE` or pass `api_base` as a named parameter"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [50-140]

The __init__ method has been updated to handle the new Azure-specific parameters. The logic for raising exceptions when required parameters are not provided is correct and follows good error handling practices. However, the api_version check should be inside the is_openai_v1() condition block, as it is only relevant for the v1 API.

        if is_openai_v1():
            if self.azure_endpoint is None:
                raise APIKeyNotFoundError(
                    "Azure endpoint is required. Please add an environment variable "
                    "`AZURE_OPENAI_API_ENDPOINT` or pass `azure_endpoint` as a named parameter"
                )
+           if self.api_version is None:
+               raise APIKeyNotFoundError(
+                   "Azure OpenAI version is required. Please add an environment variable "
+                   "`OPENAI_API_VERSION` or pass `api_version` as a named parameter"
+               )
-       elif self.api_base is None:
-           raise APIKeyNotFoundError(
-               "Azure OpenAI base is required. Please add an environment variable "
-               "`OPENAI_API_BASE` or pass `api_base` as a named parameter"
-           )
-       if self.api_version is None:
-           raise APIKeyNotFoundError(
-               "Azure OpenAI version is required. Please add an environment variable "
-               "`OPENAI_API_VERSION` or pass `api_version` as a named parameter"
-           )

Comment on lines 147 to +150
dict: A dictionary containing Default Params.

"""
return {**super()._default_params, "engine": self.engine}

def call(self, instruction: AbstractPrompt, suffix: str = "") -> str:
"""
Call the Azure OpenAI LLM.
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a syntax error in the _default_params method. The ternary operator is used incorrectly. It should be fixed to correctly return the dictionary with the appropriate key based on the API version.

- return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
+ return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name}

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
dict: A dictionary containing Default Params.
"""
return {**super()._default_params, "engine": self.engine}
def call(self, instruction: AbstractPrompt, suffix: str = "") -> str:
"""
Call the Azure OpenAI LLM.
return {**super()._default_params, "model" if is_openai_v1() else "engine": self.deployment_name}
"""
return {**super()._default_params, ("model" if is_openai_v1() else "engine"): self.deployment_name}

@gventuri gventuri merged commit 372f65d into sinaptik-ai:main Nov 15, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants