-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google GenerativeAI SDK updated to latest version #163
Conversation
…I sdk to latest version
Gunpal Jain seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
WalkthroughThis pull request updates the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GoogleChatModel
participant ChatAPI
Client->>GoogleChatModel: CreateChatCompletionAsync(request)
GoogleChatModel->>ChatAPI: Send async streaming request
ChatAPI-->>GoogleChatModel: Stream response tokens (via await foreach)
GoogleChatModel->>GoogleChatModel: Accumulate tokens (using StringBuilder)
GoogleChatModel-->>Client: Return assembled Message
sequenceDiagram
participant Client
participant GoogleEmbeddingModel
participant EmbeddingModelAPI
Client->>GoogleEmbeddingModel: CreateEmbeddingsAsync(request)
GoogleEmbeddingModel->>GoogleEmbeddingModel: Validate request & merge settings
GoogleEmbeddingModel->>EmbeddingModelAPI: Call EmbedContentAsync(request)
EmbeddingModelAPI-->>GoogleEmbeddingModel: Return embeddings response
GoogleEmbeddingModel->>GoogleEmbeddingModel: Extract and validate embedding data
GoogleEmbeddingModel-->>Client: Return EmbeddingResponse
Suggested labels
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to remove this also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (8)
src/Google/src/Predefined/GeminiModels.cs (1)
17-21
: Verify pricing and consider adding documentation for price multipliers.The model uses significantly higher pricing parameters compared to other models. Please verify:
The accuracy of the pricing structure:
- Context window: 2MB
- Input price: 0.0000035 per 1K characters
- Output price: 0.0000105 per 1K characters
- Additional prices: 0.000007 and 0.000021 per 1K characters
Consider adding XML documentation to explain what the additional price multipliers represent.
Add XML documentation to explain the price multipliers:
/// <inheritdoc cref="GoogleAIModels.Gemini15Pro" /> +/// <remarks> +/// Price multipliers: +/// <list type="bullet"> +/// <item><description>First parameter: Input price per 1K characters</description></item> +/// <item><description>Second parameter: Output price per 1K characters</description></item> +/// <item><description>Third parameter: [Explain purpose]</description></item> +/// <item><description>Fourth parameter: [Explain purpose]</description></item> +/// </list> +/// </remarks> public class Gemini15ProModel(GoogleProvider provider)src/Google/src/GoogleEmbeddingModel.cs (2)
23-24
: Validate API key usage and ensure safe storage practices.
Exposingprovider.ApiKey
in a public property might inadvertently risk key leaks if logs or debugging statements reveal its value. Consider restricting direct exposure or providing a secure alternative for retrieving credentials.
27-40
: Consider passingcancellationToken
deeper in theStopwatch
usage and settings merge logic.
While the code properly uses the token forEmbedContentAsync
, it might be beneficial to propagate cancellation checks (e.g.,cancellationToken.ThrowIfCancellationRequested()
) after each major step for quicker response to cancellation.src/Google/src/Extensions/GoogleGeminiExtensions.cs (1)
43-53
: Clarify or unify return value forGetStringForFunctionArgs
.
This method may returnstring.Empty
ornull
. If the consumer does not expect anull
return, consider making it consistently return an empty string instead of null, or update method documentation to reflect the possibility ofnull
.src/Google/src/Extensions/StringExtensions.cs (1)
77-91
: Consider adding error details for deserialization failures inAsFunctionResultContent
.
Directly invokingJsonNode.Parse(args)
may throw an exception on invalid JSON. If you anticipate or want to guard against malformed responses, you can catch parsing exceptions and wrap them in a more informative error message.src/Google/src/GoogleChatModel.cs (1)
114-122
: Streaming approach suggestion.
The asynchronous streaming pattern is a great improvement. However, consider capturing exceptions within the loop or adding error-handling logic in caseStreamContentAsync
fails mid-stream.src/Google/src/GoogleConfiguration.cs (1)
28-32
: Consider lowering the default temperature value.The default temperature of 1.0 might lead to more random outputs. Consider lowering it to 0.7 for a better balance between creativity and determinism.
- public double? Temperature { get; set; } = 1D; + public double? Temperature { get; set; } = 0.7D;src/IntegrationTests/BaseTests.cs (1)
265-266
: Add documentation and categorization for the explicit test.Consider adding:
- XML documentation explaining why this test is marked as explicit
- Test category attribute for better organization
+ /// <summary> + /// Tests the Google embedding functionality with a comprehensive set of text samples. + /// This test is marked as explicit due to potential API costs and longer execution time. + /// </summary> [Explicit] + [Category("Embedding")] public async Task GoogleEmbeddingTest()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
src/Directory.Packages.props
(1 hunks)src/Google/src/Extensions/GoogleGeminiExtensions.cs
(1 hunks)src/Google/src/Extensions/StringExtensions.cs
(3 hunks)src/Google/src/GoogleChatModel.Tokens.cs
(1 hunks)src/Google/src/GoogleChatModel.cs
(8 hunks)src/Google/src/GoogleConfiguration.cs
(1 hunks)src/Google/src/GoogleEmbeddingModel.cs
(1 hunks)src/Google/src/GoogleEmbeddingSettings.cs
(1 hunks)src/Google/src/GoogleProvider.cs
(1 hunks)src/Google/src/Predefined/GeminiModels.cs
(1 hunks)src/Google/src/Predefined/GoogleEmbeddingModels.cs
(1 hunks)src/IntegrationTests/BaseTests.cs
(1 hunks)src/IntegrationTests/Helpers.cs
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/Directory.Packages.props
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and test / Build, test and publish
🔇 Additional comments (21)
src/Google/src/Predefined/GeminiModels.cs (2)
1-4
: LGTM!The import and namespace declarations are appropriate for the context.
11-15
:❓ Verification inconclusive
Verify the capabilities and pricing for the new Gemini15Flash model.
A new model class has been added with specific pricing parameters. Please verify:
- The model's capabilities and use cases
- The pricing structure:
- Context window: 1MB
- Input price: 0.00000035 per 1K characters
- Output price: 0.00000105 per 1K characters
- Additional prices: 0.0000007 and 0.0000021 per 1K characters
🌐 Web query:
What is Google's Gemini Flash model and its pricing?
💡 Result:
Google's Gemini 2.0 Flash is a multimodal AI model optimized for speed, efficiency, and high-volume tasks like real-time interactions and complex reasoning[1][4][6]. It supports text/image/video/audio inputs with text output (multimodal output in private preview)[4][10], featuring a 1M-token context window[1][6]. Key capabilities include code generation, data analysis, function calling[4], and integration with Google Search[10].
Pricing Details
Vertex AI Platform ([2]):
Model Feature Standard Price Batch API Price Input Text Tokens $0.15/M $0.075/M Input Audio Tokens $1/M $0.50/M Output Text Tokens $0.60/M $0.30/M Modality-based approximations (actual billing uses tokens):
- Text: ~$0.0375/M characters input / $0.15/M characters output
- Image (1024x1024): ~$0.0001935 per image
- Video: ~$0.0000387 per second
- Audio: ~$0.000025 per second[2]
Gemini API ([7][9]):
Tier Input Cost Output Cost Free Unlimited (15 RPM limit) Free Pay-as-you-go $0.10/M tokens (text/image/video)
$0.70/M tokens (audio)$0.40/M tokens Additional costs:
- Context caching: $1/M token-hour storage[7]
- Grounding with Google Search: First 1,500 requests/day free, then $35/1K requests[2][7]
Key Comparisons
2.0 Flash Flash-Lite ([2][6]) Input Text $0.15/M tokens $0.075/M tokens Output Text $0.60/M tokens $0.30/M tokens Specialization Balanced speed/performance Cost optimization The model is available via Google AI Studio (free tier available), Vertex AI[4], and the Gemini app[1]. Developers can access it using
gemini-2-flash-001
in SDKs[4]. For large-scale deployments, Batch API reduces costs by ~50%[2].Citations:
- 1: https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/
- 2: https://cloud.google.com/vertex-ai/generative-ai/pricing
- 3: https://developers.googleblog.com/en/updated-gemini-models-reduced-15-pro-pricing-increased-rate-limits-and-more/
- 4: https://cloud.google.com/vertex-ai/generative-ai/docs/gemini-v2
- 5: https://www.uctoday.com/collaboration/gemini-pricing-everything-youll-pay-for-google-gemini/
- 6: https://ai.google.dev/gemini-api/docs/models/gemini
- 7: https://ai.google.dev/pricing
- 8: https://developers.googleblog.com/en/gemini-15-flash-8b-is-now-generally-available-for-use/
- 9: https://ai.google.dev/gemini-api/docs/pricing
- 10: https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/
Verify Gemini Flash Model Capabilities and Pricing Parameters
The new
Gemini15FlashModel
class insrc/Google/src/Predefined/GeminiModels.cs
introduces support for the Gemini Flash model using a 1MB (1024×1024) context window and specific pricing parameters. External sources confirm that Google’s Gemini 2.0 Flash is a multimodal model designed for high-volume, real-time interactions, featuring a 1M-token context window and tiered pricing with different costs for text and audio inputs/outputs.Please verify the following:
Model Capabilities:
Confirm that the model supports the intended multimodal inputs (text, image, video, audio) and real-time tasks (e.g., code generation, data analysis, function calling) as described in recent Gemini documentation.Pricing Structure:
The class uses the following pricing multipliers:
- Context window: 1MB
- Input Price: 0.35×10⁻⁶
- Output Price: 1.05×10⁻⁶
- Additional Pricing: 0.70×10⁻⁶ and 2.1×10⁻⁶
External pricing details for Gemini models (from Gemini API/Vertex AI platforms) list base costs such as ~$0.10/M tokens for input and ~$0.40/M for output. The parameters in the code appear to apply a scaling factor (approximately 3.5×) to these base rates. Please review and confirm that:
- The multipliers reflect the correct, intended pricing tiers for all supported modalities.
- The scaling factor is applied intentionally and is consistent with the latest pricing documentation for Gemini Flash.
Once verified against the current Gemini API pricing and feature specifications, please update or document any discrepancies if needed.
src/Google/src/GoogleEmbeddingModel.cs (1)
11-15
: Ensure consistent exception handling for null or invalidid
arguments.
Currently, the primary constructor does not throw an exception ifid
is null or empty, which could lead to issues when creating downstream resources. Consider adding a safety check or clarifying assumptions to ensure theid
is never null or empty.src/Google/src/Extensions/GoogleGeminiExtensions.cs (2)
12-15
: Confirm fallback behavior forIsFunctionCall
.
Whenresponse.GetFunction()
returns null, the method directly returns false. Ensure upstream code handles that condition gracefully to avoid unexpected branching logic.
55-58
: Handle potential deserialization exceptions fromToFunctionParameters
.
IfJsonSerializer.Deserialize<Schema?>
fails, it can raise exceptions. Consider adding a try-catch or returning null with a logged warning to handle malformed schema gracefully.src/Google/src/Extensions/StringExtensions.cs (1)
58-60
: Confirm arguments vs. JSON node alignment forFunctionCall
.
Parsingargs
directly as aJsonNode
might throw if the string is malformed JSON. If there's possibility of invalid JSON at runtime, add exception handling or validation to avoid unexpected runtime errors.src/Google/src/GoogleChatModel.cs (9)
3-7
: New references introduced.
No issues with adding these namespaces. Ensure all are utilized to avoid clutter.
34-34
: Named parameter usage.
Using named parameters forhttpClient
enhances clarity. Looks good.
36-41
: Check function-calling behavior coverage.
The newFunctionCallingBehaviour
is well-defined. Confirm it's adequately tested to handle edge cases such as malformed function calls.
62-62
: Ensure complete migration from EnhancedGenerateContentResponse.
The signature now acceptsGenerateContentResponse
. Verify that all references toEnhancedGenerateContentResponse
have been removed or updated.
68-68
: Safe handling of function arguments.
Using?.
and?? string.Empty
ensures no null reference. Good defensive coding.
83-83
: Migration to List.
Switching from array to list is fine. This should simplify adding or removing messages.
125-125
: Final response assembly.
Returningsb.ToString()
after accumulating stream content is straightforward and clear.
190-190
: Function argument retrieval.
Switching to.GetStringForFunctionArgs()
is consistent with the new function calls approach. Looks correct.
233-233
: Usage calculation with updated response type.
AdaptingGetUsage
to acceptGenerateContentResponse
aligns with the broader refactor. Confirm usage details are correct for the new type.src/Google/src/Predefined/GoogleEmbeddingModels.cs (1)
1-9
: Introduction of GoogleTextEmbedding class.
Good addition. Provides a clean entry point for text embeddings. Consider adding usage examples or extra documentation if needed to clarify usage.src/Google/src/GoogleChatModel.Tokens.cs (1)
14-14
: Asynchronous token counting.
Switching toCountTokensAsync
with.ToList()
is a beneficial update. Ensure exceptions are handled when the API call fails or returns unexpected data.src/Google/src/GoogleConfiguration.cs (1)
14-16
:❓ Verification inconclusive
Verify the model ID change and its compatibility.
The model ID has been updated from "gemini-pro" to "gemini-1.5-flash". Let's verify this change.
🌐 Web query:
What are the available Gemini model IDs and their differences?
💡 Result:
Google's Gemini family offers multiple model variants optimized for different use cases, with key differences in capabilities and technical specifications:
Core Model Variants
Model ID Version Input Types Output Key Features gemini-2.0-flash Latest Text/Image/Audio/Video Text/Media* Next-gen speed & multimodal generation gemini-1.5-pro 1.5 Text/Image/Audio/Video Text Handles 2M tokens (~2M words), complex reasoning tasks gemini-1.5-flash 1.5 Text/Image/Audio/Video Text Fast performance across diverse tasks gemini-1.0-pro 1.0 Text Text Deprecated (Feb 2025), basic NLP tasks *Media outputs coming soon[10]
Specialized Models
- gemini-1.5-flash-8b: Optimized for high-volume tasks with lower compute requirements[10]
- gemini-2.0-flash-lite-preview: Cost-efficient text-focused variant[10]
- text-embedding-004: Text similarity measurement[10]
- aqa: Source-grounded Q&A system[10]
Versioning System
Stable Releases
gemini-1.5-pro-002 (Latest GA) gemini-1.5-pro-001 gemini-1.0-pro-vision-001Auto-Updating References
gemini-1.5-pro → points to latest stable (currently -002) gemini-2.0-flash → always latest 2.x versionPreview Models
gemini-2.0-flash-lite-preview-02-05 gemini-experimental (Vertex AI)[4]Key Differentiators
Multimodal Capacity:
- Pro Models: Full audio/video/image/text processing[10]
- Flash Models: Primarily text-focused with media inputs[7]
Context Window:
- 1.5 Pro: Up to 2M tokens (~2M words)[10]
- 2.0 Flash: Not specified but optimized for speed[10]
Optimization:
- Pro Series: Complex reasoning tasks
- Flash Series: High-throughput applications
Release Stage:
- Stable versions (-001/-002) vs experimental/preview models
For current projects requiring media processing, gemini-1.5-pro offers the most comprehensive capabilities[10], while gemini-2.0-flash provides next-gen speed advantages[4]. Developers should note the deprecation of gemini-1.0-pro in February 2025[10].
Citations:
- 1: https://blog.google/technology/ai/google-gemini-ai/
- 2: https://neontri.com/blog/google-gemini-chatgpt-comparison/
- 3: https://en.wikipedia.org/wiki/Gemini_(language_model)
- 4: https://docs.paloaltonetworks.com/ai-runtime-security/administration/ai-models-public-clouds-support
- 5: https://www.youtube.com/watch?v=Yr23Qqtjcug
- 6: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions
- 7: https://firebase.google.com/docs/vertex-ai/gemini-models
- 8: https://developers.googleblog.com/en/compare-mode-in-google-ai-studio/
- 9: https://lifehacker.com/tech/difference-between-google-gemini-apps-and-models
- 10: https://ai.google.dev/gemini-api/docs/models/gemini
Model ID change verified; please review default Temperature setting.
- The updated default ModelId
"gemini-1.5-flash"
aligns with the Gemini family variants—specifically, it selects the Flash variant from version 1.5, which is designed for fast text processing.- New configuration properties (
TopK
,TopP
, andTemperature
) have been added to enable fine-tuning. Note that, although a default Temperature of 1.0 is standard, it may be too high for some applications. Please double-check that this default meets your performance and quality requirements.src/Google/src/GoogleEmbeddingSettings.cs (1)
33-41
: LGTM! Well-structured settings resolution.The Calculate method follows a good pattern using null coalescing to resolve settings from multiple sources.
src/IntegrationTests/Helpers.cs (2)
123-124
: LGTM! Native Google embeddings integration.The switch to native Google embeddings is a good improvement over using OpenAI embeddings.
175-202
:❓ Verification inconclusive
Verify the impact of commenting out Azure provider.
The Azure provider case has been commented out. Let's verify if this was intentional and if there are any dependencies.
🏁 Script executed:
#!/bin/bash # Search for Azure provider usage in the codebase rg "AzureOpenAiProvider|ProviderType\.Azure" -lLength of output: 244
Action Required: Verify Azure Provider Dependencies
The integration test helper in
src/IntegrationTests/Helpers.cs
still has theProviderType.Azure
case commented out. However, our search confirms that Azure provider functionality (i.e., references toAzureOpenAiProvider
) is actively used in several files:
src/Azure/src/AzureOpenAiChatModel.cs
src/Azure/src/AzureOpenAiEmbeddingModel.cs
src/Azure/src/AzureOpenAiProvider.cs
src/Azure/src/AzureOpenAiTextToImageModel.cs
Please review and confirm whether commenting out the Azure provider in the integration tests is intentional and that it doesn’t inadvertently break any dependencies or expected test scenarios. If the exclusion is deliberate, consider adding a comment or removing the dead code to reduce confusion.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/Google/src/GoogleEmbeddingModel.cs (1)
53-53
: Simplify the values array structure.The current implementation creates an unnecessarily nested array structure. Since we're dealing with a single embedding, we can simplify this.
- Values = new[] { values.ToArray() }.ToArray(), + Values = new[] { values.ToArray() },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Google/src/GoogleEmbeddingModel.cs
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and test / Build, test and publish
🔇 Additional comments (2)
src/Google/src/GoogleEmbeddingModel.cs (2)
9-17
: LGTM! Well-structured class declaration.The class follows modern C# patterns with primary constructor, proper inheritance, and clean dependency injection.
54-54
: Preserve measured usage in response.You accumulate elapsed time in
usage.Time
and then store the usage, but the returnedEmbeddingResponse
currently hasUsage = Usage.Empty
. This could lead to lost information about actual usage metrics.- Usage = Usage.Empty, + Usage = usage,
provider.AddUsage(usage); | ||
|
||
if (embedResponse.Embedding == null) | ||
throw new GenerativeAIException("Failed to create embeddings.", ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Provide meaningful error message.
The error message for embedding creation failure is empty. Include details about what might have caused the failure to help with debugging.
- throw new GenerativeAIException("Failed to create embeddings.", "");
+ throw new GenerativeAIException(
+ "Failed to create embeddings. The embedding response was null.",
+ "Ensure the model is properly configured and the input is valid.");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
throw new GenerativeAIException("Failed to create embeddings.", ""); | |
throw new GenerativeAIException( | |
"Failed to create embeddings. The embedding response was null.", | |
"Ensure the model is properly configured and the input is valid."); |
EmbeddingSettings? settings = null, | ||
CancellationToken cancellationToken = default) | ||
{ | ||
request = request ?? throw new ArgumentNullException(nameof(request)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation for request strings.
While the request itself is validated, there's no validation for request.Strings
. Add checks for null or empty strings to prevent potential issues.
request = request ?? throw new ArgumentNullException(nameof(request));
+ if (request.Strings == null || !request.Strings.Any())
+ {
+ throw new ArgumentException("Request must contain at least one string to embed.", nameof(request));
+ }
+ if (request.Strings.Any(string.IsNullOrEmpty))
+ {
+ throw new ArgumentException("Request strings cannot be null or empty.", nameof(request));
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
request = request ?? throw new ArgumentNullException(nameof(request)); | |
request = request ?? throw new ArgumentNullException(nameof(request)); | |
if (request.Strings == null || !request.Strings.Any()) | |
{ | |
throw new ArgumentException("Request must contain at least one string to embed.", nameof(request)); | |
} | |
if (request.Strings.Any(string.IsNullOrEmpty)) | |
{ | |
throw new ArgumentException("Request strings cannot be null or empty.", nameof(request)); | |
} |
Summary by CodeRabbit
New Features
Google_GenerativeAI
package for improved performance.GoogleTextEmbedding
class and configurable settings for output dimensions.Refactor
Documentation
Tests