Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepseek R1 integration for Agent Interactions #236

Merged
merged 8 commits into from
Jan 30, 2025
Merged

Deepseek R1 integration for Agent Interactions #236

merged 8 commits into from
Jan 30, 2025

Conversation

dhirenmathur
Copy link
Contributor

@dhirenmathur dhirenmathur commented Jan 30, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for DeepSeek AI provider
    • Introduced new methods for retrieving global AI provider and preferred LLM
  • Improvements

    • Enhanced secret management to support DeepSeek provider
    • Updated API key validation for new provider
    • Streamlined LLM initialization process
  • Dependency Updates

    • Updated multiple libraries including FastAPI, OpenAI, and LangChain
    • Added new DeepSeek-related dependencies
  • Bug Fixes

    • Improved error handling in provider and secret management services

Copy link
Contributor

coderabbitai bot commented Jan 30, 2025

Warning

Rate limit exceeded

@dhirenmathur has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 39 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between d8ebe65 and 24d5f44.

📒 Files selected for processing (19)
  • GETTING_STARTED.md (1 hunks)
  • app/api/router.py (6 hunks)
  • app/main.py (1 hunks)
  • app/modules/auth/api_key_service.py (8 hunks)
  • app/modules/conversations/conversation/conversation_controller.py (1 hunks)
  • app/modules/conversations/conversation/conversation_service.py (3 hunks)
  • app/modules/conversations/conversations_router.py (1 hunks)
  • app/modules/intelligence/agents/agents/unit_test_agent.py (1 hunks)
  • app/modules/intelligence/provider/provider_service.py (6 hunks)
  • app/modules/intelligence/tools/code_query_tools/get_node_neighbours_from_node_id_tool.py (2 hunks)
  • app/modules/intelligence/tools/kg_based_tools/get_code_from_multiple_node_ids_tool.py (2 hunks)
  • app/modules/intelligence/tools/kg_based_tools/get_code_from_probable_node_name_tool.py (1 hunks)
  • app/modules/key_management/secret_manager.py (10 hunks)
  • app/modules/parsing/graph_construction/parsing_controller.py (1 hunks)
  • app/modules/parsing/graph_construction/parsing_service.py (1 hunks)
  • app/modules/utils/email_helper.py (1 hunks)
  • readme.md (3 hunks)
  • requirements.txt (4 hunks)
  • start.sh (1 hunks)

Walkthrough

This pull request introduces support for a new AI provider, DeepSeek, across multiple modules. The changes span provider configuration, secret management, and API routing. The implementation adds new methods for retrieving global AI providers, updates secret management to handle the new provider, and modifies dependencies to include DeepSeek-related libraries. The modifications enhance the system's flexibility in managing AI providers while maintaining existing functionality.

Changes

File Change Summary
app/modules/intelligence/provider/provider_controller.py Added get_global_ai_provider method to retrieve global AI provider
app/modules/intelligence/provider/provider_router.py Added new route method get_global_ai_provider for retrieving global AI provider
app/modules/intelligence/provider/provider_service.py Introduced DeepSeek provider support, added methods for global AI provider and preferred LLM retrieval
app/modules/key_management/secret_manager.py Extended secret management to support DeepSeek provider
app/modules/key_management/secrets_schema.py Updated provider types and API key validation for DeepSeek
requirements.txt Updated dependencies, added DeepSeek-related libraries

Sequence Diagram

sequenceDiagram
    participant User
    participant ProviderAPI
    participant ProviderController
    participant ProviderService
    participant SecretManager

    User->>ProviderAPI: Request global AI provider
    ProviderAPI->>ProviderController: get_global_ai_provider(user_id)
    ProviderController->>ProviderService: get_global_ai_provider(user_id)
    ProviderService->>SecretManager: Retrieve provider configuration
    SecretManager-->>ProviderService: Return provider details
    ProviderService-->>ProviderController: Return global AI provider
    ProviderController-->>ProviderAPI: Return provider information
    ProviderAPI->>User: Respond with global AI provider
Loading

Possibly related PRs

Poem

🐰 A Rabbit's Ode to DeepSeek Provider

Hop, hop, through code's verdant maze,
DeepSeek joins our AI-powered days!
New secrets, routes, and libraries bright,
Expanding our linguistic might! 🌟
Coding rabbits dance with glee! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (8)
app/modules/intelligence/provider/provider_service.py (4)

32-32: Consider extracting base URL to a configuration variable.

Hard-coding "https://openrouter.ai/api/v1" might complicate maintenance if the endpoint changes. Extracting it as an environment variable or config setting could improve flexibility.


140-141: Unused parameter size in _get_provider_config.

Though the method signature mentions size, it’s never used in the method body. Remove or utilize this parameter to avoid confusion.

- def _get_provider_config(self, size: str) -> str:
+ def _get_provider_config(self) -> str:
    """Get the preferred provider configuration for the current user."""

152-157: Use consistent environment variable naming.

Instead of isDevelopmentMode, consider standardizing as ISDEVELOPMENTMODE. This aligns with environment variable conventions and static analysis hints.

- if os.getenv("isDevelopmentMode") == "enabled":
+ if os.getenv("ISDEVELOPMENTMODE") == "enabled":
🧰 Tools
🪛 Ruff (0.8.2)

154-154: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


Line range hint 261-289: Inconsistent handling for DeepSeek preference.

get_preferred_llm auto-converts DeepSeek to OpenAI, which contradicts the user’s stated preference. Confirm this is intentional. If so, consider adding a docstring or comment clarifying why DeepSeek usage is deferred.

app/modules/key_management/secrets_schema.py (1)

24-25: Combine condition branches to reduce duplication.

You can simplify logic by merging the consecutive checks for sk-ant- and sk-. This helps maintain readability and reduces nested conditionals.

- elif v.startswith("sk-ant-"):
-     return v
- elif v.startswith("sk-"):
-     return v
+ elif v.startswith("sk-ant-") or v.startswith("sk-"):
+     return v
🧰 Tools
🪛 Ruff (0.8.2)

22-25: Combine if branches using logical or operator

Combine if branches

(SIM114)

app/modules/intelligence/provider/provider_controller.py (1)

37-45: Exception chaining for clearer tracebacks.

Raising an exception with raise HTTPException(...) from e can improve debugging. Otherwise, this code is solid for exposing backend errors gracefully.

- raise HTTPException(status_code=500, detail=f"Error getting AI provider: {str(e)}")
+ raise HTTPException(status_code=500, detail=f"Error getting AI provider: {str(e)}") from e
🧰 Tools
🪛 Ruff (0.8.2)

42-44: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

app/modules/intelligence/provider/provider_router.py (1)

39-47: LGTM! Clean implementation of the global AI provider endpoint.

The implementation follows existing patterns and properly handles authentication and database session management.

Consider addressing the static analysis hint about Depends calls in argument defaults by moving them inside the function:

-    async def get_global_ai_provider(
-        db: Session = Depends(get_db),
-        user=Depends(AuthService.check_auth),
-    ):
+    async def get_global_ai_provider():
+        db: Session = Depends(get_db)
+        user = Depends(AuthService.check_auth)
🧰 Tools
🪛 Ruff (0.8.2)

42-42: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


43-43: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

app/modules/key_management/secret_manager.py (1)

124-128: Consider improving the comment about key storage limitation.

The comment "because user can only store one key for now" could be more descriptive.

-            #because user can only store one key for now
+            # Fall back to user's preferred LLM provider since the system currently supports storing only one API key per user
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4e5a83 and d8ebe65.

📒 Files selected for processing (7)
  • app/modules/intelligence/provider/provider_controller.py (1 hunks)
  • app/modules/intelligence/provider/provider_router.py (1 hunks)
  • app/modules/intelligence/provider/provider_service.py (6 hunks)
  • app/modules/key_management/secret_manager.py (5 hunks)
  • app/modules/key_management/secrets_schema.py (3 hunks)
  • app/modules/parsing/graph_construction/parsing_service.py (1 hunks)
  • requirements.txt (3 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/modules/intelligence/provider/provider_controller.py

42-44: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

app/modules/intelligence/provider/provider_router.py

42-42: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


43-43: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

app/modules/key_management/secrets_schema.py

22-25: Combine if branches using logical or operator

Combine if branches

(SIM114)

app/modules/key_management/secret_manager.py

49-49: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


186-186: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


187-187: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


189-189: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)

app/modules/intelligence/provider/provider_service.py

154-154: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


168-168: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


218-218: Use os.getenv("isDevelopmentMode") != "enabled" instead of not os.getenv("isDevelopmentMode") == "enabled"

Replace with != operator

(SIM201)


218-218: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


226-226: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

🔇 Additional comments (9)
app/modules/intelligence/provider/provider_service.py (4)

10-10: Good addition to support DeepSeek.

Importing ChatDeepSeek ensures your code can integrate with the new provider. Confirm that the langchain_deepseek library is installed and properly versioned.


50-54: DeepSeek listing is consistent with other providers.

Including DeepSeek in list_available_llms is consistent and ensures it is discoverable by users.


88-138: Solid use of a structured config dictionary.

The MODEL_CONFIGS dictionary is well-organized, mapping provider and model size. This approach simplifies expansions. Keep an eye on version increments, and consider referencing environment variables or central configuration if models update frequently.


234-238: Watch out for forced fallback to OpenAI.

Overriding provider = "openai" when deepseek is selected may confuse users expecting the DeepSeek small model. Consider making the fallback logic transparent by logging or removing once DeepSeek is fully supported.

app/modules/key_management/secrets_schema.py (2)

9-9: Seamless extension for new provider.

Expanding the provider literal to "deepseek" is consistent with the rest of this PR. Ensures type safety with minimal changes.


39-41: Good validation for DeepSeek’s API key format.

Verifying that the key starts with sk-or- ensures correct usage and prevents accidental misconfiguration.

app/modules/key_management/secret_manager.py (2)

48-59: LGTM! Clean addition of DeepSeek provider support.

The implementation follows the existing pattern and properly handles the new provider.

🧰 Tools
🪛 Ruff (0.8.2)

49-49: Use capitalized environment variable ISDEVELOPMENTMODE instead of isDevelopmentMode

(SIM112)


192-221: Review the bulk deletion implementation for potential race conditions.

The bulk deletion implementation looks solid but consider these aspects:

  1. Transaction handling for multiple deletions
  2. Potential race conditions during preference updates
  3. Partial failure scenarios

Let's verify the transaction handling in the codebase:

✅ Verification successful

Bulk deletion implementation is safe from race conditions

The implementation is safe because:

  • Secret deletions are independent operations with individual error tracking
  • User preference updates are atomic single-row operations
  • Failed deletions don't affect other operations
  • The code properly handles partial failures and maintains consistency
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for transaction handling patterns in secret management
rg -A 5 "BEGIN|COMMIT|ROLLBACK" app/modules/key_management/

Length of output: 61


Script:

#!/bin/bash
# Search for database session and transaction patterns
rg -B 2 -A 5 "Session|session|db\." app/modules/key_management/

# Look for concurrent access mechanisms
rg -B 2 -A 5 "lock|Lock|mutex|Mutex|concurrent|Concurrent" app/modules/key_management/

# Check for any transaction context managers
ast-grep --pattern 'with $_: $$$'

Length of output: 66958


Script:

#!/bin/bash
# Search for concurrent user preference access patterns
rg -B 2 -A 5 "UserPreferences" app/modules/key_management/
rg -B 2 -A 5 "preferences" app/modules/key_management/

# Check for any other methods modifying user preferences
ast-grep --pattern 'def $_($$$) {
  $$$
  user_pref.preferences$$$
  $$$
}'

Length of output: 12225

requirements.txt (1)

20-27: Verify compatibility between langchain components.

The updates to langchain and its components are significant. Ensure all versions are compatible:

  • langchain==0.3.16
  • langchain-anthropic==0.3.4
  • langchain-deepseek-official==0.1.0
  • langchain-community==0.3.16
  • langchain-core==0.3.32
  • langchain-openai==0.3.2
✅ Verification successful

✓ LangChain component versions are compatible

All specified versions are available in PyPI and are either the latest or one version behind the current releases. The consistent versioning pattern (0.3.x) across core packages indicates good compatibility.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for compatibility issues in dependencies
pip install pip-audit
pip-audit langchain==0.3.16 langchain-anthropic==0.3.4 langchain-deepseek-official==0.1.0 langchain-community==0.3.16 langchain-core==0.3.32 langchain-openai==0.3.2

Length of output: 5151


Script:

#!/bin/bash
# Check package metadata and dependencies
for pkg in "langchain==0.3.16" "langchain-core==0.3.32" "langchain-community==0.3.16" "langchain-anthropic==0.3.4" "langchain-openai==0.3.2" "langchain-deepseek-official==0.1.0"; do
    echo "Checking $pkg..."
    pip index versions "$pkg"
    pip show "$pkg"
    echo "---"
done

Length of output: 3683


Script:

#!/bin/bash
# Check latest available versions
for pkg in "langchain" "langchain-core" "langchain-community" "langchain-anthropic" "langchain-openai" "langchain-deepseek-official"; do
    echo "Latest version for $pkg:"
    pip index versions "$pkg" 2>/dev/null || echo "Package not found"
done

Length of output: 7571


@lru_cache(maxsize=3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Leverage caution with caching on methods.

functools.lru_cache on an instance method might risk sporadic memory usage. Ensure get_large_llm usage is stable and that cache invalidation occurs if DB-based preferences change. Otherwise, consider removing or refining the cache usage.

🧰 Tools
🪛 Ruff (0.8.2)

226-226: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

Comment on lines 181 to 185
graph_constructor = GraphConstructor(user_id, extracted_dir)
n, r = graph_constructor.build_graph()
graph_manager.create_nodes(n)
graph_manager.create_edges(r)

graph_manager.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential resource leak in graph manager closure.

The graph_manager.close() is called twice: once after creating edges and again in the finally block. This could lead to attempting to close an already closed connection.

-                graph_constructor = GraphConstructor(user_id, extracted_dir)
-                n, r = graph_constructor.build_graph()
-                graph_manager.create_nodes(n)
-                graph_manager.create_edges(r)
-                graph_manager.close()
+                graph_constructor = GraphConstructor(user_id, extracted_dir)
+                n, r = graph_constructor.build_graph()
+                graph_manager.create_nodes(n)
+                graph_manager.create_edges(r)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
graph_constructor = GraphConstructor(user_id, extracted_dir)
n, r = graph_constructor.build_graph()
graph_manager.create_nodes(n)
graph_manager.create_edges(r)
graph_manager.close()
graph_constructor = GraphConstructor(user_id, extracted_dir)
n, r = graph_constructor.build_graph()
graph_manager.create_nodes(n)
graph_manager.create_edges(r)

requirements.txt Outdated Show resolved Hide resolved
@dhirenmathur dhirenmathur merged commit 27c6b5b into main Jan 30, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant