Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Auto Flush #413

Merged
merged 2 commits into from
Jun 12, 2024
Merged

Implement Auto Flush #413

merged 2 commits into from
Jun 12, 2024

Conversation

davidvonthenen
Copy link
Contributor

@davidvonthenen davidvonthenen commented Jun 12, 2024

Proposed changes

This finished off the implementation of the Finalize functionality. Issue: #351

By implementing the Python version of the auto flush found in the Go SDK here:
deepgram/deepgram-go-sdk#237

Types of changes

What types of changes does your code introduce to the community Python SDK?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update or tests (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have lint'ed all of my code using repo standards
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

What is interesting to note is the time difference in processing between the Threaded and Async IO Tasks. Async IO Tasks can be on SEVERAL orders of magnitude SLOWER than the Threaded client. It was originally the reason why I wanted to deprecate the Async Client because, for IO-intensive apps, it flat out just doesn't make any sense.

Summary by CodeRabbit

  • New Features

    • Introduced auto-flush functionality to improve message handling efficiency.
    • Added keep-alive mechanism to enhance connection stability.
    • Implemented real-time speech transcription using microphone input, with detailed setup instructions for Linux and MacOS.
  • Bug Fixes

    • Improved logging for better debugging and information clarity.
  • Documentation

    • Added README files with setup and configuration instructions for real-time transcription examples.
  • Tests

    • Added new test cases to cover edge scenarios for auto-flush and microphone mute functionalities.

Copy link
Contributor

coderabbitai bot commented Jun 12, 2024

Warning

Rate limit exceeded

@dvonthenen has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 23 minutes and 41 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 33721b4 and bee1f98.

Walkthrough

The updates enhance the Deepgram live client by adding auto-flush functionality and improving event handling and keep-alive mechanisms. Key changes include new methods for message inspection and flushing, updated thread management, and new configuration options. Additionally, new test scripts demonstrate the usage of these features with real-time transcription examples.

Changes

File(s) Change Summary
deepgram/clients/live/v1/async_client.py Added auto-flush functionality, new constants, and methods for handling messages and keep-alive signals. Updated thread management and logging.
deepgram/clients/live/v1/client.py Enhanced event handling, introduced auto-flush and keep-alive features, added new constants, methods, and updated thread management.
deepgram/options.py Added new attributes and methods for configuration options, including logging, auto-flush, and keep-alive settings.
tests/edge_cases/auto_flush/async_microphone_mute/README.md Provided setup instructions and prerequisites for using the Deepgram API with real-time transcription.
tests/edge_cases/auto_flush/async_microphone_mute/main.py Implemented a script for real-time transcription using a microphone, with event handling for various speech-related events.
tests/edge_cases/auto_flush/microphone_mute/README.md Provided setup instructions and prerequisites for using the Deepgram API with real-time transcription.
tests/edge_cases/auto_flush/microphone_mute/main.py Implemented a script for real-time transcription using a microphone, with event handling for various speech-related events.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AsyncLiveClient
    participant Server
    User->>AsyncLiveClient: start()
    AsyncLiveClient->>Server: Connect WebSocket
    AsyncLiveClient->>AsyncLiveClient: Start _flush_thread
    AsyncLiveClient->>AsyncLiveClient: Start _keep_alive_thread
    
    loop Every HALF_SECOND
        AsyncLiveClient->>AsyncLiveClient: _flush()
    end
    
    loop Every KeepAlive Interval
        AsyncLiveClient->>Server: Send KeepAlive message
    end
    
    User->>AsyncLiveClient: send(data)
    AsyncLiveClient->>Server: Send data
    
    User->>AsyncLiveClient: finish()
    AsyncLiveClient->>AsyncLiveClient: Stop _flush_thread
    AsyncLiveClient->>AsyncLiveClient: Stop _keep_alive_thread
    AsyncLiveClient->>Server: Close WebSocket
Loading

Poem

In the code where bytes do flow,
A rabbit's touch makes features grow.
With auto-flush and keep-alive,
Transcriptions now feel so alive.
Events handled with care so fine,
Deepgram's service, oh so divine.
🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Outside diff range and nitpick comments (5)
deepgram/options.py (2)

Line range hint 49-73: Refactor to simplify the constructor logic.

The constructor of DeepgramClientOptions is quite complex. Consider refactoring to separate concerns, such as extracting methods for setting default values and validating inputs.


Line range hint 148-242: Refactor to improve readability and maintainability.

The constructor of ClientOptionsFromEnv is overly complex and hard to maintain. Consider breaking down into smaller methods for handling environment variables, logging setup, and error handling.

deepgram/clients/live/v1/client.py (3)

Line range hint 33-63: Ensure thread safety and proper initialization of threading attributes.

The threading attributes _listen_thread, _keep_alive_thread, and _flush_thread are declared but not always properly initialized before use. Consider initializing these in the constructor to avoid NoneType errors.


Line range hint 144-182: Refactor to simplify the start method.

The start method is complex and handles multiple responsibilities. Consider breaking it down into smaller methods for handling options, headers, and starting threads.


Line range hint 231-655: Improve error handling and logging.

The methods _listening, _keep_alive, and _flush have complex error handling that could be simplified. Consider using a more structured approach to handle different types of exceptions and ensure that all resources are properly cleaned up.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3177017 and 13a0833.

Files selected for processing (7)
  • deepgram/clients/live/v1/async_client.py (15 hunks)
  • deepgram/clients/live/v1/client.py (15 hunks)
  • deepgram/options.py (9 hunks)
  • tests/edge_cases/auto_flush/async_microphone_mute/README.md (1 hunks)
  • tests/edge_cases/auto_flush/async_microphone_mute/main.py (1 hunks)
  • tests/edge_cases/auto_flush/microphone_mute/README.md (1 hunks)
  • tests/edge_cases/auto_flush/microphone_mute/main.py (1 hunks)
Files not reviewed due to errors (1)
  • tests/edge_cases/auto_flush/async_microphone_mute/main.py (no review received)
Additional context used
LanguageTool
tests/edge_cases/auto_flush/async_microphone_mute/README.md

[style] ~3-~3: Consider a shorter alternative to avoid wordiness. (IN_ORDER_TO_PREMIUM)
Context: ...is example uses the Microphone as input in order to detect conversation insights in what is...


[grammar] ~7-~7: The operating system from Apple is written “macOS”. (MAC_OS)
Context: ...his example will only work on Linux and MacOS. Windows platforms are not supported. ...


[style] ~21-~21: You have already used this phrasing in nearby sentences. Shortening it will avoid wordiness. (REP_MAKE_USE_OF)
Context: ...ned within the repository. That package makes use of the [PortAudio library](http://www.port...


[uncategorized] ~21-~21: If this is a compound adjective that modifies the following noun, use a hyphen. (EN_COMPOUND_ADJECTIVE_INTERNAL)
Context: ...rtaudio.com/) which is a cross-platform open source audio library. If you are on Linux, you...

tests/edge_cases/auto_flush/microphone_mute/README.md

[style] ~3-~3: Consider a shorter alternative to avoid wordiness. (IN_ORDER_TO_PREMIUM)
Context: ...is example uses the Microphone as input in order to detect conversation insights in what is...


[grammar] ~7-~7: The operating system from Apple is written “macOS”. (MAC_OS)
Context: ...his example will only work on Linux and MacOS. Windows platforms are not supported. ...


[style] ~21-~21: You have already used this phrasing in nearby sentences. Shortening it will avoid wordiness. (REP_MAKE_USE_OF)
Context: ...ned within the repository. That package makes use of the [PortAudio library](http://www.port...


[uncategorized] ~21-~21: If this is a compound adjective that modifies the following noun, use a hyphen. (EN_COMPOUND_ADJECTIVE_INTERNAL)
Context: ...rtaudio.com/) which is a cross-platform open source audio library. If you are on Linux, you...

Ruff
tests/edge_cases/auto_flush/microphone_mute/main.py

6-6: logging imported but unused (F401)

Remove unused import: logging


8-8: time.sleep imported but unused (F401)

Remove unused import: time.sleep


39-39: f-string without any placeholders (F541)

Remove extraneous f prefix


68-68: f-string without any placeholders (F541)

Remove extraneous f prefix


71-71: f-string without any placeholders (F541)

Remove extraneous f prefix


79-79: f-string without any placeholders (F541)

Remove extraneous f prefix

tests/edge_cases/auto_flush/async_microphone_mute/main.py

8-8: logging imported but unused (F401)

Remove unused import: logging


10-10: time.sleep imported but unused (F401)

Remove unused import: time.sleep


34-34: Function definition does not bind loop variable signal (B023)


53-53: f-string without any placeholders (F541)

Remove extraneous f prefix


82-82: f-string without any placeholders (F541)

Remove extraneous f prefix


85-85: f-string without any placeholders (F541)

Remove extraneous f prefix


93-93: f-string without any placeholders (F541)

Remove extraneous f prefix

Additional comments not posted (11)
deepgram/clients/live/v1/async_client.py (11)

32-32: Introduced a new constant HALF_SECOND for timing configurations.

This constant is used for timing configurations in auto-flush and keep-alive functionalities, ensuring more precise control over timing behaviors.


55-59: Added new instance variables to manage datagrams and threading for auto-flush and keep-alive.

These variables (_last_datagram, _flush_thread, _keep_alive_thread) are crucial for the new functionality, enabling the management of state and threading for auto-flushing and keep-alive operations.


77-79: Initialization of threading variables set to None in the constructor.

Proper initialization of these variables is essential for later conditional checks and operations within the class methods.


85-85: Introduced _flush_event to manage the auto-flush mechanism.

This event object will be used to control the flushing operation, allowing for more fine-grained control over when data should be sent.


164-175: Conditional logic to start keep-alive and auto-flush threads based on configuration.

This implementation ensures that threads for keep-alive and auto-flushing are only started if enabled in the configuration, which is a good practice for resource management.


211-211: Enhanced logging for event subscription and callback handling.

Adding detailed logging at these steps improves the traceability and debuggability of the event handling process.

Also applies to: 220-220


271-276: Implemented message inspection logic within the message listening loop.

This addition is part of the auto-flush feature, where messages are inspected and potentially flushed based on certain conditions. The error handling within this block ensures robustness.


459-459: Usage of the keep_alive method within the keep-alive thread.

Regularly calling the keep_alive method in this thread supports the keep-alive functionality by sending necessary signals to maintain the connection.


547-667: Implementation of the _flush method to handle auto-flushing based on time intervals.

This method is well-implemented with checks for exit conditions and socket availability, and it uses the configuration to determine the flushing interval, aligning with the auto-flush feature's requirements.


728-751: Added a keep_alive method to send KeepAlive messages.

This method is crucial for the keep-alive functionality, ensuring that the connection remains active by sending periodic signals.


864-879: Added an _inspect method to handle message inspection for auto-flushing.

This method checks the finality of messages to decide on flushing, which is a key part of the auto-flush logic. The implementation is concise and aligns with the intended functionality.

deepgram/options.py Outdated Show resolved Hide resolved
deepgram/options.py Outdated Show resolved Hide resolved
deepgram/options.py Outdated Show resolved Hide resolved
deepgram/clients/live/v1/client.py Show resolved Hide resolved
deepgram/clients/live/v1/client.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/README.md Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/README.md Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 13a0833 and 33721b4.

Files selected for processing (5)
  • deepgram/clients/live/v1/async_client.py (15 hunks)
  • deepgram/clients/live/v1/client.py (15 hunks)
  • deepgram/options.py (10 hunks)
  • tests/edge_cases/auto_flush/async_microphone_mute/main.py (1 hunks)
  • tests/edge_cases/auto_flush/microphone_mute/main.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • deepgram/clients/live/v1/async_client.py
Additional context used
Ruff
tests/edge_cases/auto_flush/microphone_mute/main.py

6-6: logging imported but unused (F401)

Remove unused import: logging


8-8: time.sleep imported but unused (F401)

Remove unused import: time.sleep


39-39: f-string without any placeholders (F541)

Remove extraneous f prefix


68-68: f-string without any placeholders (F541)

Remove extraneous f prefix


71-71: f-string without any placeholders (F541)

Remove extraneous f prefix


79-79: f-string without any placeholders (F541)

Remove extraneous f prefix

tests/edge_cases/auto_flush/async_microphone_mute/main.py

8-8: logging imported but unused (F401)

Remove unused import: logging


10-10: time.sleep imported but unused (F401)

Remove unused import: time.sleep


34-34: Function definition does not bind loop variable signal (B023)


53-53: f-string without any placeholders (F541)

Remove extraneous f prefix


82-82: f-string without any placeholders (F541)

Remove extraneous f prefix


85-85: f-string without any placeholders (F541)

Remove extraneous f prefix


93-93: f-string without any placeholders (F541)

Remove extraneous f prefix

Additional comments not posted (14)
deepgram/options.py (1)

104-113: Validate the auto_flush_reply_delta setting to ensure it is a positive number.

+ if not isinstance(auto_flush_delta, numbers.Number) or auto_flush_delta <= 0:
+     raise ValueError("auto_flush_reply_delta must be a positive number.")

Likely invalid or redundant comment.

deepgram/clients/live/v1/client.py (13)

33-33: Added constant HALF_SECOND for timing control.

This constant is used for timing control in various threading operations, ensuring more precise management of intervals.


56-63: Introduced threading locks and threads for new functionalities.

These additions are crucial for managing concurrent access to shared resources and for handling asynchronous operations effectively.


82-92: Initialization of threading-related attributes.

Proper initialization of threads and locks is essential for the robust handling of asynchronous operations.


Line range hint 144-182: Enhanced handling of WebSocket connection options and headers.

The merging of options and headers is handled cleanly, and the conditional logic for starting threads based on configuration is well-implemented.


168-182: Conditional thread start based on configuration flags.

This conditional logic ensures that resources are used efficiently by only starting threads when necessary.


219-227: Optimized event subscription logic.


277-282: Added message inspection and error handling in the transcription event processing.

This addition enhances the robustness of the system by allowing for detailed inspection and handling of transcription messages.


Line range hint 444-457: Keep-alive functionality implemented with periodic checks.

The implementation of keep-alive functionality using a counter to manage intervals is a good practice for maintaining active connections.


538-654: Implemented auto-flush functionality based on message timing.

The auto-flush mechanism based on configurable time intervals is a critical feature for managing real-time data flows efficiently.


718-746: Refined keep_alive method to handle socket state and send keep-alive messages.

The method now properly checks the socket state before attempting to send a message, which prevents errors in disconnected states.


758-769: Finalize method implementation to ensure proper closure of connections.

The method handles the socket state and sends a finalize message to properly close the connection, enhancing the reliability of the service.


788-800: Thread management in the finish method to ensure clean shutdown.

Proper management of threads during the shutdown process is crucial for preventing resource leaks and ensuring that the application exits cleanly.


851-867: Implemented message inspection logic to reset or update the last datagram timestamp based on message finality.

This logic is essential for the auto-flush mechanism, ensuring that messages are processed timely based on their content and state.

deepgram/clients/live/v1/client.py Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/main.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/main.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/main.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/microphone_mute/main.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/async_microphone_mute/main.py Outdated Show resolved Hide resolved
tests/edge_cases/auto_flush/async_microphone_mute/main.py Outdated Show resolved Hide resolved
deepgram/options.py Outdated Show resolved Hide resolved
deepgram/options.py Show resolved Hide resolved
@davidvonthenen davidvonthenen merged commit a7d8cae into deepgram:main Jun 12, 2024
3 checks passed
@davidvonthenen davidvonthenen deleted the implement-flush branch June 12, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants