Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve audio-to-text pipeline by enabling flash-attention. [$750] #16

Closed
JJassonn69 opened this issue Dec 6, 2024 · 1 comment
Closed

Comments

@JJassonn69
Copy link
Owner

JJassonn69 commented Dec 6, 2024

Bounty Overview

We have identified an opportunity to improve the current audio-to-text pipeline in Livepeer AI Network by enabling flash-attention that will speed up the pipeline significantly allowing for faster and almost realtime operation. We are seeking the community and bounty hunters support to quickly implement this optimisation so it can be available to developers working with Livepeer.

Required Skillset.

  • Proven experience working with deep learning frameworks such as PyTorch, particularly in implementing attention mechanisms and optimising model performance.
  • Strong experience with Python.

Bounty Requirements

To successfully resolve this bounty, you must:

  1. Enable the optimisation on the existing pipeline by enabling memory efficient flash attention.
  2. Ensure that devices that don't yet support the optimisation should safely fallback to working Scaled Dot-Product Attention SDPA implementation .
  3. Create a separate docker container image similar to PR #185 to avoid dependencies issues with other pipelines.

Scope Exclusions

  • None. All areas related to the issue are within scope.

Implementation Tips

  1. Consult the documentation of the flash-attention from pytorch to better understand how to enable it in audio-to-text pipeline.
  2. Validate performance improvements in the Flash Attention-enabled pipeline and ensure proper fallback functionality in unsupported devices.

How to Apply

This bounty has been assigned to Prakarsh, who has expressed interest in addressing this issue. If you have been officially assigned this issue:

  • Communicate with the team as needed for context or clarification.
  • Provide regular updates on your progress in the issue thread.

Warning

Please ensure the issue is assigned to you before starting work. To avoid duplication of efforts, unassigned issue submissions will not be accepted.

@JJassonn69 JJassonn69 changed the title Improve audio-to-text pipeline by enabling flash-attention in ai-worker. [@£$750 Improve audio-to-text pipeline by enabling flash-attention in ai-worker. [$750] Dec 6, 2024
@JJassonn69 JJassonn69 changed the title Improve audio-to-text pipeline by enabling flash-attention in ai-worker. [$750] Improve audio-to-text pipeline by enabling flash-attention. [$750] Dec 6, 2024
@JJassonn69
Copy link
Owner Author

Posted to notion. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant