Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental speech streaming for LMNT (useChat/useCompletion React) #922

Closed
wants to merge 41 commits into from

Conversation

lgrammel
Copy link
Collaborator

@lgrammel lgrammel commented Jan 17, 2024

Summary

Adds speech streaming to useChat and useCompletion with streamData.

  • useCompletion & useChat (for React) provide a experimental_speechUrl that can be used html audio elements
  • Integration functions for lmnt speech streams through experimental_forwardLmntSpeechStream
  • streamData.experimental_appendSpeech: add speech stream chunks to data stream (used automatically through forward functions)
  • Example: examples/next-lmnt: LMNT completion & chat speech streaming
  • Docs: LMNT provider docs, API docs for experimental_forwardLmntSpeechStream

Notes

  • The LMNT SDK does not work in the edge environment (as of v1.1.2)

@lgrammel lgrammel self-assigned this Jan 17, 2024
@untilhamza
Copy link
Contributor

This is exciting

@lgrammel lgrammel changed the title [WIP] Speech streaming prototype. [RFC] Speech streaming prototype Jan 23, 2024
@lgrammel lgrammel requested a review from MaxLeiter January 23, 2024 18:54
@llermaly
Copy link

This is awesome!

@llermaly
Copy link

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example)

It is ready to test?

Thanks!

@lgrammel
Copy link
Collaborator Author

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example)

It is ready to test?

Thanks!

Have you rebuilt the ai package? The easiest way is to just rebuild the whole repository (pnpm i, pnpm build) and then try out the example.

@llermaly
Copy link

llermaly commented Jan 27, 2024

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example)
It is ready to test?
Thanks!

Have you rebuilt the ai package? The easiest way is to just rebuild the whole repository (pnpm i, pnpm build) and then try out the example.

That did the trick thank you!. I was doing npm run dev , I did pnpm build , npm start and it worked.

It works really, really fast. I hope we can get this merged very soon.

@llermaly
Copy link

Hi @MaxLeiter! did you have a chance to take a look?

@lgrammel lgrammel changed the title [RFC] Speech streaming prototype Speech streaming for LMNT & ElevenLabs Feb 19, 2024
@lgrammel lgrammel changed the title Speech streaming for LMNT & ElevenLabs Experimental speech streaming for LMNT & ElevenLabs (useChat/useCompletion React) Feb 19, 2024
@lgrammel lgrammel changed the title Experimental speech streaming for LMNT & ElevenLabs (useChat/useCompletion React) Experimental speech streaming for LMNT (useChat/useCompletion React) Feb 19, 2024
@lgrammel lgrammel marked this pull request as ready for review February 19, 2024 13:18
@tgonzales
Copy link

Hello @lgrammel I saw that you changed from eleven labs to LMNT, there is a technical reason for this, eleven labs supports multi languages, LMNT still has no plans to launch this, wouldn't it be interesting to keep both options?

Thank you and congratulations for the excellent work

@lgrammel
Copy link
Collaborator Author

Hello @lgrammel I saw that you changed from eleven labs to LMNT, there is a technical reason for this, eleven labs supports multi languages, LMNT still has no plans to launch this, wouldn't it be interesting to keep both options?

Thank you and congratulations for the excellent work

Thanks. We want to use the official elevenlabs node SDK, but it does not support duplex streaming yet: elevenlabs/elevenlabs-js#4

In the meantime, you could use modelfusion elevenlabs with the adapter that I had in an earlier version of this PR.

@Iven2132
Copy link

Iven2132 commented Mar 5, 2024

@lgrammel Hi! I can't find the example app for speech streaming in the Vercel AI SDK repo. where it's gone?

@lgrammel
Copy link
Collaborator Author

lgrammel commented Mar 5, 2024

@lgrammel Hi! I can't find the example app for speech streaming in the Vercel AI SDK repo. where it's gone?

this feature has not been merged yet

@Iven2132
Copy link

Iven2132 commented Mar 5, 2024

Hi @MaxLeiter Can you merge this?

@Iven2132
Copy link

Iven2132 commented Mar 9, 2024

Hi @MaxLeiter Can you please approve this?

@llermaly
Copy link

llermaly commented Mar 9, 2024

bump

@pixelcatgg
Copy link

we could really use this as well 🙏 thank you so much for the work on this

const speech = new Speech(process.env.LMNT_API_KEY || 'no key');

// Note: The LMNT SDK does not work on edge yet (as of v1.1.2)
// export const runtime = 'edge';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lgrammel FYI @kaikato just merged a short README in lmnt-node describing how we got this working -- if you see an even better way let us know, but with the one change to the next.config.js file it should work with edge. lmnt-com/lmnt-node#32

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought to add that when we were hacking on vercel/ai-chatbot#151 I did see issues that looked like a challenge re: websockets staying alive on edge and so for deployment I switched to nodejs and didn't look further at the time. You can see the deployment focused work I did atop that PR here: https://github.com/shaper/lmnt-ai-chatbot/commits/main/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgrammel @MaxLeiter When TTS will come?

@lgrammel lgrammel closed this Apr 9, 2024
@allenchuang
Copy link

Any reason why this is closed? TTS is a great feature to have

@solanacryptodev
Copy link

Would be cool to see TTS added with the addition of gpt-4o

@alokwhitewolf
Copy link

@lgrammel / @MaxLeiter
Any follow up plans on adding TTS to vercel AI ?

@alicercedigital
Copy link

we are very excited about this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.