-
-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to stop message streaming #1022
Conversation
does this remove the ability to send multiple messages before receiving a reply? i personally don't think sending multiple messages is useful but i just want to understand if that is a change. also creating a new chat or deleting the message while it is streaming should similarly stop the stream. currently it stops it from showing in the UI but does not interupt the provider. may not need to be within the same PR but something that could be added eventually. |
Great question! Technically it does not - it is still possible to send multiple messages using Enter or Shift + Enter: |
so looks like the button will stop all pending messages? |
Currently yes. What do you think it should do in that case? I think user expectations are loosely defined (if any) because other applications often do not support concurrent messages. |
i don't have any strong opinions on this, stopping all pending messages seems fine. any idea on how llm memory is affected? if i am not sure which would be preferable in this case. i would lean toward the former EDIT: i am actually not sure what the default behaviour would be. will need to look into |
No, I think it should not contain the complete message. I think it is stateless with
Done in 1d8b3f9 |
I think so, it feels that the main button should act like "stop all" but maybe there should be a menu entry under three dots next to each message allowing to stop streaming of specific messages. I think that should be implemented in a future PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski Thank you for opening this PR and for documenting your original issue so clearly! It is really helpful to see examples from other AI assistant applications showing the UX you're proposing here.
Jupyter AI is intended to support collaboration, and has integrations with jupyter-collaboration
to allow for multiple users to share the same chat UI and use the same configured LLM. My concerns with this PR are that:
- Whether the stop button is shown is determined by if any streaming messages are incomplete, and
- Clicking the "Stop" button halts streaming on all replies.
I consider these concerns as blockers, because these to two consequences:
- When one user sends a message, it blocks all other users from sending one through the UI until the response is finished streaming. (yes, they can use the keyboard, but the UI suggests they can't send a message)
- Users can halt streaming of responses addressed to other users.
I suggest that we re-define how stopping messages works. Specifically:
- Whether the "Stop" button is shown should be based on whether the response to the user's last question is still streaming.
- Pressing the "Stop" button should only stop streaming of the last message asked by that user.
- If a user rapidly sends 2 messages in serial, pressing the "Stop" button only needs to stop the last message from streaming.
- This is an engineering shortcut I recommend taking so we don't need to store and manage an array of all the agent responses that are still streaming; we just need a reference to the most recent one.
- I think this is fine given that there's not much of a use-case for a single user to be sending multiple messages. A user with multiple questions would expect to just ask them all at once in a single message.
StopRequest
should be modified to include anid
field, which should contain a agent message ID. On receipt, the backend should stop only that message from streaming further.
Here are some extra details about the codebase that I thought would help with this proposed implementation:
-
You can modify
props.onSend()
insend-button.tsx
to return the message ID of the human message that was just sent. This is defined inchat-input.tsx
, and callsChatHandler.sendMessage()
, which returns a Promise that resolves to the message ID of theHumanChatMessage
the user just created by sending the message. -
ChatHandler.replyFor()
accepts a human message ID and returns a Promise that resolves to theAgentChatMessage
object that is replying to that human message. This method only works when not streaming, and is unused in the codebase today. You can modify the implementation to return a promise that resolves to anAgentStreamMessage
object as soon as the stream starts. It should be as easy as changing this:// resolve promise from `replyFor()` if it exists if ( newMessage.type === 'agent' && newMessage.reply_to in this._replyForResolverDict ) {
to:
// resolve promise from `replyFor()` if it exists if ( (newMessage.type === 'agent' || newMessage.type === 'agent-stream') && newMessage.reply_to in this._replyForResolverDict ) {
Then, update the relevant types from
AgentChatMessage
toAgentChatMessage | AgentStreamMessage
.
Let me know your thoughts on the changes I'm proposing above. Again, thanks for contributing this!
One extra comment: The "Stop" button should revert to a "Send" button if the user types while the previous response is still streaming. This preserves the current UX for stopping a stream, while also having the UI correctly reflect that it is still possible to send a message while a response is still streaming. |
Thank you for the feedback @dlqqq, this should be ready for another round of review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski The new revision of this PR looks fantastic! I left a couple of minor callouts and also would like to add additional changes to this PR to minimize the performance impact. Since the additional changes I'm proposing can be lengthy, I think it would be best to open this as a separate PR, possibly targeting your fork's branch. See below.
@krassowski I pushed a couple of commits that:
Here's a demo video: Screen.Recording.2024-10-15.at.1.08.32.PM.movThis PR looks good to me as-is (no need to split this PR if you're too busy to do so). However, I'll leave this PR open for your feedback or additional code changes. (p.s. In the future, if I have more complex suggestions, I will be more inclined to push to your branch and ask for your feedback. It would be more respectful of your time for me to write the code myself and be accountable for my suggestions working/not working.) |
Agreed, this was not necessary. I was thinking about adding some kind of UI indicator for interrupted messages (but most other chat interfaces do not have one) and also about allowing the telemetry to get that info but since you added the tombstone this is not a dire need. |
Looks good to me. I am not a fan of "here" in the I opened PRs for the sub-tasks: |
for more information, see https://pre-commit.ci
d25cf69
to
5acb7e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you @krassowski! Really looking forward to using this feature myself. 🎉
After this, I need to lift some of the more complex streaming logic out of DefaultChatHandler
into BaseChatHandler
so authors of custom slash commands don't need to constantly duplicate our implementation every minor release. With this PR as-is, the stop button would show but not work for custom slash commands that implement streaming.
I'm targeting this feature to become available in a minor release next Monday (10/21). 💪
Thanks for the help with review @dlqqq! I agree it would be great if streaming/stopping streaming worked for slash commands too. |
* Allow to stop streaming of agent messages * Lint * Interrupt message streaming when they get deleted * Cleanup code and ensure that old events are deleted * Fix unit tests * Add an `await` * Store information about message being interrupted on the frontend * Only allow users to stop streaming response to >their< last message * Default to "send" action when there is any input (and lint) * rework stop button to stop streaming on all replies to current user * add message tombstone to interrupted messages * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: David L. Qiu <david@qiu.dev> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
ChatUser
does not comply with the jupyter identity model #1025 (username
is now sourced from identity provider, but display name and name are kept as-is)This PR not only stops the stream from being sent to the user, but also propagates the information up to the provider/model making to possible for custom deployments to actually kill the generating process and thus safe cost.