Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explorer: add cancel button to stop processing #1218

Open
ahuang11 opened this issue Dec 14, 2023 · 7 comments
Open

Explorer: add cancel button to stop processing #1218

ahuang11 opened this issue Dec 14, 2023 · 7 comments
Labels
explorer type: enhancement New feature or request

Comments

@ahuang11
Copy link
Collaborator

Sometimes, accidentally click on a column name with over 1000 unique groups for by and then it takes forever unless I Cmd+C or restart kernel.

@ahuang11 ahuang11 changed the title Cancel button Add cancel button to stop processing Dec 14, 2023
@ahuang11 ahuang11 added type: enhancement New feature or request and removed TRIAGE labels Dec 14, 2023
@maximlt maximlt changed the title Add cancel button to stop processing Explorer: add cancel button to stop processing Dec 15, 2023
@maximlt
Copy link
Member

maximlt commented Dec 15, 2023

I think we've already discussed that in the past. IIRC it's pretty difficult to stop things from running. I think we discussed removing columns from by and groupby that have too many unique values.

@ahuang11
Copy link
Collaborator Author

I was imagining this
holoviz/panel#5962

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Dec 18, 2023

+1. I am trying to build a custom data catalogue inspired by Intake. I would like my users to be able to explore a source using the Explorer but if I select a datetime column in the the by or groupby then suddenly everything blocks for a long time and even crashes the client or server sometimes.

It would be very helpful with one or more of the below.

  • Ability to require confirmation before proceeding with slow, blocking operation.
  • Ability to cancel slow, blocking operation.
  • AVOID BLOCKING!
  • Ability to set limit on cpu, memory and time usage. If limit is hit the operation is stopped and user informed.
  • Not enabling users to select non-categorical options in widgets that should really only be for categorical data.

I think it could be possible to implement support for some limits to not include columns with .nunique>N in the by or groupby column.

@maximlt
Copy link
Member

maximlt commented Dec 18, 2023

The explorer is pretty different from the chat components of Panel, it doesn't fetch data from the internet, which is indeed the kind of operation that can get very slow (e.g. rated limited API) and you may want to stop. Ideally the explorer should not have slow and blocking operations, as you can't explore data when things get too slow :) !

Sure we could try to add a cancel button to stop processing but before jumping on that kind of engineer solution I'd like to look into potential UX solutions. Like:

  • How could we better explain what by/groupby do?
  • Should the explorer remove some dimensions from by/groupby if they have more than X unique values (making it a configurable explorer option)?
  • How could we better expose what type of data the dimensions contain? To make it less likely for users to pick nonsensical combinations.

@philippjfr
Copy link
Member

philippjfr commented Dec 18, 2023

Agree with @maximlt, the solution is not to allow a user to back out of some nonsensical selection that will, at best, cause lengthy processing in Python or at worst crash your browser, but rather prevent users from making such selections in the first place. Indeed in many cases there is no backing out, often the actual Python portion of the processing finishes very quickly but once Bokeh is asked to render the output it effectively freezes the browser, at which point it's too late. I too suggest we do some of the following:

  • Limiting the number of categories you can group over (i.e. we allow selecting a category with tons of options but never render over some threshold)
  • Entirely exclude float columns from by/groupby fields
  • Displaying a warning before applying a potentially expensive/unusable options
  • Better informing them what each data point contains (e.g. by showing min/max values or unique categories for each column in the selector)

@MarcSkovMadsen
Copy link
Collaborator

MarcSkovMadsen commented Dec 18, 2023

I agree that we should try to limit to risk of a user selecting something that blocks.

But I also believe that it would mean a lot of the plot could be generated in its own thread by default. As it is now the explorer will not really work in a shared application because it will be blocking the main thread over and over again for 0.1-5 secs.

Maybe Panel should provide its own implementation of asyncify to run something async/ in its own thread in one line of code.

@philippjfr
Copy link
Member

Very doubtful that threading hvPlot would meaningfully unlock the GIL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
explorer type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants