-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multiprocessing #20
Comments
I can help you do that if you agree. |
Hey @jdvala, this is good idea. I would suggest to use Python's multiprocessing, e.g. with a pool. What's your opinion on this? |
Hi @jfilter I have a few question that I would like to discuss before starting to implement this.
I would recommend to go for the second option as people have gotten used to the current signature of the function and changes we change this, so in my opinion we have Secondly, if a single text is large enough, then breaking it and parallelizing it also makes sense. At this point I am confused as which should we implement first. |
Hey @jdvala, in my opinion, the clean function should also accept a list of texts and then return a list of processed texts. Then, we need a new parameter, e.g. |
Given that cleaning text could be sometimes a very time consuming task if the number of data texts are huge, it would be really good if clean-text can provide inbuilt multiprocessing ability.
It could be really simple such that you could providing a flag and then adding an option to input list of text instead of a single text.
What do you think?
The text was updated successfully, but these errors were encountered: