You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current state of this package does not offer batch transcription, with sentences being passed through one at a time.
This would be completely fine but it seems like some dependencies, such as IceNLP are being intialized on every call.
This severely slows down transcription, making batch transcriptions (in the range of thousands of utterances) very time consuming.
It would be nice to be able to pass in a list of strings and get a list of lists of Tokens, with matching indices, for example.
The text was updated successfully, but these errors were encountered:
We have it on our issue list to deamonize IceNLP, the current implementation slows the process down indeed. A temporary solution would be to set phrasing=False in textprocessing_manager.transcribe() if you can do without it, IceNLP is only used in the phrasing step.
The LSTM-model is also too slow in general, we need to do some research in that part as well.
Unfortunately, it's the phrasing part I was most interested in applying, since my data is already normalized and I have other options for G2P.
Perhaps I'll just use a more naïve phrasing approach while the issues with this dependency get worked out.
The current state of this package does not offer batch transcription, with sentences being passed through one at a time.
This would be completely fine but it seems like some dependencies, such as IceNLP are being intialized on every call.
This severely slows down transcription, making batch transcriptions (in the range of thousands of utterances) very time consuming.
It would be nice to be able to pass in a list of strings and get a list of lists of Tokens, with matching indices, for example.
The text was updated successfully, but these errors were encountered: