Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve annotators and pipelines consuming lots of driver RAM #69

Closed
saif-ellafi opened this issue Dec 18, 2017 · 2 comments
Closed

Resolve annotators and pipelines consuming lots of driver RAM #69

saif-ellafi opened this issue Dec 18, 2017 · 2 comments
Assignees

Comments

@saif-ellafi
Copy link
Contributor

Description

Annotators, specifically the Vivekn Sentiment Analysis, is consuming lots of driver RAM due to standard scala collections containing model information. This becomes a storage both inside pipelines and when reading the models back. We need to let this information flow from disk instead

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context

Your Environment

  • Version used:
  • Browser Name and version:
  • Operating System and version (desktop or mobile):
  • Link to your project:
@aleksei-ai aleksei-ai self-assigned this Dec 25, 2017
@aleksei-ai
Copy link
Contributor

@saifjsl Please try my fix with Kryo Serialization. Let me know is it works for you https://github.com/JohnSnowLabs/spark-nlp/tree/vivekn_sentiment_model_kryo_serialization

@saif-ellafi
Copy link
Contributor Author

#78

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants