Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines- if initial model download is interrupted, everything is ruined #2780

Closed
2 tasks done
KChalk opened this issue Feb 7, 2020 · 1 comment
Closed
2 tasks done
Labels

Comments

@KChalk
Copy link

KChalk commented Feb 7, 2020

🐛 Bug

Information

Model I am using (Bert, XLNet ...): pipeline('ner') and pipeline('feature-extraction')

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

  • my own modified scripts: (give details below)

The tasks I am working on is:

  • (mostly NA) my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. fresh install transformers from source
  2. run:
from transformers import pipeline
model =pipeline('feature-extraction')
  1. interrupt download. rerun Port tokenization for the multilingual model #2

Error on reload:

Downloading: 100%
230/230 [00:01<00:00, 136B/s]

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
~/miniconda3/envs/hugging/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    466             try:
--> 467                 state_dict = torch.load(resolved_archive_file, map_location="cpu")
    468             except Exception:

~/miniconda3/envs/hugging/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module)
    357     try:
--> 358         return _load(f, map_location, pickle_module)
    359     finally:

~/miniconda3/envs/hugging/lib/python3.7/site-packages/torch/serialization.py in _load(f, map_location, pickle_module)
    548         assert key in deserialized_objects
--> 549         deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
    550         offset = None

RuntimeError: unexpected EOF. The file might be corrupted.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-26-2fd4b689c1db> in <module>
----> 1 featify=pipeline('feature-extraction')

~/miniconda3/envs/hugging/lib/python3.7/site-packages/transformers/pipelines.py in pipeline(task, model, config, tokenizer, modelcard, **kwargs)
   1084                 "Trying to load the model with Tensorflow."
   1085             )
-> 1086         model = model_class.from_pretrained(model, config=config, **model_kwargs)
   1087 
   1088     return task(model=model, tokenizer=tokenizer, modelcard=modelcard, framework=framework, **kwargs)

~/miniconda3/envs/hugging/lib/python3.7/site-packages/transformers/modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    375         for config_class, model_class in MODEL_MAPPING.items():
    376             if isinstance(config, config_class):
--> 377                 return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
    378         raise ValueError(
    379             "Unrecognized configuration class {} for this kind of AutoModel: {}.\n"

~/miniconda3/envs/hugging/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    468             except Exception:
    469                 raise OSError(
--> 470                     "Unable to load weights from pytorch checkpoint file. "
    471                     "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. "
    472                 )

OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 

Expected behavior

Model should (download and) load.

Environment info

  • transformers version: 2.4.1
  • Platform: WSL
  • Python version: 3.7.6.final.0
  • PyTorch version (GPU?): 0.4.1 (no)
  • Tensorflow version (GPU?): none
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no
@LysandreJik LysandreJik added Core: Pipeline Internals of the library; Pipeline. Ex: Named Entity Recognition labels Feb 10, 2020
@stale
Copy link

stale bot commented Apr 10, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 10, 2020
@stale stale bot closed this as completed Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants