Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLM question-answering pipeline is flacky #28000

Closed
2 of 4 tasks
fxmarty opened this issue Dec 13, 2023 · 1 comment
Closed
2 of 4 tasks

XLM question-answering pipeline is flacky #28000

fxmarty opened this issue Dec 13, 2023 · 1 comment

Comments

@fxmarty
Copy link
Contributor

fxmarty commented Dec 13, 2023

System Info

transformers main, but tested on commits in the last three weeks, same issue

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

for i in range(50):
    from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
    import torch
    
    model = AutoModelForQuestionAnswering.from_pretrained("hf-internal-testing/tiny-random-XLMModel")
    tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/tiny-random-XLMModel")
    pipe = pipeline("question-answering", model=model, tokenizer=tokenizer)
    question = "Whats my name?"
    context = "My Name is Philipp and I live in Nuremberg."
    outputs = pipe(question, context)

sometimes fail with

Traceback (most recent call last):
  File "<tmp 4>", line 23, in <module>
    outputs = pipe(question, context)
  File "/home/fxmarty/hf_internship/transformers/src/transformers/pipelines/question_answering.py", line 393, in __call__
    return super().__call__(examples[0], **kwargs)
  File "/home/fxmarty/hf_internship/transformers/src/transformers/pipelines/base.py", line 1132, in __call__
    return next(
  File "/home/fxmarty/hf_internship/transformers/src/transformers/pipelines/pt_utils.py", line 125, in __next__
    processed = self.infer(item, **self.params)
  File "/home/fxmarty/hf_internship/transformers/src/transformers/pipelines/question_answering.py", line 563, in postprocess
    "start": np.where(char_to_word == token_to_orig_map[s])[0][0].item(),
KeyError: 5

Expected behavior

no error. I can have a look if I have time

@amyeroberts
Copy link
Collaborator

Hi @fxmarty - thanks for raising this!

To help with debugging - has this been observed with other checkpoints or only the tiny random ones for testing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants