Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add XLNet support for Reader #205

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open

Conversation

fmikaelian
Copy link
Collaborator

@fmikaelian fmikaelian commented Jul 16, 2019

  • Implement sklearn wrapper on top of new QA script provided by HF
  • Train XLNet on SQuAD 2.0 with wrapper
  • Add ability to load pre-trained reader with .bin file instead of pickling class object, ensuring compatibility with HF and avoiding confusion
  • Report training time and hardware used
  • Set verbose parameter
  • Report evaluation metrics
  • Integrate in QAPipeline()
  • Replace log_prob (softmax probs) by the raw logits to select best answer among paragrahs
    final_predictions_sorted = collections.OrderedDict(sorted(final_predictions.items(),
    key=lambda item: item[1]['start_log_prob'] +
    item[1]['end_log_prob'],
    reverse=True))
  • Evaluate complete cdQA pipeline
  • Update cdQA-annotator and cdQA-ui to support no answer

@fmikaelian fmikaelian changed the title Add XLNet support for Reader #196 Add XLNet support for Reader Jul 16, 2019
@codecov
Copy link

codecov bot commented Jul 16, 2019

Codecov Report

Merging #205 (660760c) into master (bda1c32) will decrease coverage by 8.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #205      +/-   ##
==========================================
- Coverage   31.23%   23.22%   -8.01%     
==========================================
  Files           7        9       +2     
  Lines        1508     2032     +524     
==========================================
+ Hits          471      472       +1     
- Misses       1037     1560     +523     
Impacted Files Coverage Δ
cdqa/reader/reader_sklearn.py 0.00% <0.00%> (ø)
cdqa/reader/utils_squad.py 0.00% <0.00%> (ø)
cdqa/reader/utils_squad_evaluate.py 0.00% <0.00%> (ø)
cdqa/reader/bertqa_sklearn.py 58.90% <0.00%> (+0.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bda1c32...660760c. Read the comment docs.

@fmikaelian
Copy link
Collaborator Author

ValueError during evaluation after training:

Traceback (most recent call last):
  File "tutorial-train-xlnet-squad.py", line 39, in <module>
    out_eval, final_prediction = reader.evaluate(X='dev-v2.0.json')
ValueError: too many values to unpack (expected 2)

@fmikaelian
Copy link
Collaborator Author

To use the XLNet reader with a pretrained .bin model:

import wget
from cdqa.reader.reader_sklearn import Reader

wget.download(url='https://github.com/cdqa-suite/cdQA/releases/download/XLNet_cased_vCPU/pytorch_model.bin', out='.')

# cast Reader class with train params
reader = Reader(model_type='xlnet',
                model_name_or_path='xlnet-base-cased',
                output_dir='.',
                evaluate_during_training=False,
                no_cuda=False,
                fp16=False,
                pretrained_model_path='.')

# make some predictions
reader.predict(X='dev-v2.0-small.json')

@fmikaelian
Copy link
Collaborator Author

  • hardware: GeForce RTX 208
  • training time: 9 hours

@fmikaelian fmikaelian mentioned this pull request Jul 20, 2019
1 task
@andrelmfarias andrelmfarias changed the title Add XLNet support for Reader [WIP] Add XLNet support for Reader Jul 24, 2019
@andrelmfarias
Copy link
Collaborator

Implementation of XLNetForQuestionAnswering is pretty different from BertForQuestionAnswering and the official HF version does not output the logits for now.
XLNetForQuestionAnswering uses Beam Search to find the best (and more probable) span, while BertForQuestionAnswering maximises the start_score and end_score separately.

from #196

@alex-movila
Copy link

Any progress with this?
In meantime we have even better models: RoBERTa and ERNIE 2.0

@fmikaelian
Copy link
Collaborator Author

Hi @alex-movila

You can follow our progress on this PR here. We described all the steps to achieve in order to be synced with the latest changes made by @huggingface.

At the moment we depend on the pytorch-transformers repository as a backend for our QA system. The @huggingface community is progressively implementing new models. They are now in the process of adding RoBERTa (see this). They don't have plan to add ERNIE a the moment (see this).

Their new API should allow the user to use any transformer to do QA. We are looking to provide the same thing with cdQA.

@fmikaelian
Copy link
Collaborator Author

fmikaelian commented Sep 15, 2019

I could not replicate results of official SQuAD 2.0 with our trained XLNet model:

from cdqa.reader.reader_sklearn import Reader

reader = Reader(model_type='xlnet',
                model_name_or_path='xlnet-base-cased',
                fp16=False,
                output_dir='.',
                no_cuda=False,
                pretrained_model_path='.')

reader.evaluate(X='dev-v2.0.json')

See my colab notebook for reproducibility: https://colab.research.google.com/github/cdqa-suite/cdQA/blob/sync-huggingface/examples/tutorial-eval-xlnet-squad2.0.ipynb

{
  "exact": 35.643897919649625,
  "f1": 40.81892328134685,
  "total": 11873,
  "HasAns_exact": 67.29082321187585,
  "HasAns_f1": 77.65571459504568,
  "HasAns_total": 5928,
  "NoAns_exact": 4.087468460891506,
  "NoAns_f1": 4.087468460891506,
  "NoAns_total": 5945,
  "best_exact": 50.07159100480081,
  "best_exact_thresh": 0.0,
  "best_f1": 50.07159100480081,
  "best_f1_thresh": 0.0
}

{'HasAns_exact': 67.29082321187585,
 'HasAns_f1': 77.65571459504568,
 'HasAns_total': 5928,
 'NoAns_exact': 4.087468460891506,
 'NoAns_f1': 4.087468460891506,
 'NoAns_total': 5945,
 'best_exact': 50.07159100480081,
 'best_exact_thresh': 0.0,
 'best_f1': 50.07159100480081,
 'best_f1_thresh': 0.0,
 'exact': 35.643897919649625,
 'f1': 40.81892328134685,
 'total': 11873}

It might be an not optimzed hyperparameters issue (see this: huggingface/transformers#822).

@andrelmfarias can you confirm the params you used during training? (https://github.com/cdqa-suite/cdQA/blob/sync-huggingface/examples/tutorial-train-xlnet-squad.py)

@andrelmfarias
Copy link
Collaborator

I had to reduce some parameters (max_length, batch_size, etc.). The GPU did not handle the training with default parameters. It might be that.

@fmikaelian
Copy link
Collaborator Author

I could not replicate results of official SQuAD 2.0 with our trained XLNet model:

from cdqa.reader.reader_sklearn import Reader

reader = Reader(model_type='xlnet',
                model_name_or_path='xlnet-base-cased',
                fp16=False,
                output_dir='.',
                no_cuda=False,
                pretrained_model_path='.')

reader.evaluate(X='dev-v2.0.json')

See my colab notebook for reproducibility: https://colab.research.google.com/github/cdqa-suite/cdQA/blob/sync-huggingface/examples/tutorial-eval-xlnet-squad2.0.ipynb

{
  "exact": 35.643897919649625,
  "f1": 40.81892328134685,
  "total": 11873,
  "HasAns_exact": 67.29082321187585,
  "HasAns_f1": 77.65571459504568,
  "HasAns_total": 5928,
  "NoAns_exact": 4.087468460891506,
  "NoAns_f1": 4.087468460891506,
  "NoAns_total": 5945,
  "best_exact": 50.07159100480081,
  "best_exact_thresh": 0.0,
  "best_f1": 50.07159100480081,
  "best_f1_thresh": 0.0
}

{'HasAns_exact': 67.29082321187585,
 'HasAns_f1': 77.65571459504568,
 'HasAns_total': 5928,
 'NoAns_exact': 4.087468460891506,
 'NoAns_f1': 4.087468460891506,
 'NoAns_total': 5945,
 'best_exact': 50.07159100480081,
 'best_exact_thresh': 0.0,
 'best_f1': 50.07159100480081,
 'best_f1_thresh': 0.0,
 'exact': 35.643897919649625,
 'f1': 40.81892328134685,
 'total': 11873}

It might be an not optimzed hyperparameters issue (see this: huggingface/transformers#822).

@andrelmfarias can you confirm the params you used during training? (https://github.com/cdqa-suite/cdQA/blob/sync-huggingface/examples/tutorial-train-xlnet-squad.py)

This issue is being discussed here: huggingface/transformers#947 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants