Recreating WMT14 EN-DE transformer results #1862

sedrickkeh · 2020-09-06T06:51:28Z

I'm currently trying to recreate the results from the paper "Attention is All You Need". According to the paper, the transformer model achieved a BLEU score of 27.3 on the WMT14 EN-DE dataset (with test set newstest2014).

I cannot seem to achieve that score. The best score I achieved was around 25.8.

Below are the commands I'm using (mostly copied exactly from the Opennmt website):

1. Download the data
Here, I'm using newstest2013 for validation and newstest2014 for testing, as described in the Attention is All You Need paper.

mkdir wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2013.de -P ./wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2013.en -P ./wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2014.de -P ./wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/newstest2014.en -P ./wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/train.en -P ./wmt14
wget https://nlp.stanford.edu/projects/nmt/data/wmt14.en-de/train.de -P ./wmt14

After this, I also renamed the newstest files to valid.en, valid.de, test.en, test.de.

2. Preprocess the data
This is taken from https://opennmt.net/OpenNMT-py/extended.html

for l in en de; do for f in wmt14/*.$l; do if [[ "$f" != *"test"* ]]; then sed -i "$ d" $f; fi;  done; done
for l in en de; do for f in wmt14/*.$l; do perl tools/tokenizer.perl -a -no-escape -l $l -q  < $f > $f.atok; done; done
onmt_preprocess -train_src wmt14/train.en.atok -train_tgt wmt14/train.de.atok -valid_src wmt14/valid.en.atok -valid_tgt wmt14/valid.de.atok -save_data wmt14/wmt14.atok.low -lower

3. Train
Training code taken from https://opennmt.net/OpenNMT-py/FAQ.html

mkdir results
python  train.py -data wmt14/wmt14.atok.low -save_model results/wmt14_model \ 
        -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8 \
        -encoder_type transformer -decoder_type transformer -position_encoding \
        -train_steps 200000  -max_generator_batches 2 -dropout 0.1\
        -batch_size 4096 -batch_type tokens -normalization tokens  -accum_count 2 \
        -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 \
        -max_grad_norm 0 -param_init 0  -param_init_glorot \
        -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 \
        -world_size 4 -gpu_ranks 0 1 2 3

4. Translate
Code taken from https://opennmt.net/OpenNMT-py/extended.html
onmt_translate -gpu 0 -model results/wmt14_model_step_200000.pt -src wmt14/test.en.atok -tgt wmt14/test.de.atok -replace_unk -verbose -output results/wmt14.test.pred.atok

5. Evaluate
Code taken from https://opennmt.net/OpenNMT-py/extended.html
perl tools/multi-bleu.perl wmt14/test.de.atok < results/wmt14.test.pred.atok

According to the FAQ page, the training command is supposedly able to recreate the WMT14 results. Am I doing something wrong in my steps? I tried looking at this related issue #637 but it seems like some of the commands there are outdated.

The text was updated successfully, but these errors were encountered:

francoishernandez · 2020-09-07T10:24:19Z

What do you mean by outdated commands in #637? I don't think lots of params have changed since then.
Also you may want to have a look at this: https://forum.opennmt.net/t/reproducing-pre-trained-transformer-model/3591/6

sedrickkeh · 2020-09-07T13:30:05Z

@francoishernandez there are some parameters such as "-epochs" and "-report_every" that do not work anymore in the current version, but yes, you're right; it shouldn't be that big of a concern. However, #637 mostly talks about preprocessing and tokenizing though, which I believe I'm following correctly. Do you think there's something wrong with my current preprocessing/tokenizing steps?

As for this https://forum.opennmt.net/t/reproducing-pre-trained-transformer-model/3591/6, I tried the other perl file that was mentioned (multi-bleu-detok.perl), and it indeed gave an improvement, but it was only very slight (around 0.1 higher BLEU).

vince62s · 2020-09-10T19:26:58Z

don't take data from Stanford.
adapt the script I made here https://github.com/OpenNMT/OpenNMT-tf/tree/master/scripts/wmt
for data preparation.

vince62s closed this as completed Jan 2, 2021

Yuran-Zhao mentioned this issue Jan 28, 2021

How to reproduce the result on WMT14 DE-EN? #2003

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recreating WMT14 EN-DE transformer results #1862

Recreating WMT14 EN-DE transformer results #1862

sedrickkeh commented Sep 6, 2020

francoishernandez commented Sep 7, 2020 •

edited

Loading

sedrickkeh commented Sep 7, 2020 •

edited

Loading

vince62s commented Sep 10, 2020

Recreating WMT14 EN-DE transformer results #1862

Recreating WMT14 EN-DE transformer results #1862

Comments

sedrickkeh commented Sep 6, 2020

francoishernandez commented Sep 7, 2020 • edited Loading

sedrickkeh commented Sep 7, 2020 • edited Loading

vince62s commented Sep 10, 2020

francoishernandez commented Sep 7, 2020 •

edited

Loading

sedrickkeh commented Sep 7, 2020 •

edited

Loading