This repository contains code for our EMNLP 2018 paper "Guided Neural Language Generation for Abstractive Summarization Using AMR"
We used the Abstract Meaning Representation Annotation Release 2.0 which contains manually annotated document and summary AMR.
For preprocessing, clone the AMR preprocessing repository.
git clone https://github.com/sheffieldnlp/AMR-Preprocessing
Run the AMR linearizing where the input is the system summary AMR from Liu's summarizer ($F) and the AMR raw dataset ($AMR). Here we use the test dataset. Run the preprocessing on the training, and validation dataset if you want to train the model.
export F_TRAIN=/<path to AMR proxy train>/amr-release-2.0-amrs-training.txt
export F_TEST=/<path to AMR proxy train>/amr-release-2.0-amrs-test.txt
export F_DEV=/<path to AMR proxy train>/amr-release-2.0-amrs-dev.txt
export OUTPUT=/<output path for the results>/
python var_free_amrs.py -f $F_TRAIN -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_TEST -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_DEV -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
For each set (train, test and dev) the script will produce a set of two files: the sentence (.sent) and its respective linearized AMR (.tf) files.
export SRC=/<path to the linearized AMR tf training file>/all_amr-release-2.0-amrs-training.txt.tf
export TGT=/<path to the sentence training file>/all_amr-release-2.0-amrs-training.txt.sent
export SRC_VALID=/<path to the linearized AMR tf validation file>/all_amr-release-2-0-amrs-dev-all.txt.tf
export TGT_VALID=/<path to the sentence training file>/all_amr-release-2-0-amrs-dev-all.txt.sent
export SAVE=/<path to save directory>/
python preprocess.py -train_src $SRC -train_tgt $TGT -valid_src $SRC_VALID -valid_tgt $TGT_VALID -save_data $SAVE -src_seq_length 1000 -tgt_seq_length 1000 -shuffle 1
export F=/<path to test summarizer output>/summ_ramp_10_passes_len_edges_exp_0
export OUTPUT=/<path to test preprocessed output>
export AMR=/<path to AMR>/amr-release-2.0-amrs-test-proxy.txt
python var_free_amrs.py -is_dir -f $F -output_path $OUTPUT --custom_parentheses --no_semantics --delete_amr_var --with_side -side_file $AMR
python $WORK/train.py -data $PREPROCESS/van_noord/no_filter_amr_2/data -save_model $MODEL/$TYPE -rnn_size 500 -layers 2 -epochs 2000 -optim sgd -learning_rate 1 -learning_rate_decay 0.8 -encoder_type brnn -global_attention general -seed 1 -dropout 0.5 -batch_size 32
python $WORK/translate.py -src $file -output