Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
phosseini authored May 8, 2022
1 parent af8e726 commit 37b30e4
Showing 1 changed file with 11 additions and 10 deletions.
21 changes: 11 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
</p>

## Converting Knowledge Graphs to Text
### ATOMIC-to-Text
Triples in ATOMIC are stored in form of: `(subject, relation, target)`. We convert (verbalize) these triples to natural language text to later use them in training/fine-tuning some Pretrained Language Models (PLMs):
1. Download ATOMIC 2020 [here](https://allenai.org/data/atomic-2020), put the zip file in the `/data` folder, and unzip it (we only need `dev.tsv` and `train.tsv`).
### ATOMIC<sub>2020</sub>-to-Text
Triples in ATOMIC<sub>2020</sub> are stored in form of: `(subject, relation, target)`. We convert (verbalize) these triples to natural language text to later use them in training/fine-tuning some Pretrained Language Models (PLMs):
1. Download ATOMIC<sub>2020</sub> [here](https://allenai.org/data/atomic-2020), put the zip file in the `/data` folder, and unzip it (we only need `dev.tsv` and `train.tsv`).
2. Run the following code: [`atomic_to_text.py`](https://github.com/phosseini/causal-reasoning/blob/main/atomic_to_text.py) (depending on whether you're running the grammar check, this may take a while.)
3. Outputs will be stored as `.txt` and `.csv` files in the `/data` folder following the name patterns: `atomic2020_dev.*` and `atomic2020_train.*`.

Expand All @@ -17,9 +17,9 @@ Triples in ATOMIC are stored in form of: `(subject, relation, target)`. We conve
3. Output will be stored in: `data/glucose_train.csv`

## Continual Pretraining
Once we verbalized the ATOMIC knowledge graph and GLUCOSE dataset to text, we continually pretrain a Pretrained Language Model (PLM), BERT here, using the converted texts. We call this pretraining step a **continual pretraining** since we use one of the objectives, Masked Language Modeling (MLM) that was originally used in pretraining BERT, to further train the PLM. There are two steps for running the pretraining:
Once we verbalized the ATOMIC<sub>2020</sub> knowledge graph and GLUCOSE dataset to text, we continually pretrain a Pretrained Language Model (PLM), BERT here, using the converted texts. We call this pretraining step a **continual pretraining** since we use one of the objectives, Masked Language Modeling (MLM) that was originally used in pretraining BERT, to further train the PLM. There are two steps for running the pretraining:
* Setting the parameters in the [`config/pretraining_config.json`](https://github.com/phosseini/causal-reasoning/blob/main/config/pretraining_config.json): Even though most of these parameters are self-descriptive, we give a brief explanation about some of them for clarification purposes:
* `relation_category` (for ATOMIC): A list of triple types (strings) with which we want to continually pretrain our model. There are three main categories of triples in ATOMIC: `event`, `social`, and `physical`. These categories may deal with different types of knowledge. And, models pretrained with each of these categories or a combination of them may give us different results when fine-tuned and tested on downstream tasks. As a result, we added an option for choosing the triple type(s) with which we want to run the pretraining.
* `relation_category` (for ATOMIC<sub>2020</sub>): A list of triple types (strings) with which we want to continually pretrain our model. There are three main categories of triples in ATOMIC<sub>2020</sub>: `event`, `social`, and `physical`. These categories may deal with different types of knowledge. And, models pretrained with each of these categories or a combination of them may give us different results when fine-tuned and tested on downstream tasks. As a result, we added an option for choosing the triple type(s) with which we want to run the pretraining.
* Runnig the pretraining code: [`pretraining.py`](https://github.com/phosseini/causal-reasoning/blob/main/pretraining.py)

### Using our models on HuggingFace🤗
Expand All @@ -40,11 +40,12 @@ atomic_roberta_model = AutoModel.from_pretrained("phosseini/atomic-roberta-large
Full list of models on HuggingFace
| Model | Training Data |
| :---: | :---: |
| [phosseini/glucose-bert-large](https://huggingface.co/phosseini/glucose-bert-large) | GLUCOSE |
| [phosseini/atomic-bert-large](https://huggingface.co/phosseini/atomic-bert-large)| ATOMIC `event` relations |
| [phosseini/atomic-bert-large-full](https://huggingface.co/phosseini/atomic-bert-large-full)| ATOMIC `event`, `social`, `physical` relations |
| [phosseini/atomic-roberta-large](https://huggingface.co/phosseini/atomic-roberta-large)| ATOMIC `event` relations |
| [phosseini/atomic-roberta-large-full](https://huggingface.co/phosseini/atomic-roberta-large-full)| ATOMIC `event`, `social`, `physical` relations |
| [`phosseini/glucose-bert-large`](https://huggingface.co/phosseini/glucose-bert-large) | [GLUCOSE](https://github.com/ElementalCognition/glucose) |
| [`phosseini/glucose-roberta-large`](https://huggingface.co/phosseini/glucose-roberta-large) | [GLUCOSE](https://github.com/ElementalCognition/glucose) |
| [`phosseini/atomic-bert-large`](https://huggingface.co/phosseini/atomic-bert-large)| [ATOMIC<sub>2020</sub>](https://allenai.org/data/atomic-2020) `event` relations |
| [`phosseini/atomic-bert-large-full`](https://huggingface.co/phosseini/atomic-bert-large-full)| [ATOMIC<sub>2020</sub>](https://allenai.org/data/atomic-2020) `event`, `social`, `physical` relations |
| [`phosseini/atomic-roberta-large`](https://huggingface.co/phosseini/atomic-roberta-large)| [ATOMIC<sub>2020</sub>](https://allenai.org/data/atomic-2020) `event` relations |
| [`phosseini/atomic-roberta-large-full`](https://huggingface.co/phosseini/atomic-roberta-large-full)| [ATOMIC<sub>2020</sub>](https://allenai.org/data/atomic-2020) `event`, `social`, `physical` relations |

## Fine-tuning
After pretraining the PLM with the new data, it is time to fine-tune and evaluate its performance on downstream tasks. So far, we have tested our models on two benchmarks including COPA and TCR. In the following, we explain the fine-tuning process for each of them.
Expand Down

0 comments on commit 37b30e4

Please sign in to comment.