[Re] Word Meanings Representation in Context

This repository hosts the code, data and results of our reproducibility experiment for the ACL 2021 paper Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy by Marcos Garcia. The paper looks at both static and contextualized word embeddings with the goal to assess their ability to adequately represent different lexical‐semantic relations, such as homonymy and synonymy. Our goal is to reproduce the results summarised in Table 4 of the orginal paper and to test the hypothesis formulated by the author on a newly-compiled Italian dataset. The original repository can be reached at https://github.com/marcospln/homonymy_acl21.

Datasets

For our experiment we work with .tsv data-sets of triples in five languages: English, Spanish, Portuguese, Galician and Italian. A triple is a set of three sentences, each containing a target word marked by <b></b> tags. Two target words have the same meaning while the third is an outlier.

Target	POS	Context	Overlap	Sent1	Sent2	Sent3
coach	same|same|same	same|same|same	false|false|false	We're going to the airport by <b>coach</b>.	We're going to the airport by <b>bus</b>.	We're going to the airport by <b>bicycle</b>.

For each .tsv data-set we also need its corresponding .conllu version. These resources are provided in the datasets folder.

Run the Experiment

execute get_fasttext_models.sh to get the fastText models required to succesfully run the experiment;
execute generate_comparisons.sh to generate an embedding for each sentence in each triple and compare them: (emb_sent_1 vs emb_sent_2), (emb_sent_1 vs emb_sent_3) and (emb_sent_2 vs emb_sent_3);
execute evaluate_comparisons.sh to compute the accuracy scores for each language variety.

Results

The outputs of the two scripts, generate_comparisons.sh and evaluate_comparisons.sh, are stored in triples_comparisons and results, respectively. We provide our results in the repro_results folder.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
datasets		datasets
repro_results		repro_results
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compare_embeddings_static.py		compare_embeddings_static.py
compare_embeddings_transformers.py		compare_embeddings_transformers.py
eval_results.py		eval_results.py
evaluate_comparisons.sh		evaluate_comparisons.sh
generate_comparisons.sh		generate_comparisons.sh
get_fasttext_models.sh		get_fasttext_models.sh
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[Re] Word Meanings Representation in Context

Datasets

Run the Experiment

Results

About

Languages

License

matteobrv/repro-homonymy-acl21

Folders and files

Latest commit

History

Repository files navigation

[Re] Word Meanings Representation in Context

Datasets

Run the Experiment

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages