From cea4a8dde699b6e29b43f1b60a6755afa301fba5 Mon Sep 17 00:00:00 2001 From: Sean Rosario Date: Mon, 13 Nov 2017 20:54:35 -0500 Subject: [PATCH 1/5] Added Universal Sentence Rep notes file with TLDR --- ...pervised-learning-of-universal-sentence-representations.md | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 notes/supervised-learning-of-universal-sentence-representations.md diff --git a/notes/supervised-learning-of-universal-sentence-representations.md b/notes/supervised-learning-of-universal-sentence-representations.md new file mode 100644 index 0000000..e6daa9e --- /dev/null +++ b/notes/supervised-learning-of-universal-sentence-representations.md @@ -0,0 +1,4 @@ +## [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://arxiv.org/abs/1705.02364) + +TLDR; The authors show that supervised training on the NLI task can produce high-quality "universal" sentence embeddings which outperform other existing models, on transfer tasks. They train the sentence vectors on the SNLI corpus using 4 different sentence encoding model architectures. + From 0bc2758086af42fd7fe5668c8bd90197a43e79fc Mon Sep 17 00:00:00 2001 From: Sean Rosario Date: Mon, 13 Nov 2017 21:56:47 -0500 Subject: [PATCH 2/5] Added key points to InferSent notes --- ...d-learning-of-universal-sentence-representations.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/notes/supervised-learning-of-universal-sentence-representations.md b/notes/supervised-learning-of-universal-sentence-representations.md index e6daa9e..84ecab5 100644 --- a/notes/supervised-learning-of-universal-sentence-representations.md +++ b/notes/supervised-learning-of-universal-sentence-representations.md @@ -2,3 +2,13 @@ TLDR; The authors show that supervised training on the NLI task can produce high-quality "universal" sentence embeddings which outperform other existing models, on transfer tasks. They train the sentence vectors on the SNLI corpus using 4 different sentence encoding model architectures. +### Key Points +- The SNLI corpus is a large corpus of sentence pairs that have been manually categories into 3 classes: entailment, contradiction, and neutral. The SNLI task is good for learning sentence vectors because it forces the model to learn semantic representations + +- The 4 sentence encoding architectures used are: + - LSTM/GRU: Essentially the encoder of a seq2seq model + - BiLSTM: Bi-directional LSTM where each dim of the two (forwards and backwards) encoding are either summed or max-pooled + - Self-attentive network: Weighted linear combination (Attention) over each hidden state vectors of a BiLSTM + - Hierarchical ConvNet: The authors introduce a variation of the AdaSent model, where at each layer of the CNN, a max pool is taken over the feature maps. Each of these max pooled vectors are concatenated to obtain the final sentence encoding. + +- The trained models are used to get sentence representations for different tasks such as classification (eg: sentiment analysis, Subj/obj), entailment (eg: SICK dataset), caption-image retrieval and a few other tasks. \ No newline at end of file From dfc31ab07376f9abacc1b4297aae589f83d7680a Mon Sep 17 00:00:00 2001 From: Sean Rosario Date: Mon, 13 Nov 2017 22:53:30 -0500 Subject: [PATCH 3/5] Added more key points to InferSent notes --- ...-learning-of-universal-sentence-representations.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/notes/supervised-learning-of-universal-sentence-representations.md b/notes/supervised-learning-of-universal-sentence-representations.md index 84ecab5..18b4faf 100644 --- a/notes/supervised-learning-of-universal-sentence-representations.md +++ b/notes/supervised-learning-of-universal-sentence-representations.md @@ -7,8 +7,13 @@ TLDR; The authors show that supervised training on the NLI task can produce high - The 4 sentence encoding architectures used are: - LSTM/GRU: Essentially the encoder of a seq2seq model - - BiLSTM: Bi-directional LSTM where each dim of the two (forwards and backwards) encoding are either summed or max-pooled + - BiLSTM: Bi-directional LSTM where each dim of the two (forwards and backwards) encoding are either summed or max-pooled. - Self-attentive network: Weighted linear combination (Attention) over each hidden state vectors of a BiLSTM - - Hierarchical ConvNet: The authors introduce a variation of the AdaSent model, where at each layer of the CNN, a max pool is taken over the feature maps. Each of these max pooled vectors are concatenated to obtain the final sentence encoding. + - Hierarchical ConvNet: The authors introduce a variation of the AdaSent model, where at each layer of the CNN, a max pool is taken over the feature maps. Each max pooled vector is concatenated to obtain the final sentence encoding. + +- The BiLSTM-Max w/ 4096 dim encoding performs best out of all on the SNLI task as well as on transfer tasks. + +- Some models are sensitive to over to over-specialization on the SNLI training task. This means that some models can perform better on the SNLI task but don't transfer as well compared to other models + +- The trained models are used to get sentence representations and test performance on 12 different transfer tasks such as classification (eg: sentiment analysis, Subj/obj), entailment (eg: SICK dataset), caption-image retrieval and a few other tasks. -- The trained models are used to get sentence representations for different tasks such as classification (eg: sentiment analysis, Subj/obj), entailment (eg: SICK dataset), caption-image retrieval and a few other tasks. \ No newline at end of file From 27565ceaf39ac5c4e420fa1ff661d196963c7c4c Mon Sep 17 00:00:00 2001 From: Sean Rosario Date: Mon, 13 Nov 2017 22:55:22 -0500 Subject: [PATCH 4/5] Added InferSent implementation link --- ...supervised-learning-of-universal-sentence-representations.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/notes/supervised-learning-of-universal-sentence-representations.md b/notes/supervised-learning-of-universal-sentence-representations.md index 18b4faf..dcd68b6 100644 --- a/notes/supervised-learning-of-universal-sentence-representations.md +++ b/notes/supervised-learning-of-universal-sentence-representations.md @@ -17,3 +17,5 @@ TLDR; The authors show that supervised training on the NLI task can produce high - The trained models are used to get sentence representations and test performance on 12 different transfer tasks such as classification (eg: sentiment analysis, Subj/obj), entailment (eg: SICK dataset), caption-image retrieval and a few other tasks. +- PyTorch implementation available [here](https://github.com/facebookresearch/InferSent) + From 86e6b53946d4cb6f05b26c88d0c6ca3340e4be95 Mon Sep 17 00:00:00 2001 From: Sean D'Rosario Date: Mon, 13 Nov 2017 23:04:11 -0500 Subject: [PATCH 5/5] Added NLI Sentence embedding paper to the list First one under the 2017-05 section --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index cd987d1..824b6d5 100644 --- a/README.md +++ b/README.md @@ -126,6 +126,7 @@ Weakly-Supervised Classification and Localization of Common Thorax Diseases [[CV #### 2017-05 +- Supervised Learning of Universal Sentence Representations from Natural Language Inference Data [[arXiv](https://arxiv.org/abs/1705.02364)] [[code](https://github.com/facebookresearch/InferSent)] - pix2code: Generating Code from a Graphical User Interface Screenshot [[arXiv](https://arxiv.org/abs/1705.07962)] [[article](https://uizard.io/research#pix2code)] [[code](https://github.com/tonybeltramelli/pix2code)] - The Cramer Distance as a Solution to Biased Wasserstein Gradients [[arXiv](https://arxiv.org/abs/1705.10743)] - Reinforcement Learning with a Corrupted Reward Channel [[arXiv](https://arxiv.org/abs/1705.08417)]