Skip to content

Commit

Permalink
Merge pull request dennybritz#32 from SeanRosario/master
Browse files Browse the repository at this point in the history
Added universal sentence embeddings paper to list with notes
  • Loading branch information
dennybritz authored Nov 14, 2017
2 parents 90bbd6c + 86e6b53 commit feab73f
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ Weakly-Supervised Classification and Localization of Common Thorax Diseases [[CV

#### 2017-05

- Supervised Learning of Universal Sentence Representations from Natural Language Inference Data [[arXiv](https://arxiv.org/abs/1705.02364)] [[code](https://github.com/facebookresearch/InferSent)]
- pix2code: Generating Code from a Graphical User Interface Screenshot [[arXiv](https://arxiv.org/abs/1705.07962)] [[article](https://uizard.io/research#pix2code)] [[code](https://github.com/tonybeltramelli/pix2code)]
- The Cramer Distance as a Solution to Biased Wasserstein Gradients [[arXiv](https://arxiv.org/abs/1705.10743)]
- Reinforcement Learning with a Corrupted Reward Channel [[arXiv](https://arxiv.org/abs/1705.08417)]
Expand Down
21 changes: 21 additions & 0 deletions notes/supervised-learning-of-universal-sentence-representations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://arxiv.org/abs/1705.02364)

TLDR; The authors show that supervised training on the NLI task can produce high-quality "universal" sentence embeddings which outperform other existing models, on transfer tasks. They train the sentence vectors on the SNLI corpus using 4 different sentence encoding model architectures.

### Key Points
- The SNLI corpus is a large corpus of sentence pairs that have been manually categories into 3 classes: entailment, contradiction, and neutral. The SNLI task is good for learning sentence vectors because it forces the model to learn semantic representations

- The 4 sentence encoding architectures used are:
- LSTM/GRU: Essentially the encoder of a seq2seq model
- BiLSTM: Bi-directional LSTM where each dim of the two (forwards and backwards) encoding are either summed or max-pooled.
- Self-attentive network: Weighted linear combination (Attention) over each hidden state vectors of a BiLSTM
- Hierarchical ConvNet: The authors introduce a variation of the AdaSent model, where at each layer of the CNN, a max pool is taken over the feature maps. Each max pooled vector is concatenated to obtain the final sentence encoding.

- The BiLSTM-Max w/ 4096 dim encoding performs best out of all on the SNLI task as well as on transfer tasks.

- Some models are sensitive to over to over-specialization on the SNLI training task. This means that some models can perform better on the SNLI task but don't transfer as well compared to other models

- The trained models are used to get sentence representations and test performance on 12 different transfer tasks such as classification (eg: sentiment analysis, Subj/obj), entailment (eg: SICK dataset), caption-image retrieval and a few other tasks.

- PyTorch implementation available [here](https://github.com/facebookresearch/InferSent)

0 comments on commit feab73f

Please sign in to comment.