Microsoft Research336 тыс
Опубликовано 22 июня 2016, 20:35
Word embeddings are often constructed with discriminative models such as deep nets and word2vec. Mikolov et al (2013) showed that these embeddings exhibit linear structure that is useful in solving "word analogy tasks". Subsequently, Levy and Goldberg (2014) and Pennington et al (2014) tried to explain why such linear structure should arise in embeddings derived from nonlinear methods. We provide a new generative model "explanation" for various word embedding methods as well as the above-mentioned linear structure. It also gives a generative explanation of older vector space methods such as the PMI method of Church and Hanks (1990). The model has surprising predictions (e.g., the spatial isotropy of word vectors), which are empirically verified. It also directly leads to a linear algebraic understanding of how a word embedding behaves when the word is polysemous (has multiple meanings), and to recover the different meanings from the embedding. This methodology and generative model may be useful for other NLP tasks and neural models. Joint work with Sanjeev Arora, Yuanzhi Li, Yingyu Liang, and Andrej Risteski (listed in alphabetical order).
Свежие видео