Файл:Linguistic Regularities in Continuous Space Word Representations Rvecs.pdf

Материал из Материалы по машинному обучению
Перейти к: навигация, поиск
Linguistic_Regularities_in_Continuous_Space_Word_Representations_Rvecs.pdf(0 × 0 пикселей, размер файла: 121 КБ, MIME-тип: application/pdf)

Tomas Mikolov∗, Wen-tauYih, GeoffreyZweig Microsoft Research Redmond, WA 98052


Continuous space language models have recently demonstrated outstanding results across a variety of tasks. In this paper, we examine the vector-space word representations that are implicitly learned by the input-layer weights. We find that these representations are surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset. This allows vector-oriented reasoning based on the offsets betweenwords. Forexample,themale/female relationship is automatically learned, and with the induced vector representations, “King Man + Woman” results in a vector very close to “Queen.” We demonstrate that the word vectors capture syntactic regularities by means of syntactic analogy questions (provided with this paper), and are able to correctly answer almost 40% of the questions. We demonstrate that the word vectors capture semantic regularities by using the vector offset method to answer SemEval-2012 Task 2 questions. Remarkably, this method outperforms the best previous systems.

История файла

Нажмите на дату/время, чтобы просмотреть, как тогда выглядел файл.

текущий16:02, 22 декабря 20160 × 0 (121 КБ)Slikos (обсуждение | вклад)
  • Вы не можете перезаписать этот файл.

Следующая 1 страница ссылается на данный файл: