Academia.eduAcademia.edu

Table 2: Results on the word analogy task, given as percent accuracy. Underlined scores are best within groups of similarly-sized models; bold scores are best overall. HPCA vectors are publicly available”; (i)vLBL results are from (Mnih et al., 2013); skip-gram (SG) and CBOW results are from (Mikolov et al., 2013a,b); we trained SGt and CBOW' using the word2vec tool. See text for details and a description of the SVD models.   for details and a description of the SVD models.  dataset for NER (Tjong Kim Sang and De Meul- der, 2003).  Word analogies. The word analogy task con-  sists of questions like, “a is to b as c is The dataset contains 19,544 such questi  to?” ons, di-  vided into a semantic subset and a syntactic sub- set. The semantic questions are typically analogies about people or places, like “Athens is to Greece  as Berlin is to __?”. The syntactic ques  ions are  typically analogies about verb tenses or forms of adjectives, for example “dance is to dancing as fly  is to__?”. To correctly answer the ques  ion, the  model should uniquely identify the missing term,  with only an exact correspondence coun  ed as a  correct match. We answer the question “a is to b asc isto?” by finding the word d whose repre- sentation wg is closest to Wy — Wg + We according  to the cosine similarity.4  Word analogies. The word analogy task con-

Table 2 Results on the word analogy task, given as percent accuracy. Underlined scores are best within groups of similarly-sized models; bold scores are best overall. HPCA vectors are publicly available”; (i)vLBL results are from (Mnih et al., 2013); skip-gram (SG) and CBOW results are from (Mikolov et al., 2013a,b); we trained SGt and CBOW' using the word2vec tool. See text for details and a description of the SVD models. for details and a description of the SVD models. dataset for NER (Tjong Kim Sang and De Meul- der, 2003). Word analogies. The word analogy task con- sists of questions like, “a is to b as c is The dataset contains 19,544 such questi to?” ons, di- vided into a semantic subset and a syntactic sub- set. The semantic questions are typically analogies about people or places, like “Athens is to Greece as Berlin is to __?”. The syntactic ques ions are typically analogies about verb tenses or forms of adjectives, for example “dance is to dancing as fly is to__?”. To correctly answer the ques ion, the model should uniquely identify the missing term, with only an exact correspondence coun ed as a correct match. We answer the question “a is to b asc isto?” by finding the word d whose repre- sentation wg is closest to Wy — Wg + We according to the cosine similarity.4 Word analogies. The word analogy task con-