embedding spaces
Recently Published Documents


TOTAL DOCUMENTS

59
(FIVE YEARS 27)

H-INDEX

8
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Yuan An ◽  
Alexander Kalinowski ◽  
Jane Greenberg

Author(s):  
Sixiao Zhang ◽  
Hongxu Chen ◽  
Xiao Ming ◽  
Lizhen Cui ◽  
Hongzhi Yin ◽  
...  
Keyword(s):  

Author(s):  
Anushika Liyanage ◽  
Surangika Ranathunga ◽  
Sanath Jayasena

2021 ◽  
Author(s):  
Maxime D. Armstrong ◽  
Diego Maupomé ◽  
Marie-Jean Meurs

2021 ◽  
Vol 15 (02) ◽  
pp. 263-290
Author(s):  
Renjith P. Ravindran ◽  
Kavi Narayana Murthy

Word embeddings have recently become a vital part of many Natural Language Processing (NLP) systems. Word embeddings are a suite of techniques that represent words in a language as vectors in an n-dimensional real space that has been shown to encode a significant amount of syntactic and semantic information. When used in NLP systems, these representations have resulted in improved performance across a wide range of NLP tasks. However, it is not clear how syntactic properties interact with the more widely studied semantic properties of words. Or what the main factors in the modeling formulation are that encourages embedding spaces to pick up more of syntactic behavior as opposed to semantic behavior of words. We investigate several aspects of word embedding spaces and modeling assumptions that maximize syntactic coherence — the degree to which words with similar syntactic properties form distinct neighborhoods in the embedding space. We do so in order to understand which of the existing models maximize syntactic coherence making it a more reliable source for extracting syntactic category (POS) information. Our analysis shows that syntactic coherence of S-CODE is superior to the other more popular and more recent embedding techniques such as Word2vec, fastText, GloVe and LexVec, when measured under compatible parameter settings. Our investigation also gives deeper insights into the geometry of the embedding space with respect to syntactic coherence, and how this is influenced by context size, frequency of words, and dimensionality of the embedding space.


Author(s):  
Julien Ducoulombier ◽  
Victor Turchin ◽  
Thomas Willwacher
Keyword(s):  

Author(s):  
Kelly Marchisio ◽  
Youngser Park ◽  
Ali Saad-Eldin ◽  
Anton Alyakin ◽  
Kevin Duh ◽  
...  

2021 ◽  
Author(s):  
Pankaj Gupta ◽  
Yatin Chaudhary ◽  
Hinrich Schütze

2021 ◽  
Author(s):  
Niklas Friedrich ◽  
Anne Lauscher ◽  
Simone Paolo Ponzetto ◽  
Goran Glavaš

Sign in / Sign up

Export Citation Format

Share Document