TENDENCIES OF DEVELOPMENT SCIENCE AND PRACTICE
330
ALGORITHMS FOR INTERPRETING WORD VECTORS
Iskandarova S.N.
Samarkand branch of Tashkent University of Information Technologies named after
Muhammad al-Khwarizmi Samarkand, Uzbekistan
Makhamova D.A.
Samarkand branch of Tashkent University of Information Technologies named after
Muhammad al-Khwarizmi Samarkand, Uzbekistan
GloVe is closely associated with Word2Vec: the algorithms appeared around the
same time and rely on the interpretability of word vectors. The GloVe model tries to
solve the problem of efficient use of hit statistics. GloVe
minimizes the difference
between the product of word vectors and the log probability of their joint occurrence
using stochastic gradient descent. The resulting representations reflect important linear
substructures of the vector space of words: it turns out to tie together different satellites
of one planet or the postal code of a city with its name.
In Word2Vec, word co-occurrence doesn't really matter, it just helps generate
more training samples. GloVe considers co-occurrence rather than relying on context
statistics alone. Word vectors are grouped together based on their global similarity.
Finished Models
GloVe embeddings are readily available from the Stanford University website.
Advantages
❖
Simple architecture without neural network.
❖
The model is fast and this may be sufficient for simple applications.
❖
GloVe improves upon Word2Vec.
It adds word frequency and
outperforms Word2Vec on most benchmarks.
❖
Meaningful embeddings.
Flaws
While the co-occurrence matrix
provides global information,
GloVe remains
trained at the word level and provides little data about the sentence and the context in
which the word is used. Handles unknown and rare words poorly.