Information Review Measurement of Text Similarity: a survey Jiapeng Wang and Yihong Dong
, 11, x FOR PEER REVIEW 8 of 17 Figure 2
Download 2.35 Mb. Pdf ko'rish
|
information-11-00421-v2
Information 2020, 11, x FOR PEER REVIEW
8 of 17 Figure 2. Word2vec’s model architectures. The continuous bag of words (CBOW) architecture predicts the current word based on the context, and the skip-gram predicts surrounding words given the current word [34]. • Glove Glove is a word representation tool based on global word frequency statistics, which explains the semantic information of words by modeling the contextual relationship of words. Its core idea is that words with similar meanings often appear in similar contexts [35]. • BERT BERT’s full name is bidirectional encoder representation from transformers, because decoder is unable to capture the directional encoder representation from transformers. The main innovation of the model is based on the pre-train approach, which covers masked language model and next sentence prediction, which capture expression and sentence-level representation, respectively [36]. However, BERT will be complicated to obtain interactive computing when it is used, so it generally not used as a way of computing similarity text when facing downstream tasks. BERT’s model architectures are described in Figure 3. Figure 3. Figure from architectures. BERT uses a bidirectional transformer. BERT’s representations are jointly conditioned on both the left and right context in all layers [36]. Figure 2. Word2vec’s model architectures. The continuous bag of words (CBOW) architecture predicts the current word based on the context, and the skip-gram predicts surrounding words given the current word [ 34 ]. • Glove Glove is a word representation tool based on global word frequency statistics, which explains the semantic information of words by modeling the contextual relationship of words. Its core idea is that words with similar meanings often appear in similar contexts [ 35 ]. • BERT BERT’s full name is bidirectional encoder representation from transformers, because decoder is unable to capture the directional encoder representation from transformers. The main innovation of the model is based on the pre-train approach, which covers masked language model and next sentence prediction, which capture expression and sentence-level representation, respectively [ 36 ]. However, BERT will be complicated to obtain interactive computing when it is used, so it generally not used as a way of computing similarity text when facing downstream tasks. BERT’s model architectures are described in Figure 3 . Information 2020, 11, x FOR PEER REVIEW 8 of 17 Download 2.35 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling