Available at
Download 1.62 Mb. Pdf ko'rish
|
bbbb
Average
1,405.5 772.5 54.96% In order to check this, I have computed the lexical density in two other spoken corpora: English speeches in EPTIC and the spoken demographic part of the BNC. The European Parliament Translation and Interpreting Corpus (EPTIC) is an intermodal, comparable and parallel corpus, which comprises speeches delivered at the EP with their official interpretations and translations (Bernardini et al., 2016). I relied on the English source speeches of EPTIC (23,549 words, see Table 11). The second corpus used for the comparison is the spoken_demographic subcorpus of the British National Corpus (BNC). The BNC, which is a collaboration between commercial and academic partners, is a 100-million-word collection of samples of written and spoken language from a wide range of sources. It aims to represent a wide cross-section of British English from the later part of the 20 th century (Leech, 1992). The written part represents 90% of the BNC while the spoken part represents 10% of it. The latter consists of transcriptions of informal conversations and spoken language gathered in different contexts. The spoken_demographic subcorpus contains 4,190,072 words. Table 11 – Lexical density across spoken corpora Subcorpus Number of words Number of lexical words Lexical density EPTIC-st-in-en 23,549 16,088 68.32% BNC_spoken_demographic 4,657,760 2,256,866 48.45% IN 5,622 3,090 54.96% Results and discussion page 62 As shown in Table 11, lexical density is much higher in EPTIC than in the BNC_spoken_demographic subcorpus, which is quite logical given the different genres represented in each corpus. In the first case, the corpus is made of speeches delivered by Members of the European Parliament while the second consists of everyday conversations that were collected in contexts such as radio shows or phone-ins. Interestingly, we see that the source speeches included in the corpus display a lexical density higher than everyday conversations but lower than parliamentary debates. Two POS-tags were not taken into consideration in the lexical density scores reported for the French outputs: SENT (sentence-break punctuation) and SYM (symbol) because they were irrelevant to the calculation of lexical density. Table 12 shows the lexical density of all interpreting tasks and all students’ outputs. Table 12 – Lexical density (LD) across inputs and outputs in LOCOSSI IN01 IN02 IN03 IN04 Download 1.62 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling