Available at


Download 1.62 Mb.
Pdf ko'rish
bet36/61
Sana18.06.2023
Hajmi1.62 Mb.
#1559231
1   ...   32   33   34   35   36   37   38   39   ...   61
Bog'liq
bbbb

Average 
1,405.5 
772.5 
54.96% 
In order to check this, I have computed the lexical density in two other spoken 
corpora: English speeches in EPTIC and the spoken demographic part of the BNC. 
The European Parliament Translation and Interpreting Corpus (EPTIC) is an 
intermodal, comparable and parallel corpus, which comprises speeches delivered 
at the EP with their official interpretations and translations (Bernardini et al., 2016). 
I relied on the English source speeches of EPTIC (23,549 words, see Table 11). 
The second corpus used for the comparison is the spoken_demographic 
subcorpus of the British National Corpus (BNC). The BNC, which is a collaboration 
between commercial and academic partners, is a 100-million-word collection of 
samples of written and spoken language from a wide range of sources. It aims to 
represent a wide cross-section of British English from the later part of the 20
th
century (Leech, 1992). The written part represents 90% of the BNC while the 
spoken part represents 10% of it. The latter consists of transcriptions of informal 
conversations and spoken language gathered in different contexts. The 
spoken_demographic subcorpus contains 4,190,072 words. 
Table 11 – Lexical density across spoken corpora 
Subcorpus 
Number of 
words 
Number of 
lexical words 
Lexical density 
EPTIC-st-in-en 
23,549 
16,088 
68.32% 
BNC_spoken_demographic 4,657,760 
2,256,866 
48.45% 
IN 
5,622 
3,090 
54.96% 


Results and discussion 
 
page 62 
As shown in Table 11, lexical density is much higher in EPTIC than in the 
BNC_spoken_demographic subcorpus, which is quite logical given the different 
genres represented in each corpus. In the first case, the corpus is made of 
speeches delivered by Members of the European Parliament while the second 
consists of everyday conversations that were collected in contexts such as radio 
shows or phone-ins. Interestingly, we see that the source speeches included in the 
corpus display a lexical density higher than everyday conversations but lower than 
parliamentary debates. 
Two POS-tags were not taken into consideration in the lexical density scores 
reported for the French outputs: SENT (sentence-break punctuation) and SYM 
(symbol) because they were irrelevant to the calculation of lexical density. Table 
12 shows the lexical density of all interpreting tasks and all students’ outputs. 
Table 12 – Lexical density (LD) across inputs and outputs in LOCOSSI 
IN01 
IN02 
IN03 
IN04 

Download 1.62 Mb.

Do'stlaringiz bilan baham:
1   ...   32   33   34   35   36   37   38   39   ...   61




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling