Available at

bet	36/61
Sana	18.06.2023
Hajmi	1,62 Mb.
	#1559231

1 ... 32 33 34 35 36 37 38 39 ... 61

Bog'liq
bbbb

Average
1,405.5
772.5
54.96%
In order to check this, I have computed the lexical density in two other spoken
corpora: English speeches in EPTIC and the spoken demographic part of the BNC.
The European Parliament Translation and Interpreting Corpus (EPTIC) is an
intermodal, comparable and parallel corpus, which comprises speeches delivered
at the EP with their official interpretations and translations (Bernardini et al., 2016).
I relied on the English source speeches of EPTIC (23,549 words, see Table 11).
The second corpus used for the comparison is the spoken_demographic
subcorpus of the British National Corpus (BNC). The BNC, which is a collaboration
between commercial and academic partners, is a 100-million-word collection of
samples of written and spoken language from a wide range of sources. It aims to
represent a wide cross-section of British English from the later part of the 20
th
century (Leech, 1992). The written part represents 90% of the BNC while the
spoken part represents 10% of it. The latter consists of transcriptions of informal
conversations and spoken language gathered in different contexts. The
spoken_demographic subcorpus contains 4,190,072 words.
Table 11 – Lexical density across spoken corpora
Subcorpus
Number of
words
Number of
lexical words
Lexical density
EPTIC-st-in-en
23,549
16,088
68.32%
BNC_spoken_demographic 4,657,760
2,256,866
48.45%
IN
5,622
3,090
54.96%

Results and discussion

page 62
As shown in Table 11, lexical density is much higher in EPTIC than in the
BNC_spoken_demographic subcorpus, which is quite logical given the different
genres represented in each corpus. In the first case, the corpus is made of
speeches delivered by Members of the European Parliament while the second
consists of everyday conversations that were collected in contexts such as radio
shows or phone-ins. Interestingly, we see that the source speeches included in the
corpus display a lexical density higher than everyday conversations but lower than
parliamentary debates.
Two POS-tags were not taken into consideration in the lexical density scores
reported for the French outputs: SENT (sentence-break punctuation) and SYM
(symbol) because they were irrelevant to the calculation of lexical density. Table
12 shows the lexical density of all interpreting tasks and all students’ outputs.
Table 12 – Lexical density (LD) across inputs and outputs in LOCOSSI
IN01
IN02
IN03
IN04

Download 1,62 Mb.

Do'stlaringiz bilan baham:

1 ... 32 33 34 35 36 37 38 39 ... 61