Available at


Download 1.62 Mb.
Pdf ko'rish
bet25/61
Sana18.06.2023
Hajmi1.62 Mb.
#1559231
1   ...   21   22   23   24   25   26   27   28   ...   61
Bog'liq
bbbb

Part-of-
speech tagging (POS-tagging) is the action of annotating corpora, or in other 
words, the process of classifying words into their parts of speech and labelling 
them accordingly. The TreeTagger (Schmid, 1994) is a tool for annotating text 
with part-of-speech and lemma information. It was developed by Helmut Schmid in 
the TC project (Stuttgart). Finally, Sketch Engine is a corpus query system, which 
allows the user to view and analyse word sketches (Kilgariff et al., 2004). The 
POS-tag description used for this subcorpus can be found in the appendices 
(Appendix 3).
This French subcorpus, made up of 30,511 words, was also POS-tagged, using 
the TreeTagger for French on the Sketch Engine platform. More information about 
the French tagset can be found in Appendix 4. 


Data and methodology 
 
page 46 
3.4 Database 
In this section, I explain how I built the database on which the analyses are based. 
I will first take a look at the source speeches, focusing on the difficulties they 
contain and their frequency of occurrence. The second part of this section outlines 
the taxonomy of rendition types I used. Rendition types are illustrated with 
authentic examples taken from LOCOSSI. 
3.4.1 Source speeches: types of difficulties 
As mentioned in Section 2.4., this study focuses on different difficulties faced by 
trainees in SI. These potentially problematic items are numbers, proper names, 
complex noun phrases, single-word terms, culture-specific items, phrasal verbs 
and idioms. I manually identified all the occurrences of each type of difficulty in the 
four source speeches. All these occurrences were then included in a database (an 
Excel spreadsheet). Table 4 shows examples of each category. As the “complex 
noun phrases” category was the most populated (see Table 5), I decided to 
subdivide it into eight subcategories, which represent different syntactic patterns 
(where A stands for adjective, N for noun, prep for preposition, and Adv for 
adverb).
Table 4 – Examples of difficulties 
Type of difficulty 
Examples (taken from LOCOSSI) 

Download 1.62 Mb.

Do'stlaringiz bilan baham:
1   ...   21   22   23   24   25   26   27   28   ...   61




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling