Available at
Download 1.62 Mb. Pdf ko'rish
|
bbbb
Parliament Interpreting Corpus), an electronic parallel corpus of source and target
speeches in Italian, English and Spanish (Monti et al., 2005; Bendazzoli & Sandrelli, 2005). The plenary sittings of the European Parliament were chosen as source material in this corpus because they show a high level of homogeneity, in that all speeches are produced in the same formal setting (Marzocchi & Zucchetto, 1997). EPIC allows both parallel and comparable analyses and contains about 180,000 words. It is available online for the whole interpreting community to help to share our knowledge of interpreting and even enhance its teaching. As for DIRSI, i.e. Directionality in Simultaneous Interpreting (Bendazzoli & Sandrelli, 2009), it includes interpreters’ output into both their native language and their foreign working language. This corpus was compiled thanks to audio recordings from international conferences about health-related subjects held in Italy between 2005 and 2008 and includes recordings from different sessions (i.e. opening statements, presentations, and closing sessions). The language pair is English Corpus-based interpreting studies page 16 and Italian and five professional interpreters have accepted to contribute data. Debates were excluded from the corpus due to their degree of interactivity. The creation of this corpus has been possible thanks to the experience previously gained with the EPIC project. It is important to add that interpreting corpora created ad hoc by individual researchers for manual analysis are still used today, and complement the realm of CIS. The notion of corpus in interpreting studies has clearly initially been linked to empirical research based on authentic data (i.e. from real-life interpreting assignments) and because of the difficulty to compile electronic corpora that notion still applies to data sets that continue to be analysed manually and not electronically. In the following paragraphs, I will describe the main characteristics of interpreting corpora (i.e. interpreting mode and setting, corpus size, languages and data accessibility) that have already been compiled recently, and that are currently being compiled. The first CIS projects focused on professional simultaneous interpreting performed in conference settings. Two specific sources of data have been predominant, namely TV broadcasting and the European Parliament (EP). At the EP, for instance, source speeches are interpreted simultaneously into as many as 23 languages (sometimes through relais interpreting) and these speeches can be used for research purposes (Bendazzoli, 2010). Other fields have yet been explored such as festivals, medical conferences, and football press conferences. Simultaneous interpreting is nevertheless not the only interpreting mode that can be analysed. Asian research centres focus more on consecutive interpreting because of data accessibility, their data source being televised press conferences of Chinese political representatives (Wang, 2015). More recent projects also focus on short consecutive interpreting in community settings or on dialogue interpreting. It is also important to note that efforts are being made to develop sign-language Corpus-based interpreting studies page 17 corpora despite the difficulty to collect data due to anonymity issues (see Metzger & Roy, 2011). If they are compared to the spoken part of the British National Corpus (10 million words), interpreting corpora are quite small (Dembry & Love, 2015). Projects based on EP data still have to reach the size of general reference corpora, and it might just be a matter of time and of labour force. However, it is hoped that CIS projects will develop in other international organizations than the EP to diversify interpretation settings. It is already considered in some organisations such as the European Commission (Spinollo, 2018; Scardulla, 2016). Even though current projects carried out in Asia are expected to generate pretty large resources, compiling very large corpora in the near future is not likely to happen. In terms of languages used, we can see a wide range of language combinations, which confirms one of the “special challenges” (Setton 2011: 68) of CIS - multilingualism. English is represented in many studies, but it is really encouraging to see that non-European languages such as Hebrew, Japanese or Chinese are represented as well. The last feature of interpreting corpora I would like to focus on is data accessibility, which has always been an issue in CIS. In the oldest studies mentioned in Setton’s (2011) overview (see Appendix 1), transcripts were rarely made available and sound files were recorded on tape and not in digital form. Transcript files were thus hard to access (Diriker, 2004) and the analysis had to be carried out manually. In the same period, some studies were however based on machine-readable corpora (Cencini, 2002; Fumagalli, 2000) but the transcripts were “not available for outside use” (Setton 2011: 40). In some other cases, transcripts are available on CDs (e.g. Vuorikoski, 2004; Monacelli, 2009) or on the web as it is the case for the two corpora I mentioned earlier, EPIC and DIRSI. In the future, data accessibility should be facilitated at least among the research community. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling