Daniel A. Nkemleke

Download 538 b.

Sana	10.08.2017
Hajmi	538 b.
	#13169

Daniel A. Nkemleke
Department of English
Ecole Normale Supérieure
University of YaoundeI

Daniel A. Nkemleke
Department of English
Ecole Normale Supérieure
University of YaoundeI

The study of language based on examples of “real life“ language use, collected, stored and processed via computer
Facilitated by the advent of computer technology (1960s)
Latin: corpus (body): body of text  any collection
of more than one text, written or spoken

Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)

Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)
Between 1960s and 1980s: minority of linguists continued working on corpus-based work (Quirk: SEU, Francis & Kucera: Brown corpus, Svartik: London-Lund corpus)
Computer technology: major support for CL
First African Corpus: 1989 (ICE-East Africa) (Schmied 1989)
Second African Corpus: 1992 CCE (Tiamajou 1993)/ Nigeria??

“Thirty years ago when this research started it was considered impossible to process texts of several million words in length.
Twenty years ago it was considered marginally possible but lunatic.
Ten years ago it was considered quite possible but still lunatic. Today it is very popular“
(Thomas/Short 1996: 4)

L1 Corpora

L1 Corpora
Brown Corpus of American English
Lancaster-Oslo/Bergen Corpus (LOB)
London-Lund Corpus
British National Corpus (BNC)
Birmingham Corpus of British English
L2 Corpora
ICE-East Africa (Kenya & Tanzania)
Corpus of Cameroon English
Corpus of Nigerian English ??
Kolhapur Corpus of Indian English
Multinational Corpus Project
International Corpus of English (ICE)

1. Sampling & representativeness
Interest in whole variety of English
Attempts to construct a “representative” sample corpus
Which maximally represents variety
Aim: picture as accurate and reasonable as possible of a language population

2. Finite size
Body of finite amount of words, e.g. 1,000,000
Figure determined at beginning of project
monitor corpus: constant addition of texts

3. Machine-readable form

3. Machine-readable form
Past: reference to printed text
Nowadays: implication, machine-redable
Few in book form (e.g. original London-Lund)
Occasionally other forms of media (microfiche, recordings)

4. Standard reference

4. Standard reference
Tacitly a corpus constitutes a standard reference
Presupposition: wide availability to other researchers
Direct comparison of results with other varieties

Began in 1992 with the collaboration of two

Began in 1992 with the collaboration of two
British universities (Birmingham/Liverpool)
Assistance of the British council in Yaoundé
Target of a million words reached in 1994
Data use for classroom activities/research since then
2005: project benefited from a grant of the AvH
→ Goal: Further development (tagging) of the database
(TU-Chemnitz)

Provide authentic data for the description of the main features and problems inherent in the variety of English which is written in Cameroon
Provide a source of authentic material for English language teaching/learning in Cameroon
Serve as a database for comparative studies on CamE in relation to other varieties of English

Dialogues

Dialogues
1. Conversations
2. Phone calls
3. Broadcast discussions
4. Classroom lessons
5. Interviews
6. Parliamentary debates
7. Legal cross- examination
8. Business transactions

13 possible ways in which a corpus may be useful

13 possible ways in which a corpus may be useful
1. Corpora as a source of empirical data
2. Corpora in language teaching and learning
3. Corpora in Lexical studies
4. Corpora in grammar studies
5. Corpora in speech research
6. Corpora and semantic studies
7. Corpora in pragmatic and discourse studies
8. Corpora in sociolinguistic studies
9. Corpora and stylistic studies
10. Corpora in historical linguistics
11. Corpora in dialectology and variational studies
12. Corpora in Psycholinguistics
13. Corpora in cultural studies

Linguists can make more objective statements on language use in the variety, comparing other varieties
Nkemleke /Mbangwana (2001)
Nkemleke (2003)
Nkemleke (2004a, 2004b)
Nkemleke (2005)
Nkemleke(2006)
Nkemleke (2007a, 2007b)
Nkemleke(fc: 2008a, 2008b, 2008c)
Schmied/Nkemleke (fc:2008a, 2008b)
A number of post-graduate projects in ENS/Faculty

CCE data used for classroom activities over the years

CCE data used for classroom activities over the years

Support teachers’ classroom explanation
Learner’s as researchers
Data-driven learning
Critical look at existing language teaching material

CCE data used for studies on aspects of Cameroon English usage, E.g. Hans-Georg Wolf used data from the corpus in his book English in Cameroon, published in 2001 by Mouton de Grouter (Berlin/New York).

Keep informed about new words, changing meanings
Call up word combinations, co-occurring words

ICE-Cameroon is on-going
Future possibility of more specialized corpora
E.g. Academic texts, Fiction

Thank You!

Download 538 b.

Do'stlaringiz bilan baham:

Daniel A. Nkemleke

Daniel A. Nkemleke

Department of English

Ecole Normale Supérieure

University of YaoundeI

Daniel A. Nkemleke

Department of English

Ecole Normale Supérieure

University of YaoundeI

The study of language based on examples of “real life“ language use, collected, stored and processed via computer

Facilitated by the advent of computer technology (1960s)

Latin: corpus (body): body of text  any collection

of more than one text, written or spoken

Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)

Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)

Between 1960s and 1980s: minority of linguists continued working on corpus-based work (Quirk: SEU, Francis & Kucera: Brown corpus, Svartik: London-Lund corpus)

Computer technology: major support for CL

First African Corpus: 1989 (ICE-East Africa) (Schmied 1989)

Second African Corpus: 1992 CCE (Tiamajou 1993)/ Nigeria??

“Thirty years ago when this research started it was considered impossible to process texts of several million words in length.

Twenty years ago it was considered marginally possible but lunatic.

Ten years ago it was considered quite possible but still lunatic. Today it is very popular“

(Thomas/Short 1996: 4)

L1 Corpora

L1 Corpora

Brown Corpus of American English

Lancaster-Oslo/Bergen Corpus (LOB)

London-Lund Corpus

British National Corpus (BNC)

Birmingham Corpus of British English

L2 Corpora

ICE-East Africa (Kenya & Tanzania)

Corpus of Cameroon English

Corpus of Nigerian English ??

Kolhapur Corpus of Indian English

Multinational Corpus Project

International Corpus of English (ICE)

1. Sampling & representativeness

Interest in whole variety of English

Attempts to construct a “representative” sample corpus

Which maximally represents variety

Aim: picture as accurate and reasonable as possible of a language population

2. Finite size

Body of finite amount of words, e.g. 1,000,000

Figure determined at beginning of project

monitor corpus: constant addition of texts

3. Machine-readable form

3. Machine-readable form

Past: reference to printed text

Nowadays: implication, machine-redable

Few in book form (e.g. original London-Lund)

Occasionally other forms of media (microfiche, recordings)

4. Standard reference

4. Standard reference

Tacitly a corpus constitutes a standard reference

Presupposition: wide availability to other researchers

Direct comparison of results with other varieties

Began in 1992 with the collaboration of two

Began in 1992 with the collaboration of two

British universities (Birmingham/Liverpool)

Assistance of the British council in Yaoundé

Target of a million words reached in 1994

Data use for classroom activities/research since then

2005: project benefited from a grant of the AvH

→ Goal: Further development (tagging) of the database

(TU-Chemnitz)

Provide authentic data for the description of the main features and problems inherent in the variety of English which is written in Cameroon

Provide a source of authentic material for English language teaching/learning in Cameroon

Serve as a database for comparative studies on CamE in relation to other varieties of English

Dialogues

Dialogues

1. Conversations

2. Phone calls

3. Broadcast discussions

4. Classroom lessons

5. Interviews

6. Parliamentary debates

7. Legal cross- examination

8. Business transactions

13 possible ways in which a corpus may be useful