Phraseology and Culture in English
Download 1.68 Mb. Pdf ko'rish
|
Phraseology and Culture in English
Appendix I:
1. Frequencies for ten neutral collocations in the British National Corpus, the COBUILD Collocations Database and selected top- level domains on the World Wide Web The BNC contains c. 100,000,000 words, the COBUILD database slightly over 200,000,000. Comparing the frequencies for these two corpora shows that, as expected, the figures for COBUILD are consistently higher, but also that there still is considerable fluctuation in the frequency of occur- rence of individual collocations. The figures for the individual top-level web domains need to be interpreted in this light. They give a rough idea of the proportion of the various domains relative to each other, and they allow a rough estimate of the amount of material looked at in comparison to the 100,000,000 words of the BNC. Thus, the .edu and .uk domains appear broadly comparable in size, and so do the .au and .ca ones, which seem to contain roughly a fourth of the material found in the two bigger ones. While it is fairly safe to determine rough proportions, it is much more risky to calculate approximate numbers of words. For example, by extrapolation from the BNC, the 7,650 instances of deep breath in the Australian web material would indicate a size of c. 1.34 billion words. Performing the same calculation on early age, on the other hand, we would arrive at the rather different estimate of 5.5 billion words. In view of such fluctuation, esti- mates should not be based on individual collocations but on aggregate fre- quencies. 460 Christian Mair Table 2. Distribution of established collocati onal m arkers of Britishness on the W eb (figures = rounded percentages) sum of all Download 1.68 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling