2.注重语料新颖的原则
Sinclair(1991)认为,要发现人们实际使用语言的真相,就必须观察人们实际使用的语言。因此,语言学研究所使用的语料必须是真实的语言,即大量的自然发生的语料,而不是通过内省(introspection)和直觉(intuition)获得的语料。他写道(1991:4):“...the contrast expod between the impressions of language detail noted by people,and the evidence compiled objectively from texts is huge and systematic.It leads one to suppo that human intuition about language is highly specific,and not at all a good guide to what actually happens when the same people actually u the language.”
根据 Sinclair的语料观,我们可以得到以下启示:
(1)选择某种语料要根据这些文本扮演的某种社会角色,而不是根据这些语料是否可以说明
某个语言点。遗憾的是,目前有不少语法学家或其他语言学家,他们选择语料,其目的是为了验证某一语言现象。换言之,如果他们发觉某种语言现象非常有趣,就会选择围绕该现象的各种用法来分析。这是不可取的,因为如果我们只专注于英语中异常的东西,就有可能忽略一些更为常规的、单调的语言型式。
(2)研究的语料量要大。语料库越大,就越能精确地描述经常出现的词项。语料越多,对于核心表达的认识越会改变。原来重要的东西,经过语料库的筛选可能变得不太重要。大型语料库可以发现核心而典型的东西,可以区别典型的与非常见的用法,区别典型与可能的用法。那么,普通的语料库至少该多大呢?一般来说,至少包含一百万字。根据Leech(1991)的发现,最早期的语料库大概包含了约一百万字,远远超出了语言学家的实际使用量。Sinclair(1991)在Corpus,Concordance,Collocations一书中描述的语料库包含约七百多万字,而1997年的Bank of English语料库则包含了3亿多字。由此可见,语料库越大,我们就越容易发现人们的语言使用规律。
(3)观察大量的语料,可以从各个角度分析语言的方方面面,其中频率(frequency)对于研究语言至关重要。没有频率方面的信息,就无法研究语言。频率研究发现,一些词串出奇地经常共现,即使所谓的固定表达法也表现了出奇的可变度。
(4)使用的语料必须经过系统性排列。语料越多,越需要进行组织。如果没有系统组织的话,要找出词语搭配的频率是困难的。以词形(wordform)为单位设计出的软件,可以帮助我们查找到某一词形的所有例子,也可以同时呈现那些出现在该词形前后左右的一些词语,对这些句子进行字母顺序排列,发现其中的型式。正如Sinclair(1991:4)所说:“...the ability to examine large text corpora in a systematic manner allows access to a quality of evidence that has been available before.”
(5)采用“问题导向”(problem-oriented)的研究方法。这种基于语料来解决问题的研究被Tognini-Bonelli(2001)称之为“基于语料”(corpus-bad)的研究,与“语料驱动”(corpus-driven)形成对比。
(6)对语料库进行加注,使软件可以查阅到某一范畴(如被动语态、不定式从句、补足语),而不是某一词形。比如,Biber等 (1994)计算了“that-”和“wh-”引导的从句的使用频率,Halliday(1993)通过大型语料库计算了肯定与否定从句的频率,Kettermann(1997)用加注语料库回答语言习得的相关问题。
3.注重意义单位描述的原则
Sinclair(1991)认为,有些词出现在短语中,其意义会发生变化。比如:“have a baby”(生小孩)、“have a bath”(洗澡)、“have a cigarette”(抽一支香烟)、“have such conduct”(容忍这种行为)、“have a meal”(用餐)、“have a vere headache”(头疼得厉害)、“have a walk”(散步)中的“have”是一个频繁使用的动词,但在这个词组中则失去了原来的多数意义,意义不是限于这个词,而是扩展到整个词组。这种现象叫做“渐进的去词汇化”(progressive delexicalization)。
根据Sinclair的“意义单位描述”观,我们可以得到以下启示:
(1)在描述语言单位过程中,必须充分考虑到受限制的语境;
(2)在研究公式化语言中,注重同一语块在不同上下文中发生的意义变化。
就拿“naked eye”(肉眼)来说。British National Corpus出现了148个含有“naked eye”的例子。通过分析这些例子,我们可以看出“naked eye”通常所处的语境不是固定的,而是受到限制的,具体如下:
语境1“naked eye”与“the”共现,如:We merely became accustomed to the general life of t
he common birds and animals,and to the appearances of trees and clouds and everything upon the surface that showed itlf to the naked eye.
厦门教育语境 2“the naked eye”与“to”共现,如:The legs are flailing wildly—tiny stretches of inct flesh—no thicker than a hair to my naked eye,but obviously larger than life to this poor,
语境3“the naked eye”与“with”共现,如:The interesting point is that the Greeks were certainly able to e Merope with the naked eye,whereas today this is virtually impossible.
语境 4“the naked eye”与“by”共现,如:It would have been no u asking him whether he thought there was a unifying purpo in life,whether it could really be chance that an animal so small that it couldn’t be en by the naked eye could die millions of years ago in the depths of the a and be resurrected by science to prove a man innocent or guilty.
破产姐妹下载语境5“the naked eye”与“via”共现,如:It is known more usually under the name Gill-magg
ot,becau of the length and shape of the female’s egg-sacs which look like miniature white maggots when viewed via the naked eye.
语境6“the naked eye”与“visible”共现,如:The mite is just visible to the naked eye and feeds on honey bees and their grubs by sucking their body fluids.
语境7“the naked eye”与“invisible”共现,如:Through his telescope Galileo obrved more things in the heavens than had ever been dreamed of:moons of Jupiter and myriads of stars invisible to the naked eye.
语境8“the naked eye”与“obvious”共现,如:The Small Cloud is very obvious with the naked eye,and binoculars show it well,though admittedly it cannot rival the splendour of the Large Cloud;it has no well-defined shape,but is easy to resolve,at least in part.
语境9“the naked eye”与“parable”共现,如:The pairs are parable with the naked eye,but clor binaries—or,of cour,optical doubles—require binoculars or a telescope.