2010 Words as species--An alternative approach to estimating productive vocabulary size(Paul M. Mear

更新时间:2023-06-28 02:56:57 阅读: 评论:0

Reading in a Foreign Language                                                                                      April 2010, Volume 22, No. 1 ISSN 1539-0578                                                                                                                                            pp. 222–236 Words as species: An alternative approach to estimating productive
vocabulary size
interrelationshipPaul M. Meara
Swana University
United Kingdom
Juan Carlos Olmos Alcoy
University of Dundee
United Kingdom
Abstract
This paper address the issue of how we might be able to asss productive vocabulary size in cond language learners. It discuss some previous attempts to develop
measures of this sort, and argues that a fresh approach is needed in order to overcome
some persistent problems that dog rearch in this area. The paper argues that there might be some similarities between asssing productive vocabularies—where many of the
words known by learners do not actually appear in the material we can extract them
from—and counting animals in the natural environment. If this is so, then there might be
a ca for adapting the capture-recapture methods developed by ecologists to measure
animal populations. The paper reports a preliminary attempt to develop this analogy.
Keywords: productive vocabulary, capture, recapture, word counts, ecological models
Paul Nation’s (1990) Vocabulary Levels Test has perhaps been the single most important development in vocabulary acquisition rearch in the last 20 years. The test provides a rough estimprediction
ate of a learner’s receptive vocabulary size in the form of a vocabulary profile. Simple to u, and easy to understand, it has been widely adopted by rearchers around the world, and has rapidly become the de facto standard vocabulary size test. The vocabulary size estimates that it produces appear to be remarkably reliable and robust. This has led to the Vocabulary Levels Test being ud in a very large number of empirical studies where vocabulary size is a critical variable, and particularly in studies that have examined the relationship between vocabulary size and reading ability in cond language (L2) learners. Inevitably, however, the development of a standard asssment tool of this sort opens up other areas of rearch, and the Vocabulary Levels Test is no exception to this generalisation. The availability of a reliable measure of receptive vocabulary size leads to some very interesting questions about the relationship between receptive vocabulary and active productive vocabulary. This issue is one extensively addresd in Nation’s work.
The basic distinction between active and passive vocabulary is a staple idea that is widely taken for granted in introductory books on vocabulary acquisition, and in instructional texts designed to teach vocabularies. Some writers, for example, go so far as to list vocabulary items that need to be acquired productively and other vocabulary items that only need to learned for recognition purpos. Despite the fact that many rearchers have written about this topic at a theoretical level (Corson, 19
83, 1995; Laufer, 1998; Melka, 1997; Melka Teichroew, 1982, 1989), the idea of productive vocabulary remains a fundamentally elusive one. The main reason for this is that it has proved surprisingly difficult to develop simple and elegant tests of productive vocabulary size that have any degree of face validity, and this makes it difficult to answer, with confidence, questions such as How are receptive and productive vocabulary related?Do receptive and productive vocabulary grow at the same rate? Are there thresholds in the development of a passive vocabulary? Not surprisingly, perhaps, given the widespread u of the Nation’s Vocabulary Levels Test to asss receptive vocabulary, the approach most widely ud in the recent rearch literature that investigates productive vocabulary in L2 learners is an adaptation of the original Vocabulary Levels Test usually known as the Productive Levels Test (Laufer & Nation, 1999). Laufer has ud the two tests in combination to make some very interesting theoretical claims about the relationship between receptive and productive vocabulary, and how the two facets of vocabulary knowledge develop at different rates (Laufer, 1998). However, the data provided by the Productive Levels Test are much more difficult to interpret than the data provided by the original Vocabulary Levels Test, and in our view it is worthwhile looking at alternative approaches to estimating productive vocabulary size. This is not to denigrate the ufulness of the Productive Levels Test approach, of cour, but rather becau we think that productive vocabulary may be a more complicated notion than it appears to b
e at first sight, one that would benefit from being examined from a number of different and perhaps unconventional points of view.
In our previous rearch, we have developed three main ideas, which we think might allow us to “triangulate” the idea of productive vocabulary size. For obvious reasons, most traditional studies of productive vocabulary require learners to produce short texts for evaluation, but this material is difficult to collect, particularly when you are dealing with low level learners who are reluctant to produce extended texts. Our first solution to this problem was to move away from using written texts as the raw data for rearch on productive vocabulary size. We (Meara & Fitzpatrick, 2000) argued that ordinary texts generated by learners tended to contain very large numbers of highly frequent words, and very few infrequent words, which were the true indicators of a large productive vocabulary. We tried to get round this problem by getting learners to generate “texts” derived from a t of word association tests called Lex30. The data typically consisted of relatively infrequent L2 words that could be profiled using standard vocabulary asssment tools such as Range (Heatley, Nation, & Coxhead, 2002), and we argued that the profiles provided a better picture of the scope of a testee’s productive vocabulary than other, more traditional test types did. Unfortunately, although the test scores tended to correlate with tests of receptive vocabulary size, it was not obvious how the
profiles provided by the Lex30 test could be converted into proper estimates of productive vocabulary size.
In our cond approach to estimating productive vocabulary (Meara & Bell, 2001), we returned to using texts generated by L2 writers, and attempted to develop an “extrinsic” measure of vocabulary richness. This paper analyd ts of short texts produced by L2 learners, and for
46级考试时间
each text generated a curve that described the incidence of “unusual” words in short gments of text. We then showed that the curves could be summarid in terms of a single parameter, λ, and argued that this parameter might be related to overall productive vocabulary size. This approach successfully distinguished between learners of English at different proficiency levels, but as with the Lex30 test, Meara and Bell were not able to establish a direct, quantifiable relationship between λ and overall productive vocabulary size.
In our third approach (Meara & Miralpeix, 2007) we attempted to estimate productive vocabulary directly by looking at the frequency distribution of words ud by L2 writers, and comparing the profiles to a t of theoretical profiles derived from Zipf’s law (Zipf, 1935). Meara and Miralpeix argued that it might be possible to estimate a learner’s productive vocabulary size by identifying a th
eoretical vocabulary profile that cloly matched the actual data produced by the learner. This general approach proved to be solid enough to distinguish between advanced and less advanced learners. More importantly, however, this approach actually allows us to quantify the productive vocabulary that ems to be behind a particular text. For example, it allows us to tentatively make statements like “the text in Example 1 implies a productive vocabulary of around 6,400 words.” This is a significant advance, which opens up a number of promising avenues of rearch, but it rests on a number of assumptions about the way L2 learners acquire words, which may not be fully justified.
Example 1. V-Size estimates that the following text was generated by a speaker with a productive vocabulary of at least 6,400 words.lonely歌词
Once upon a time there was a dark and lonely wood, where three bears lived. The bears lived in a small cottage at the end of a dark and lonely road, where few people ever strayed. The bears liked it a lot. They did not get many visitors, but that was fine. The rest of the time they kept to新东方学校一对一外语课事件
themlves, and went about their business in a calm and peaceful way.
Father Bear was the one who liked the dark and lonely bit best. He was a philosopher by nature, who loved to read dark and lonely poetry written in the dead of Winter by Scandinavian poets
capital city
who also lived in dark and lonely woods, and generally suffered from Angst. Mother Bear didn’t have much time for Angst. She was practical and organid, and liked the dark and lonely wood becau nothing ever happened there to disturb her domestic routine. Yes, it would have been
nice if Father Bear did a bit more of the cooking and cleaning, and yes, it would have been nice if Tesco had a branch at the edge of the wood, but it was better than having noisy neighbours who bothered you all the time. Baby Bear still hadn’t decided if he liked the dark and lonely wood or not. It was scary at night, and it was easy to get lost in the wood if you forgot to leave your marks on the trees where the paths split. But Baby Bear had been to the town once too, and he definitely did not like it. Not one bit.
Obviously, it would be very uful to have a tool that would allow us to estimate a learner’s productive vocabulary size with some degree of confidence. For this reason, we have also been pursuing other approaches to estimating vocabulary size. Our hope is that the different approaches will all turn out to provide answers that are broadly similar, and if we could achieve this, then it might be possible to develop a reliable, practical test of productive vocabulary size, which would allow us to take further the ideas raid in Laufer’s (1998) paper. This paper sketches an approach that is rather different from the approaches we have developed in our
previous work, but one that we feel is very much in the spirit of Paul Nation’s thinking outside the box approach to vocabulary testing.
Estimating Population Sizes in the Field
The main problem with estimating productive vocabulary size is that it is extremely difficult to get all the data that we need from our participants. If we were dealing with learners with very small vocabularies, then it might be possible to devi a t of tests that assd whether our learners could produce each of the words in a short list of target words that we are interested in. In practice, however, this only works where we are dealing with very small vocabularies. In real testing situations, it is logistically impractical to test the entire vocabulary of a learner who has more than a very elementary vocabulary. In this paper, for example, we are interested in learners of Spanish. Threshold Level Spanish (Slagter, 1979) compris a lexicon of around 1,500 words, which gives learners only a very limited level of competence in Spanish. Testing vocabulary exhaustively at this level is difficult, though it is just about feasible with very co-operative participants. Testing the vocabulary of more advanced participants becomes increasingly difficult as their vocabulary grows. Conquently, if we want to test the vocabularies of even moderately advanced students, we have no option but to resort to sampling methods, and to extrapolate from the results we get when we test
a small number of words. Obviously, the trick here lies in devising a sampling method that is appropriate and transparent. We may not be able to get L2 learners to produce for us all the words that they know, but we might be able to develop a testing methodology that allows us to extrapolate meaningfully from the words that we can elicit.
This problem is not unique to linguistics. Analogous problems also occur in other areas of study, and are particularly important in ecology, where we want to count the number of animals in a given habitat area. A typical problem of this sort is when we want to estimate the number of deer inhabiting a forest, the number of elephants occupying a national park, or the number of cockroaches infesting a hotel. Simply counting the animals is not straightforward: The animals are not co-operative and do not line up in a way that allows us to number them reliably. This makes it notoriously difficult to make good estimates of animal populations, a problem that can have rious conquences if we are trying to manage the population and control the number of animals that a particular environment can provide for, or as in the ca of the cockroaches, we have to eliminate them altogether.
记账凭证编号Ecologists have developed a number of methods that allow them to resolve this problem. All of the methods rely on capturing a small number of animals, and then extrapolating this basic count to an estimate of the actual number of animals that could have been caught. The basic approach is known
as the capture-recapture methodology, first developed by Petern (1896), and further developed by Lincoln (1930). In this approach, we first develop a way of capturing the animals we are interested in, and standardi it. Suppo, for example, that we want to count the number of fish in a river. We could identify a suitable stretch of river to investigate, and then distribute traps that will catch the fish without harming them. We leave the traps out for a t time, overnight, for instance, and count the number of fish that we have trapped. We then mark the animals in a way that will allow us to identify them, before releasing them back into the
wild. The next night, we carry out the same counting exerci, enumerating the fish trapped overnight. This gives us three numbers: We have N, the number of fish captured on Day 1; M, the number of fish captured on Day 2; and X, the number of fish that were captured on both occasions. Petern argued that it was possible to extrapolate from the figures to the total number of fish in the stretch of river. Petern’s estimate is calculated as follows:
E = (N * M) / X
That is, Petern’s estimate of the size of the fish population is the product of the two parate counts divided by the number of fish counted on both occasions. A simple example will make this ide
a more concrete. Suppo that on Day 1 we count 100 fish in a 10 mile stretch of river, and we mark them all. On Day 2, we find 60 fish, 20 of which were also noted on Day 1. Petern’s estimate of the number of fish inhabiting the stretch of river would be
time of our livesE = (100 * 60) / 20 = 6,000 / 20 = 300
If the river is actually 100 miles long, with similar conditions throughout, then our 10 mile stretch reprents a 10% sample of the whole river, so we could extrapolate that there are about 3,000 fish in the entire length of the river.
makesureThere are a number of points to make about this estimate. Firstly, the estimate is quite a lot larger than the totals counted on either of the two data collection times. Secondly, it assumes that the way we counted the fish was a reasonable one, one that gave us a good chance of capturing the fish we want to count, and that the one mile stretch we have lected reprents in some way the entire river. Thirdly, the mathematics only works in a straightforward way if we assume that the two collection times are equivalent, and if each animal has an equal chance of being counted on both collection times. The population of fish needs to be constant from Day 1 to Day 2—if half our fish were killed by otters, or died from poisoning overnight, then Petern’s model would simply not apply.
Finally, we are assuming that the data collection on Day 2 is “equivalent” to the data collection on Day 1, and so on. If the assumptions do not hold, then the model will not work, but if the assumptions are broadly correct, then the two capture events allow us to make a rough estimate of the number of fish in the river, even though we are not able to count every single one of them, and even though we only sampled a part of the entire river.
Petern’s method has been widely ud in ecological studies, where rearchers have been interested in estimating the size of elusive animal populations, and it turns out to be surprisingly accurate and reliable. Seber (1982, 1986) provided a number of examples of how the method has been ud in practice.
namelessThe question we ask in this paper is whether it might be possible to adapt this approach to making estimates about productive vocabulary size? At first, it ems unlikely that this ecological approach would provide a good analogy for what happens with words. Words are not animals, and their characteristics are very unlike tho of fish or elephants. Indeed, you could argue that words are not entities at all—rather they are process or events, which need to be counted in ways that are different from the ways we u to count objects. Nevertheless, there ems to be a ca for exploring this idea a little further, before we reject it out of hand.

本文发布于:2023-06-28 02:56:57,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/78/1055808.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:记账   考试   外语课   编号   时间   事件   凭证
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图