Journal of Chine Language and Computing, 15 (3)(161-171) Several Discussions on Chine L

更新时间:2023-05-13 19:31:09 阅读: 评论:0

Journal of Chine Language and Computing, 15 (3):(161-171)
Several Discussions on Chine Letter-word phras*
Zheng Zezhi 1  Zhang Pu 2
1.Xiamen University, Xiamen, Fjian, China
2.Beijing Language and Culture University, Beijing, China
; zhangpu@
perfect意思Bad on the investigation of the usage of letter-word phras in the “People’s Daily”(year 2002), veral discussions, including classifying ELWPs occurring in the corpus, defining the ELWPs, analyzing and categorizing mono-alphabetic ELWPs and digital ELWPs, exploring ELWP parallel structures and so on, on letter-word phras are prented in the paper. Wish to bestead the Chine information processing and letter-word phras standardization.
ELWP, Letter-word phra, mono-alphabetic ELWP, digital ELWP, parallel structure
1 Introduction
Since 1994 letter-word phras have been studied, Yongquan Liu published the first paper on lettered-words “Discussing lettered-words”(Yongquan Liu, 1994.10), then along with lettered-word phras come forth, more and more people pay attention to the new language phenomena, and the study goes widely from the form to the pronounce and mantic rearches of letter-word phras. The days, Southeast Asia Chine districts and China are the main areas using letter-word phras. The leading rearch fields about letter-word phras are how to standardize lettered-words, how to lect them for dictionaries, and how to read them.
Although there are many people studying letter-word phras, almost all of their studies are illustrational or of qualitative analysis, and the rearches in quantitative analysis + qualitative analysis are rare, much less reports about the usage of letter-word phras in large scale real texts. By now we haven’t known clearly the matic class, * This is a revid version of the paper prented at the 6th Chine Lexical Semantics International Congress of Linguists, held at Xiamen,
China in April 2005. I am grateful for the imburment from the project (rial number: 04L2004-01-01-03) of the Rearch Center of National Language Monitoring Agency (press & books), which funded the rearch reported in this study. The authors are responsible for any remaining errors and inadequacies in the paper.
Zezhi zheng, Pu Zhang 162
pragmatic state and the using range of letter-word phras. We have investigated the usage of foreign letters in “People’s Daily”(year 2002)(Runzhi Guan, et al, 2005; Zezhi Zheng, et al, 2005), and bad on the investigation this paper will gave a external describing of the usage of foreign letters in Chine texts, and then discuss mono-alphabetic ELWPs, digital ELWPs, ELWP parallel structures, and we wish the study to be beneficial to the Chine information and standardizing letter-word phras.
2 The definition of ELWPs
For lettered-words, Yongquan Liu ‘s definition was (Yongquan-Liu, 2002): “The words which are compod of Roman alphabets (including Chine pinyin alphabets) or Greek alphabets or a composite of Roman alphabets and symbols and figures and Chine characters or a composite of
Greek alphabets and symbols and figures and Chine characters.”  This definition basically includes main properties of lettered-words. But it lacks in practicability for auto-extracting and recognizing lettered-words.
The reasons why we u the term “Letter-word phras”, not “lettered-words” are: 1. The letter strings in lettered-words are almost the breviaries of foreign words; 2. Most time letter strings require joining with Chine characters to denote a concept, for instance, to express a proper noun or a term; 3. The dividing line between Chine words and phras is vague.
As a matter of fact, neither lettered-words nor Letter-words phras can include all usage of foreign letters in Chine texts, such as 妫(音:Guī), v+m, .dbx, .dll, .doc, .eml, (C), C, [M], c:\io.sys, (c), c:\kkk, c:\windows, c:\windows\, C:\Winnt\sys-tem32 etc. They aren’t a lettered-word or a Letter-word phra. So we have to know all usages of foreign letters to make clear what are Letter-word phras. Herein we prent the definition--engineering definition of lettered words or phras, for short, ELWP. The definition of ELWPs, it mainly refers to the character strings, which appear in Chine texts, and are consist of word-symbols and mark-symbols, or word-symbols and mark-symbols and Chine characters. The character strings have definite n and syntax function (such as卡拉OK,CD盘,VISA卡,HSK,3D动画ISO9000认
证,IEC标准, etc.). As a word or a phra, it is lf-contained, and in texts the letter quence can’t be changed, its composition can’t be inrted or deleted.
In the definition, we u two terms, as follows:
•Word-symbol: It refers to “the minimum unit of pinyin characters or phonetic notation symbols”(GB/t12200.2-94, the partⅡof “Chine information processing
glossary ”: Chine and Chine characters), it includes Roman alphabets, Greek
alphabets, Cyrillic alphabets, Nippone katakana and hiragana, etc.
•Mark-symbol: It refers to punctuation marks, money-symbols($, ¢, £, ¤, ¥, €, ﹩,
℃℉℡㏕, ㎎, ㏒, ㎏, $, ¢, £, ¥, etc.),  measure-symbols(, , №, , ™, ℅,
㏑, ㎜, ㏎, ㎝, ㏄, ㎞, ㎡, etc.), numeral symbols (Arabia numeral and Roman
numeral), calculating-symbols(+,-,/,etc.) and other symbols(№, ℡, ™, &, *, #, ©,
®, etc.).
According to the definition of ELWPs, an ELWP can be an Internet address, an Email-address, a computer file name or a computer file address, a computer virus name, a formula, which contains Word-symbols, a figure + measure unit, foreign words or phras, the Chine pinyin, all kinds proper nouns (criterion names, agreement names, commodity names, brand names, company names, coding names, etc.), lettered words, etc.
Several Discussions on Chine Letter-word phras                                      163
In order to auto-recognize ELWPs, we have also prented a formalization definition of ELWPs (Zezhi Zheng, Pu Zhang, 2005)
Thereinafter, we u ELWPs to denote the engineering definition of lettered words or phras, and u lettered-word phras to denote the words or phras, such as “DVD 机, CT, etc.”.
Tab1  Some examples of ELWPs
3 the usage state of foreign letters in Chine texts
日本电话电信公司DoCoMo “A.O.C”字样
格C 盘(Harm.FormatC.bwp20) 天然维生素E
http:// “教育考试服务中心”
(EducationalTestingService,ETS) c:\bbb
著名品牌“EPSON 爱普生”
丰富的维生素E 、
美国教育考试服务处(ETS)举办的、 给“E 学生”、
著名品牌“EPSON 爱普生”的商标 销售假冒EPSON(爱普生)墨盒 唱“卡拉OK”
Zezhi zheng, Pu Zhang 164
airmanAccording to the definition of ELWPs, we examined the usage of foreign letters in Chine texts, and then classified the usage state (shown in fig.1). Heretofore Yongquan Liu(2002) mentioned 5 classifying methods. Our divisional way is bad on the usage state of foreign letters, and from the angle of engineering view. This divisional way is valuable to the Chine information processing and the criterion of letter-word phras.
In Fig.1, the part containing statistical values is the part that we’ll analyze. To the ELWPs with punctuations can be saw in the paper by Runzhi Guan et al(2005).
minimizehe ELWPs in Fig.1’s big weak frame—the ELWPs with Chine characters and without figures and punctuations are letter-word phras, which are granted by many academicians. The ELWPs in Fig.1’s small weak frame are regarded as breviaries of foreign words, and not lettered-words (Mingyang Hu, 2002). We consider the things remble to the ELWPs in Fig.1’s big weak frame, and should be regarded as letter-word phras or lettered-words.
4 The mono-alphabetic ELWPdula
The mono-alphabetic ELWPs are the ELWPs which just contain only one word-symbol, such as “e龙公司, G网, K粉, Q号, C盘, D大调小提琴协奏曲, A师, e交通, etc.”.
The mono-alphabetic ELWPs, in letter-word phras, has its particularities, A, B, C, D, etc. the letters can be ud as rial numbers, class code names and may form a letter-word phras when they combine with Chine characters, so we examine them solely, and wish to find their matic and pragmatic usages. The letter roles in mono-alphabetic ELWPs are shown in Fig.2.
There are 358 different mono-alphabetic ELWPs (ud 1545 times, in 1138 texts),in the “People’s Daily”(year 2002).
According to tab.2, a single letter is ud to indicate a quence number, a class code, and a team number in ELWPs, becau in a way all the three usages of a single letter have taxis function, sometimes we can’t class them clearly, certainly, some of them can be distinguished entirely, such as:
a)As a team number:C组,世界杯D组,世界杯G组,E组,F组,G组,H组,N组, etc.
b)As a class code: F字签证(访问类签证),A型乙肝,摩托罗拉V字头(摩托罗拉V
字头是时尚类手机), etc.
c)As a quence number : A级,A级标准,A级影片,D级危房, etc.
d)What are the different between “D类危房和D
非主流英语网名级危房” ? the letters in “维生素A、维生素B、维生素C” are quence numbers, also are class codes, and same as in “印楝素A、印楝素B、印楝素D”.
A letter ud as a symbol, such as: C馆, C国, C 盘, C票, C区, A店, A国, A里, 阿Q, 阿Q精神,
cry是什么意思B 立柱, etc.
Borrowing form ELWPs, using a letter shape to form a letter-word phra, such as “H型钢, O型着陆, S弯度, S形, S形流水, S形平行状, S形舞, T型台, T 恤, T恤衫, T字台, etc.” The letter-word phras may come from Chine, such as “T型台” , may come from foreign language, for instance “T恤衫”.
Fig.2    The letter role in the
mono-alphabetic ELWP
Several Discussions on Chine Letter-word phras
Tab.2 the state of mono-alphabetic ELWPs
简报怎么写A foreign word abbreviated to one letter, and then come into being an ELWP, such as
“C网(窄带CDMA的简称), e城便利站, e城便利站终端机, E电视台, E化, e交通, e
教育, 四通四S, 窄带C网, etc.” The number of such ELWPs is not so much, but they are common letter-word phras.
Pinyin letters, for instance “学习b, p, m, f, d, t, n, l八个声母, a, o, e, i, u, ü六个韵
母.” . Why a pinyin letter can’t be a lettered-word or a letter-word phra is that it has no meaning, just ud as phonetic notation, but they can be ud in texts and are our examining objects.
Parallel structures, refer to the mono-alphabetic ELWPs which are ud in parallel structures express in texts, such as “A、B、C3组, A、B、O血型, A、B角制, A、B、H
流通股, 准驾车型为A、B、N、P, etc.”. When be off their ntences, the letters will lo
their meanings, but can be normally ud in ntences. They should be lettered-words or
letter-word phra.
Unit name, the letters in mono-alphabetic ELWPs can be ud as a unit name, for instance “K” in the ntence “理论上也有100多K,……”.
Chine pinyin letters are not lettered-words or letter-word phras, but when they are
ud as a abbreviation, such as “HSK(the test of Chine standard)”, we consider them to
be lettered-words or letter-word phra, becau from the word form to usage, their functions are similar to the abbreviations from foreign words.
5 The digital ELWP
Arabia numbers have being an impartible part in Chine depiction, but none has examined
the usages of the blendwords of letters and numbers or letters and numbers and Chine characters in Chine texts. In this ction we’ll realistically describe and discuss the usages, bad on the investigation of digital ELWPs in the “People’s Daily”(year 2002), and wish to know something of them.
In this paper the digital ELWP refer to the ELWPs, with numbers and without punctuations, such as “Win2000, 65MW, 2B铅笔, 丰田8A, 运8F400飞机, system32,
TOP500, CDMA2000标准, 摩托罗拉V70手机……”.
There are 691 different digital ELWPs ud in the “People’s Daily”(year 2002), they occupy 10% in total ELWPs, and they could be divided into two groups, one group in which the numbers are prior to letters and another in which the numbers are posterior to letters. So we can examine the different fu
nctions when numbers are at different position in ELWPs.
Zezhi zheng, Pu Zhang 166
5.1 The ELWPs in which the numbers prior to letters
There are 212 different ELWPs (ud 293 times, in 251 texts), in which numbers are prior to letters, in the ELWPs, the strings of numbers +letters has 3 usages:
1)As code numbers of product names or product models, such as “丰田8A, 运8F400飞
机, 5ELX, 5V电喷发动机, 600MW机组, 603G3503花岗石……”.
2)As breviaries of speciality terms, such as “3G标准, 3G网络, 2MDDN专线, Oct4基
因, 4S网点……”
3)As strings of figure+unit, such as “15ml, 167×600KW, 16V, 1800MHz, 1G,
The state of the ELWPs shows in Tab.3
The strings of figure+unit, such as “100Hz, 100M3, 100mg, 10KV, 10m, 10mg, 10MW, 11M, 1250GB, 140TB, 150CC组, 150mg……”
77.78% term ELWPs are shortened form, such as “国际3G, 3G技术, 3G基础设施服务, 3G时代, 3G市场, 3G网络, 同步3G移动通讯, 3G移动电话, 1X, 1X系统……”. 66.67% product model ELWPs u figure to reprent quantity and the strings of figure+unit to reprent product models, such as “宝马745h, 日本东芝800mA电视显像透视机, 火龙牌90W电热毯, 600MW机组……”, and the others u figures to reprent code numbers or some measurement, such as “603G3503花岗石, 丰田8A, 运8F400飞机, 3PE防腐钢管, 2B铅……”
Other names, most of them are company names, conference names, technique names, agreement names and criterion name, such as “3Com公司, 3M公司, 3G标准, 3G无线传输标准, 3G牌照, 欧洲3G牌照, 开放式3G平台, 3G用户, 3GPP标准会议, 3GPP工作组会议, 3GPP2, 3GPP2会议, 3GPP2全会……”
Only 3B(a credit grade of fund) is not grouping in grouping and classifying ELWPs, and the others all u the strings of figure+unit to denote groups or class, such as “125CCA组, 125CCB组, 125CC组, 1600CC组, 250CC组, 3B, 国产125CC组, 国产150CC组, 国家3A级旅游度假区, 专业125CCA组……”
Other ELWPs include expressions(such as, 3+X), common word phras(such as, 附件2B) and disconnecting of ELWP parallel structures (as鑫诺卫星2A, 3A转发器).
5.2 The ELWPs in which numbers posterior to letters
There are 479 different ELWPs (ud 713 times, in 572 texts), in which numbers are posterior to letters, in the “People’s Daily”(year 2002). In the ELWPs, the strings of letters+figures are ud as code numbers or symbols, and no letters is ud as measure units, just a few are ud as breviaries. The state of the ELWPs shows in Tab.4.

本文发布于:2023-05-13 19:31:09,感谢您对本站的认可!



标签:标准   签证   着陆   旅游   大调   E化
留言与评论(共有 0 条评论)
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图