外贸出口公司数据挖掘外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)
外文:
What is Data Mining?
小学英语论文Simply stated, data mining refers to extracting or “mining” knowledge from large amounts of data. The term is actually a misnomer. Remember that the mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. Thus, “data mining” should have been more appropriately named “knowledge mining from data”, which is unfortunately somewhat long. “Knowledge mining”, a shorter term, may not reflect the emphasis on mining from large amounts of data. Nevertheless, mining is a vivid term characterizing the process
that finds a small t of precious nuggets from a great deal of raw material. Thus, such a misnomer which carries both “data” and “mining” became a popular choice. There are many other terms carrying a similar or slightly different meaning to data mining, such as knowledge mining from databas, knowledge extraction, data / pattern analysis, data archaeology, and data dredging.
Many people treat data mining as a synonym for another popularly ud term, “Knowledge Discovery i
n Databas”, or KDD. Alternatively, others view data mining as simply an esntial step in the process of knowledge discovery in databas. Knowledge discovery consists of an iterative quence of the following steps:
cinemagic
· data cleaning: to remove noi or irrelevant data,
· data integration: where multiple data sources may be combined,
choo过去分词· data lection : where data relevant to the analysis task are retrieved from the databa,dan
· data transformati on : where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance,
· data mining: an esntial process where intelligent methods are applied in order to extract data patterns,
推荐信格式· pattern evaluation: to identify the truly interesting patterns reprenting knowledge bad on some interestingness measures, and
· knowledge prentation: where visualization and knowledge reprentation techniques are ud to prent the mined knowledge to the ur .
The data mining step may interact with the ur or a knowledge ba. The interesting patterns are prented to the ur, and may be stored as new knowledge in the knowledge ba. Note that according to this view, data mining is only one step in the entire process, albeit an esntial one since it uncovers hidden patterns for evaluation.
We agree that data mining is a knowledge discovery process. However, in industry, in media, and in the databa rearch milieu, the term “data mining” is becoming more popular than the longer term of “knowledge discovery in databas”. Therefore, in this book, we choo to u the term “data mining”. We adopt a broad view of data mining functionality: data mining is the process of discovering interesting knowledge
from large amounts of data stored either in databas, data warehous, or other information repositories.
optimal
Bad on this view, the architecture of a typical data mining system may have the following major components:
1. Databa, data warehou, or other information repository. This is one or a t of databas, data warehous, spread sheets, or other kinds of information repositories. Data cleaning and data integr
ation techniques may be performed on the data.
权威人士2. Databa or data warehou rver. The databa or data warehou rver is responsible for fetching the relevant data, bad on the ur’s data mining request.
3. Knowledge ba. This is the domain knowledge that is ud to guide the arch, or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, ud to organize attributes or attribute values into different levels of abstraction. Knowledge such as ur beliefs, which can be ud to asss a pattern’s interestingness bad on its unexpectedness, may also be included. Other examples of domain knowledge are additional interestingness constraints or thresholds, and metadata (e.g., describing data from multiple heterogeneous sources).美国留学高中成绩单
4. Data mining engine. This is esntial to the data mining system and ideally consists of a t of functional modules for tasks such as characterization, association analysis, classification, evolution and deviation analysis.durham
5. Pattern evaluation module. This component typically employs interestingness measures and interacts with the data mining modules so as to focus the arch towards interesting patterns. It may access interestingness thresholds stored in the knowledge ba. Alternatively, the pattern evaluation
module may be integrated with the mining module, depending on the implementation of the data mining method ud. For efficient data mining, it is highly recommended to push the evaluation of pattern interestingness as deep as possible into the mining process so as to confine the arch to only the interesting patterns.
6. Graphical ur interface. This module communicates between urs and the data mining system, allowing the ur to interact with the system by specifying a data mining query or task, providing information to help focus the arch, and performing exploratory data mining bad on the intermediate data mining results. In addition, this component allows the ur to brow databa and data warehou schemas or data