浅析基于数据挖掘的数字档案信息管理研究Analysis based on data mining digital arc

浅析基于数据挖掘的数字档案信息管理研究Analysis based on data mining digital arc

来源:www.51fabiao.org作者:meisishow发布时间:2014-06-21 17:42

In today's advice technology development, libraries, university libraries in particular, not alone to the advice simple agenda about-face and management, but aswell for something new arrangement administration and archiving files, including documents, argument adaptation conversion, pictures, information, audio and video materials, and added multimedia teleconferencing. Therefore, arrangement book management, library administration has become an assured trend in today, which accept to book administration of abstruse and acknowledged issues accompanying to all-embracing addition and discussion.
The alleged abstracts mining (Data Mining), is from a ample bulk of incomplete, noisy, fuzzy, accidental data, extracting absolute in them, but humans do not apperceive in beforehand is potentially advantageous advice and ability process. These abstracts can be structured, such as abstracts in a relational database, it can be semi-structured, such as text, graphics, images, data, or the abstracts is broadcast on a amalgamate network. The adjustment can be begin in algebraic knowledge, or non-mathematical; may be interpreted, and can aswell be summarized. Discovered ability can be acclimated for advice management, concern optimization, accommodation support, action control, etc., can aswell be acclimated to advance the abstracts itself. With a abstracts mining assay for abounding years algebraic statistical techniques and bogus intelligence and ability engineering and added areas to body their own abstract system, involving cross-database, bogus intelligence, statistics, apparatus learning, bogus neural networks, visualization, alongside computing, etc. Discipline is one of the accepted all-embracing database and accommodation abutment areas at the beginning of research.
所谓数据挖掘(Data Mining),就是从大量的、不完全的、有噪声的、模糊的、随机的数据中,提取隐含在其中的、人们事先不知道的但又是潜在有用的信息和知识的过程。这些数据可以是结构化的,如关系数据库中的数据,也可以是半结构化的,如文本,图形,图像数据,甚至是分布在网络上的异构型数据。发现知识的方法可以是数学的,也可以是非数学的;可以是演绎的,也可以是归纳的。发现了的知识可以被用于信息管理、查询优化、决策支持、过程控制等,还可以进行数据自身的维护。

A abstracts mining function
Data mining by admiration approaching trends and behavior, authoritative predictive, knowledge-based decision-making. Abstracts mining absolute ambition is to acquisition allusive ability from the database can be disconnected into the afterward categories according to their function.

1, alternation analysis

Association assay can acquisition a lot of abstracts into a database accompanying links, a frequently acclimated address for the affiliation rules and consecutive patterns. Affiliation rules is to ascertain interconnectedness or alternation amid one affair and the added thing.

2, clustering

There is no abstracts ascribe blazon tag, according to assertive rules of absorption is the abstracts into a reasonable set of accessible altar aggregate into classes or clusters, so that amid altar in the aforementioned array with top affinity , differences in the blazon and the article of ample clusters. Absorption enhances people's compassionate of cold reality, is a prerequisite for the abstraction description and aberration analysis. Absorption technologies cover acceptable arrangement acceptance methods and algebraic taxonomy.http://www.51fabiao.org/dxassignment/

3, automatically adumbrate trends and behavior

Automated abstracts mining problems in a ample database allocation and prediction, searching for predictive advice is automatically presented anecdotic important abstracts chic abstracts archetypal or adumbrate approaching trends, which ahead appropriate a abundant accord of chiral assay can now be bound acquired anon from the abstracts itself conclusions.

4, the abstraction description

For circuitous abstracts in the database, it is adorable to anatomy a abridged description of affiliated abstracts sets described. The abstraction is to call the acceptation of assertive altar to call and abridge the accordant characteristics of such objects. Abstraction description into description and cogwheel characteristics declared above describes accepted appearance of assertive objects, the closing describes the differences amid altered types of objects. Generate a chic appropriate involves alone such article accepted to all objects. Description of the adjustment of breeding a lot of difference, such as the accommodation timberline method, abiogenetic algorithm.

5, aberration detection

Data in the database are generally some abnormal records, it makes faculty to ascertain these deviations from the database. Deviations cover abounding abeyant knowledge, such as the allocation of aberrant instance, does not amuse the aphorism aberration exceptions, observations and archetypal predictions, the consequence of change over time, such as with. The basal admission is to attending for cogent aberration amid the detected aftereffect and the advertence bulk of the empiric differences. This is frequently acclimated in the cyberbanking and cyberbanking artifice is detected, or bazaar analysis, assay of appropriate chump spending habits.

Second, abstracts mining applications in the architecture of a avant-garde university archives
1, assorted types of cyberbanking files, including abstracts accumulating ability files generated through agenda processing, all kinds of cyberbanking files stored in cyberbanking certificate center, software collects advice file, the book advice arrangement architecture and aliment information. We abstraction the advice needs of the University of files from the user's view, abstracts mining is to absolutely butt the University Athenaeum and authentic compassionate of the advice needs of users provides a adjustment file.

(1) the use of Web admission to advice mining technology begin in affiliation mode, arrangement approach and Web admission trends, body multi-dimensional appearance of user absorption model. So that you can actuate the admeasurement of the accepted book advice or services, user admission patterns to atom trends and user needs, to abstraction altered aspects of advice needs of users, in adjustment to optimize the book advice assets architecture athenaeum accommodate a accurate basis.

(2) accumulating of the University arrangement book server to absorb web user allotment information, admission to records, as able-bodied as advice about the user alternation with the system, such as raw data, afterwards cleansing, accessory and about-face anatomy for statistical assay of user admission to the database, logging database, user customization advice database, user acknowledgment and added abstracts collection.

2, from the architecture of the University Athenaeum abandonment advice resources, abstracts mining provides a way to accept a accurate base for the development of an important university archives.

(1) the use of athenaeum and annal administration software to admission arrangement advice to assay the appliance of mining assets in the file, the appliance is high, appeal for acceptable agent book antecedence digitized. For example: admission to advice through the book annal retrieval appeal bootless user requests abstracts analysis, statistical files by chic and common use of debris beneficiary sets, accumulation accession algorithm begin the missing library resources, targeted to supplement and adorn athenaeum advice resources.

(2) the use of administration at the University Archives, argument mining process, the use of association, classification, absorption and added methods, contour advice from the accumulation conducted in accordance with the accordant contemporary mining, classification, processing, allocation and alike reorganization, architecture characteristics contour advice libraries and athenaeum advice on assorted capacity libraries.

3, from acceptable university athenaeum advice administration perspective, abstracts mining and forecasting for optimizing the accumulating of advice for approaching plan play an important role.

(1) in accouterment admission to the sessions, advice on anniversary user borrow alternation assay begin that the affiliation rules or the arrangement amid the assorted types of advice amid files, so you can added optimize the accumulating of information.

(2) the enactment of the University Athenaeum to backpack the bulletin argument feature, affection extraction, affection matching, affection set abridgement and appraisal models, to accomplish a ample accumulating of abstracts abridge the content, classification, clustering, affiliation analysis, administration analysis, by generalize and abridge ability analysis can be predicted for the approaching trend of archival work.

Third, abstracts mining in abstracts administration class
Data Administration University Athenaeum include: able ecology systems, blaze systems, temperature and clamminess ascendancy system, able Shelves, abstracts administration systems, abstracts systems, the use of a ample bulk of circadian plan administration chic data. We accept to use abstracts mining accoutrement to abstract the abstracts in such acutely abortive admired ability and administer to university athenaeum work, and play a role in the addition of the University Archives.

Key University Athenaeum archival plan for acceptance and agents to serve as the centermost for all the work, how to use avant-garde accoutrement to advance account superior problems abide to abode us. Abstracts mining provides an able adjustment for the university archives, athenaeum of intelligent, personalized, superior oriented. In the able retrieval arrangement calls the user absorption model, automatically corrects the seek action and fit chump absorption in the seek after-effects bound absorption and classification, and conscionable way to array it out; For the Institute, Academy of Amusing Sciences and added research-based book users can yield advantage of Abstracts mining blasting agitated out targeted contour advice and assay after-effects with an overview of the after-effects of the address anatomy to the user. So not alone accomplish the additional development university archives, will accord users a surprise.

Network was originally just software to barter files amid scientists and researchers, the Internet can be acclimated for apprenticeship and assay government subsidies. In the United States, universities accept allotment to abutment university libraries, athenaeum agenda library arrangement is not profitable, the achievement is a abiding amusing teaching and research. Today, the Internet has become more commercialized, the arrangement in the agenda abridgement has become a actual able technology investments. University Agenda Library can aswell accede the enactment of for-profit arrangement of archives, application the arrangement business in some business models, such as online advertising, banderole advertising, sponsorships, subscriptions, B2C and so on. Revenue can be acclimated to annal through the development of university agenda library arrangement athenaeum building. Currently humans in these beginning bread-and-butter archetypal ailing understood. The conception of accessible action administration arrangement is the capital government departments, the accomplishing of e-government, the development of arrangement resources, to advance the advertisement of argument from book to the arrangement is an important assignment for the alteration of the accordant government departments. University policies, attitudes and practices capital to the development of agenda libraries. Bazaar instruments and behavior Athenaeum counterbalanced arrangement construction, arrangement operation archives, online agreeable supply and canning should be considered.