Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education

Taiyuan, China

Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education

Taiyuan, China
SEARCH FILTERS
Time filter
Source Type

Xu S.,Shanxi University | Wang J.,Shanxi University | Wang J.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education
Neurocomputing | Year: 2017

In our society, many fields have produced a large number of data streams. How to mining the interesting knowledge and patterns from continuous data stream becomes a problem which we have to solve. Different from conventional classification algorithms, data stream classification algorithms have to adjust their classification models with the change of data stream because of concept drift. However, conventional classification models will keep stable once models are trained. To solve the problem, a dynamic extreme learning machine for data stream classification (DELM) is proposed. DELM utilizes online learning mechanism to train ELM as basic classifier and trains a double hidden layer structure to improve the performance of ELM. When an alert about concept drift is set, more hidden layer nodes are added into ELM to improve the generalization ability of classifier. If the value measuring concept drift reaches the upper limit or the accuracy of ELM is in a low level, the current classifier will be deleted, and the algorithm will use new data to train a new classifier so as to learn new concept. The experimental results showed DELM could improve the accuracy of classification result, and can adapt to new concept in a short time. © 2017 Elsevier B.V.


Li F.-J.,Shanxi University | Qian Y.-H.,Shanxi University | Qian Y.-H.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Wang J.-T.,Shanxi University | Liang J.-Y.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education
Proceedings - International Conference on Machine Learning and Cybernetics | Year: 2015

As an important reflection of human cognitive ability, the multi-granulation analysis gets more reasonable solution of a problem in comparison to the single granulation. Clustering analysis is an active area of machine learning and a fundamental technique of information granulation. By using different clustering algorithms and different parameters of an algorithm, a data set can be granulated into multiple granular spaces. Clustering ensemble with these granular spaces is an effective strategy of multigranulation information fusion. The existing algorithms of clustering ensemble can be categorized into three types: feature-based method, combinatorial method and graph-based method. Given the fact that every type of methods has their own advantages and disadvantages, combining their advantages will obtain better granulation results. Based on this consideration, this paper introduces a Dempster-Shafer evidence theory based clustering ensemble method that combines advantages of combinatorial method and graph-based method. In this strategy, the definition of mass functions considers neighbors of an object using the graph binarization and the final clustering ensemble result is generated by applying the Dempster's combination rule. The form of the Dempster's combination rule makes the algorithm conforming to the pattern of combinatorial method. Experimental results show that the proposed method yields better performance in comparison with other seven clustering ensemble methods conducted on fourteen numerical real-world data sets from the UCI Machine Learning Repository. © 2015 IEEE.


Wang S.,Shanxi University | Wang S.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Li D.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Li D.,Shanxi University | And 3 more authors.
Expert Systems with Applications | Year: 2011

Owing to its openness, virtualization and sharing criterion, the Internet has been rapidly becoming a platform for people to express their opinion, attitude, feeling and emotion. As the subjectivity texts are often too many for people to go through, how to automatically classify them into different sentiment orientation categories (e.g. positive/negative) has become an important research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for subjectivity text sentiment classification. In order to validate the proposed method, we compared it with the method based on Information Gain while Support Vector Machine is adopted as the classifier. Two experiments are conducted by combining different feature selection methods with two kinds of candidate feature sets. Under 2739 subjectivity documents of COAE2008s and 1006 car-related subjectivity documents, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance respectively with accuracy 86.61% and 82.80% under two corpus while the candidate features are the words which appear in both positive and negative texts. © 2011 Elsevier Ltd. All rights reserved.


Meng Y.,Shanxi University | Liang J.,Shanxi University | Liang J.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Qian Y.,Shanxi University | Qian Y.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education
Knowledge-Based Systems | Year: 2016

Functional data type, which is an important data type, is widely prevalent in many fields such as economics, biology, finance, and meteorology. Its underlying process is often seen as a continuous curve. The classification process for functional data is a basic data mining task. The common method is a two-stage learning process: first, by means of basis functions, the functional data series is converted into multivariate data; second, a machine learning algorithm is employed for performing the classification task based on the new representation. The problem is that a majority of learning algorithms are based on Euclidean distance, whereas the distance between functional samples is L 2 distance. In this context, there are three very interesting problems. (1) Is seeing a functional sample as a point in the corresponding Euclidean space feasible? (2) How to select an orthonormal basis for a given functional data type? (3) Which one is better, orthogonal representation or non-orthogonal representation, under finite basis functions for the same number of basis? These issues are the main motivation of this study. For the first problem, theoretical studies show that seeing a functional sample as a point in the corresponding Euclidean space is feasible under the orthonormal representation. For the second problem, through experimental analysis, we find that Fourier basis is suitable for representing stable functions(especially, periodic functions), wavelet basis is good at differentiating functions with local differences, and data driven functional principal component basis could be the first preference especially when one does not have any prior knowledge on functional data types. For the third problem, experimental results show that orthogonal representation is better than non-orthogonal representation from the viewpoint of classification performance. These results have important significance for studying functional data classification. © 2015 Elsevier B.V.


Wang F.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Wang F.,Shanxi University | Liang J.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Liang J.,Shanxi University
Neurocomputing | Year: 2016

Feature selection for large-scale data sets has been conceived as a very important data preprocessing step in the area of machine learning. Data sets in real databases usually take on hybrid forms, i.e., the coexistence of categorical and numerical data. In this paper, based on the idea of decomposition and fusion, an efficient feature selection approach for large-scale hybrid data sets is studied. According to this approach, one can get an effective feature subset in a much shorter time. By employing two common classifiers as the evaluation function, experiments have been carried out on twelve UCI data sets. The experimental results show that the proposed approach is effective and efficient. © 2016 Elsevier B.V.


Kang X.,Shanxi University | Kang X.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | Li D.,Shanxi University | Li D.,Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education | And 4 more authors.
Fuzzy Sets and Systems | Year: 2012

This paper introduces granular computing (GrC) into formal concept analysis (FCA). It provides a unified model for concept lattice building and rule extraction on a fuzzy granularity base for different granulations. One of the strengths of GrC is that larger granulations help to hide some specific details, whereas FCA in a GrC context can prevent losses due to concept lattice complexity. However, the number of superfluous rules increases exponentially with the scale of the decision context. To overcome this we present some inference rules and maximal rules and prove that the set of all these maximal rules is complete and nonredundant. Thus, users who want to obtain decision rules should generate maximal rules. Examples demonstrate that application of the method is valid and practicable. In summary, this approach utilizes FCA in a GrC context and provides a practical basis for data analysis and processing. © 2012 Elsevier B.V. All rights reserved.

Loading Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education collaborators
Loading Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education collaborators