Time filter

Source Type

Guo H.-S.,Shanxi University | Wang W.-J.,Shanxi University | Wang W.-J.,Key Laboratory of Computational Intelligence and Chinese Information Processing
Ruan Jian Xue Bao/Journal of Software | Year: 2013

Although granular support vector machine (GSVM) can improve the learning speed, the generalization performance may be decreased because the original data distribution will be changed inevitably by two reasons: (1) A granule is usually replaced by individual datum; (2) Granulation and learning are carried out in different spaces. To address this problem, this study presents a granular support vector regression (SVR) model based on dynamical granulation, namely DGSVR, by using the dynamical hierarchical granulation method. With DGSVR, the original data are mapped into the high-dimensional space by mercer kernel to reveal the distribution features implicit in original sample space, and the data are divided into some granules initially. Then, some granules are obtained with important regression information by measuring the distances of granules and regression hyperplane. By computing the radius and density of granules, the deep dynamical granulation process executes until there are no informational granules need to be granulated. Finally, those granules in different granulation levels are extracted and trained by SVR. The experimental results on benchmark function datasets and UCI regression datasets demonstrate that the DGSVR model can quickly finish the dynamical granulation process and is convergent. It concludes this model can improve the generalization performance and achieve high learning efficiency at the same time. ©Copyright 2013, Institute of Software, the Chinese Academy of Sciences. Source

Qian Y.H.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Liang J.Y.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Song P.,Shanxi University | Dang C.Y.,City University of Hong Kong
International Journal of Information Technology and Decision Making | Year: 2010

Set-valued information systems are generalized models of single-valued information systems. Its semantic interpretation can be classified into two categories: disjunctive and conjunctive. We focus on the former in this paper. By introducing four types of dominance relations to the disjunctive set-valued information systems, we establish a dominance-based rough sets approach, which is mainly based on the substitution of the indiscernibility relation by the dominance relations. Furthermore, we develop a new approach to sorting for objects in disjunctive set-valued ordered information systems, which is based on the dominance class of an object induced by a dominance relation. Finally, we propose criterion reductions of disjunctive set-valued ordered information systems that eliminate only those information that are not essential from the ordering of objects. The approaches show how to simplify a disjunctive set-valued ordered information system. Throughout this paper, we establish in detail the interrelationships among the four types of dominance relations, which include corresponding dominance classes, rough sets approaches, sorting for objects and criterion reductions. These results give a kind of feasible approaches to intelligent decision making in disjunctive set-valued ordered information systems. © 2010 World Scientific Publishing Company. Source

Kang X.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Li D.,Shanxi University | Wang S.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Qu K.,Key Laboratory of Computational Intelligence and Chinese Information Processing
Information Sciences | Year: 2013

This paper proposes a rough set model based on formal concept analysis. In this model, a solution to an algebraic structure problem is first provided in an information system: a lattice structure is inferred from the information system and corresponding nodes are called rough concepts. How to deal with common problems in rough set theory based on rough concepts is then explored, such as upper and lower approximation operators, reducts and cores. Decision dependency has become a common form of knowledge representation owing to its properties of expressiveness and ease of understanding, so it has been widely used in practice. Finally, application of rough concepts to the extraction of decision dependencies from a decision table is studied; a complete and non-redundant set of decision dependencies can be obtained from a decision table. Examples demonstrate that application of the method presented in this paper is valid and practicable. The results not only provide a better understanding of rough set theory from the perspective of formal concept analysis, but also demonstrate a new way of combining rough set theory and formal concept analysis. © 2012 Elsevier Inc. All rights reserved. Source

Cao F.,Shanxi University | Cao F.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Liang J.,Shanxi University | Liang J.,Key Laboratory of Computational Intelligence and Chinese Information Processing
Expert Systems with Applications | Year: 2011

As the size of data growing at a rapid pace, clustering a very large data set inevitably incurs a time-consuming process. To improve the efficiency of clustering, sampling is usually used to scale down the size of data set. However, with sampling applied, how to allocate unlabeled objects into proper clusters is a very difficult problem. In this paper, based on the frequency of attribute values in a given cluster and the distributions of attribute values in different clusters, a novel similarity measure is proposed to allocate each unlabeled object into the corresponding appropriate cluster for clustering categorical data. Furthermore, a labeling algorithm for categorical data is presented, and its corresponding time complexity is analyzed as well. The effectiveness of the proposed algorithm is shown by the experiments on real-world data sets. © 2010 Elsevier Ltd. All rights reserved. Source

Wang S.,Shanxi University | Wang S.,Key Laboratory of Computational Intelligence and Chinese Information Processing | Li D.,Shanxi University | Li D.,Key Laboratory of Computational Intelligence and Chinese Information Processing | And 2 more authors.
Knowledge-Based Systems | Year: 2013

The vast subjective texts spreading all over the Internet promoted the demand for text sentiment classification technology. A well-known fact that often weakens the performance of classifiers is the distribution imbalance of review texts on the positive-negative classes. In this paper, we pay attention to the sentiment classification problem of imbalanced text sets. With regards to this problem, the algorithm BRC for clarifying the disorder boundary is proposed by cutting the majority class samples in the dense boundary region. The classifier is constructed based on Support Vector Machine. In order to find the better feature weight scheme, combination strategy of sample cutting, and parameters in BRC, three groups of experiments are designed on six text sets about five domains. The experimental results show that the feature weight scheme Presence has the best performance. And the combination strategy BRC + RS can give a tradeoff between the evaluation measures, Precision and Recall on two categories and make the synthetical evaluation measure Accuracy obtain a larger increase. It should be noted that the method of determining the parameters α and β in BRC is empirical. Although the boundary region cutting algorithm BRC is aimed to text sentiment classification we believe that it is also suitable to any two-category classification problem with imbalanced sample data. © 2012 Elsevier B.V. All rights reserved. Source

Discover hidden collaborations