Time filter

Source Type

Zhong C.,Tongji University | Zhong C.,Key Laboratory of Embedded System | Zhong C.,Ningbo University | Miao D.,Tongji University | And 3 more authors.
Pattern Recognition | Year: 2010

Many clustering approaches have been proposed in the literature, but most of them are vulnerable to the different cluster sizes, shapes and densities. In this paper, we present a graph-theoretical clustering method which is robust to the difference. Based on the graph composed of two rounds of minimum spanning trees (MST), the proposed method (2-MSTClus) classifies cluster problems into two groups, i.e. separated cluster problems and touching cluster problems, and identifies the two groups of cluster problems automatically. It contains two clustering algorithms which deal with separated clusters and touching clusters in two phases, respectively. In the first phase, two round minimum spanning trees are employed to construct a graph and detect separated clusters which cover distance separated and density separated clusters. In the second phase, touching clusters, which are subgroups produced in the first phase, can be partitioned by comparing cuts, respectively, on the two round minimum spanning trees. The proposed method is robust to the varied cluster sizes, shapes and densities, and can discover the number of clusters. Experimental results on synthetic and real datasets demonstrate the performance of the proposed method. © 2009 Elsevier Ltd. All rights reserved.

Zhang X.,Sichuan Normal University | Zhang X.,Tongji University | Zhang X.,Key Laboratory of Embedded System | Miao D.,Tongji University | Miao D.,Key Laboratory of Embedded System
Information Sciences | Year: 2014

Attribute reduction is an essential subject in rough set theory, but because of quantitative extension, it becomes a problem when considering probabilistic rough set (PRS) approaches. The decision-theoretic rough set (DTRS) has a threshold semantics and decision feature and thus becomes a typical and fundamental PRS. Based on reduction target structures, this paper investigates hierarchical attribute reduction for a two-category DTRS and is divided into five parts. (1) The knowledge-preservation property and reduct are explored by knowledge coarsening. (2) The consistency-preservation principle and reduct are constructed by a consistency mechanism. (3) Region preservation is analyzed, and the separability between consistency preservation and region preservation is concluded; thus, the double-preservation principle and reduct are studied. (4) Structure targets are proposed by knowledge structures, and an attribute reduction is further described and simulated; thus, general reducts are defined to preserve the structure targets or optimal measures. (5) The hierarchical relationships of the relevant four targets and reducts are analyzed, and a decision table example is provided for illustration. This study offers promotion, rationality, structure, hierarchy and generalization, and it establishes a fundamental reduction framework for two-category DTRS. The relevant results also provide some new insights into the attribute reduction problem for PRS. © 2014 Elsevier Inc. All rights reserved.

Wang W.,Tongji University | Wang W.,Performance Technology | Wang W.,Key Laboratory of Embedded System | Zeng G.,Tongji University | And 3 more authors.
Expert Systems with Applications | Year: 2010

Content trust is one of the main components in the research of information retrieval. As it gets easier to add information to the Web via HTML pages, wikis, blogs, and other documents, it gets tougher to distinguish accurate or trustworthy information from inaccurate or untrustworthy information on the Web. Current technology of spam detection is based on binary metric, that is binary classification is adapted in the spam detection. In order to meet the users' need and preference, more accurate metric is needed in the content trust as well as in detecting spam information. In this paper, we use the notion of content trust for spam detection, and regard it as a ranking problem. Besides traditional text feature attributes, information quality based evidence is introduced to define the trust feature of spam information, and a novel content trust learning algorithm based on these evidence is proposed. Finally, a Web spam detection system is developed and the experiments on the real Web data are carried out, which show the proposed method performs very well in practice. © 2010 Elsevier Ltd.

Feng Q.,Shanxi Normal University | Feng Q.,Tongji University | Feng Q.,Key Laboratory of Embedded System | Miao D.,Tongji University | And 3 more authors.
Expert Systems with Applications | Year: 2010

Decision rules mining is an important technique in machine learning and data mining. It has been studied intensively during the past few years. However, most existing algorithms are based on flat dataset, from which a set of decision rules mined may be very large for large scale data. Such a set of rules is not easily understandable and really useful for users. Moreover, too many rules may lead to over fitting. Thus, an approach to hierarchical decision rules mining is provided in this paper. It can mine decision rules from different levels of abstraction. The aim of this approach is to improve the quality and efficiency of decision rules mining by combining the hierarchical structure of multidimensional data model and the techniques of rough set theory. The approach follows the so-called separate-and-conquer strategy. It can not only provide a method of hierarchical decision rules mining, but also the most important is that it can reveal the fact that there exists property-preserving among decision rules mined from different levels, which can further improve the efficiency of decision rules mining. © 2009 Elsevier Ltd. All rights reserved.

Loading Key Laboratory of Embedded System collaborators
Loading Key Laboratory of Embedded System collaborators