Time filter

Source Type

Pizzuti C.,CNR Institute for High Performance Computing and Networking | Rombo S.E.,University of Palermo
Bioinformatics | Year: 2014

Motivation: Protein-protein interaction (PPI) networks are powerful models to represent the pairwise protein interactions of the organisms. Clustering PPI networks can be useful for isolating groups of interacting proteins that participate in the same biological processes or that perform together specific biological functions. Evolutionary orthologies can be inferred this way, as well as functions and properties of yet uncharacterized proteins. Results: We present an overview of the main state-of-the-art clustering methods that have been applied to PPI networks over the past decade. We distinguish five specific categories of approaches, describe and compare their main features and then focus on one of them, i.e. population-based stochastic search. We provide an experimental evaluation, based on some validation measures widely used in the literature, of techniques in this class, that are as yet less explored than the others. In particular, we study how the capability of Genetic Algorithms (GAs) to extract clusters in PPI networks varies when different topology-based fitness functions are used, and we compare GAs with the main techniques in the other categories. The experimental campaign shows that predictions returned by GAs are often more accurate than those produced by the contestant methods. Interesting issues still remain open about possible generalizations of GAs allowing for cluster overlapping. © The Author 2014.

Cuzzocrea A.,CNR Institute for High Performance Computing and Networking
Concurrency Computation Practice and Experience | Year: 2011

Data and Knowledge Grids represent emerging and attracting application scenarios for Grid Computing, and pose novel and previously unrecognized challenges to the research community. Basically, Data and Knowledge Grids are found on high-performance Grid infrastructures, and add to the latter meaningful data- and knowledge-oriented abstractions and metaphors that perfectly marry with innovative requirements of modern complex Intelligent Information Systems. To this end, Service-oriented Architectures and Paradigms are the most popular for Grids, and on the whole represent an active and widely recognized area of Grid Computing research. In this paper, we introduce the so-called Grid-based RTSOA frameworks, which essentially combine Grid Computing with real-time service management and execution paradigms, and place emphasis for novel research perspectives in data-intensive e-science Grid applications on real-time bound constraints. Grid-based RTSOA frameworks are then specialized to the particular context of Data Transformation services over Grids, which play a relevant role for both Data and Knowledge Grids. Finally, we complete the main contribution of this paper with a rigorous theoretical model for efficiently supporting Grid-based RTSOA frameworks, with particular emphasis on the context of Data Transformation services over Grids, along with its comprehensive experimental assessment and analysis. © 2010 John Wiley & Sons, Ltd.

Talia D.,CNR Institute for High Performance Computing and Networking
CEUR Workshop Proceedings | Year: 2011

Cloud computing systems provide large-scale infrastructures for high-performance computing that are "elastic" since they are able to adapt to user and application needs. Clouds are used through a service-oriented interface that implements the*-as-a-service paradigm to offer Cloud services on demand. This paper discusses Cloud computing models and architectures, their use in parallel and distributed applications, and examines analogies, differences and potential synergies between Cloud computing and multi-agent systems. This analysis is lead having in mind the goal of implementing highperformance complex systems and intelligent applications by using of Cloud systems and software agents. The convergence of interests between multi-agent systems that need reliable distributed infrastructures and Cloud computing systems that need intelligent software with dynamic, flexible, and autonomous behavior can result in new systems and applications.

Folino F.,CNR Institute for High Performance Computing and Networking | Pizzuti C.,CNR Institute for High Performance Computing and Networking
IEEE Transactions on Knowledge and Data Engineering | Year: 2014

The discovery of evolving communities in dynamic networks is an important research topic that poses challenging tasks. Evolutionary clustering is a recent framework for clustering dynamic networks that introduces the concept of temporal smoothness inside the community structure detection method. Evolutionary-based clustering approaches try to maximize cluster accuracy with respect to incoming data of the current time step, and minimize clustering drift from one time step to the successive one. In order to optimize both these two competing objectives, an input parameter that controls the preference degree of a user towards either the snapshot quality or the temporal quality is needed. In this paper the detection of communities with temporal smoothness is formulated as a multiobjective problem and a method based on genetic algorithms is proposed. The main advantage of the algorithm is that it automatically provides a solution representing the best trade-off between the accuracy of the clustering obtained, and the deviation from one time step to the successive. Experiments on synthetic data sets show the very good performance of the method when compared with state-of-the-art approaches. © 2013 IEEE.

Cuzzocrea A.,CNR Institute for High Performance Computing and Networking
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2011

In this paper, we introduce a novel framework for estimating OLAP queries over uncertain and imprecise multidimensional data streams, along with three relevant research contributions: (i) a probabilistic data stream model, which describes both precise and imprecise multidimensional data stream readings in terms of nice confidence-interval-based Probability Distribution Functions (PDF); (ii) a possible-world semantics for uncertain and imprecise multidimensional data streams, which is based on an innovative data-driven approach that exploits "natural" features of OLAP data, such as the presence of clusters and high correlations; (iii) an innovative approach for providing theoretically-founded estimates to OLAP queries over uncertain and imprecise multidimensional data streams that exploits the well-recognized probabilistic estimators theory. © 2011 Springer-Verlag Berlin Heidelberg.

Falco I.D.,CNR Institute for High Performance Computing and Networking
Applied Soft Computing Journal | Year: 2013

In this paper, a new approach based on Differential Evolution (DE) for the automatic classification of items in medical databases is proposed. Based on it, a tool called DEREx is presented, which automatically extracts explicit knowledge from the database under the form of IF-THEN rules containing AND-connected clauses on the database variables. Each DE individual codes for a set of rules. For each class more than one rule can be contained in the individual, and these rules can be seen as logically connected in OR. Furthermore, all the classifying rules for all the classes are found all at once in one step. DEREx is thought as a useful support to decision making whenever explanations on why an item is assigned to a given class should be provided, as it is the case for diagnosis in the medical domain. The major contribution of this paper is that DEREx is the first classification tool in literature that is based on DE and automatically extracts sets of IF-THEN rules without the intervention of any other mechanism. In fact, all other classification tools based on DE existing in literature either simply find centroids for the classes rather than extracting rules, or are hybrid systems in which DE simply optimizes some parameters whereas the classification capabilities are provided by other mechanisms. For the experiments eight databases from the medical domain have been considered. First, among ten classical DE variants, the most effective of them in terms of highest classification accuracy in a ten-fold cross-validation has been found. Secondly, the tool has been compared over the same eight databases against a set of fifteen classifiers widely used in literature. The results have proven the effectiveness of the proposed approach, since DEREx turns out to be the best performing tool in terms of highest classification accuracy. Also statistical analysis has confirmed that DEREx is the best classifier. When compared to the other rule-based classification tools here used, DEREx needs the lowest average number of rules to face a problem, and the average number of clauses per rule is not very high. In conclusion, the tool here presented is preferable to the other classifiers because it shows good classification accuracy, automatically extracts knowledge, and provides users with it under an easily comprehensible form. © 2012 Elsevier B.V. All rights reserved.

Cuzzocrea A.,CNR Institute for High Performance Computing and Networking
Proceedings - International Computer Software and Applications Conference | Year: 2013

This paper explores the convergence of Data Warehousing, OLAP and data-intensive Cloud Infrastructures in the context of so-called analytics over Big Data. The paper briefly reviews some state-of-the-art proposals, highlights open research issues and, finally, it draws possible research directions in this scientific field. © 2013 IEEE.

Masciari E.,CNR Institute for High Performance Computing and Networking
Information Sciences | Year: 2012

Datastreams are potentially infinite data sources that flow continuously while monitoring a physical phenomenon, like temperature levels or other kind of human activities, such as clickstreams, telephone call records, and so on. RFID technology has lead in recent years the generation of huge streams of data. Moreover, RFID based systems allow the effective management of items tagged by RFID tags, especially for supply chain management or objects tracking. In this paper we introduce SMART (Stream Monitoring enterprise Activities by RFID Tags) a system based on an outlier template definition for detecting anomalies in RFID streams. We describe SMART features and its application on a real life scenario that shows the effectiveness of the proposed method for enterprise management. Moreover, we describe an outlier detection approach we defined and effectively exploited in SMART. © 2012 Elsevier Inc. All rights reserved.

Coronato A.,CNR Institute for High Performance Computing and Networking
Sensors | Year: 2012

The design and realization of health monitoring applications has attracted the interest of large communities both from industry and academia. Several research challenges have been faced and issues tackled in order to realize effective applications for the management and monitoring of people with chronic diseases, people with disabilities, elderly people. However, there is a lack of efficient tools that enable rapid and possibly cheap realization of reliable health monitoring applications. The paper presents Uranus, a service oriented middleware architecture, which provides basic functions for the integration of different kinds of biomedical sensors. Uranus has also distinguishing characteristics like services for the run-time verification of the correctness of running applications and mechanisms for the recovery from failures. The paper concludes with two case studies as proof of concept. © 2012 by the authors.

Forestiero A.,CNR Institute for High Performance Computing and Networking
Information Sciences | Year: 2016

Many distributed systems continuously gather, produce and elaborate data, often as data streams that can change over time. Discovering anomalous data is fundamental to obtain critical and actionable information such as intrusions, faults, and system failures. This paper proposes a multi-agent algorithm to detect anomalies in distributed data streams. As data items arrive from whatever sources, they are associated with bio-inspired agents and randomly disseminated onto a virtual space. The loaded agents move on the virtual space in order to form a group following the flocking algorithm. The agents group on the basis of a predefined concept of similarity of their associated objects. Only the agents associated to similar objects form a flock, whereas the agents associated with objects dissimilar to each other do not group in flocks. Anomalies are objects associated with isolated agents or objects associated with agents belonging to flocks having a few number of elements. Swarm intelligence features of the approach, such as adaptivity, parallelism, asynchronism, and decentralization, make the algorithm scalable to very large data sets and very large distributed systems. Experimental results for real and synthetic datasets confirm the validity of the proposed model. © 2016 Elsevier Inc.

Loading CNR Institute for High Performance Computing and Networking collaborators
Loading CNR Institute for High Performance Computing and Networking collaborators