Entity

Time filter

Source Type

Gif-sur-Yvette, France

Chen L.,Maison de la Simulation | Petition S.,Lille University of Science and Technology
Proceedings - IEEE International Conference on Cluster Computing, ICCC | Year: 2015

Krylov subspace methods (KSMs) are widely used insolving large-scale sparse linear problems. The orthogonalizationprocess in methods like GMRES would consume a majorityof the time. Since modern manycore architecture based acceleratorshave provided great horsepowers for computations,communication overheads remain a bottleneck, especially inclusters with a great number of nodes. The HA-PACS/TCA ofTsukuba University is a CPU-GPU hybrid cluster equipped withdifferent interconnects for communications among GPUs. We testa group of Krylov basis computation methods with differentsparse matrices and interconnects on HA-PACS/TCA. Resultsshow that an auto-tuning scheme is required to deal with varioustypes of matrices. © 2015 IEEE.


Ye F.,CEA Saclay Nuclear Research Center | Calvin C.,CEA Saclay Nuclear Research Center | Petiton S.G.,Maison de la Simulation | Petiton S.G.,Lille University of Science and Technology
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2015

The Sparse Matrix-Vector Multiplication (SpMV) is fundamental to a broad spectrum of scientific and engineering applications, such as many iterative numerical methods. The widely used Compressed Sparse Row (CSR) sparse matrix storage format was chosen to carry on this study for sustainability and reusability reasons. We parallelized for Intel Many Integrated Core (MIC) architecture a vectorized SpMV kernel using MPI and OpenMP, both pure and hybrid versions of them. In comparison to pure models and vendor-supplied BLAS libraries across different mainstream architectures (CPU, GPU), the hybrid model exhibits a substantial improvement. To further assess the behavior of hybrid model, we attribute the inadequacy of performances to vectorization rate, irregularity of non-zeros, and load balancing issue. A mathematical relationship between the first two factors and the performance is then proposed based on the experimental data. © Springer International Publishing Switzerland 2015.


Chen L.,Maison de la Simulation | Petiton S.G.,Maison de la Simulation | Petiton S.G.,Lille University of Science and Technology | Drummond L.A.,Lawrence Berkeley National Laboratory | Hugues M.,French Institute for Research in Computer Science and Automation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2015

Krylov Subspace Methods (KSMs) are widely used for solving large-scale linear systems and eigenproblems. However, the computation of Krylov subspace bases suffers from the overhead of performing global reduction operations when computing the inner vector products in the orthogonalization steps. In this paper, a hypergraph based communication optimization scheme is applied to Arnoldi and incomplete Arnoldi methods of forming Krylov subspace basis from sparse matrix, and features of these methods are compared in a analytical way. Finally, experiments on a CPU-GPU heterogeneous cluster show that our optimization improves the Arnoldi methods implementations for a generic matrix, and a benefit of up to 10x speedup for some special diagonal structured matrix. The performance advantage also varies for different subspace sizes and matrix formats, which requires a further integration of auto-tuning strategy. © Springer International Publishing Switzerland 2015.


Liu Z.,Maison de la Simulation | Liu Z.,University of Versailles | Emad N.,Maison de la Simulation | Emad N.,University of Versailles | And 2 more authors.
International Journal of Parallel Programming | Year: 2014

A parallel implementation based on implicitly restarted Arnoldi method (MIRAM) is proposed for calculating dominant eigenpair of stochastic matrices derived from very large real networks. Their high damping factor makes many existing algorithms less efficient, while MIRAM could be promising. Also, we apply this method in an epidemic application. We describe in this paper a stochastic model based on PageRank to simulate the epidemic spread, where a PageRank-like infection vector is calculated by MIRAM to help establish efficient vaccination strategy. MIRAM is implemented within the framework of Trilinos, targeting big data and sparse matrices representing scale-free networks, also known as power law networks. Hypergraph partitioning approach is employed to minimize the communication overhead. The algorithm is tested on a nation wide cluster of clusters Grid5000. Experiments on very large networks such as twitter and yahoo with over 1 billion nodes are conducted. With our parallel implementation, a speedup of (Formula presented.) is met compared to the sequential solver. © 2014 Springer Science+Business Media New York

Discover hidden collaborations