Varini N.,ICHEC Inc |
Varini N.,Curtin University Australia |
Ceresoli D.,CNR Institute of Molecular Science and Technologies |
Martin-Samos L.,CNR Institute of Materials |
And 4 more authors.
Computer Physics Communications | Year: 2013
One of the most promising techniques used for studying the electronic properties of materials is based on Density Functional Theory (DFT) approach and its extensions. DFT has been widely applied in traditional solid state physics problems where periodicity and symmetry play a crucial role in reducing the computational workload. With growing compute power capability and the development of improved DFT methods, the range of potential applications is now including other scientific areas such as Chemistry and Biology. However, cross disciplinary combinations of traditional Solid-State Physics, Chemistry and Biology drastically improve the system complexity while reducing the degree of periodicity and symmetry. Large simulation cells containing of hundreds or even thousands of atoms are needed to model these kind of physical systems. The treatment of those systems still remains a computational challenge even with modern supercomputers. In this paper we describe our work to improve the scalability of Quantum ESPRESSO (Giannozzi et al., 2009 ) for treating very large cells and huge numbers of electrons. To this end we have introduced an extra level of parallelism, over electronic bands, in three kernels for solving computationally expensive problems: the Sternheimer equation solver (Nuclear Magnetic Resonance, package QE-GIPAW), the Fock operator builder (electronic ground-state, package PWscf) and most of the Car-Parrinello routines (Car-Parrinello dynamics, package CP). Final benchmarks show our success in computing the Nuclear Magnetic Response (NMR) chemical shift of a large biological assembly, the electronic structure of defected amorphous silica with hybrid exchange-correlation functionals and the equilibrium atomic structure of height Porphyrins anchored to a Carbon Nanotube, on many thousands of CPU cores. © 2013 Elsevier B.V. All rights reserved.
Fursin G.,University Paris - Sud |
Miceli R.,University of Rennes 1 |
Miceli R.,ICHEC Inc |
Lokhmotov A.,ARM Inc |
And 6 more authors.
Scientific Programming | Year: 2014
Empirical auto-tuning and machine learning techniques have been showing high potential to improve execution time, power consumption, code size, reliability and other important metrics of various applications for more than two decades. However, they are still far from widespread production use due to lack of native support for auto-tuning in an ever changing and complex software and hardware stack, large and multi-dimensional optimization spaces, excessively long exploration times, and lack of unified mechanisms for preserving and sharing of optimization knowledge and research material. We present a possible collaborative approach to solve above problems using Collective Mind knowledge management system. In contrast with previous cTuning framework, this modular infrastructure allows to preserve and share through the Internet the whole auto-tuning setups with all related artifacts and their software and hardware dependencies besides just performance data. It also allows to gradually structure, systematize and describe all available research material including tools, benchmarks, data sets, search strategies and machine learning models. Researchers can take advantage of shared components and data with extensible meta-description to quickly and collaboratively validate and improve existing auto-tuning and benchmarking techniques or prototype new ones. The community can now gradually learn and improve complex behavior of all existing computer systems while exposing behavior anomalies or model mispredictions to an interdisciplinary community in a reproducible way for further analysis. We present several practical, collaborative and model-driven auto-tuning scenarios. We also decided to release all material at c-mind.org/repo to set up an example for a collaborative and reproducible research as well as our new publication model in computer engineering where experimental results are continuously shared and validated by the community. © 2014 - IOS Press and the authors. All rights reserved.
Del Bene J.E.,Youngstown State University |
Alkorta I.,Institute Quimica Medica IQM CSIC |
Elguero J.,Institute Quimica Medica IQM CSIC |
Sanchez-Sanz G.,ICHEC Inc
Journal of Physical Chemistry A | Year: 2017
Ab initio MP2/aug'-cc-pVTZ calculations have been performed on the binary complexes XY:PH3 for XY = ClCl, FCl, and FBr; and PH3:N-base for N-base = NCH, NH3, NCF, NCCN, and N2; and the corresponding ternary complexes XY:PH3:N-base, to investigate P···N pnicogen bond formation through the lone-pair hole at P in the binary complexes and P···N pnicogen-bond formation assisted by P···Y halogen bond formation through the σ-hole at Y. Although the binary complexes PH3:N-base that form through the lone-pair hole have very small binding energies, they are not equilibrium structures on their potential surfaces. The presence of the P···Y halogen bond makes PH3 a better electron-pair acceptor through its lone-pair hole, leading to stable ternary complexes XY:PH3:N-base. The halogen bonds in ClCl:PH3 and ClCl:PH3:NCCN are traditional halogen bonds, but in the remaining binary and ternary complexes, they are chlorine- or bromine-shared halogen bonds. For a given nitrogen base, the P···N pnicogen bond in the ternary complex FCl:PH3:N-base appears to be stronger than that bond in FBr:PH3:N-base, which is stronger than the P···N bond in the corresponding ClCl:PH3:N-base complex. EOM-CCSD spin-spin coupling constants for the binary and ternary complexes with ClCl and FCl are also consistent with the changing nature of the halogen bonds in these complexes. At long P-Cl distances, the coupling constant 1xJ(P-Cl) increases with decreasing distance but then decreases as the P-Cl distance continues to decrease, and the halogen bonds become chlorine-shared bonds. At the shorter distances, 1xJ(P-Cl) approaches the value of 1J(P-Cl) for the cation +(Cl-PH3). The coupling constants 1pJ(P-N) are small and, with one exception, are greater in ClCl:PH3:N-base complexes compared to that in FCl:PH3:N-base, despite the shorter P-N distances in the latter. © 2017 American Chemical Society.
Dubois V.,CNRS Institute of Chemistry |
Jannes G.,CNRS Institute of Chemistry |
Jannes G.,ICHEC Inc
Applied Catalysis A: General | Year: 2014
2-Methyl-2-nitropropane hydrogenation proceeds along a rake scheme that may encompass homogeneous steps, reactions on the support and on the metallic phase as well. This complexity makes it useful as a test reaction. Reaction rates, selectivities and intermediate compounds accumulation provide experimental information on the modification of catalytic parameters such as support functionality and metallic dispersion. Experiments were carried out in homogeneous phase, on bare carbon supports, with mechanical mixtures of bare carbon supports and carbon-supported catalysts, and with catalysts prepared on modified supports. We have shown that hydrogen activation takes place on the metal and that organic reactants and intermediates may be activated on the metal and on the carbon support as well and may react with hydrogen available after spillover and jumpover migrations. © 2013 Elsevier B.V.
Sabatini R.,International School for Advanced Studies |
Gorni T.,University of Modena and Reggio Emilia |
Gorni T.,ICHEC Inc |
De Gironcoli S.,International School for Advanced Studies |
De Gironcoli S.,CNR Institute of Materials
Physical Review B - Condensed Matter and Materials Physics | Year: 2013
We present a simple revision of the VV10 nonlocal density functional by Vydrov and Van Voorhis for dispersion interactions. Unlike the original functional our modification allows nonlocal correlation energy and its derivatives to be efficiently evaluated in a plane wave framework along the lines pioneered by Román-Pérez and Soler. Our revised functional maintains the outstanding precision of the original VV10 in noncovalently bound complexes and performs well in representative covalent, ionic, and metallic solids. © 2013 American Physical Society.
O'Sullivan J.,University College Dublin |
Sweeney C.,University College Dublin |
Nolan P.,ICHEC Inc |
Gleeson E.,Met. Eireann
International Journal of Climatology | Year: 2015
There is a paucity of dynamically downscaled climate model output at a high resolution over Ireland, of temperature projections for the mid-21st century. This study aims to address this shortcoming. A preliminary investigation of global climate model (GCM) data and high-resolution regional climate model (RCM) data shows that the latter exhibits greater variability over Ireland by reducing the dominance of the surrounding seas on the climate signal. This motivates the subsequent dynamical downscaling and analysis of the temperature output from three high-resolution (4-7 km grid size) RCMs over Ireland. The three RCMs, driven by four GCMs from CMIP3 and CMIP5, were run under different Special Report on Emissions Scenarios (SRES) and representative concentration pathway (RCP) future scenarios. Projections of mean and extreme temperature changes are considered for the mid-century (2041-2060) and assessed relative to the control period of 1981-2000. Analysis of the RCM data shows that annual mean temperatures are projected to rise between 0.4 and 1.8 °C above control levels by mid-century. On a seasonal basis, results differ by forcing scenario. Future summers have the largest projected warming under RCP 8.5, where the greatest warming is seen in the southeast of Ireland. The remaining two high emission scenarios (SRESs A1B and A2) project future winters to have the greatest warming, with almost uniform increases of 1.5-2 °C across the island. Changes in the bidecadal 5th and 95th percentile values of daily minimum and maximum temperatures, respectively, are also analysed. The greatest change in daily minimum temperature is projected for future winters (indicating fewer cold nights and frost days), a pattern that is consistent across all scenarios/forcings. An investigation into the distribution of temperature under RCP 8.5 shows a strong summer increase compounded by increased variability, and a winter increase compounded by an increase in skewness. © 2015 Royal Meteorological Society.
Bull J.M.,University of Edinburgh |
Reid F.,University of Edinburgh |
McDonnell N.,ICHEC Inc
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012
We present a set of extensions to an existing microbenchmark suite for OpenMP. The new benchmarks measure the overhead of the task construct introduced in the OpenMP 3.0 standard, and associated task synchronisation constructs. We present the results from a variety of compilers and hardware platforms, which demonstrate some significant differences in performance between different OpenMP implementations. © 2012 Springer-Verlag.
Farber R.,ICHEC Inc
Scientific Computing | Year: 2011
OpenCL and NVIDIA's CUDA are competing to become the common application platform for both GPU computing as well as x86 computers. The Portland Group (PGI) has introduced a native CUDA-x86 compiler that changes the decision-making process dramatically and makes CUDA a candidate for all. The PGI CUDA C/C++ compiler is a native compiler that transparently compiles CUDA to run on x86 systems even when a GPU is not present in the system. In 2012, the PGI compiler will be able to create a unified binary, which will simplify the software distribution process tremendously. On the other hand, OpenCL is an open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. A strength of OpenCL is the flexibility it provides to support portability across multiple device types and configurations. OpenCL also offers an offline compilation capability that can be used to protect kernel source code, while it limits the execution to the precompiled devices.
Spiga F.,ICHEC Inc |
Girotto I.,ICHEC Inc
Proceedings - 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2012 | Year: 2012
GPU computing has revolutionized HPC by bringing the performance of the supercomputer to the desktop. Attractive price, performance, and power characteristics allow multiple GPUs to be plugged into both desktop machines as well as supercomputer nodes for increased performance. Excellent performance and scalability can be achieved for some problems using hybrid combinations of multiple GPUs and CPU computing resources. This paper presents the acceleration of the open-source QUANTUM ESPRESSO package with the freely available PHIGEMM library. Specifically, the parallel implementation and scaling of the PHIGEMM matrix-matrix multiplication will be discussed. This library can be called from applications through all standard GEMM interfaces and it is able to perform matrix-matrix multiplications using one or more GPUs as well as the host multi-core processor. An 8.9-times speedup is reported in overall run-time of a representative AUSURF112 benchmark for a PWSCF calculation. In addition, multi-GPU scaling and performance for 3D-FFTs are discussed. © 2012 IEEE.