Time filter

Source Type

Luo Q.,National High Performance Computing Center | Zhou Y.,National High Performance Computing Center | Kong C.,National High Performance Computing Center | Liu G.,National High Performance Computing Center | And 2 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2013

Two NUMA architectures with different memory subsystems are experimentally analyzed in this paper. By applying the benchmark with various access patterns, it shows much different characteristics of memory system between Xeon E5620 with Global Queue and LS 3A with typical crossbar switch. The experiment results reveal the fact that LS 3A and Xeon E5620 have some similar features. Our study also showed some other diverse features of these two platforms: due to the different contention locations and mechanisms, the memory access model for E5620 doesn't fit for LS 3A. Through comparing, we find that one advantage of LS 3A is that it can obtain steady bandwidth on both local and remote thread, and it is more fair for local and remote access under some circumstances. Another fact is that LS 3A is not such sensitive to remote access, compared with E5620, so there will be no obvious performance degradation caused by non-local memory access. © 2013 IFIP International Federation for Information Processing.


Meng H.,Hefei University of Technology | Meng H.,National High Performance Computing Center | Li J.,Hefei University of Technology | Li J.,National High Performance Computing Center | And 4 more authors.
International Review on Computers and Software | Year: 2013

The deduplication efficiency/overhead ratio of existing source deduplication solutions in cloud backup systems, in despite of their complex deduplication workflows, was not ideal due to their insufficient study of file semantics. In this paper, we present MMSD, a metadata-aware multi-tiered source deduplication cloud backup system in personal computing environment, to obtain an optimal tradeoff between the deduplication efficiency and deduplication metadata storage overhead and finally achieve a shorter backup window than existing approaches. MMSD makes full use of file metadata including file size, type, timestamp, path information, modification frequency, and the advantage of whole file level deduplication compared with chunk level deduplication, and efficiently combines these two deduplication levels. Our experimental results with real world datasets show that, compared with the state-of-art source deduplication methods, MMSD can improve the overall deduplication efficiency from 75% to 91% with only 33.8% of deduplication metadata storage overhead and shorten the backup window by at least 54.2%. © 2013 Praise Worthy Prize S.r.l. - All rights reserved.


Luo T.,National High Performance Computing Center | Luo T.,Hefei University of Technology | Luo T.,CAS Institute of Software | Yuan W.,CAS Institute of Software | And 5 more authors.
International Review on Computers and Software | Year: 2013

Compared with traditional data warehouse applications, big data analytics are huge and complex, and requires massive performance and scalability. In this paper, we explore the feasibility and versatility of building a hybrid system that not only retains the analytical DBMS, but also could handle the demands of rapidly exploding data applications. We propose a hybrid system prototype which takes DBMS as the underlying storage and execution units, and Hadoop as an index layer and a cache. Experiments show that our system meets the demand, and will be appropriate for analogous big data analysis applications. © 2013 Praise Worthy Prize S.r.l. - All rights reserved.


Luo Q.,National High Performance Computing Center | Luo Q.,Shenzhen University | Liu C.,Shenzhen University | Kong C.,Shenzhen University | And 3 more authors.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012

Some typical memory access patterns are provided and programmed in C, which can be used as benchmark to characterize the various techniques and algorithms aim to improve the performance of NUMA memory access. These access patterns, called MAP-numa (Memory Access Patterns for NUMA), currently include three classes, whose working data sets are corresponding to 1-dimension array, 2-dimension matrix and 3-dimension cube. It is dedicated for NUMA memory access optimization other than measuring the memory bandwidth and latency. MAP-numa is an alternative to those exist benchmarks such as STREAM, pChase, etc. It is used to verify the optimizations' (made automatically/manually to source code/executive binary) capacities by investigating what locality leakage can be remedied. Some experiment results are shown, which give an example of using MAP-numa to evaluate some optimizations based on Oprofile sampling. © IFIP International Federation for Information Processing 2012.


Luo Q.,National High Performance Computing Center | Luo Q.,Shenzhen University | Liu C.,Shenzhen University | Kong C.,Shenzhen University | And 3 more authors.
Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings | Year: 2012

Sustaining the memory locality is critical for obtaining high performance in NUMA system. But how to identify a locality leakage problem and how to measure the leakage is still open issue. This paper provides an algorithm to quantitatively measure the locality leakage based on the memory trace produced by IBS (Instruction-Based-Sampling). A "perfect matrix" PM is generated from virtual memory address trace, which represents the highest locality pattern. A "communication matrix" CM is obtained from physical memory address trace to describe the actual memory access pattern. The penalty factors are calculated from PM or CM with considering of the hardware NUMA factor. The leakage is measured by the difference between the penalty factors of PM and the penalty factors of CM, which can be used to estimate the performance decrease and guide the optimization. Some experiment results are show to testify the effectiveness and accuracy of our quantitative measurement. © 2012 IEEE.

Loading National High Performance Computing Center collaborators
Loading National High Performance Computing Center collaborators