Li S.,Hefei University of Technology |
Li S.,Key Laboratory on High Performance Computing of Anhui Province |
Li S.,Intel Corporation |
Cheng B.,Intel Corporation |
And 5 more authors.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | Year: 2012
Long B.,Anhui University of Science and Technology |
Long B.,Key Laboratory on High Performance Computing of Anhui Province |
Sun G.-Z.,Anhui University of Science and Technology |
Sun G.-Z.,Key Laboratory on High Performance Computing of Anhui Province |
And 3 more authors.
Tien Tzu Hsueh Pao/Acta Electronica Sinica | Year: 2011
We present a hybrid-index structure for high-dimensional data which named HKD-tree (Hybrid K-Dimensional Tree). To make use of two-level parallelization of multi-core clusters, we combined with KD-tree and LSH, which uses LSH in the leaf nodes of KD-tree. Compared with the traditional index structure, the hybrid index structure has effective parallel processing ability and good scalability, which is suitable for the multi-core cluster platform and high-dimensional data indexing. The experiment results show that the performance of the hybrid index structure is superior to the traditional index structure on the multi-core cluster systems.
Fang W.,Hefei University of Technology |
Fang W.,Key Laboratory on High Performance Computing of Anhui Province |
Sun G.,Hefei University of Technology |
Sun G.,Key Laboratory on High Performance Computing of Anhui Province |
And 4 more authors.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | Year: 2011
Three-dimensional fast Fourier transform(3D-FFT) is widely used in physics. It is crucial to many applications because it demands heavy calculation and communications. Thus in most cases it is 3D-FFT that dominates the computational time. The traditional parallel algorithms of 3D-FFT are not suitable for the sparse lattice which is often encountered in the field of quantum computing, because the block partitioning used may involve many redundant computing and communications, due to the sparse of non-zero elements in FFT grid. In this paper we propose a noval parallel algorithm of 3D-FFT. Unlike the previous methods, the new algorithm uses slice partitioning, and redesigns the computing order in order to minimize the calculation time and communication cost. Taking advantage of the slice partitioning, the new method are highly scalable and can automatically satisfy the demands of load balancing. We compare it with traditional algorithms in theory and in practice. Theoretical performance analysis shows that the new method can greatly reduce the computational time and increase parallel speedup. The experiments have been carried cut in some high-performance machines, such as KD-50, IBM JS22 and DAWNING. The results show that our new algorithm behaves much better than traditional algorithms in performing 3D-FFT for sparse lattice.