Fantalgo LLC

Elkridge, MD, United States

Fantalgo LLC

Elkridge, MD, United States

Time filter

Source Type

Berlin K.,University of Maryland University College | Gumerov N.A.,University of Maryland University College | Gumerov N.A.,Fantalgo LLC | Fushman D.,University of Maryland University College | And 2 more authors.
Journal of Applied Crystallography | Year: 2014

The need for fast approximate algorithms for Debye summation arises in computations performed in crystallography, small/wide-angle X-ray scattering and small-angle neutron scattering. When integrated into structure refinement protocols these algorithms can provide significant speed up over direct all-atom-to-all-atom computation. However, these protocols often employ an iterative gradient-based optimization procedure, which then requires derivatives of the profile with respect to atomic coordinates. This article presents an accurate, O(N) cost algorithm for the computation of scattering profile derivatives. The results reported here show orders of magnitude improvement in computational efficiency, while maintaining the prescribed accuracy. This opens the possibility to efficiently integrate small-angle scattering data into the structure determination and refinement of macromolecular systems. © 2014.


Gumerov N.A.,University of Maryland University College | Gumerov N.A.,Fantalgo LLC | Berlin K.,University of Maryland University College | Fushman D.,University of Maryland University College | And 2 more authors.
Journal of Computational Chemistry | Year: 2012

Debye summation, which involves the summation of sinc functions of distances between all pair of atoms in three-dimensional space, arises in computations performed in crystallography, small/wide angle X-ray scattering (SAXS/WAXS), and small angle neutron scattering (SANS). Direct evaluation of Debye summation has quadratic complexity, which results in computational bottleneck when determining crystal properties, or running structure refinement protocols that involve SAXS or SANS, even for moderately sized molecules. We present a fast approximation algorithm that efficiently computes the summation to any prescribed accuracy ε in linear time. The algorithm is similar to the fast multipole method (FMM), and is based on a hierarchical spatial decomposition of the molecule coupled with local harmonic expansions and translation of these expansions. An even more efficient implementation is possible when the scattering profile is all that is required, as in small angle scattering reconstruction (SAS) of macromolecules. We examine the relationship of the proposed algorithm to existing approximate methods for profile computations, and show that these methods may result in inaccurate profile computations, unless an error-bound derived in this article is used. Our theoretical and computational results show orders of magnitude improvement in computation complexity over existing methods, while maintaining prescribed accuracy. © 2012 Wiley Periodicals, Inc.


Gumerov N.A.,University of Maryland University College | Duraiswami R.,Fantalgo LLC
Journal of Computational Physics | Year: 2013

Vortex methods are used to efficiently simulate incompressible flows using Lagrangian techniques. Use of the FMM (Fast Multipole Method) allows considerable speed up of both velocity evaluation and vorticity evolution terms in these methods. Both equations require field evaluation of constrained (divergence free) vector valued quantities (velocity, vorticity) and cross terms from these. These are usually evaluated by performing several FMM accelerated sums of scalar harmonic functions. We present a formulation of vortex methods based on the Lamb-Helmholtz decomposition of the velocity in terms of two scalar potentials. In its original form, this decomposition is not invariant with respect to translation, violating a key requirement for the FMM. One of the key contributions of this paper is a theory for translation for this representation. The translation theory is developed by introducing "conversion" operators, which enable the representation to be restored in an arbitrary reference frame. Using this form, efficient vortex element computations can be made, which need evaluation of just two scalar harmonic FMM sums for evaluating the velocity and vorticity evolution terms. Details of the decomposition, translation and conversion formulae, and sample numerical results are presented. © 2013 Elsevier Inc.


Hu Q.,University of Maryland College Park | Hu Q.,University of Maryland University College | Gumerov N.A.,University of Maryland College Park | Gumerov N.A.,Fantalgo LLC | And 5 more authors.
Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS | Year: 2014

The fast multipole method (FMM) is often used to accelerate the calculation of particle interactions in particle-based methods to simulate incompressible flows. To evaluate the most time-consuming kernels - the Biot-Savart equation and stretching term of the vorticity equation, we mathematically reformulated it so that only two Laplace scalar potentials are used instead of six. This automatically ensuring divergence-free far-field computation. Based on this formulation, we developed a new FMM-based vortex method on heterogeneous architectures, which distributed the work between multicore CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm uses new data structures which can dynamically manage inter-node communication and load balance efficiently, with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching calculation for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s. © 2014 IEEE.


Hu Q.,University of Maryland University College | Gumerov N.A.,University of Maryland University College | Gumerov N.A.,Fantalgo LLC | Duraiswami R.,University of Maryland University College | Duraiswami R.,Fantalgo LLC
Computers and Fluids | Year: 2013

Many physics based simulations can be efficiently and accurately performed using particle methods which focus computational resources at the location of sources or discontinuities (particles), and evaluation of relevant fields at locations of interest. These particle methods result in the so-called N-body problem. The N-body problem also arises in interpolation using implicit functions, in simulation of molecular and stellar dynamics, and other areas. Fast and accurate N-body simulations are the goal of this paper. The Fast Multipole Method (FMM) has been proposed for these. In this paper we provide efficient data-structures implemented on Graphical Processing Units (GPUs), and a novel parallel formulation of the FMM on GPUs to address this problem. As an example application, we simulate the interactions between vortex rings. Except for initial setup, our approach processes all the computations and updates on GPU. Further, we provide interactive visualization of the simulation as it proceeds. Where the cost of direct simulation of the interaction of vortices and particles is O(n2+nm) per time step, where n is number of vortex elements and m is the number of particles, our algorithm reduces it to O(n+m) cost. © 2013 .


Hu Q.,University of Maryland Institute for Advanced Computer Studies | Hu Q.,University of Maryland University College | Gumerov N.A.,University of Maryland Institute for Advanced Computer Studies | Gumerov N.A.,Fantalgo LLC | And 3 more authors.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012 | Year: 2012

The Fast Multipole Method (FMM) allows O(N) evaluation to any arbitrary precision of N-body interactions that arises in many scientific contexts. These methods have been parallelized, with a recent set of papers attempting to parallelize them on heterogeneous CPU/GPU architectures [1]. While impressive performance was reported, the algorithms did not demonstrate complete weak or strong scalability. Further, the algorithms were not demonstrated on nonuniform distributions of particles that arise in practice. In this paper, we develop an efficient scalable version of the FMM that can be scaled well on many heterogeneous nodes for nonuniform data. Key contributions of our work are data structures that allow uniform work distribution over multiple computing nodes, and that minimize the communication cost. These new data structures are computed using a parallel algorithm, and only require a small additional computation overhead. Numerical simulations on a heterogeneous cluster empirically demonstrate the performance of our algorithm. © 2012 IEEE.


Gumerov N.A.,University of Maryland University College | Duraiswami R.,University of Maryland University College | Duraiswami R.,Fantalgo LLC
Journal of Computational Physics | Year: 2014

In a number of problems in computational physics, a finite sum of kernel functions centered at N particle locations located in a box in three dimensions must be extended by imposing periodic boundary conditions on box boundaries. Even though the finite sum can be efficiently computed via fast summation algorithms, such as the fast multipole method (FMM), the periodized extension is usually treated via a different algorithm, Ewald summation, accelerated via the fast Fourier transform (FFT). A different approach to compute this periodized sum just using a blackbox finite fast summation algorithm is presented in this paper. The method splits the periodized sum into two parts. The first, comprising the contribution of all points outside a large sphere enclosing the box, and some of its neighbors, is approximated inside the box by a collection of kernel functions ("sources") placed on the surface of the sphere or using an expansion in terms of spectrally convergent local basis functions. The second part, comprising the part inside the sphere, and including the box and its immediate neighborhood, is treated via available summation algorithms. The coefficients of the sources are determined by least squares collocation of the periodicity condition of the total potential, imposed on a circumspherical surface for the box. While the method is presented in general, details are worked out for the case of evaluating electrostatic potentials and forces. Results show that when used with the FMM, the periodized sum can be computed to any specified accuracy, at an additional cost of the order of the free-space FMM. Several technical details and efficient algorithms for auxiliary computations are provided, as are numerical comparisons. © 2014 Elsevier Inc.


Adelman R.,University of Maryland University College | Gumerov N.A.,University of Maryland University College | Gumerov N.A.,Fantalgo LLC | Duraiswami R.,University of Maryland University College | Duraiswami R.,Fantalgo LLC
IEEE Transactions on Antennas and Propagation | Year: 2016

The Galerkin boundary element method (BEM), also known as the method of moments, is a powerful method for solving the Laplace equation in three dimensions. There are advantages to Galerkin formulations for integral equations, as they treat problems associated with kernel singularity, and lead to symmetric and better conditioned matrices. However, the Galerkin method requires the computation of double surface integral over pairs of triangles. There are many semianalytical methods to treat these integrals, which all have some issues and are discussed in this paper. Novel methods inspired by the treatment of these kernels in the fast multipole method are presented for computing all the integrals that arise in the Galerkin formulation to any accuracy. Integrals involving completely geometrically separated triangles are nonsingular, and are computed using a technique based on spherical harmonics and multipole expansions and translations, which require the integration of polynomial functions over the triangles. Integrals involving cases where the triangles have common vertices, edges, or are coincident are treated via scaling and symmetry arguments, combined with automatic recursive geometric decomposition of the integrals. The methods are validated, and example results are presented. © 2016 IEEE.


Adelman R.,University of Maryland College Park | Gumerov N.A.,University of Maryland College Park | Gumerov N.A.,Fantalgo LLC | Duraiswami R.,University of Maryland College Park | Duraiswami R.,Fantalgo LLC
Journal of the Acoustical Society of America | Year: 2014

Analytical solutions to acoustic scattering problems involving spheroids and disks have long been known and have many applications. However, these solutions require special functions that are not easily computable. Therefore, their asymptotic forms are typically used instead since they are more readily available. In this paper, these solutions are explored, and computational software is provided for calculating their nonasymptotic forms, which are accurate over a wide range of frequencies and distances. This software, which runs in MATLAB, computes the solutions to acoustic scattering problems involving spheroids and disks by semi-analytical means, and is freely available from the authors' webpage. © 2014 Acoustical Society of America.

Loading Fantalgo LLC collaborators
Loading Fantalgo LLC collaborators