CCL Home Preclinical Pharmacokinetics Service
APREDICA -- Preclinical Service: ADME, Toxicity, Pharmacokinetics
Up Directory CCL June 03, 1993 [003]
Previous Message Month index Next day

From:  figuei ( ( at ) ) lutece.rutgers.edu (Francisco Figueirido)
Date:  Thu, 3 Jun 93 08:59:04 -0400
Subject:  Hierarchical multipole solvers


I am currently testing an implementation of the Barnes-Hut algorithm that I
have inserted into a MD simulation package (IMPACT) we use here at Rutgers.
I haven't tried the FMM, though (it is known to have a much larger overhead
than the O(N log N) method). These are the references I have collected so
far (in BibTeX format):

[ AT ]Article{appel,
  author =       "Andrew W. Appel",
  title =        "An efficient program for many body simulations",
  journal =      siamscistat,
  volume =       "6",
  pages =        "85--103",
  year =         "1985",
  abstract =     "The simulation of $N$ particles interacting in a
		 gravitational force field is useful in astrophysics, but such
		 simulations become costly for large $N$. Representing the
		 universe as a tree structure with the particles at the leaves
		 and internal nodes labeled with the centers of mass of their
		 descendants allows several simultaneous attacks on the
		 computation time required by the problem. These approaches
		 range from algorithmic changes (replacing an $O(N^2)$
		 algorithm with an algorithm whose time-complexity is believed
		 to be $O(N\log N)$) to data structure modifications,
		 code-tuning, and hardware modifications. The changes reduced
		 the running time of a large problem ($N=10000$) by a factor
		 of four hundred. This paper describes both the particular
		 program and the methodology underlying such speedups.",
}

 ^at^ Article{barnes:hut,
  author =       "Josh Barnes and Piet Hut",
  title =        "A hierarchical ${O}({N}\log {N})$ force-calculation
                 algorithm",
  journal =      "Nature",
  volume =       "324",
  pages =        "446--449",
  year =         "1986",
  abstract = "Until recently the gravitational $N$-body problem has been
modelled numerically either by direct integration, in which the computation
needed increases as $N^2$, or by an iterative potential method in which the
number of operations grows as $N\,\log N$. Here we describe a novel method of
directly calculating the forces on $N$ bodies that grows only as $N\,\log N$.
The technique uses a tree-structured hierarchical subdivision of space into
cubic cells, each of which is recursively divided into eight subcells whenever
more than one particle is found to occupy the same cell. This tree is
constructed anew at every time step, avoiding ambiguity and tangling.
Advantages over potential-solving codes are: accurate local interactions;
freedeom from geometrical assumptions and restrictions; and applicability to a
wide class of systems, including (proto-)planetary, stellar, galactic and
cosmological ones. Advantages over previous hierarchical tree-codes include
simplicity and the possibility of rigorous analysis of error. Although we
concentrate here on stellar dynamical applications, our techniques of
efficiently handling a large number of long-range interactions and
concentrating computational effort where most needed have potential
applications in other areas of astrophysics as well."
}

 ^at^ Book{greengard,
  author =       "Leslie Greengard",
  title =        "The Rapid Evaluation of Potential Fields in Particle
                 Systems",
  publisher =    "The MIT Press",
  address =      "Cambridge, Massachusetts",
  year =         "1988",
}


 $#at#$ PhDThesis{draghicescu,
  author =      "Draghicescu, Cristina I.",
  title =       "Efficient Algorithms for Particle Methods",
  school =      "The Pennsylvania State University",
  year =        "1991",
  abstract =    "A fast algorithm is presented, which reduces the amount of
work necessary for computing pairwise interactions in a system of $n$
particles from $O(n^2)$ to $O(n(\log n)^p)$, where $p$ depends on the
problem in question. Error and work estimates are given.\par I
illustrate its application to the approximation of the Euler equations
in fluid dynamical simulations using the point vertex method. The
algorithm can be applied for both two- and three-dimensional
simulations; in the first case I show that, with a proper choice of
parameters, the accuracy and stability of the direct method are
preserved.\par Also discussed is the application of the algorithm to the
problem of evaluating interactions in molecular simulations.  A slightly
modified version can be used to reduce the complexity of the integral
equation method for boundary value problmes. I implemented the algorithm
for such a problem and provide the numerical results. On a SUN 4 the
algorithm reduces the CPU time required for a calculation with 500,000
points from a month to 15 minutes and is three times faster than the
direct method for as few as 128 particles.",
}

 # - at - # PhDThesis{salmon,
  author =       "John K. Salmon",
  title =        "Parallel Hierarchical ${N}$-body Methods",
  school =       "California Institute of Technology",
  year =         "1991",
  abstract = "Recent algorithmic advances utilizing hierarchical data
structures have resulted in a dramatic reduction in the time required
for computer simulation of $N$-body systems with long-range
interactions.  Computations which required $O(N^2)$ operations can now
be done in $O(N\,\log N)$ or $O(N)$. We review these tree methods and
find that they may be distinguished based on a few simple features. \par
The Barnes-Hut (BH) algorithm has received a great deal of attention,
and is the subject of the remainder of the dissertation. We present a
generalization of the BH tree and analyze the statistical properties of
such trees in detail.  We also consider the expected number of
operations entailed by an execution of the BH algorithm. We find an
optimal number for $m$, the maximum number of bodies in a terminal cell,
and confirm that the number of operations is $O(N\,\log N)$, even if the
distribution of bodies is not uniform. \par The mathematical basis of
all hierarchical methods is the multipole approximation. We discuss
multipole approximations, for the case of arbitrary, spherically
symmetric, and Newtonian Green's functions. We describe methods for
computing multipoles and evaluating multipole approximations in each of
these cases, emphasizing the tradeoff between generality and algorithmic
complexity. \par $N$-body simulations in computational astrophysics can
require $10^6$ or even more bodies. Algorithmic advances are not
sufficient, in and of themselves, to make computations of this size
feasible. Parallel computation offers, {\em a priori\/}, the necessary
computational power in terms of speed and memory. We show how the BH
algorithm can be adapted to execute in parallel. We use orthogonal
recursive bisection to partition space. The logical communication
structure that emerges is that of a hypercube. A local version of the BH
tree is constructed in each processor by iteratively exchanging data
along each edge of the logical hypercube. We obtain speedups in excess
of 380 on a 512 processor system for simulations of galaxy mergers with
180,000 bodies. We analyze the performance of the parallel version of
the algorithm and find that the overhead is due primarily to
interprocessor synchronization delays and redundant computation.
Communication is not a significant factor."
}

 \\at// Article{SL:JStatPhys:91,
  author =      "K. E. Schmidt and Michael A. Lee",
  title =       "Implementing the Fast Multipole Method in Three Dimensions",
  journal =     "J. Stat.\ Phys.{}",
  volume =      "63",
  pages =       "1223--1235",
  year =        "1991",
  abstract =    "The Rokhlin-Greengard fast multipole algorithm for
evaluating Coulomb and multipole potentials has been implemented and
analyzed in three dimensions. The implementation is presented for
bounded charged systems and systems with periodic boundary conditions.
The results include timings and error characterizations.",
}

 # - at - # Article{Hernquist:JCP:87,
  author =      "Lars Hernquist",
  title =       "Vectorization of Tree Traversals",
  journal =     jcompphys,
  volume =      "87",
  pages =       "137--147",
  year =        "1990",
  abstract =    "A simple method for vectorizing tree searches,
which operates by processing all relevant nodes at the same depth in the
tree simultaneously, is described. This procedure appears to be general,
assuming that gather-scatter oprations are vectorizable, but is most
efficient if the traversals proceed monotonically from the root to the
leaves, or {\em vice versa\/}. Particular application is made to the
hierarchical tree approach for computing the self-consistent interaction
of $N$ bodies. It is demonstrated that full vectorization of the
requisite tree searches is feasible, resulting in a factor $\approx$
4--5 improvement in cpu efficiency in the traversals on a CRAY X-MP. The
overall gain in the case of the Barnes-Hut tree code algorithm is a
factor $\approx$ 2--3, implying a net speed-up of $\approx$ 400-500 on a
CRAY X-MP over a VAX 11/780 or SUN 3/50.",
}

 ( ( at ) ) Article{Makino:JCP:87,
  author =      "Junichiro Makino",
  title =       "Vectorization of a Treecode",
  journal =     jcompphys,
  volume =      "87",
  pages =       "1990",
  year =        "148--160",
  abstract =    "Vectorized algorithms for the force calculation
and tree construction in the Barnes-Hut tree algorithm are described.
The basic idea for the vectorization of the force calculation is to
vectorize the tree traversal across particles, so that all particles in
the system traverse the tree simultaneously. The tree construction
algorithm also makes use of the fact that particles can be treated in
parallel. Thus these algorithms take advantage of the internal
parallelism in the $N$-body system and the tree algorithm most
effectively. As a natural result, these algorithms can be used on a wide
range of vector/parallel architectures, including current supercomputers
and highly parallel architectures such as the Connection Machine. The
vectorized code runs about five times faster than the non-vector code on
a Cyber 205 for an $N$-body system with $N=8192$.",
}

 #*at*# Article{Barnes:JCP:87,
  author =      "Joshua E. Barnes",
  title =       "A Modified Tree Code: Don't Laugh; It Runs",
  journal =     jcompphys,
  volume =      "87",
  pages =       "161--170",
  year =        "1990",
  abstract =    "I describe a modification of the Barnes-Hut tree
algorithm together with a series of numerical tests of this method. The
basic idea is to improve the performance of the code on heavily
vector-oriented machines such as the Cyber 205 by exploiting the fact
that nearby particles tend to have very similar interaction lists. By
building an interaction list good everywhere within a cell containing a
modest number of particles and reusing this interaction list for each
particle in the cell in turn, the balance of computation can be shifted
from recursive descent to force summation. Instead of vectorizing tree
descent, this scheme simply avoids it in favor of force summation, which
is quite easy to vectorize. A welcome side-effect of this modification
is that the force calculation, which now treats a larger fraction of the
local interactions exactly, is significantly more accurate that the
unmodified method.",
}

 "at@at" Article{Makino:JCP:88,
  author =      "Junichiro Makino",
  title =       "Comparison of Two Different Tree Algorithms",
  journal =     jcompphys,
  volume =      "88",
  pages =       "393--408",
  year =        "1990",
  abstract =    "The efficiency of two different algorithms of
hierarchical force calculation is discussed. Both algorithms utilize the
tree structure to reduce the cost of the force calculation from $O(N^2)$
to $O(N\log N)$. The only difference lies in the method of the
construction of the tree. One algorithm uses the oct-tree, which is the
recursive division of a cube into eight subcubes. The other method makes
the tree by repeatedly replacing a mutually nearest pair in the system
by a super-particle. Numerical experiments showed that the cost of the
force calculation using these two schemes is quite similar for the same
relative accuracy of the obtained force. The construction of the
mutual-nearest-neighbor tree is more expensive than the construction of
the oct-tree roughly by a factor of 10. On the conventional mainframes
this difference is not important because the cost of the tree
construction is only a small fraction of the total calculation cost. On
vector processors, the oct-tree scheme is currently faster because the
tree construction is relatively more expensive on the vector
processors.",
}




Similar Messages
06/07/1993:  Summary of hierarchical multipole methods
11/26/1997:  tRNA modelling: summary
10/13/1992:  On the use of Cut-off Schemes in MD
08/03/1995:  ACS Chicago - CINF Abstracts    - 29 pages document -
08/01/1996:  Re: CCL:M:Heat of formation calculation using MOPAC.
01/27/1995:  EEC summer school
04/28/1994:  Semi-empirical methods revisited
03/11/1996:  Law of conservation of difficulty: violations.
11/22/1997:  EXTENDED HUECKEL--MORE INFO AND A FINAL SUMMARY
04/08/1994:  normal coordinate calculation 


Raw Message Text