Publications

Results 9451–9475 of 9,998
Skip to search filters

Parallel hypergraph partitioning for scientific computing

Boman, Erik G.; Devine, Karen D.; Heaphy, Robert T.; Hendrickson, Bruce A.

Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.

More Details

Accelerating list management for MPI

Hemmert, Karl S.; Rodrigues, Arun; Underwood, Keith

The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a message is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By moving functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages.

More Details

A multiscale discontinuous galerkin method with the computational structure of a continuous galerkin method

Scovazzi, Guglielmo S.; Bochev, Pavel B.

Proliferation of degrees-of-freedom has plagued discontinuous Galerkin methodology from its inception over 30 years ago. This paper develops a new computational formulation that combines the advantages of discontinuous Galerkin methods with the data structure of their continuous Galerkin counterparts. The new method uses local, element-wise problems to project a continuous finite element space into a given discontinuous space, and then applies a discontinuous Galerkin formulation. The projection leads to parameterization of the discontinuous degrees-of-freedom by their continuous counterparts and has a variational multiscale interpretation. This significantly reduces the computational burden and, at the same time, little or no degradation of the solution occurs. In fact, the new method produces improved solutions compared with the traditional discontinuous Galerkin method in some situations.

More Details

Density functional theory study of transition metal porphine adsorption on gold surface and electric field induced conformation changes

Proposed for publication in the Journal of the American Chemical Society.

Rempe, Susan R.; Schultz, Peter A.; Chandross, M.

We apply density functional theory (DFT) and the DFT+U technique to study the adsorption of transition metal porphine molecules on atomistically flat Au(111) surfaces. DFT calculations using the Perdew?Burke?Ernzerhof exchange correlation functional correctly predict the palladium porphine (PdP) low-spin ground state. PdP is found to adsorb preferentially on gold in a flat geometry, not in an edgewise geometry, in qualitative agreement with experiments on substituted porphyrins. It exhibits no covalent bonding to Au(111), and the binding energy is a small fraction of an electronvolt. The DFT+U technique, parametrized to B3LYP-predicted spin state ordering of the Mn d-electrons, is found to be crucial for reproducing the correct magnetic moment and geometry of the isolated manganese porphine (MnP) molecule. Adsorption of Mn(II)P on Au(111) substantially alters the Mn ion spin state. Its interaction with the gold substrate is stronger and more site-specific than that of PdP. The binding can be partially reversed by applying an electric potential, which leads to significant changes in the electronic and magnetic properties of adsorbed MnP and 0.1 {angstrom} changes in the Mn-nitrogen distances within the porphine macrocycle. We conjecture that this DFT+U approach may be a useful general method for modeling first-row transition metal ion complexes in a condensed-matter setting.

More Details

Effect of deformation path sequence on the behavior of nanoscale copper bicrystal interfaces

Proposed for publication in the Journal of Engineering Materials and Technology.

Plimpton, Steven J.

Molecular dynamics calculations are performed to study the effect of deformation sequence and history on the inelastic behavior of copper interfaces on the nanoscale. An asymmetric 45 deg tilt bicrystal interface is examined, representing an idealized high-angle grain boundary interface. The interface model is subjected to three different deformation paths: tension then shear, shear then tension, and combined proportional tension and shear. Analysis shows that path-history dependent material behavior is confined within a finite layer of deformation around the bicrystal interface. The relationships between length scale and interface properties, such as the thickness of the path-history dependent layer and the interface strength, are discussed in detail.

More Details

Nonlinear magnetohydrodynamics simulation using high-order finite elements

Proposed for publication in the Journal of Computational Physics.

Plimpton, Steven J.

A conforming representation composed of 2D finite elements and finite Fourier series is applied to 3D nonlinear non-ideal magnetohydrodynamics using a semi-implicit time-advance. The self-adjoint semi-implicit operator and variational approach to spatial discretization are synergistic and enable simulation in the extremely stiff conditions found in high temperature plasmas without sacrificing the geometric flexibility needed for modeling laboratory experiments. Growth rates for resistive tearing modes with experimentally relevant Lundquist number are computed accurately with time-steps that are large with respect to the global Alfven time and moderate spatial resolution when the finite elements have basis functions of polynomial degree (p) two or larger. An error diffusion method controls the generation of magnetic divergence error. Convergence studies show that this approach is effective for continuous basis functions with p {ge} 2, where the number of test functions for the divergence control terms is less than the number of degrees of freedom in the expansion for vector fields. Anisotropic thermal conduction at realistic ratios of parallel to perpendicular conductivity (x{parallel}/x{perpendicular}) is computed accurately with p {ge} 3 without mesh alignment. A simulation of tearing-mode evolution for a shaped toroidal tokamak equilibrium demonstrates the effectiveness of the algorithm in nonlinear conditions, and its results are used to verify the accuracy of the numerical anisotropic thermal conduction in 3D magnetic topologies.

More Details

An improved convergence bound for aggregation-based domain decomposition preconditioners

Proposed for publication in the SIAM Journal on Matrix Analysis and Applications.

Sala, Marzio S.; Shadid, John N.; Tuminaro, Raymond S.

In this paper we present a two-level overlapping domain decomposition preconditioner for the finite-element discretization of elliptic problems in two and three dimensions. The computational domain is partitioned into overlapping subdomains, and a coarse space correction, based on aggregation techniques, is added. Our definition of the coarse space does not require the introduction of a coarse grid. We consider a set of assumptions on the coarse basis functions to bound the condition number of the resulting preconditioned system. These assumptions involve only geometrical quantities associated with the aggregates and the subdomains. We prove that the condition number using the two-level additive Schwarz preconditioner is O(H/{delta} + H{sub 0}/{delta}), where H and H{sub 0} are the diameters of the subdomains and the aggregates, respectively, and {delta} is the overlap among the subdomains and the aggregates. This extends the bounds presented in [C. Lasser and A. Toselli, Convergence of some two-level overlapping domain decomposition preconditioners with smoothed aggregation coarse spaces, in Recent Developments in Domain Decomposition Methods, Lecture Notes in Comput. Sci. Engrg. 23, L. Pavarino and A. Toselli, eds., Springer-Verlag, Berlin, 2002, pp. 95-117; M. Sala, Domain Decomposition Preconditioners: Theoretical Properties, Application to the Compressible Euler Equations, Parallel Aspects, Ph.D. thesis, Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland, 2003; M. Sala, Math. Model. Numer. Anal., 38 (2004), pp. 765-780]. Numerical experiments on a model problem are reported to illustrate the performance of the proposed preconditioner.

More Details
Results 9451–9475 of 9,998
Results 9451–9475 of 9,998