TYPE Conference Presenation YEAR 2021

Scopus OSTI DOI

Brost, Randolph B.; Carrier, Erin E.; Carroll, Michelle C.; Groth, Katrina M.; Kegelmeyer, William P.; Leung, Vitus J.; Link, Hamilton E.; Patterson, Andrew J.; Phillips, Cynthia A.; Richter, Samuel N.; Robinson, David G.; Staid, Andrea S.; Woodbridge, Diane M.-K.

This report summarizes the work performed under the Sandia LDRD project "Adverse Event Prediction Using Graph-Augmented Temporal Analysis." The goal of the project was to de- velop a method for analyzing multiple time-series data streams to identify precursors provid- ing advance warning of the potential occurrence of events of interest. The proposed approach combined temporal analysis of each data stream with reasoning about relationships between data streams using a geospatial-temporal semantic graph. This class of problems is relevant to several important topics of national interest. In the course of this work we developed new temporal analysis techniques, including temporal analysis using Markov Chain Monte Carlo techniques, temporal shift algorithms to refine forecasts, and a version of Ripley's K-function extended to support temporal precursor identification. This report summarizes the project's major accomplishments, and gathers the abstracts and references for the publication sub- missions and reports that were prepared as part of this work. We then describe work in progress that is not yet ready for publication.

More Details

TYPE SAND Report YEAR 2018

OSTI DOI

Statistical Models of Dengue Fever

Link, Hamilton E.; Richter, Samuel N.; Leung, Vitus J.; Brost, Randolph B.; Phillips, Cynthia A.; Staid, Andrea S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI DOI

Level-spread: A new job allocation policy for dragonfly networks

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018

Zhang, Yijia; Tuncer, Ozan; Kaplan, Fulya; Olcoz, Katzalin; Leung, Vitus J.; Coskun, Ayse K.

The dragonfly network topology has attracted attention in recent years owing to its high radix and constant diameter. However, the influence of job allocation on communication time in dragonfly networks is not fully understood. Recent studies have shown that random allocation is better at balancing the network traffic, while compact allocation is better at harnessing the locality in dragonfly groups. Based on these observations, this paper introduces a novel allocation policy called Level-Spread for dragonfly networks. This policy spreads jobs within the smallest network level that a given job can fit in at the time of its allocation. In this way, it simultaneously harnesses node adjacency and balances link congestion. To evaluate the performance of Level-Spread, we run packet-level network simulations using a diverse set of application communication patterns, job sizes, and communication intensities. We also explore the impact of network properties such as the number of groups, number of routers per group, machine utilization level, and global link bandwidth. Level-Spread reduces the communication overhead by 16% on average (and up to 71%) compared to the state-of-The-Art allocation policies.

More Details

TYPE Conference Poster YEAR 2018

Scopus OSTI DOI

Taxonimist: Application Detection through Rich Monitoring Data

Ates, Emre; Tuncer, Ozan; Turk, Ata T.; Leung, Vitus J.; Brandt, James M.; Egele, Manuel E.; Coskun, Ayse K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Detection and Diagnosis of Performance Variations

Tuncer, Ozan; Ates, Emre; Zhang, Yijia Z.; Turk, Ata T.; Brandt, James M.; Leung, Vitus J.; Egele, Manuel E.; Coskun, Ayse K.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Tuncer, Ozan T.; Zhang, Yijia Z.; Leung, Vitus J.; Coskun, Ayse K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Task Placement to Reduce Application Communication Costs

Devine, Karen D.; Brandt, James M.; Deveci, Mehmet D.; Gentile, Ann C.; Leung, Vitus J.; Olivier, Stephen L.; Pedretti, Kevin P.; Rajamanickam, Sivasankaran R.; Taylor, Mark A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Adverse event prediction using graph-augmented temporal analysis

Brost, Randolph B.; Leung, Vitus J.; Link, Hamilton E.; Phillips, Cynthia A.; Staid, Andrea S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Unveiling the Interplay between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks

Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017

Kaplan, Fulya; Tuncer, Ozan; Leung, Vitus J.; Hemmert, Karl S.; Coskun, Ayse K.

Network messaging delay historically constitutes a large portion of the wall-clock time for High Performance Computing (HPC) applications, as these applications run on many nodes and involve intensive communication among their tasks. Dragonfly network topology has emerged as a promising solution for building exascale HPC systems owing to its low network diameter and large bisection bandwidth. Dragonfly includes local links that form groups and global links that connect these groups via high bandwidth optical links. Many aspects of the dragonfly network design are yet to be explored, such as the performance impact of the connectivity of the global links, i.e., global link arrangements, the bandwidth of the local and global links, or the job allocation algorithm. This paper first introduces a packet-level simulation framework to model the performance of HPC applications in detail. The proposed framework is able to simulate known MPI (message passing interface) routines as well as applications with custom-defined communication patterns for a given job placement algorithm and network topology. Using this simulation framework, we investigate the coupling between global link bandwidth and arrangements, communication pattern and intensity, job allocation and task mapping algorithms, and routing mechanisms in dragonfly topologies. We demonstrate that by choosing the right combination of system settings and workload allocation algorithms, communication overhead can be decreased by up to 44%. We also show that circulant arrangement provides up to 15% higher bisection bandwidth compared to the other arrangements, but for realistic workloads, the performance impact of link arrangements is less than 3%.

More Details

TYPE Conference Poster YEAR 2017

Scopus OSTI DOI

Unveiling the Interplay Between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks

Kaplan, Fulya K.; Tuncer, Ozan T.; Leung, Vitus J.; Hemmert, Karl S.; Coskun, Ayse K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI DOI

Beyond BUS-PASS

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Diagnosing Performance Variations in HPC Architectures Using Machine Learning

Tuncer, Ozan T.; Ates, Emre A.; Zhang, Yijia Z.; Turk, Ata T.; Brandt, James M.; Leung, Vitus J.; Egele, Manuel E.; Coskun, Ayse K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Diagnosing performance variations in HPC applications using machine learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hastings, Emily H.; Rincon-Cruz, David R.; Spehlmann, Marc S.; Meyers, Sofia M.; Xu, Anda X.; Bunde, David P.; Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI DOI

Architecture-aware Task Placement

Deveci, Mehmet D.; Devine, Karen D.; Leung, Vitus J.; Prokopenko, Andrey V.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Comparing global link arrangements for Dragonfly networks

Leung, Vitus J.; Bunde, David P.; Hastings, Emily H.; Meyers, Sofia M.; Rincon-Cruz, David R.; Spehlmann, Marc S.; Xu, Anda X.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Local search to improve task mapping

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Statistically significant relational data mining :

Berry, Jonathan W.; Leung, Vitus J.; Phillips, Cynthia A.; Pinar, Ali P.; Robinson, David G.

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

More Details

TYPE SAND Report YEAR 2014

OSTI DOI

Task Mapping Stencil Computations for Non-Contiguous Allocations

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Backfilling with guarantees made as jobs arrive

Proposed for publication in Concurrency and Computation: Practice and Experience.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Journal of Physics: Conference Series

Devine, Karen D.; Diachin, L.; Kraftcheck, J.; Jansen, K.E.; Leung, Vitus J.; Luo, X.; Miller, M.; Ollivier-Gooch, C.; Ovcharenko, A.; Sahni, O.; Shephard, M.S.; Tautges, T.; Xie, T.; Zhou, M.

SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. In this paper, we describe a software component - an abstract data model and programming interface - designed to provide support for parallel unstructured mesh operations. We describe key issues that must be addressed to successfully provide high-performance, distributed-memory unstructured mesh services and highlight some recent research accomplishments in developing new load balancing and MPI-based communication libraries appropriate for leadership class computing. Finally, we give examples of the use of parallel adaptive mesh modification in two SciDAC applications. © 2009 IOP Publishing Ltd.

More Details

TYPE Conference YEAR 2009

Scopus OSTI

Recognition for Optimizing Hardware Allocation

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

Tutorial : the Zoltan toolkit

Devine, Karen D.; Boman, Erik G.; Chevalier, Cedric C.; Leung, Vitus J.; Riesen, Lee A.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Boman, Erik G.; Leung, Vitus J.; Riesen, Lee A.

Abstract not provided.

More Details

TYPE Conference YEAR 2007

OSTI

Dynamic Services (Zoltan)

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2007

OSTI

Parallel Job Scheduling Policies to Improve Fairness: A Case Study

Journal of Scheduling

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2006

OSTI

On the Placement of Imperfect Sensors in Municipal Water Networks

Hart, William E.; Berry, Jonathan W.; Watson, Jean-Paul W.; Carr, Robert D.; Phillips, Cynthia A.; Leung, Vitus J.

Abstract not provided.

More Details

TYPE Conference YEAR 2006

OSTI

Fairness of Job Scheduling in Cplant

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Communication patterns and allocation strategies

Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM)

Bunde, David P.; Leung, Vitus J.; Mache, Jens

Motivated by observations about job runtimes on the CPlant system, we use a trace-driven microsimulator to begin characterizing the performance of different classes of allocation algorithms on jobs with different communication patterns in space-shared parallel systems with mesh topology. We show that relative performance varies considerably with communication pattern. The Paging strategy using the Hilbert space-filling curve and the Best Fit heuristic performed best across several communication patterns.

More Details

TYPE Conference YEAR 2004

Scopus OSTI DOI

Communication-aware processor allocation for supercomputers

Leung, Vitus J.; Phillips, Cynthia A.

We give processor-allocation algorithms for grid architectures, where the objective is to select processors from a set of available processors to minimize the average number of communication hops. The associated clustering problem is as follows: Given n points in R{sup d}, find a size-k subset with minimum average pairwise L{sub 1} distance.We present a natural approximation algorithm and show that it is a 7/4-approximation for 2D grids. In d dimensions, the approximation guarantee is 2 - 1/2d, which is tight. We also give a polynomial-time approximation scheme (PTAS) for constant dimension d and report on experimental results.

More Details

TYPE Conference YEAR 2004

OSTI

Algorithmic support for commodity-based parallel computing systems

Leung, Vitus J.; Leung, Vitus J.; Phillips, Cynthia A.

The Computational Plant or Cplant is a commodity-based distributed-memory supercomputer under development at Sandia National Laboratories. Distributed-memory supercomputers run many parallel programs simultaneously. Users submit their programs to a job queue. When a job is scheduled to run, it is assigned to a set of available processors. Job runtime depends not only on the number of processors but also on the particular set of processors assigned to it. Jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs. This report introduces new allocation strategies and performance metrics based on space-filling curves and one dimensional allocation strategies. These algorithms are general and simple. Preliminary simulations and Cplant experiments indicate that both space-filling curves and one-dimensional packing improve processor locality compared to the sorted free list strategy previously used on Cplant. These new allocation strategies are implemented in Release 2.0 of the Cplant System Software that was phased into the Cplant systems at Sandia by May 2002. Experimental results then demonstrated that the average number of communication hops between the processors allocated to a job strongly correlates with the job's completion time. This report also gives processor-allocation algorithms for minimizing the average number of communication hops between the assigned processors for grid architectures. The associated clustering problem is as follows: Given n points in {Re}d, find k points that minimize their average pairwise L{sub 1} distance. Exact and approximate algorithms are given for these optimization problems. One of these algorithms has been implemented on Cplant and will be included in Cplant System Software, Version 2.1, to be released. In more preliminary work, we suggest improvements to the scheduler separate from the allocator.

More Details

TYPE Report YEAR 2003

OSTI DOI

Witkowski, Walter R.; Jung, Joseph J.; Dohrmann, Clark R.; Leung, Vitus J.

The ability to generate a suitable finite element mesh in an automatic fashion is becoming the key to being able to automate the entire engineering analysis process. However, placing an all-hexahedron mesh in a general three-dimensional body continues to be an elusive goal. The approach investigated in this research is fundamentally different from any other that is known of by the authors. A physical analogy viewpoint is used to formulate the actual meshing problem which constructs a global mathematical description of the problem. The analogy used was that of minimizing the electrical potential of a system charged particles within a charged domain. The particles in the presented analogy represent duals to mesh elements (i.e., quads or hexes). Particle movement is governed by a mathematical functional which accounts for inter-particles repulsive, attractive and alignment forces. This functional is minimized to find the optimal location and orientation of each particle. After the particles are connected a mesh can be easily resolved. The mathematical description for this problem is as easy to formulate in three-dimensions as it is in two- or one-dimensions. The meshing algorithm was developed within CoMeT. It can solve the two-dimensional meshing problem for convex and concave geometries in a purely automated fashion. Investigation of the robustness of the technique has shown a success rate of approximately 99% for the two-dimensional geometries tested. Run times to mesh a 100 element complex geometry were typically in the 10 minute range. Efficiency of the technique is still an issue that needs to be addressed. Performance is an issue that is critical for most engineers generating meshes. It was not for this project. The primary focus of this work was to investigate and evaluate a meshing algorithm/philosophy with efficiency issues being secondary. The algorithm was also extended to mesh three-dimensional geometries. Unfortunately, only simple geometries were tested before this project ended. The primary complexity in the extension was in the connectivity problem formulation. Defining all of the interparticle interactions that occur in three-dimensions and expressing them in mathematical relationships is very difficult.

More Details

TYPE Report YEAR 2000

OSTI DOI