Publications

Results 51–100 of 119
Skip to search filters

A task-based linear algebra Building Blocks approach for scalable graph analytics

2015 IEEE High Performance Extreme Computing Conference, HPEC 2015

Wolf, Michael W.; Berry, Jonathan W.; Stark, Dylan S.

It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

More Details

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.; Hemmert, Karl S.; McCauley, Samuel; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.; Resnick, David R.; Rodrigues, Arun

A fundamental challenge for supercomputer architecture is that processors cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. As the number of cores per chip increases, and traditional DDR DRAM speeds stagnate, the problem is only getting worse. A variety of non-DDR 3D memory technologies (Wide I/O 2, HBM) offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. However, such a packaging scheme cannot contain sufficient memory capacity for a node. It seems likely that future systems will require at least two levels of main memory: high-bandwidth, low-power memory near the processor and low-bandwidth high-capacity memory further away. This near memory will probably not have significantly faster latency than the far memory. This, combined with the large size of the near memory (multiple GB) and power constraints, may make it difficult to treat it as a standard cache. In this paper, we explore some of the design space for a user-controlled multi-level main memory. We present algorithms designed for the heterogeneous bandwidth, using streaming to exploit data locality. We consider algorithms for the fundamental application of sorting. Our algorithms asymptotically reduce memory-block transfers under certain architectural parameter settings. We use and extend Sandia National Laboratories' SST simulation capability to demonstrate the relationship between increased bandwidth and improved algorithmic performance. Memory access counts from simulations corroborate predicted performance. This co-design effort suggests implementing two-level main memory systems may improve memory performance in fundamental applications.

More Details

Cooperative Computing for Autonomous Data Centers

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Berry, Jonathan W.; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy

We present a new distributed model for graph computations motivated by limited information sharing. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a poly logarithmic size subgraph, 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centres' have results for both models for s-t connectivity, one of the simplest graph problems that requires global information in the worst case. In the limited-sharing model, our results exploit social network structure. Standard communication complexity gives polynomial lower bounds on s-t connectivity for general graphs. However, if the graph for each data centre has a giant component and these giant components intersect, then we can overcome this lower bound, computing-t connectivity while exchanging O(log 2 n) bits for a constant number of data centers. We can also test the assumption that the giant components overlap using O(log 2 n) bits provided the (unknown) overlap is sufficiently large. The second result is in the low trust model. We give a secure multi-party computation (MPC) algorithm that 1) does not make cryptographic assumptions when there are 3 or more entities, and 2) is efficient, especially when compared to the usual garbled circuit approach. The entities learn only the yes/no answer. No party learns anything about the others' graph, not even node names. This algorithm does not require any special graph structure. This secure MPC result for s-t connectivity is one of the first that involves a few parties computing on large inputs, instead of many parties computing on a few local values.

More Details

Cyber Graph Queries for Geographically Distributed Data Centers

Berry, Jonathan W.; Collins, Michael C.; Kearns, Aaron K.; Phillips, Cynthia A.; Saia, Jared S.

We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed in [20]. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly- detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove perfor- mance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

More Details

Why do simple algorithms for triangle enumeration work in the real world?

Internet Mathematics

Berry, Jonathan W.; Fostvedt, Luke A.; Nordman, Daniel J.; Phillips, Cynthia A.; Comandur, Seshadhri C.; Wilson, Alyson G.

Listing all triangles is a fundamental graph operation. Triangles can have important interpretations in real-world graphs, especially social and other interaction networks. Despite the lack of provably efficient (linear, or slightly super linear) worst-case algorithms for this problem, practitioners run simple, efficient heuristics to find all triangles in graphs with millions of vertices. How are these heuristics exploiting the structure of these special graphs to provide major speedups in running time? We study one of the most prevalent algorithms used by practitioners. A trivial algorithm enumerates all paths of length 2, and checks if each such path is incident to a triangle. A good heuristic is to enumerate only those paths of length 2 in which the middle vertex has the lowest degree. It is easily implemented and is empirically known to give remarkable speedups over the trivial algorithm. We study the behavior of this algorithm over graphs with heavy-tailed degree distributions, a defining feature of real-world graphs. The erased configuration model (ECM) efficiently generates a graph with asymptotically (almost) any desired degree sequence. We show that the expected running time of this algorithm over the distribution of graphs created by the ECM is controlled by the l4/3-norm of the degree sequence. Norms of the degree sequence are a measure of the heaviness of the tail, and it is precisely this feature that allows non trivial speedups of simple triangle enumeration algorithms. As a corollary of our main theorem, we prove expected linear-time performance for degree sequences following a power law with exponent α ≥ 7/3, and non trivial speedup whenever α ∈ (2, 3).

More Details

Statistically significant relational data mining :

Berry, Jonathan W.; Leung, Vitus J.; Phillips, Cynthia A.; Pinar, Ali P.; Robinson, David G.

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

More Details

CHALLENGES IN PARALLEL GRAPH PROCESSING

Parallel Processing Letters

Hendrickson, Bruce A.; Berry, Jonathan W.

Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, software, and hardware that have worked well for developing mainstream parallel scientific applications are not necessarily effective for large-scale graph problems. In this paper we present the inter-relationships between graph problems, software, and parallel hardware in the current state of the art and discuss how those issues present inherent challenges in solving large-scale graph problems. The range of these challenges suggests a research agenda for the development of scalable high-performance software for graph problems.

More Details

Sensor placement for municipal water networks

Phillips, Cynthia A.; Boman, Erik G.; Carr, Robert D.; Hart, William E.; Berry, Jonathan W.; Watson, Jean-Paul W.; Hart, David B.; Mckenna, Sean A.; Riesen, Lee A.

We consider the problem of placing a limited number of sensors in a municipal water distribution network to minimize the impact over a given suite of contamination incidents. In its simplest form, the sensor placement problem is a p-median problem that has structure extremely amenable to exact and heuristic solution methods. We describe the solution of real-world instances using integer programming or local search or a Lagrangian method. The Lagrangian method is necessary for solution of large problems on small PCs. We summarize a number of other heuristic methods for effectively addressing issues such as sensor failures, tuning sensors based on local water quality variability, and problem size/approximation quality tradeoffs. These algorithms are incorporated into the TEVA-SPOT toolkit, a software suite that the US Environmental Protection Agency has used and is using to design contamination warning systems for US municipal water systems.

More Details

Listing triangles in expected linear time on a class of power law graphs

Berry, Jonathan W.

Enumerating triangles (3-cycles) in graphs is a kernel operation for social network analysis. For example, many community detection methods depend upon finding common neighbors of two related entities. We consider Cohen's simple and elegant solution for listing triangles: give each node a 'bucket.' Place each edge into the bucket of its endpoint of lowest degree, breaking ties consistently. Each node then checks each pair of edges in its bucket, testing for the adjacency that would complete that triangle. Cohen presents an informal argument that his algorithm should run well on real graphs. We formalize this argument by providing an analysis for the expected running time on a class of random graphs, including power law graphs. We consider a rigorously defined method for generating a random simple graph, the erased configuration model (ECM). In the ECM each node draws a degree independently from a marginal degree distribution, endpoints pair randomly, and we erase self loops and multiedges. If the marginal degree distribution has a finite second moment, it follows immediately that Cohen's algorithm runs in expected linear time. Furthermore, it can still run in expected linear time even when the degree distribution has such a heavy tail that the second moment is not finite. We prove that Cohen's algorithm runs in expected linear time when the marginal degree distribution has finite 4/3 moment and no vertex has degree larger than {radical}n. In fact we give the precise asymptotic value of the expected number of edge pairs per bucket. A finite 4/3 moment is required; if it is unbounded, then so is the number of pairs. The marginal degree distribution of a power law graph has bounded 4/3 moment when its exponent {alpha} is more than 7/3. Thus for this class of power law graphs, with degree at most {radical}n, Cohen's algorithm runs in expected linear time. This is precisely the value of {alpha} for which the clustering coefficient tends to zero asymptotically, and it is in the range that is relevant for the degree distribution of the World-Wide Web.

More Details

Low-memory Lagrangian relaxation methods for sensor placement in municipal water networks

World Environmental and Water Resources Congress 2008: Ahupua'a - Proceedings of the World Environmental and Water Resources Congress 2008

Berry, Jonathan W.; Boman, Erik G.; Phillips, Cynthia A.; Riesen, Lee A.

Placing sensors in municipal water networks to protect against a set of contamination events is a classic p-median problem for most objectives when we assume that sensors are perfect. Many researchers have proposed exact and approximate solution methods for this p-median formulation. For full-scale networks with large contamination event suites, one must generally rely on heuristic methods to generate solutions. These heuristics provide feasible solutions, but give no quality guarantee relative to the optimal placement. In this paper we apply a Lagrangian relaxation method in order to compute lower bounds on the expected impact of suites of contamination events. In all of our experiments with single objectives, these lower bounds establish that the GRASP local search method generates solutions that are provably optimal to to within a fraction of a percentage point. Our Lagrangian heuristic also provides good solutions itself and requires only a fraction of the memory of GRASP. We conclude by describing two variations of the Lagrangian heuristic: an aggregated version that trades off solution quality for further memory savings, and a multi-objective version which balances objectives with additional goals. © 2008 ASCE.

More Details

Tolerating the community detection resolution limit with edge weighting

Proposed for publication in the Proceedings of the National Academy of Sciences.

Hendrickson, Bruce A.; Laviolette, Randall A.; Phillips, Cynthia A.; Berry, Jonathan W.

Communities of vertices within a giant network such as the World-Wide-Web are likely to be vastly smaller than the network itself. However, Fortunato and Barthelemy have proved that modularity maximization algorithms for community detection may fail to resolve communities with fewer than {radical} L/2 edges, where L is the number of edges in the entire network. This resolution limit leads modularity maximization algorithms to have notoriously poor accuracy on many real networks. Fortunato and Barthelemy's argument can be extended to networks with weighted edges as well, and we derive this corollary argument. We conclude that weighted modularity algorithms may fail to resolve communities with fewer than {radical} W{epsilon}/2 total edge weight, where W is the total edge weight in the network and {epsilon} is the maximum weight of an inter-community edge. If {epsilon} is small, then small communities can be resolved. Given a weighted or unweighted network, we describe how to derive new edge weights in order to achieve a low {epsilon}, we modify the 'CNM' community detection algorithm to maximize weighted modularity, and show that the resulting algorithm has greatly improved accuracy. In experiments with an emerging community standard benchmark, we find that our simple CNM variant is competitive with the most accurate community detection methods yet proposed.

More Details

The TEVA-SPOT toolkit for drinking water contaminant warning system design

World Environmental and Water Resources Congress 2008: Ahupua'a - Proceedings of the World Environmental and Water Resources Congress 2008

Hart, William E.; Berry, Jonathan W.; Boman, Erik G.; Murray, Regan; Phillips, Cynthia A.; Riesen, Lee A.; Watson, Jean-Paul W.

We present the TEVA-SPOT Toolkit, a sensor placement optimization tool developed within the USEPA TEVA program. The TEVA-SPOT Toolkit provides a sensor placement framework that facilitates research in sensor placement optimization and enables the practical application of sensor placement solvers to real-world CWS design applications. This paper provides an overview of its key features, and then illustrates how this tool can be flexibly applied to solve a variety of different types of sensor placement problems. © 2008 ASCE.

More Details

Limited-memory techniques for sensor placement in water distribution networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hart, William E.; Berry, Jonathan W.; Boman, Erik G.; Phillips, Cynthia A.; Riesen, Lee A.; Watson, Jean-Paul W.

The practical utility of optimization technologies is often impacted by factors that reflect how these tools are used in practice, including whether various real-world constraints can be adequately modeled, the sophistication of the analysts applying the optimizer, and related environmental factors (e.g. whether a company is willing to trust predictions from computational models). Other features are less appreciated, but of equal importance in terms of dictating the successful use of optimization. These include the scale of problem instances, which in practice drives the development of approximate solution techniques, and constraints imposed by the target computing platforms. End-users often lack state-of-the-art computers, and thus runtime and memory limitations are often a significant, limiting factor in algorithm design. When coupled with large problem scale, the result is a significant technological challenge. We describe our experience developing and deploying both exact and heuristic algorithms for placing sensors in water distribution networks to mitigate against damage due intentional or accidental introduction of contaminants. The target computing platforms for this application have motivated limited-memory techniques that can optimize large-scale sensor placement problems. © 2008 Springer Berlin Heidelberg.

More Details

Scheduling manual sampling for contamination detection in municipal water networks

8th Annual Water Distribution Systems Analysis Symposium 2006

Berry, Jonathan W.; Lin, Henry; Lauer, Erik; Phillips, Cynthia

Cities without an early warning system of indwelling sensors can consider monitoring their networks manually, especially during times of heightened security levels. We consider the problem of calculating an optimal schedule for manual sampling in a municipal water network. Preliminary computations with a small-scale example indicate that during normal times, manual sampling can provide some benefit, but it is far inferior to an indwelling sensor network. However, given information that significantly constrains the nature of an imminent threat, manual sampling can perform as well as a small sensor network designed to handle normal threats. Copyright ASCE 2006.

More Details
Results 51–100 of 119
Results 51–100 of 119