Publications

Results 151–172 of 172
Skip to search filters

Sampling graphs with a prescribed joint degree distribution using Markov chains

2011 Proceedings of the 13th Workshop on Algorithm Engineering and Experiments, ALENEX 2011

Stanton, Isabelle; Pinar, Ali P.

One of the most influential results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these real-world networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest that the joint degree distribution of graphs is an interesting avenue of study for further research into network structure. We provide a simple greedy algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via endpoint switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge. Copyright © 2011 by SIAM.

More Details

The inhibiting bisection problem

Pinar, Ali P.

Given a graph where each vertex is assigned a generation or consumption volume, we try to bisect the graph so that each part has a significant generation/consumption mismatch, and the cutsize of the bisection is small. Our motivation comes from the vulnerability analysis of distribution systems such as the electric power system. We show that the constrained version of the problem, where we place either the cutsize or the mismatch significance as a constraint and optimize the other, is NP-complete, and provide an integer programming formulation. We also propose an alternative relaxed formulation, which can trade-off between the two objectives and show that the alternative formulation of the problem can be solved in polynomial time by a maximum flow solver. Our experiments with benchmark electric power systems validate the effectiveness of our methods.

More Details

Constructing and sampling graphs with a given joint degree distribution

Stanton, Isabelle S.; Pinar, Ali P.

One of the most influential recent results in network analysis is that many natural networks exhibit a power-law or log-normal degree distribution. This has inspired numerous generative models that match this property. However, more recent work has shown that while these generative models do have the right degree distribution, they are not good models for real life networks due to their differences on other important metrics like conductance. We believe this is, in part, because many of these real-world networks have very different joint degree distributions, i.e. the probability that a randomly selected edge will be between nodes of degree k and l. Assortativity is a sufficient statistic of the joint degree distribution, and it has been previously noted that social networks tend to be assortative, while biological and technological networks tend to be disassortative. We suggest understanding the relationship between network structure and the joint degree distribution of graphs is an interesting avenue of further research. An important tool for such studies are algorithms that can generate random instances of graphs with the same joint degree distribution. This is the main topic of this paper and we study the problem from both a theoretical and practical perspective. We provide an algorithm for constructing simple graphs from a given joint degree distribution, and a Monte Carlo Markov Chain method for sampling them. We also show that the state space of simple graphs with a fixed degree distribution is connected via end point switches. We empirically evaluate the mixing time of this Markov Chain by using experiments based on the autocorrelation of each edge. These experiments show that our Markov Chain mixes quickly on real graphs, allowing for utilization of our techniques in practice.

More Details

Clustering of graphs with of multiple edge types

Pinar, Ali P.; Rocklin, Matthew D.

We study clustering on graphs with multiple edge types. Our main motivation is that similarities between objects can be measured in many different metrics. For instance similarity between two papers can be based on common authors, where they are published, keyword similarity, citations, etc. As such, graphs with multiple edges is a more accurate model to describe similarities between objects. Each edge/metric provides only partial information about the data; recovering full information requires aggregation of all the similarity metrics. Clustering becomes much more challenging in this context, since in addition to the difficulties of the traditional clustering problem, we have to deal with a space of clusterings. We generalize the concept of clustering in single-edge graphs to multi-edged graphs and investigate problems such as: Can we find a clustering that remains good, even if we change the relative weights of metrics? How can we describe the space of clusterings efficiently? Can we find unexpected clusterings (a good clustering that is distant from all given clusterings)? If given the groundtruth clustering, can we recover how the weights for edge types were aggregated?

More Details

Scalable methods for representing, characterizing, and generating large graphs

Pinar, Ali P.

Goal - design methods to characterize and identify a low dimensional representation of graphs. Impact - enabling predictive simulation; monitoring dynamics on graphs; and sampling and recovering network structure from limited observations. Areas to explore are: (1) Enabling technologies - develop novel algorithms and tailor existing ones for complex networks; (2) Modeling and generation - Identify the right parameters for graph representation and develop algorithms to compute these parameters and generate graphs from these parameters; and (3) Comparison - Given two graphs how do we tell they are similar? Some conclusions are: (1) A bad metric can make anything look good; (2) A metric that is based an edge-by edge prediction will suffer from the skewed distribution of present and absent edges; (3) The dominant signal is the sparsity, edges only add a noise on top of the signal, the real signal, structure of the graph is often lost behind the dominant signal; and (4) Proposed alternative: comparison based on carefully chosen set of features, it is more efficient, sensitive to selection of features, finding independent set of features is an important area, and keep an eye on us for some important results.

More Details

Compressively sensed complex networks

Pinar, Ali P.; Dunlavy, Daniel D.

The aim of this project is to develop low dimension parametric (deterministic) models of complex networks, to use compressive sensing (CS) and multiscale analysis to do so and to exploit the structure of complex networks (some are self-similar under coarsening). CS provides a new way of sampling and reconstructing networks. The approach is based on multiresolution decomposition of the adjacency matrix and its efficient sampling. It requires preprocessing of the adjacency matrix to make it 'blocky' which is the biggest (combinatorial) algorithm challenge. Current CS reconstruction algorithm makes no use of the structure of a graph, its very general (and so not very efficient/customized). Other model-based CS techniques exist, but not yet adapted to networks. Obvious starting point for future work is to increase the efficiency of reconstruction.

More Details

LDRD final report : massive multithreading applied to national infrastructure and informatics

Barrett, Brian B.; Hendrickson, Bruce A.; Laviolette, Randall A.; Leung, Vitus J.; Mackey, Greg; Murphy, Richard C.; Phillips, Cynthia A.; Pinar, Ali P.

Large relational datasets such as national-scale social networks and power grids present different computational challenges than do physical simulations. Sandia's distributed-memory supercomputers are well suited for solving problems concerning the latter, but not the former. The reason is that problems such as pattern recognition and knowledge discovery on large networks are dominated by memory latency and not by computation. Furthermore, most memory requests in these applications are very small, and when the datasets are large, most requests miss the cache. The result is extremely low utilization. We are unlikely to be able to grow out of this problem with conventional architectures. As the power density of microprocessors has approached that of a nuclear reactor in the past two years, we have seen a leveling of Moores Law. Building larger and larger microprocessor-based supercomputers is not a solution for informatics and network infrastructure problems since the additional processors are utilized to only a tiny fraction of their capacity. An alternative solution is to use the paradigm of massive multithreading with a large shared memory. There is only one instance of this paradigm today: the Cray MTA-2. The proposal team has unique experience with and access to this machine. The XMT, which is now being delivered, is a Red Storm machine with up to 8192 multithreaded 'Threadstorm' processors and 128 TB of shared memory. For many years, the XMT will be the only way to address very large graph problems efficiently, and future generations of supercomputers will include multithreaded processors. Roughly 10 MTA processor can process a simple short paths problem in the time taken by the Gordon Bell Prize-nominated distributed memory code on 32,000 processors of Blue Gene/Light. We have developed algorithms and open-source software for the XMT, and have modified that software to run some of these algorithms on other multithreaded platforms such as the Sun Niagara and Opteron multi-core chips.

More Details
Results 151–172 of 172
Results 151–172 of 172