Publications Search

Gabert, Kasimir G.; Pinar, Ali P.; Çatalyürek, Ümit V.

Finding dense regions of graphs is fundamental in graph mining. We focus on the computation of dense hierarchies and regions with graph nuclei - -a generalization of k-cores and trusses. Static computation of nuclei, namely through variants of 'peeling', are easy to understand and implement. However, many practically important graphs undergo continuous change. Dynamic algorithms, maintaining nucleus computations on dynamic graph streams, are nuanced and require significant effort to port between nuclei, e.g., from k-cores to trusses. We propose a unifying framework to maintain nuclei in dynamic graph streams. First, we show no dynamic algorithm can asymptotically beat re-computation, highlighting the need to experimentally understand variability. Next, we prove equivalence between k-cores on a special hypergraph and nuclei. Our algorithm splits the problem into maintaining the special hypergraph and maintaining k-cores on it. We implement our algorithm and experimentally demonstrate improvements up to 108 x over re-computation. We show algorithmic improvements on k-cores apply to trusses and outperform truss-specific implementations.

More Details

TYPE Conference Paper YEAR 2021

Scopus OSTI DOI

Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020

Laishram, Ricky; Sariyüce, Ahmet E.; Eliassi-Rad, Tina; Pinar, Ali P.; Soundarajan, Sucheta

In many online social networking platforms, the participation of an individual is motivated by the participation of others. If an individual chooses to leave a platform, this may produce a cascade in which that person’s friends then choose to leave, causing their friends to leave, and so on. In some cases, it may be possible to incentivize key individuals to stay active within the network, thus preventing such a cascade. This problem is modeled using the anchored k-core of a network, which, for a network G and set of anchor nodes A, is the maximal subgraph of G in which every node has a total of at least k neighbors between the subgraph and anchors. In this work, we propose Residual Core Maximization (RCM), a novel algorithm for finding b anchor nodes so that the size of the anchored k-core is maximized. We perform a comprehensive experimental evaluation on numerous real-world networks and compare RCM to various baselines. We observe that RCM is more effective and efficient than the state-of-the-art methods: on average, RCM produces anchored k-cores that are 1.65 times larger than those produced by the baseline algorithm, and is approximately 500 times faster on average.

More Details

TYPE Conference Poster YEAR 2020

Scopus OSTI DOI

SECURE: An Evidence-based Approach to Cyber Experimentation

Proceedings - 2019 Resilience Week, RWS 2019

Pinar, Ali P.; Benz, Zachary O.; Castillo, Anya; Hart, Bill; Swiler, Laura P.; Tarman, Thomas D.

Securing cyber systems is of paramount importance, but rigorous, evidence-based techniques to support decision makers for high-consequence decisions have been missing. The need for bringing rigor into cybersecurity is well-recognized, but little progress has been made over the last decades. We introduce a new project, SECURE, that aims to bring more rigor into cyber experimentation. The core idea is to follow the footsteps of computational science and engineering and expand similar capabilities to support rigorous cyber experimentation. In this paper, we review the cyber experimentation process, present the research areas that underlie our effort, discuss the underlying research challenges, and report on our progress to date. This paper is based on work in progress, and we expect to have more complete results for the conference.

More Details

TYPE Conference Poster YEAR 2019

Scopus OSTI DOI

Computational Optimization and Applications

Cheng, Jianqiang; Chen, Richard L.; Najm, H.N.; Pinar, Ali P.; Safta, Cosmin S.; Watson, Jean-Paul W.

Increasing penetration levels of renewables have transformed how power systems are operated. High levels of uncertainty in production make it increasingly difficulty to guarantee operational feasibility; instead, constraints may only be satisfied with high probability. We present a chance-constrained economic dispatch model that efficiently integrates energy storage and high renewable penetration to satisfy renewable portfolio requirements. Specifically, we require that wind energy contribute at least a prespecified proportion of the total demand and that the scheduled wind energy is deliverable with high probability. We develop an approximate partial sample average approximation (PSAA) framework to enable efficient solution of large-scale chance-constrained economic dispatch problems. Computational experiments on the IEEE-24 bus system show that the proposed PSAA approach is more accurate, closer to the prescribed satisfaction tolerance, and approximately 100 times faster than standard sample average approximation. Finally, the improved efficiency of our PSAA approach enables solution of a larger WECC-240 test system in minutes.

More Details

TYPE Journal Article YEAR 2018

Scopus OSTI DOI

Counter-Adversarial Node Labeling

Kegelmeyer, William P.; Wendt, Jeremy D.; Pinar, Ali P.; Anderson-Bergman, Clifford I.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Unsupervised Learning Through Randomized Algorithms for High-Volume High-Velocity Data (ULTRA-HV)

Pinar, Ali P.; Kolda, Tamara G.; Carlberg, Kevin T.; Ballard, Grey B.; Mahoney, Michael M.

Through long-term investments in computing, algorithms, facilities, and instrumentation, DOE is an established leader in massive-scale, high-fidelity simulations, as well as science-leading experimentation. In both cases, DOE is generating more data than it can analyze and the problem is intensifying quickly. The need for advanced algorithms that can automatically convert the abundance of data into a wealth of useful information by discovering hidden structures is well recognized. Such efforts however, are hindered by the massive volume of the data and its high velocity. Here, the challenge is developing unsupervised learning methods to discover hidden structure in high-volume, high-velocity data.

More Details

TYPE Other Report YEAR 2018

OSTI DOI

Predictive Fidelity of Machine Learning Methods Applied to Scientific Simulations

Debusschere, Bert D.; Templeton, Jeremy A.; Safta, Cosmin S.; Sargsyan, Khachik S.; Pinar, Ali P.; Najm, H.N.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Predictive Fidelity of Machine Learning Methods Applied to Scientific Simulations

Debusschere, Bert D.; Templeton, Jeremy A.; Safta, Cosmin S.; Sargsyan, Khachik S.; Pinar, Ali P.; Najm, H.N.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Journal of Complex Networks

Aksoy, Sinan G.; Kolda, Tamara G.; Pinar, Ali P.

Network science is a powerful tool for analyzing complex systems in fields ranging from sociology to engineering to biology. This article is focused on generative models of large-scale bipartite graphs, also known as two-way graphs or two-mode networks. We propose two generative models that can be easily tuned to reproduce the characteristics of real-world networks, not just qualitatively but quantitatively. The characteristics we consider are the degree distributions and the metamorphosis coefficient. The metamorphosis coefficient, a bipartite analogue of the clustering coefficient, is the proportion of length-three paths that participate in length-four cycles. Having a high metamorphosis coefficient is a necessary condition for close-knit community structure. We define edge, node and degreewise metamorphosis coefficients, enabling a more detailed understanding of the bipartite connectivity that is not explained by degree distribution alone. Our first model, bipartite Chung-Lu, is able to reproduce real-world degree distributions, and our second model, bipartite block two-level Erdös-Rényi, reproduces both the degree distributions as well as the degreewise metamorphosis coefficients. We demonstrate the effectiveness of these models on several real-world data sets.

More Details

TYPE Other Report YEAR 2017

Scopus OSTI DOI

Predictive Fidelity Interpretability and Resilience of Machine Learning Methods Applied to Scientific Simulations

Debusschere, Bert D.; Pinar, Ali P.; Sargsyan, Khachik S.; Templeton, Jeremy A.; Najm, H.N.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

A Community and Node Attribute-Corrected Stochastic Blockmodel

Altenburger, Kristen M.; Kegelmeyer, William P.; Pinar, Ali P.; Wendt, Jeremy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Bounded-Degree Approximations of Stochastic Networks

IEEE Transactions on Molecular, Biological and Multi-Scale Communications

Pinar, Ali P.; Quinn, Christopher Q.; Kiyavash, Negar K.

We propose algorithms to approximate directed information graphs. Directed information graphs are probabilistic graphical models that depict causal dependencies between stochastic processes in a network. The proposed algorithms identify optimal and near-optimal approximations in terms of Kullback-Leibler divergence. The user-chosen sparsity trades off the quality of the approximation against visual conciseness and computational tractability. One class of approximations contains graphs with speci ed in-degrees. Another class additionally requires that the graph is connected. For both classes, we propose algorithms to identify the optimal approximations and also near-optimal approximations, using a novel relaxation of submodularity. We also propose algorithms to identify the r-best approximations among these classes, enabling robust decision making.

More Details

TYPE Journal Article YEAR 2017

OSTI DOI

Counter Adversarial Community Detection

Kegelmeyer, William P.; Wendt, Jeremy D.; Pinar, Ali P.; Altenburger, Kristen M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Counter-Adversarial Community Detection

Kegelmeyer, William P.; Wendt, Jeremy D.; Pinar, Ali P.; Altenburger, Kristen M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Sarıyüce, Ahmet E.; Pinar, Ali P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Sparse approximations of directed information graphs

IEEE International Symposium on Information Theory - Proceedings

Quinn, Christopher J.; Pinar, Ali P.; Gao, Jing; Su, Lu

Given a network of agents interacting over time, which few interactions best characterize the dynamics of the whole network? We propose an algorithm that finds the optimal sparse approximation of a network. The user controls the level of sparsity by specifying the total number of edges. The networks are represented using directed information graphs, a graphical model that depicts causal influences between agents in a network. Goodness of approximation is measured with Kullback-Leibler divergence. The algorithm finds the best approximation with no assumptions on the topology or the class of the joint distribution.

More Details

TYPE Conference Poster YEAR 2016

Scopus OSTI

A Sparse Quadrature Approach for Stochastic Optimization in Power Grid Models

Safta, Cosmin S.; Chen, Richard L.; Najm, H.N.; Pinar, Ali P.; Watson, Jean-Paul W.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Sampling Algorithms for Network Analysis

Pinar, Ali P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

MaxReach: Reducing Network Incompleteness through Node Probes

Soundarajan, Sucheta S.; Eliassi-Rad, Tina E.; Gallagher, Brian G.; Pinar, Ali P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI DOI

Models and Measurements of Big Graphs

Pinar, Ali P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Conference Record - Asilomar Conference on Signals, Systems and Computers

Jha, Madhav; Pinar, Ali P.; Seshadhri, C.

Graphs in the real-world are often temporal and can be represented as a "stream" of edges. Estimating the number of triangles in a graph observed as a stream of edges is a fundamental problem in data mining. Our goal is to design a single pass space-efficient streaming algorithm for estimating triangle counts. While there are numerous algorithms for this problem, they all (implicitly or explicitly) assume that the stream does not contain duplicate edges. However, real graph streams are rife with duplicate edges. The work around is typically an extra unaccounted pass (storing all the edges!) just to "clean up" the data. Furthermore, previous work tends to aggregate all edges to construct a graph, discarding the temporal information. It will be much more informative to investigate temporal windows, especially multiple time windows simultaneously. Can we estimate triangle counts for multiple time windows in a single pass even when the stream contains repeated edges? In this work, we give the first algorithm for estimating the triangle count of a multigraph stream of edges over arbitrary time windows. We build on existing "wedge sampling" work for triangle counting. Duplicate edges create significant biasing issues for small space streaming algorithms, which we provably resolve through a subtle debiasing mechanism. Moreover, our algorithm seamlessly handles multiple time windows. The final result is theoretically provable and has excellent performance in practice. Our algorithm discovers fascinating transitivity and triangle trends in real-world temporal graphs.

More Details

TYPE Conference Poster YEAR 2016

Scopus OSTI

ESCAPE: Efficiently Counting All 5-Vertex Subgraphs

Pinar, Ali P.; Seshadhri, C.S.; Vaidyanathan, vishal V.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI DOI

Counter-Adversarial Community Detection: Initial Investigations

Kegelmeyer, William P.; Wendt, Jeremy D.; Pinar, Ali P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Diamond sampling for approximate maximum all-pairs dot-product (MAD) search

Proceedings - IEEE International Conference on Data Mining, ICDM

Journal of Complex Networks

Ray, Jaideep R.; Pinar, Ali P.; Comandur, Seshadhri C.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2014

OSTI

Modeling Analyzing and Generating Networks

Pinar, Ali P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Statistically significant relational data mining :

Berry, Jonathan W.; Leung, Vitus J.; Phillips, Cynthia A.; Pinar, Ali P.; Robinson, David G.

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

More Details

TYPE SAND Report YEAR 2014

OSTI DOI