Publications

Publications / SAND Report

Pattern analysis of directed graphs using DEDICOM: an application to Enron email

Bader, Brett W.; Kolda, Tamara G.

DEDICOM is a linear algebra model for analyzing intrinsically asymmetric relationships, such as trade among nations or the exchange of emails among individuals. DEDICOM decomposes a complex pattern of observed relations among objects into a sum of simpler patterns of inferred relations among latent components of the objects. Three-way DEDICOM is a higher-order extension of the model that incorporates a third mode of the data, such as time, giving it stronger uniqueness properties and consequently enhancing interpretability of solutions. In this paper, we present algorithms for computing these decompositions on large, sparse data as well as a variant for computing an asymmetric nonnegative factorization. When we apply these techniques to adjacency arrays arising from directed graphs with edges labeled by time, we obtain a smaller graph on latent semantic dimensions and gain additional information about their changing relationships over time. We demonstrate these techniques on the Enron email corpus to learn about the social networks and their transient behavior. The mixture of roles assigned to individuals by DEDICOM showed strong correspondence with known job classifications and revealed the patterns of communication between these roles. Changes in the communication pattern over time, e.g., between top executives and the legal department, were also apparent in the solutions.