Publications

Publications / SAND Report

Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments

Pebay, Philippe P.

We present a formula for the pairwise update of arbitrary-order centered statistical moments. This formula is of particular interest to compute such moments in parallel for large-scale, distributed data sets. As a corollary, we indicate a specialization of this formula for incremental updates, of particular interest to streaming implementations. Finally, we provide pairwise and incremental update formulas for the covariance. Centered statistical moments are one of the most widely used tools in descriptive statistics. It is therefore essential for statistical analysis packages that robust and efficient algorithms be devised and implemented. However, robustness and speed of execution, in this context as well as in others, tend to be orthogonal. For instance, it is well known1 that algorithms for calculating centered statistical moments that utilize sum of powers for the sake of execution speed (one-pass algorithms) lead to unacceptable numerical instability.