Publications

Publications / Conference

Baler: Deterministic, lossless log message clustering tool

Taerat, Narate; Brandt, Jim; Gentile, Ann C.; Wong, Matthew H.; Leangsuksun, Chokchai

The rate of failures in HPC systems continues to increase as the number of components comprising the systems increases. System logs are one of the valuable information sources that can be used to analyze system failures and their root causes. However, system log files are usually too large and complex to analyze manually. There are some existing log clustering tools that seek to help analysts in exploring these logs, however they fail to satisfy our needs with respect to scalability, usability and quality of results. Thus, we have developed a log clustering tool to better address these needs. In this paper we present our novel approach and initial experimental results. © Springer-Verlag 2011.