SPINE: A Scalable Log Parser with Feedback Guidance (ESEC/FSE 2022 - Research Papers)

Mon 14 - Fri 18 November 2022 Singapore

Who

Xuheng Wang, Xu Zhang, Liqun Li, Shilin He, Hongyu Zhang, Yudong Liu, Lingling Zheng, Yu Kang, Qingwei Lin, Yingnong Dang, Saravanakumar Rajmohan, Dongmei Zhang

Track

ESEC/FSE 2022 Research Papers

Abstract

Log parsing, which extracts log templates and parameters, is a critical prerequisite step for automated log analysis techniques. Though existing log parsers have achieved promising accuracy on public log datasets, they still face many challenges when applied in the industry. Through studying the characteristics of real-world log data and analyzing the limitations of existing log parsers, we identify two problems. Firstly, it is non-trivial to scale a log parser to a vast number of logs, especially in real-world scenarios where the log data is extremely imbalanced. Secondly, existing log parsers overlook the importance of user feedback, which is imperative for parser fine-tuning under the continuous evolution of log data. To overcome the challenges, we propose SPINE, which is a highly scalable log parser with user feedback guidance. Based on our log parser equipped with initial grouping and progressive clustering,we propose a novel log data scheduling algorithm to improve the efficiency of parallelization under the large-scale imbalanced log data. Besides, we introduce user feedback to make the parser fast adapt to the evolving logs. We evaluated SPINE on 16 public log datasets. SPINE achieves more than 0.90 parsing accuracy on average with the highest parsing efficiency, which outperforms the state-of-the-art log parsers. We also evaluated SPINE in the production environment of Microsoft, in which SPINE can parse 30million logs in less than 8 minutes under 16 executors, achieving near real-time performance. In addition, our evaluations show that SPINE can consistently achieve good accuracy under log evolution with a moderate number of user feedback.

DOI

https://doi.org/10.1145/3540250.3549176

Xuheng Wang

Tsinghua University

China

Xu Zhang

Microsoft Research

China

Liqun Li

Microsoft Research

China

Shilin He

Microsoft Research

China

Hongyu Zhang

University of Newcastle

Australia

Yudong Liu

Microsoft Research

China

Lingling Zheng

Microsoft Azure

United States

Yu Kang

Microsoft Research

China

Qingwei Lin

Microsoft Research

China

Yingnong Dang

Microsoft Azure

United States

Saravanakumar Rajmohan

Microsoft 365

United States

Dongmei Zhang

Microsoft Research

China