Research Brief | Computing+ Finance Dr Zhipeng Gao: ACWRecommender: A Tool for Validating Actionable Warnings with Weak Supervision

Source:上海高等研究院英文网

Static analysis tools have been widely used by software developers and companies in recent years to detect potential bugs. For example, Infer Static Analyzer, developed by Facebook, is a static code analysis tool for code defects (e.g., null pointer exceptions, memory leaks, contention conditions) in Android and iOS applications (including Facebook, WhatsApp, Instagram, etc.). Static inspection tools have gained popularity among developers due to their lightweight analysis and low computational cost. However, static analysis tools face two main challenges: first, false alarm rate of static analysis tools is high (up to 90%); second, developers often suffer from information overload when using these tools, which may lead to overlook of real code defects and get caught up in irrelevant false alarm.

To address this, researchers have introduced the concept of actionable warning. Specifically, a warning is said to be active if it appears in one software version and then disappears in subsequent versions, if not, it is considered as a false alarm. However, there are still two shortcomings: (1) most warning elimination algorithms rely on expert-customized features, which is dependent on expert design and specialized domain knowledge, and (2) previous research focuses mainly on detecting active warnings while ignoring the issue that that not all active warnings are from software defects.

ACWRecommender framework

Here the authors constructed a weakly supervised learning-based static analysis tool warning elimination model, ACWRecommender, to validate, score, and rank static warnings via a weakly supervised learning approach. The method is a two-stage framework that includes a coarse-grained detection phase and a fine-grained reordering phase. In order to validate the feasibility of the model, the work collected 538 active warnings and 30,590 false positives by ranking and statically analyzed the top 500 C-programmed projects on GitHub and then assigned the corresponding weak labels to each of the active warnings by semantic and structural matching. A detector was constructed in the first phase using a pre-trained model to predict whether the warning was an active one, and the model was fine-tuned in the second phase using the weak labels to prioritize recommendations for active warnings that could be software defects. The authors also conducted extensive experiments on both tasks to validate effectiveness, and F1-score of ACWRecommender improved by 91.7% over other baseline models on the active warning detection task in the first phase, and substantially outperformed baseline model on the warning validation ranking task in the second phase in terms of nDCG and MRR. The tool was validated in a real-world development scenario, where the study utilized the ACWRecommender to recommend 24 active warnings to GitHub developers, 22 of which were identified as software defects, further demonstrating its utility.

The project was funded by Starry Night Fund from Shanghai Institute for Advanced Study, Zhejiang University and Science and Technology Commission of Shanghai Municipality, and  was presented at the 38th IEEE/ACM International Conference on Automated Software Engineering Industry Challenge Track and for details please check  https://conf.researchr.org/details/ase-2023/ase-2023-industry-challenge/11/ACWRecommender-A-Tool-for-Validating-Actionable-Warnings-with-Weak-Supervision.