A practical algorithm for learning disjunctive abstraction heuristics in static program analysis
- Authors
- Jeon, Donghoon; Jeon, Minseok; Oh, Hakjoo
- Issue Date
- 7월-2021
- Publisher
- ELSEVIER
- Keywords
- Data-driven static analysis; Learning algorithm
- Citation
- INFORMATION AND SOFTWARE TECHNOLOGY, v.135
- Indexed
- SCIE
SCOPUS
- Journal Title
- INFORMATION AND SOFTWARE TECHNOLOGY
- Volume
- 135
- URI
- https://scholar.korea.ac.kr/handle/2021.sw.korea/137265
- DOI
- 10.1016/j.infsof.2021.106564
- ISSN
- 0950-5849
- Abstract
- Context: The precision and cost of static analysis are determined by abstraction heuristics (e.g., strategies for abstracting calling contexts, heap locations, etc.), but manually designing effective abstraction heuristics requires a huge amount of engineering effort and domain knowledge. Recently, data-driven static analysis has emerged to address this challenge by learning such heuristics automatically from a set of training programs. Objective: We present a practical algorithm for learning disjunctive abstraction heuristics in data-driven static analysis. We build on a recently proposed approach that can learn nontrivial program properties by disjunctive boolean functions. However, the existing approach is practically limited as it assumes that the most precise abstraction is cheap for the training programs; the algorithm is inapplicable if the most precise abstraction is not scalable. The objective of this paper is to mitigate this limitation. Method: Our algorithm overcomes the limitation with two new ideas. It systematically decomposes the learning problem into feasible subproblems, and it can search through the abstraction space from the coarse to fine-grained abstractions. With this approach, our algorithm is able to learn heuristics when static analysis with the most precise abstraction is not scalable over the training programs. Results: We show our approach is effective and generally applicable. We applied our approach to a context sensitive points-to analysis for Java and a flow-sensitive interval analysis for C. Experimental results show that our algorithm is efficient. For example, our algorithm can learn heuristics for 3-object-sensitive analysis for which the existing learning algorithm is too expensive to learn any useful heuristics. Conclusion: Our algorithm makes a state-of-the-art technique for data-driven static analysis more practical.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School > Department of Computer Science and Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.