Constraint-Based Analysis in the Presence of Uncertainty and Imprecision

158
Опубликовано 6 сентября 2016, 18:34
In all sound program analysis systems there is some mechanism for modeling non-deterministic choice, which is typically used to correctly capture effects of the program's environment (Did malloc return a valid pointer or null? Which thread did the scheduler pick?) and imprecision in the analysis (Which element was read by the array reference? Is a certain key in the hash map?). In constraint-based analyses it is common to model such decisions as an unconstrained variable ranging over the possible outcomes (e.g., malloc returned either null or a pointer, but not both). Unfortunately, this very useful idea leads to both theoretical and practical difficulties. On the theoretical side, each choice should introduce a fresh variable, leading to a potentially unbounded number of variables in recursive procedures; practically, even in relatively small, non-recursive problems the number of such variables imposes real limits on scalability. In this talk we present a solution to both the theoretical and practical problems. Given a (recursive) formula F with variables representing non-deterministic choices, we eliminate such variables in favor of two formulas capturing upper and lower bounds on F: the strongest necessary and weakest sufficient conditions under which the original formula was satisfiable and valid, respectively. These formulas are satisfiable/valid exactly when F is, so there is no loss of information in the transformation, and in practice these formulas are much smaller than F. We apply this idea to SAT-based path- and context-sensitive program analysis and demonstrate the practicality of the approach by analyzing programs as large as the entire Linux kernel.
автотехномузыкадетское