SEDiff: Scope-Aware Differential Fuzzing to Test Internal Function Models in Symbolic Execution
Symbolic execution has become a foundational program analysis technique.
Performing symbolic execution unavoidably encounters internal functions (e.g., library functions) that provide basic operations such as string processing.
Many symbolic execution engines construct internal function models that abstract function behaviors for scalability and compatibility concerns.
Due to the high complexity of constructing the models,
developers intentionally summarize only partial behaviors of a function, namely modeled functionalities, in the models.
The correctness of the internal function models is critical because
it would impact all applications of symbolic execution, e.g., bug detection and model checking.
A naive solution to testing the correctness of internal function models is to cross-check whether the behaviors of the models comply with their corresponding original function implementations.
However, such a solution would mostly detect overwhelming inconsistencies concerning the unmodeled functionalities, which are out of the scope of models and thus considered false reports.
We argue that a reasonable testing approach should target only the functionalities that developers intend to model.
While being necessary, automatically identifying the modeled functionalities, i.e., the scope, is a significant challenge.
In this paper, we propose a scope-aware differential testing framework, SEDiff, to tackle this problem.
We design a novel algorithm to automatically map the modeled functionalities to the code in the original implementations.
SEDiff then applies scope-aware grey-box differential fuzzing to relevant code in the original implementations.
It also equips a new scope-aware input generator and a tailored bug checker that efficiently and correctly detect erroneous inconsistencies.
We extensively evaluated SEDiff on several popular real-world symbolic execution engines targeting binary, web and kernel.
Our manual investigation shows that SEDiff precisely identifies the modeled functionalities and detects
46 new bugs in the internal function models used in the symbolic execution engines.