Compilers are heavily used software. Despite careful development, bugs are inevitable due to the
scale of the codebase and the complexity of the optimizing transformations. Several approaches
have been developed for compiler correctness. The classical method is to prove, mathematically,
that for every input program, the resulting optimized program is semantically equivalent to the
input program. An alternative approach is termed Translation Validation. Rather than verifying
the translator once for all, each instance of the translation is verified separately.
The open-source nature of today’s compilers opens up an alternative mechanism for translation
validation: a compiler optimization can be designed (or later augmented) to produce a proof
justifying the correctness of the translation. This witness for correctness may then be validated
by generating verification conditions that are discharged by an SMT solver. Thus, a single,
typically straightforward, simulation checker covers multiple transformations, eliminating the
need for individualized, complex heuristics.
There are, however, two important limitations to the TV methods. One is the lack of support for
inter-procedural optimizations (IPOs). Secondly, the overhead of carrying out the semantic
checks is often quite substantial. To address these limitations, we propose a mechanism that
replaces semantic checks with syntactic checks, to the extent possible.
In the thesis, I first formally define our program model and translation correctness. Upon that, I
introduce the new translation validation framework, witnessing with meta-rules and shadowing.
The meta-rule is optimization specific and describes what makes this optimization correct.
Shadowing ensures that the translation is correct after a series of compiler passes.
Then, I apply the framework to the three LLVM IPOs: Inter-procedural Constant Propagation,
Inlining, and Tail Recursion Elimination, formulate what the meta-rule is for each optimization
and prove that each meta-rule is semantic-based sound.
At last, I show the experimental results of our implementation. We have applied the mechanism to
the three LLVM IPOs. Results show that the overheads of hint-generation and validation are
lower than 4%, on average. In the absence of an existing semantic inter-procedural TV tool, we
conducted experiments to obtain an approximate lower bound on the overhead of semantic
checking. The results suggest that semantically checking these transformations may incur as
much as 40 times the syntactic checking overhead.