16 out 2020
10:00 Master's Defense Fully distance
Theme
DrPin: A dynamic binary instrument for multiple processor architectures
Student
Luís Fernando Antonioli
Advisor / Teacher
Rodolfo Jardim de Azevedo
Brief summary
The complexity of the programs is increasing at an unprecedented rate and the tools used for their development have kept pace with this evolution. Modern applications depend largely on dynamically loaded libraries and some applications even generate code during execution. Therefore, static analysis tools used to debug and understand applications are no longer sufficient to have a complete overview of an application. As a result, dynamic analysis tools (those that are run at run time) are being adopted and integrated into the development and study of modern applications. Among these, tools that operate directly on the program's binary are particularly useful in the middle of numerous dynamically loaded libraries, where the source code may not be available. Building tools that manipulate and instrument binary code during execution is particularly difficult and error-prone. A small error can result in a complete deviation from the behavior of the program being analyzed. For this reason, Dynamic Binary Instrumentation (DBI) frameworks have become increasingly popular. These frameworks provide a means for creating dynamic binary analysis tools with little effort. Among them, Pin 2 has been by far the most popular and easy to use. However, since the release of the Linux Kernel series 4, it has been unsupported. In this work, our focus is on studying the challenges encountered when creating a new DBI (DrPin) that focuses on being fully compatible with the Pin 2 API, while also supporting multiple architectures (x86-64, x86, Arm, Aarch64) and modern Linux systems. Currently, DrPin supports a total of 83 functions of the Pin 2 API, which makes it capable of running several pintools originally written for Pin 2 without any modifications. Comparing the performance of DrPin with Pin 2, for a simple tool that counts the number of instructions executed, we observed that, for the SPECint 2006 benchmark, we are, on average, only 10% slower than Pin and 11,6 times slower slower than native execution. We also explored the ecosystem around dynamic binary instrumentation frameworks a bit. Specifically, we study and extend a technique that uses dynamic binary analysis tools, built with the help of DBI frameworks, to predict the performance of a given architecture when running a specific program or benchmark, without the need to run the entire program or benchmark . In particular, we have extended the SimPoint Methodology to make additional gains in reducing the time required to obtain such predictions. We show that, considering the similarities in the behavior of the program between different inputs, we can further reduce the time needed to obtain results from simulating entire benchmarks. Specifically for SPECint 2006, we show that the number of SimPoints (directly proportional to the simulation time) can be reduced by an average of 32%, losing only 0,06% of accuracy when compared to the original technique.
Examination Board
Headlines:
Rodolfo Jardim de Azevedo IC / UNICAMP
Fernando Magno Quintão Pereira DCC / UFMG
Guido Costa Souza de Araújo IC / UNICAMP
Substitutes:
Sandro Rigo IC / UNICAMP
Luiz Cláudio Villar dos Santos INE / UFSC