Project Stage 1
Building GCC on AArch64 and Exploring Compilation Dumps
Introduction
Welcome to my journey through Project Stage 1 for the SPO600 course, where we dive into building the GNU Compiler Collection (GCC) from source. This semester, our task is to create a proof-of-concept for the function-pruning component of Automatic Function Multi-Versioning (AFMV) on AArch64 systems.
In this post, I’ll guide you through my experience with setting up GCC on an AArch64 server and exploring the diagnostic output, or “dumps,” generated during its compilation passes. Building a compiler from source was a first for me, and it provided essential insights into the build process and initial compiler optimizations.
Why Build GCC?
Compiling GCC from source is crucial for working directly with the compiler’s code. It lets us configure custom builds, test changes, and understand the system we’re developing for in a hands-on way. Setting up a local GCC build on the aarch64-002 server was my first step in getting comfortable with this complex toolchain.
Steps to Build GCC
1. Cloning the GCC Repository
The first step in building GCC was to obtain its source code. I created a project directory to keep everything organized and then cloned the main GCC repository from gcc.gnu.org
:
Cloning took a while since GCC is a large codebase with many files and contributors, but this setup will make it easier to experiment with custom builds.
2. Setting Up the Build Directory
To keep the build files separate from the source code, I created a dedicated build directory:
3. Configuring the Build
With the source and build directories ready, I configured the build with the configure
script, specifying an installation path with the --prefix
option. This step ensures that GCC installs to a local directory, so it doesn’t interfere with system-wide software:
The configuration process verifies that the necessary libraries and tools are available. Thankfully, the server already had all dependencies installed, which kept things running smoothly.
4. Compiling GCC
Next came the build process itself, which I initiated with make
. GCC is a massive project, so building it can take anywhere from minutes to hours. I used the following command to compile with parallel jobs, log output, and track build time:
Here:
-j 24
: Runs 24 parallel jobs, based on the server’s core count. Running multiple jobs can speed up the process.time
: Measures the build time.tee build.log
: Records output and errors tobuild.log
for future reference.
On the aarch64-002 server, the build completed in just under two hours.
5. Installing GCC
Once the build was complete, I installed GCC to the specified directory:
6. Setting Up the Environment
To use my new GCC build instead of the system’s default version, I updated my PATH
variable to prioritize the newly installed compiler:
This change ensures that my system searches this directory first when I call gcc
, allowing me to test my build without affecting the system’s compiler.
7. Verifying the Installation
Finally, I confirmed that the correct GCC version was active:
The output showed GCC version 15.0.0 (experimental), confirming that I had successfully built and installed the new compiler.
Producing and Analyzing Compilation Dumps
With GCC up and running, my next step was to explore how it handles code during different compilation stages. Using -fdump-tree-all and -fdump-rtl-all options, I compiled a small test program and analyzed the resulting dump files. This experience helped me understand the intermediate steps GCC uses to transform source code into optimized machine code.
Creating the Test Program
To generate useful dumps, I wrote a basic program in C, test.c
:
Compiling with Dump Options
Using -fdump-tree-all
and -fdump-rtl-all
, I generated a series of dump files for each stage in GCC’s compilation process:
This command created several files with names like test.c.004t.gimple.dump
and test.c.156r.expand.dump
, which I opened to explore the optimizations applied to the code.
Key Insights from the Dumps
The dump files provided a clear view of how GCC transforms code through different stages of compilation:
GIMPLE Representation: In the
004t.gimple.dump
file, I observed my code in GIMPLE form, an intermediate language that GCC uses to apply general optimizations. Here, the code was simplified for efficiency without altering its functionality.RTL Representation: In the
156r.expand.dump
file, I found the Register Transfer Language (RTL) form, which is closer to machine code. This view shows how GCC manages data and assigns registers, helping with final optimizations before generating the executable code.
Overall, these dumps revealed how GCC progressively refines and optimizes a simple function through various stages, bringing it closer to an efficient, compiled form.
Reflection
Building GCC and exploring its dump files was a valuable learning experience. I gained a clearer understanding of the build process, from source configuration to managing dependencies. Producing and analyzing dumps introduced me to GCC’s intermediate representations and showed how the compiler makes code more efficient at each stage.
This journey has given me a solid foundation in working with GCC, and I look forward to further exploring function versioning and pruning for the AFMV project.
Comments
Post a Comment