Project Stage 1

Building GCC on AArch64 and Exploring Compilation Dumps

Introduction

Welcome to my journey through Project Stage 1 for the SPO600 course, where we dive into building the GNU Compiler Collection (GCC) from source. This semester, our task is to create a proof-of-concept for the function-pruning component of Automatic Function Multi-Versioning (AFMV) on AArch64 systems.

In this post, I’ll guide you through my experience with setting up GCC on an AArch64 server and exploring the diagnostic output, or “dumps,” generated during its compilation passes. Building a compiler from source was a first for me, and it provided essential insights into the build process and initial compiler optimizations.

Why Build GCC?

Compiling GCC from source is crucial for working directly with the compiler’s code. It lets us configure custom builds, test changes, and understand the system we’re developing for in a hands-on way. Setting up a local GCC build on the aarch64-002 server was my first step in getting comfortable with this complex toolchain.


Steps to Build GCC

1. Cloning the GCC Repository

The first step in building GCC was to obtain its source code. I created a project directory to keep everything organized and then cloned the main GCC repository from gcc.gnu.org:

mkdir ~/gcc-project cd ~/gcc-project git clone git://gcc.gnu.org/git/gcc.git

Cloning took a while since GCC is a large codebase with many files and contributors, but this setup will make it easier to experiment with custom builds.

2. Setting Up the Build Directory

To keep the build files separate from the source code, I created a dedicated build directory:

mkdir ~/gcc-build-001 cd ~/gcc-build-001

3. Configuring the Build

With the source and build directories ready, I configured the build with the configure script, specifying an installation path with the --prefix option. This step ensures that GCC installs to a local directory, so it doesn’t interfere with system-wide software:

~/gcc/configure --prefix=$HOME/gcc-test-001

The configuration process verifies that the necessary libraries and tools are available. Thankfully, the server already had all dependencies installed, which kept things running smoothly.

4. Compiling GCC

Next came the build process itself, which I initiated with make. GCC is a massive project, so building it can take anywhere from minutes to hours. I used the following command to compile with parallel jobs, log output, and track build time:

time make -j 24 |& tee build.log

Here:

  • -j 24: Runs 24 parallel jobs, based on the server’s core count. Running multiple jobs can speed up the process.
  • time: Measures the build time.
  • tee build.log: Records output and errors to build.log for future reference.

On the aarch64-002 server, the build completed in just under two hours.

5. Installing GCC

Once the build was complete, I installed GCC to the specified directory:

make install

6. Setting Up the Environment

To use my new GCC build instead of the system’s default version, I updated my PATH variable to prioritize the newly installed compiler:

export PATH=$HOME/gcc-test-001/bin:$PATH

This change ensures that my system searches this directory first when I call gcc, allowing me to test my build without affecting the system’s compiler.

7. Verifying the Installation

Finally, I confirmed that the correct GCC version was active:

gcc --version

The output showed GCC version 15.0.0 (experimental), confirming that I had successfully built and installed the new compiler.


Producing and Analyzing Compilation Dumps

With GCC up and running, my next step was to explore how it handles code during different compilation stages. Using -fdump-tree-all and -fdump-rtl-all options, I compiled a small test program and analyzed the resulting dump files. This experience helped me understand the intermediate steps GCC uses to transform source code into optimized machine code.

Creating the Test Program

To generate useful dumps, I wrote a basic program in C, test.c:

#include <stdio.h> int add(int a, int b) { return a + b; } int main() { printf("Result: %d\n", add(3, 4)); return 0; }

Compiling with Dump Options

Using -fdump-tree-all and -fdump-rtl-all, I generated a series of dump files for each stage in GCC’s compilation process:

gcc -O2 -fdump-tree-all -fdump-rtl-all test.c

This command created several files with names like test.c.004t.gimple.dump and test.c.156r.expand.dump, which I opened to explore the optimizations applied to the code.


Key Insights from the Dumps

The dump files provided a clear view of how GCC transforms code through different stages of compilation:

  • GIMPLE Representation: In the 004t.gimple.dump file, I observed my code in GIMPLE form, an intermediate language that GCC uses to apply general optimizations. Here, the code was simplified for efficiency without altering its functionality.

  • RTL Representation: In the 156r.expand.dump file, I found the Register Transfer Language (RTL) form, which is closer to machine code. This view shows how GCC manages data and assigns registers, helping with final optimizations before generating the executable code.

Overall, these dumps revealed how GCC progressively refines and optimizes a simple function through various stages, bringing it closer to an efficient, compiled form.


Reflection

Building GCC and exploring its dump files was a valuable learning experience. I gained a clearer understanding of the build process, from source configuration to managing dependencies. Producing and analyzing dumps introduced me to GCC’s intermediate representations and showed how the compiler makes code more efficient at each stage.

This journey has given me a solid foundation in working with GCC, and I look forward to further exploring function versioning and pruning for the AFMV project.

Comments

Popular posts from this blog

Project Stage 2

Lab1