The Birth of C
the bootstrap
A brief recap on how coding in assembly works
Overview:
hello.c → [C compiler] → hello.s → [assembler] → hello.o → [linker] → hello (executable)Here’s helloworld in assembly (so we start from hello.s):
May also see this nice video about going the other way:
To invent the C programming language, Dennis had to invent the C compiler as well on the PDP-11 which had an assembler only. The first C compiler had to be written in assembly to produce a compiler binary (see steps below) which could then be used to write and run tiny C programs.
The Assembly-to-Binary Process:
Ritchie's C compiler (written in PDP-11 assembly) read C source code and generated PDP-11 assembly language as its output - not binary.
The generated assembly code was then fed to the PDP-11 assembler that converted the assembly mnemonics into machine code object files.
Linker: Finally, a linker (like
ld) combined the object files with libraries to produce the final executable binary.
(Ritchie reciting the assembly to Kernighan for the first C Compiler)
This approach was actually quite forward-thinking. Many early compilers generated machine code directly, but Ritchie's design made the compiler more portable and easier to debug. When they later ported Unix and C to the Interdata 8/32, they only needed to modify the code generation part of the compiler to produce Interdata assembly language - the rest of the toolchain concept remained the same.
This multi-stage approach became the standard model that's still used in modern compilers today (though with more intermediate representations and optimization passes).
The development process was bootstrapped in stages:
Starting with assembly language: Initially, Ritchie wrote a simple C compiler in PDP-11 assembly language. This first compiler could handle only a subset of the C language.
Self-hosting bootstrap: Once he had a basic C compiler working, he rewrote it in C itself. This is called "self-hosting" - using the language to compile its own compiler. The process involved:
Writing a minimal C compiler in assembly
Using that compiler to compile a more complete C compiler written in C
Iteratively improving the compiler using each previous version
Gradual expansion: The C language and its compiler evolved together. Ritchie would add new language features and simultaneously extend the compiler to handle them.
This bootstrapping approach was crucial for portability. Once the C compiler was written in C, it could be ported to new machines by first creating a minimal cross-compiler or by hand-translating key parts to the target machine's assembly language, then using that to compile the full compiler on the new platform.
This self-hosting characteristic of C made the Unix-to-Interdata port possible - they could bring both the operating system and the tools needed to maintain it to the new hardware platform.
Ritchie wrote a recursive descent parser for a subset of the C language. Beyond the parser, Ritchie had to solve several fundamental compiler problems in that first assembly-language C compiler:
Code Generation This was perhaps the most complex part. Ritchie had to:
Map C language constructs to PDP-11 assembly instructions
Handle register allocation (deciding which variables go in which registers)
Generate efficient instruction sequences for expressions, control flow, and function calls
Implement the PDP-11 calling conventions for function parameters and return values
Symbol Table Management
Track variable names, types, and storage locations (register vs memory)
Handle scope rules (local vs global variables)
Manage function declarations and definitions
Type System Implementation
Handle C's basic types (char, int, pointer, array)
Implement type checking and type conversions
Calculate sizes and alignments for different data types
Handle pointer arithmetic correctly
Memory Layout and Storage Allocation
Decide where to place variables in memory (stack, heap, static storage)
Implement stack frame management for function calls
Handle local variable storage and cleanup
Manage string literals and constants
Expression Evaluation
Implement operator precedence and associativity
Handle complex expressions with proper order of evaluation
Generate efficient code for arithmetic, logical, and comparison operations
Deal with side effects in expressions
Control Flow Implementation
Generate jump instructions for if/else statements
Implement loops (while, for) with proper branching
Handle break and continue statements
Manage nested control structures
The remarkable thing is that Ritchie designed these solutions to be simple and regular, which made both the language and the compiler easier to understand and port to other machines.
Refer: https://en.wikipedia.org/wiki/Portable_C_Compiler
Also, the ideas of C came from an earlier language developed by Ritchie called B.




