LLVM IR and Tooling Purpose Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with opt , lowering to assembly with llc , and inspecting IR for debugging or performance work. Triggers "Show me the LLVM IR for this function" "How do I run an LLVM optimisation pass?" "What does this LLVM IR instruction mean?" "How do I write a custom LLVM pass?" "Why isn't auto-vectorisation happening in LLVM?" Workflow 1. Generate LLVM IR
Emit textual IR (.ll)
clang -O0 -emit-llvm -S src.c -o src.ll
Emit bitcode (.bc)
clang -O2 -emit-llvm -c src.c -o src.bc
Disassemble bitcode to text
llvm-dis src.bc -o src.ll 2. Run optimisation passes with opt
Apply a specific pass
opt -passes = 'mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll
Standard optimisation pipelines
opt
-passes
=
'default
List available passes
opt --print-passes 2
&1 | less
Print IR before and after a pass
opt -passes = 'instcombine' --print-before = instcombine --print-after = instcombine src.ll -S -o out.ll 2
&1 | less 3. Lower IR to assembly with llc
Compile IR to object file
llc -filetype = obj src.ll -o src.o
Compile to assembly
llc -filetype = asm -masm-syntax = intel src.ll -o src.s
Target a specific CPU
llc -mcpu = skylake -mattr = +avx2 src.ll -o src.s
Show available targets
llc --version 4. Inspect IR Key IR constructs to understand: Construct Meaning alloca Stack allocation (pre-SSA; mem2reg promotes to registers) load / store Memory access getelementptr (GEP) Pointer arithmetic / field access phi SSA φ-node: merges values from predecessor blocks call / invoke Function call ( invoke has exception edges) icmp / fcmp Integer/float comparison br Branch (conditional or unconditional) ret Return bitcast Reinterpret bits (no-op in codegen) ptrtoint / inttoptr Pointer↔integer (avoid where possible) 5. Key passes Pass Effect mem2reg Promote alloca to SSA registers instcombine Instruction combining / peephole simplifycfg CFG cleanup, dead block removal loop-vectorize Auto-vectorisation slp-vectorize Superword-level parallelism (straight-line vectorisation) inline Function inlining gvn Global value numbering (common subexpression elimination) licm Loop-invariant code motion loop-unroll Loop unrolling argpromotion Promote pointer args to values sroa Scalar Replacement of Aggregates 6. Debugging missed optimisations
Why was a loop not vectorised?
clang -O2 -Rpass-missed = loop-vectorize -Rpass-analysis = loop-vectorize src.c
Dump pass pipeline
clang -O2 -mllvm -debug-pass = Structure src.c -o /dev/null 2
&1 | less
Print IR after each pass (very verbose)
opt
-passes
=
'default
&1 | less 7. Useful llvm tools Tool Purpose llvm-dis Bitcode → textual IR llvm-as Textual IR → bitcode llvm-link Link multiple bitcode files llvm-lto Standalone LTO llvm-nm Symbols in bitcode/object llvm-objdump Disassemble objects llvm-profdata Merge/show PGO profiles llvm-cov Coverage reporting llvm-mca Machine code analyser (throughput/latency) For binutils equivalents, see skills/binaries/binutils .