CodeQL Static Analysis When to Use CodeQL

Ideal scenarios:

Source code access with ability to build (for compiled languages) Open-source projects or GitHub Advanced Security license Need for interprocedural data flow and taint tracking Finding complex vulnerabilities requiring AST/CFG analysis Comprehensive security audits where analysis time is not critical

Consider Semgrep instead when:

No build capability for compiled languages Licensing constraints Need fast, lightweight pattern matching Simple, single-file analysis is sufficient Why Interprocedural Analysis Matters

Simple grep/pattern tools only see one function at a time. Real vulnerabilities often span multiple functions:

HTTP Handler → Input Parser → Business Logic → Database Query ↓ ↓ ↓ ↓ source transforms passes sink (SQL)

CodeQL tracks data flow across all these steps. A tainted input in the handler can be traced through 5+ function calls to find where it reaches a dangerous sink.

Pattern-based tools miss this because they can't connect request.param in file A to db.execute(query) in file B.

When NOT to Use

Do NOT use this skill for:

Projects that cannot be built (CodeQL requires successful compilation for compiled languages) Quick pattern searches (use Semgrep or grep for speed) Non-security code quality checks (use linters instead) Projects without source code access Environment Check

Check if CodeQL is installed

command -v codeql >/dev/null 2>&1 && echo "CodeQL: installed" || echo "CodeQL: NOT installed (run install steps below)"

Installation CodeQL CLI

macOS/Linux (Homebrew)

brew install --cask codeql

Update

brew upgrade codeql

Manual: Download bundle from https://github.com/github/codeql-action/releases

Trail of Bits Queries (Optional)

Install public ToB security queries for additional coverage:

Download ToB query packs

codeql pack download trailofbits/cpp-queries trailofbits/go-queries

Verify installation

codeql resolve qlpacks | grep trailofbits

Core Workflow 1. Create Database codeql database create codeql.db --language= [--command=''] --source-root=.

Language --language= Build Required Python python No JavaScript/TypeScript javascript No Go go No Ruby ruby No Rust rust Yes (--command='cargo build') Java/Kotlin java Yes (--command='./gradlew build') C/C++ cpp Yes (--command='make -j8') C# csharp Yes (--command='dotnet build') Swift swift Yes (macOS only) 2. Run Analysis

List available query packs

codeql resolve qlpacks

Run security queries:

SARIF output (recommended)

codeql database analyze codeql.db \ --format=sarif-latest \ --output=results.sarif \ -- codeql/python-queries:codeql-suites/python-security-extended.qls

CSV output

codeql database analyze codeql.db \ --format=csv \ --output=results.csv \ -- codeql/javascript-queries

With Trail of Bits queries (if installed):

codeql database analyze codeql.db \ --format=sarif-latest \ --output=results.sarif \ -- trailofbits/go-queries

Writing Custom Queries Query Structure

CodeQL uses SQL-like syntax: from Type x where P(x) select f(x)

Basic Template /* * @name Find SQL injection vulnerabilities * @description Identifies potential SQL injection from user input * @kind path-problem * @problem.severity error * @security-severity 9.0 * @precision high * @id py/sql-injection * @tags security * external/cwe/cwe-089 /

import python import semmle.python.dataflow.new.DataFlow import semmle.python.dataflow.new.TaintTracking

module SqlInjectionConfig implements DataFlow::ConfigSig { predicate isSource(DataFlow::Node source) { // Define taint sources (user input) exists(source) }

predicate isSink(DataFlow::Node sink) { // Define dangerous sinks (SQL execution) exists(sink) } }

module SqlInjectionFlow = TaintTracking::Global;

from SqlInjectionFlow::PathNode source, SqlInjectionFlow::PathNode sink where SqlInjectionFlow::flowPath(source, sink) select sink.getNode(), source, sink, "SQL injection from $@.", source.getNode(), "user input"

Query Metadata Field Description Values @kind Query type problem, path-problem @problem.severity Issue severity error, warning, recommendation @security-severity CVSS score 0.0 - 10.0 @precision Confidence very-high, high, medium, low Key Language Features // Predicates predicate isUserInput(DataFlow::Node node) { exists(Call c | c.getFunc().(Attribute).getName() = "get" and node.asExpr() = c) }

// Transitive closure: + (one or more), * (zero or more) node.getASuccessor+()

// Quantification exists(Variable v | v.getName() = "password") forall(Call c | c.getTarget().hasName("dangerous") | hasCheck(c))

Creating Query Packs codeql pack init myorg/security-queries

Structure:

myorg-security-queries/ ├── qlpack.yml ├── src/ │ └── SqlInjection.ql └── test/ └── SqlInjectionTest.expected

qlpack.yml:

name: myorg/security-queries version: 1.0.0 dependencies: codeql/python-all: "*"

CI/CD Integration (GitHub Actions) name: CodeQL Analysis

on: push: branches: [main] pull_request: branches: [main] schedule: - cron: '0 0 * * 1' # Weekly

jobs: analyze: runs-on: ubuntu-latest permissions: actions: read contents: read security-events: write

strategy:
  matrix:
    language: ['python', 'javascript']

steps:
  - uses: actions/checkout@v4

  - name: Initialize CodeQL
    uses: github/codeql-action/init@v3
    with:
      languages: ${{ matrix.language }}
      queries: security-extended,security-and-quality
      # Add custom queries/packs:
      # queries: security-extended,./codeql/custom-queries
      # packs: trailofbits/python-queries

  - uses: github/codeql-action/autobuild@v3

  - uses: github/codeql-action/analyze@v3
    with:
      category: "/language:${{ matrix.language }}"

Testing Queries codeql test run test/

Test file format:

def vulnerable(): user_input = request.args.get("q") # Source cursor.execute("SELECT * FROM users WHERE id = " + user_input) # Alert: sql-injection

def safe(): user_input = request.args.get("q") cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,)) # OK

Troubleshooting Issue Solution Database creation fails Clean build environment, verify build command works independently Slow analysis Use --threads, narrow query scope, check query complexity Missing results Check file exclusions, verify source files were parsed Out of memory Set CODEQL_RAM=48000 environment variable (48GB) CMake source path issues Adjust --source-root to point to actual source location Rationalizations to Reject Shortcut Why It's Wrong "No findings means the code is secure" CodeQL only finds patterns it has queries for; novel vulnerabilities won't be detected "This code path looks safe" Complex data flow can hide vulnerabilities across 5+ function calls; trace the full path "Small change, low risk" Small changes can introduce critical bugs; run full analysis on every change "Tests pass so it's safe" Tests prove behavior, not absence of vulnerabilities; they test expected paths, not attacker paths "The query didn't flag it" Default query suites don't cover everything; check if custom queries are needed for your domain Resources Docs: https://codeql.github.com/docs/ Query Help: https://codeql.github.com/codeql-query-help/ Security Lab: https://securitylab.github.com/ Trail of Bits Queries: https://github.com/trailofbits/codeql-queries VSCode Extension: "CodeQL" for query development

安装