- Deploying Airflow
- This skill covers deploying Airflow DAGs and projects to production, whether using Astro (Astronomer's managed platform) or open-source Airflow on Docker Compose or Kubernetes.
- Choosing a path:
- Astro is a good fit for managed operations and faster CI/CD. For open-source, use Docker Compose for dev and the Helm chart for production.
- Astro (Astronomer)
- Astro provides CLI commands and GitHub integration for deploying Airflow projects.
- Deploy Commands
- Command
- What It Does
- astro deploy
- Full project deploy — builds Docker image and deploys DAGs
- astro deploy --dags
- DAG-only deploy — pushes only DAG files (fast, no image build)
- astro deploy --image
- Image-only deploy — pushes only the Docker image (for multi-repo CI/CD)
- astro deploy --dbt
- dbt project deploy — deploys a dbt project to run alongside Airflow
- Full Project Deploy
- Builds a Docker image from your Astro project and deploys everything (DAGs, plugins, requirements, packages):
- astro deploy
- Use this when you've changed
- requirements.txt
- ,
- Dockerfile
- ,
- packages.txt
- , plugins, or any non-DAG file.
- DAG-Only Deploy
- Pushes only files in the
- dags/
- directory without rebuilding the Docker image:
- astro deploy
- --dags
- This is significantly faster than a full deploy since it skips the image build. Use this when you've only changed DAG files and haven't modified dependencies or configuration.
- Image-Only Deploy
- Pushes only the Docker image without updating DAGs:
- astro deploy
- --image
- This is useful in multi-repo setups where DAGs are deployed separately from the image, or in CI/CD pipelines that manage image and DAG deploys independently.
- dbt Project Deploy
- Deploys a dbt project to run with Cosmos on an Astro deployment:
- astro deploy
- --dbt
- GitHub Integration
- Astro supports branch-to-deployment mapping for automated deploys:
- Map branches to specific deployments (e.g.,
- main
- -> production,
- develop
- -> staging)
- Pushes to mapped branches trigger automatic deploys
- Supports DAG-only deploys on merge for faster iteration
- Configure this in the Astro UI under
- Deployment Settings > CI/CD
- .
- CI/CD Patterns
- Common CI/CD strategies on Astro:
- DAG-only on feature branches
-
- Use
- astro deploy --dags
- for fast iteration during development
- Full deploy on main
-
- Use
- astro deploy
- on merge to main for production releases
- Separate image and DAG pipelines
- Use --image and --dags in separate CI jobs for independent release cycles Deploy Queue When multiple deploys are triggered in quick succession, Astro processes them sequentially in a deploy queue. Each deploy completes before the next one starts. Reference Astro Deploy Documentation Open-Source: Docker Compose Deploy Airflow using the official Docker Compose setup. This is recommended for learning and exploration — for production, use Kubernetes with the Helm chart (see below). Prerequisites Docker and Docker Compose v2.14.0+ The official apache/airflow Docker image Quick Start Download the official Airflow 3 Docker Compose file: curl -LfO 'https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml' This sets up the full Airflow 3 architecture: Service Purpose airflow-apiserver REST API and UI (port 8080) airflow-scheduler Schedules DAG runs airflow-dag-processor Parses and processes DAG files airflow-worker Executes tasks (CeleryExecutor) airflow-triggerer Handles deferrable/async tasks postgres Metadata database redis Celery message broker Minimal Setup For a simpler setup with LocalExecutor (no Celery/Redis), create a docker-compose.yaml : x-airflow-common : &airflow-common image : apache/airflow : 3
Use the latest Airflow 3.x release
- environment
- :
- &airflow-common-env
- AIRFLOW__CORE__EXECUTOR
- :
- LocalExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
- :
- postgresql+psycopg2
- :
- //airflow
- :
- airflow@postgres/airflow
- AIRFLOW__CORE__LOAD_EXAMPLES
- :
- 'false'
- AIRFLOW__CORE__DAGS_FOLDER
- :
- /opt/airflow/dags
- volumes
- :
- -
- ./dags
- :
- /opt/airflow/dags
- -
- ./logs
- :
- /opt/airflow/logs
- -
- ./plugins
- :
- /opt/airflow/plugins
- depends_on
- :
- postgres
- :
- condition
- :
- service_healthy
- services
- :
- postgres
- :
- image
- :
- postgres
- :
- 16
- environment
- :
- POSTGRES_USER
- :
- airflow
- POSTGRES_PASSWORD
- :
- airflow
- POSTGRES_DB
- :
- airflow
- volumes
- :
- -
- postgres
- -
- db
- -
- volume
- :
- /var/lib/postgresql/data
- healthcheck
- :
- test
- :
- [
- "CMD"
- ,
- "pg_isready"
- ,
- "-U"
- ,
- "airflow"
- ]
- interval
- :
- 10s
- retries
- :
- 5
- start_period
- :
- 5s
- airflow-init
- :
- <<
- :
- *airflow-common
- entrypoint
- :
- /bin/bash
- command
- :
- -
- -
- c
- -
- |
- airflow db migrate
- airflow users create \
- --username admin \
- --firstname Admin \
- --lastname User \
- --role Admin \
- --email admin@example.com \
- --password admin
- depends_on
- :
- postgres
- :
- condition
- :
- service_healthy
- airflow-apiserver
- :
- <<
- :
- *airflow-common
- command
- :
- airflow api
- -
- server
- ports
- :
- -
- "8080:8080"
- healthcheck
- :
- test
- :
- [
- "CMD"
- ,
- "curl"
- ,
- "--fail"
- ,
- "http://localhost:8080/health"
- ]
- interval
- :
- 30s
- timeout
- :
- 10s
- retries
- :
- 5
- start_period
- :
- 30s
- airflow-scheduler
- :
- <<
- :
- *airflow-common
- command
- :
- airflow scheduler
- airflow-dag-processor
- :
- <<
- :
- *airflow-common
- command
- :
- airflow dag
- -
- processor
- airflow-triggerer
- :
- <<
- :
- *airflow-common
- command
- :
- airflow triggerer
- volumes
- :
- postgres-db-volume
- :
- Airflow 3 architecture note
- The webserver has been replaced by the API server ( airflow api-server ), and the DAG processor now runs as a standalone process separate from the scheduler. Common Operations
Start all services
docker compose up -d
Stop all services
docker compose down
View logs
docker compose logs -f airflow-scheduler
Restart after requirements change
docker compose down && docker compose up -d --build
Run a one-off Airflow CLI command
docker compose exec airflow-apiserver airflow dags list Installing Python Packages Add packages to requirements.txt and rebuild:
Add to requirements.txt, then:
docker compose down docker compose up -d --build Or use a custom Dockerfile: FROM apache/airflow:3 # Pin to a specific version (e.g., 3.1.7) for reproducibility COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt Update docker-compose.yaml to build from the Dockerfile: x-airflow-common : &airflow-common build : context : . dockerfile : Dockerfile
... rest of config
Environment Variables Configure Airflow settings via environment variables in docker-compose.yaml : environment :
Core settings
AIRFLOW__CORE__EXECUTOR : LocalExecutor AIRFLOW__CORE__PARALLELISM : 32 AIRFLOW__CORE__MAX_ACTIVE_TASKS_PER_DAG : 16
AIRFLOW__EMAIL__EMAIL_BACKEND : airflow.utils.email.send_email_smtp AIRFLOW__SMTP__SMTP_HOST : smtp.example.com
Connections (as URI)
AIRFLOW_CONN_MY_DB : postgresql : //user : pass@host : 5432/db Open-Source: Kubernetes (Helm Chart) Deploy Airflow on Kubernetes using the official Apache Airflow Helm chart. Prerequisites A Kubernetes cluster kubectl configured helm installed Installation
Add the Airflow Helm repo
helm repo add apache-airflow https://airflow.apache.org helm repo update
Install with default values
helm install airflow apache-airflow/airflow \ --namespace airflow \ --create-namespace
Install with custom values
helm install airflow apache-airflow/airflow \ --namespace airflow \ --create-namespace \ -f values.yaml Key values.yaml Configuration
Executor type
executor : KubernetesExecutor
or CeleryExecutor, LocalExecutor
Airflow image (pin to your desired version)
defaultAirflowRepository : apache/airflow defaultAirflowTag : "3"
Or pin: "3.1.7"
Git-sync for DAGs (recommended for production)
dags : gitSync : enabled : true repo : https : //github.com/your - org/your - dags.git branch : main subPath : dags wait : 60
seconds between syncs
API server (replaces webserver in Airflow 3)
apiServer : resources : requests : cpu : "250m" memory : "512Mi" limits : cpu : "500m" memory : "1Gi" replicas : 1
Scheduler
scheduler : resources : requests : cpu : "500m" memory : "1Gi" limits : cpu : "1000m" memory : "2Gi"
Standalone DAG processor
dagProcessor : enabled : true resources : requests : cpu : "250m" memory : "512Mi" limits : cpu : "500m" memory : "1Gi"
Triggerer (for deferrable tasks)
triggerer : resources : requests : cpu : "250m" memory : "512Mi" limits : cpu : "500m" memory : "1Gi"
Worker resources (CeleryExecutor only)
workers : resources : requests : cpu : "500m" memory : "1Gi" limits : cpu : "2000m" memory : "4Gi" replicas : 2
Log persistence
logs : persistence : enabled : true size : 10Gi
PostgreSQL (built-in)
postgresql : enabled : true
Or use an external database
postgresql:
enabled: false
data:
metadataConnection:
user: airflow
pass: airflow
host: your-rds-host.amazonaws.com
port: 5432
db: airflow
Upgrading
Upgrade with new values
helm upgrade airflow apache-airflow/airflow \ --namespace airflow \ -f values.yaml
Upgrade to a new Airflow version
- helm upgrade airflow apache-airflow/airflow
- \
- --namespace
- airflow
- \
- --set
- defaultAirflowTag
- =
- "
" - DAG Deployment Strategies on Kubernetes
- Git-sync
- (recommended): DAGs are synced from a Git repository automatically
- Persistent Volume
-
- Mount a shared PV containing DAGs
- Baked into image
- Include DAGs in a custom Docker image Useful Commands
Check pod status
kubectl get pods -n airflow
View scheduler logs
kubectl logs -f deployment/airflow-scheduler -n airflow
Port-forward the API server
kubectl port-forward svc/airflow-apiserver 8080 :8080 -n airflow
Run a one-off CLI command
kubectl exec -it deployment/airflow-scheduler -n airflow -- airflow dags list