Skip to Content
CLIAirflow

Airflow

Develop, test, and deploy Airflow DAGs using the skale airflow commands.

Initialize a project

skale airflow init

Scaffolds a new Airflow 3 project in the current directory:

. ├── Dockerfile # FROM ghcr.io/skaledata/airflow:<version> + comments ├── README.md # project layout, CLI commands, deploy workflow ├── requirements.txt # pip deps — auto-installed via ONBUILD ├── packages.txt # apt deps — auto-installed via ONBUILD ├── dags/example_dag.py # example DAG to get you started ├── plugins/ # project-specific Airflow plugins ├── tests/ # DAG tests (pytest) ├── .gitignore └── .dockerignore

The Dockerfile is a single FROM line — the SkaleData Airflow base image pre-installs everything you need and auto-picks-up requirements.txt + packages.txt from the build-context root. No COPY / RUN boilerplate required.

Layout matches Astronomer’s Astro Runtime  so anyone migrating from astro finds a familiar shape.

Local development

Start Airflow locally

skale airflow start

Builds the Docker image from your Dockerfile, starts all services (api-server, scheduler, dag-processor, triggerer, postgres), and waits for the api-server to be healthy.

If your project is bound to a cluster (via --cluster or .skaledata.yaml), the local environment is configured with the same secrets backend and cloud credentials as your deployed instance.

# Start with cluster credentials skale airflow start --cluster analytics-prod

Stop Airflow

skale airflow stop

Gracefully stops all containers. Preserves volumes and data — use skale airflow start to resume.

Restart Airflow

skale airflow restart

Stops and restarts all containers without rebuilding. Useful after config changes.

Destroy local environment

skale airflow kill

Stops and removes all containers, networks, and volumes. Deletes your local Postgres data. Use skale airflow init + start to start fresh.

Open a shell

# Default: scheduler container skale airflow bash # Specific container skale airflow bash webserver

Valid containers: scheduler, api-server, dag-processor, triggerer, postgres.

Run Airflow CLI commands

skale airflow run dags list skale airflow run tasks test my_dag my_task 2024-01-01

Executes an Airflow CLI command inside the scheduler container.

Deploying

Full deploy

skale airflow deploy --cluster analytics-prod

Builds the Docker image, pushes it to the cluster’s container registry, and triggers a rolling deploy. The first time you run this, pass --cluster — the binding is saved to .skaledata.yaml so subsequent deploys just need:

skale airflow deploy

DAG-only deploy

skale airflow deploy --dag-only

Uploads your dags/ folder to cloud storage (GCS / S3 / Azure Blob). The Airflow scheduler picks up changes within 30 seconds via a sync sidecar. No image build, no downtime.

Deploy flags

FlagDescription
--cluster <id>Target cluster (saved to .skaledata.yaml after first use)
--app <name>Airflow instance name (for clusters with multiple Airflows)
--tag <tag>Image tag (defaults to git SHA)
--force-imageForce a full image build even if only DAGs changed
--dag-onlyUpload DAGs only, skip image build

Refresh credentials

skale airflow refresh

Re-mints short-lived cloud credentials for the secrets backend without restarting containers.

  • GCP / Azure: Running containers pick up the new credential file automatically
  • AWS: Containers are restarted to pick up the new environment variables

Requires the project to be bound to a deployed instance (.skaledata.yaml).

CI/CD

Use --dag-only with an API key for automated DAG deployments:

# .github/workflows/deploy-dags.yml name: Deploy DAGs on: push: branches: [main] paths: ['dags/**'] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install CLI run: curl -fsSL https://get.skaledata.com | bash - name: Deploy DAGs env: SKALE_API_KEY: ${{ secrets.SKALE_API_KEY }} run: skale airflow deploy --dag-only
Last updated on