Airflow Image
SkaleData publishes a maintained Airflow base image at
ghcr.io/skaledata/airflow that’s used
by every managed Airflow cluster by default. It’s apache/airflow with the
skaledata-airflow-plugins
package pre-installed and registered as an Airflow plugin.
What’s in it
apache/airflow(the official image)apache-airflow-providers-airbyteskaledata-airflow-plugins— SkaleData’s plugin package, currently shipping:- A bearer-auth shim for the Airbyte provider that talks to SkaleData-managed Airbyte
via your
sdk_*API key instead of the upstream OAuth2/applications/tokenflow
- A bearer-auth shim for the Airbyte provider that talks to SkaleData-managed Airbyte
via your
If you’re running the stock image (you haven’t published a custom one), you get this on your next chart upgrade — no action required.
Talking to your SkaleData-managed Airbyte from Airflow
SkaleData ships Airbyte with global.auth.enabled: false and validates your API key at
the ingress. The upstream apache-airflow-providers-airbyte operator can’t talk to that
shape — it only supports OAuth2 client credentials or no-auth. The shim lives in this
image so it Just Works.
1. Create an Airflow connection
In Airflow UI Admin → Connections → +:
| Field | Value |
|---|---|
| Conn Id | airbyte_default |
| Conn Type | Airbyte |
| Host | https://<cluster>.skaledata.run/api/public/v1/ |
| Password | sdk_... (your SkaleData API key) |
| Login / Token URL | leave blank |
2. Use it in a DAG
SkaleData’s extensions live under the skale.providers.* namespace, which
mirrors Airflow’s own airflow.providers.* layout. Class names match upstream
exactly — the namespace already says who owns them.
from datetime import datetime
from airflow.decorators import dag
from skale.providers.airbyte.operators.airbyte import AirbyteTriggerSyncOperator
@dag(start_date=datetime(2026, 1, 1), schedule=None, catchup=False)
def run_airbyte_sync():
AirbyteTriggerSyncOperator(
task_id="sync_postgres_to_warehouse",
airbyte_conn_id="airbyte_default",
connection_id="<your-airbyte-connection-uuid>",
deferrable=True,
)
run_airbyte_sync()This is a drop-in for the upstream
airflow.providers.airbyte.operators.airbyte.AirbyteTriggerSyncOperator —
same arguments, same async/deferrable semantics. Migrating an existing DAG
is a one-line import swap.
The hook and trigger are also exposed at the matching paths in case you need them directly:
from skale.providers.airbyte.hooks.airbyte import AirbyteHook
from skale.providers.airbyte.triggers.airbyte import AirbyteSyncTriggerMaintaining your own Airflow image on top
If you publish a custom image (typical pattern when you need extra providers, OS packages, or DAG code), swap the base:
- FROM apache/airflow:3.2.2-python3.12
+ FROM ghcr.io/skaledata/airflow:3.2.2
# everything else stays the same
COPY dags/ /opt/airflow/dags/The plugins are pre-installed and Airflow discovers them at startup via entry points.
Automatic dependency installation
The image automatically picks up two files if they sit next to your Dockerfile:
| File | What happens |
|---|---|
requirements.txt | pip install -r requirements.txt (no constraints — you pin what you want) |
packages.txt | every line is apt-get install’d |
Both files are optional — leave one out and that step is skipped. This is the same convention Astronomer’s Astro Runtime uses, so migrating from Astro needs no config changes.
A minimal customer Dockerfile is a single line:
FROM ghcr.io/skaledata/airflow:3.2.2Layout next to it:
.
├── Dockerfile → FROM ghcr.io/skaledata/airflow:3.2.2
├── requirements.txt → pip-installable lines (e.g. dbt-postgres==1.10.0)
├── packages.txt → apt package names (e.g. curl, jq)
└── dags/ → your DAGs (you still COPY these into your Dockerfile)Constraints handling: customer requirements.txt is installed without
--constraint. You can pin newer provider releases than the
Apache Airflow constraints file
knows about — pip resolves freely against what you ask for. The base image’s
own Airflow install is still pinned with constraints at build time, so the
platform layer stays internally consistent. If you pin a dep that breaks
something Airflow needs, you’ll see the failure at build time (the pip
resolver errors there); the platform image itself is unaffected.
Matches Astronomer’s astro-runtime behaviour, so customers migrating from Astro won’t be surprised.
Versioning
Image tags pin to upstream Airflow versions. The Python version is an internal build-time pin (currently 3.12) and intentionally not part of the customer-facing tag.
| Tag | Airflow | Notes |
|---|---|---|
3.2.2 | 3.2.2 | Mutable — always the latest plugin for this Airflow |
3.2.2-<sha7> | 3.2.2 | Immutable — pin against this in prod for reproducibility |
A plugin-only update re-publishes the mutable 3.2.2 tag and produces a new
-<sha7> immutable. The Airflow version doesn’t move.
See the release log for what changed in each release.
Pinning to a specific build
The mutable tag is fine for most users — you get plugin fixes automatically on next pod restart. For stricter reproducibility, pin to the SHA:
FROM ghcr.io/skaledata/airflow:3.2.2-<sha7>Look up the current SHA on the GHCR package page .