DockerSwarmOperator

Docker

Execute a command as an ephemeral docker swarm service. Example use-case - Using Docker Swarm orchestration to make one-time scripts highly available.

View on GitHub

Last Updated: Jan. 3, 2023

Access Instructions

Install the Docker provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

imageRequiredDocker image from which to create the container. If image tag is omitted, “latest” will be used.
api_versionRemote API version. Set to auto to automatically detect the server’s version.
auto_removeAuto-removal of the container on daemon side when the container’s process exits. The default is False.
commandCommand to be run in the container. (templated)
docker_urlURL of the host running the docker daemon. Default is unix://var/run/docker.sock
environmentEnvironment variables to set in the container. (templated)
force_pullPull the docker image on every run. Default is False.
mem_limitMaximum amount of memory the container can use. Either a float value, which represents the limit in bytes, or a string like 128m or 1g.
tls_ca_certPath to a PEM-encoded certificate authority to secure the docker connection.
tls_client_certPath to the PEM-encoded certificate used to authenticate docker client.
tls_client_keyPath to the PEM-encoded key used to authenticate docker client.
tls_hostnameHostname to match against the docker server certificate or False to disable the check.
tls_ssl_versionVersion of SSL to use when communicating with docker daemon.
tmp_dirMount point inside the container to a temporary directory created on the host by the operator. The path is also made available via the environment variable AIRFLOW_TMP_DIR inside the container.
userDefault user inside the docker container.
docker_conn_idThe Docker connection id
ttyAllocate pseudo-TTY to the container of this service This needs to be set see logs of the Docker container / service.
enable_loggingShow the application’s logs in operator’s logs. Supported only if the Docker engine is using json-file or journald logging drivers. The tty parameter should be set to use this with Python applications.
configsList of docker configs to be exposed to the containers of the swarm service. The configs are ConfigReference objects as per the docker api [https://docker-py.readthedocs.io/en/stable/services.html#docker.models.services.ServiceCollection.create]_
secretsList of docker secrets to be exposed to the containers of the swarm service. The secrets are SecretReference objects as per the docker create_service api. [https://docker-py.readthedocs.io/en/stable/services.html#docker.models.services.ServiceCollection.create]_
modeIndicate whether a service should be deployed as a replicated or global service, and associated parameters
networksList of network names or IDs or NetworkAttachmentConfig to attach the service to.
placementPlacement instructions for the scheduler. If a list is passed instead, it is assumed to be a list of constraints as part of a Placement object.

Documentation

Execute a command as an ephemeral docker swarm service. Example use-case - Using Docker Swarm orchestration to make one-time scripts highly available.

A temporary directory is created on the host and mounted into a container to allow storing files that together exceed the default disk size of 10GB in a container. The path to the mounted directory can be accessed via the environment variable AIRFLOW_TMP_DIR.

If a login to a private registry is required prior to pulling the image, a Docker connection needs to be configured in Airflow and the connection ID be provided with the parameter docker_conn_id.

Was this page helpful?