Container Image Lifecycle
flyte-sdk provides a fluent, immutable API for defining and building container images. Instead of writing and maintaining separate Dockerfiles, you can define your environment directly in Python code. This ensures that your task dependencies are versioned alongside your code and allows flyte-sdk to optimize the build process through intelligent caching and hashing.
The Fluent Image API
Defining an image in flyte-sdk follows a two-step pattern: you start with a base constructor and then chain customization methods. Each with_* method returns a new, immutable Image instance, allowing you to reuse base definitions across different tasks.
import flyte
# Define a base image for data science tasks
base_ds_image = (
flyte.Image.from_debian_base(python_version=(3, 11))
.with_apt_packages("curl", "git")
.with_pip_packages("pandas", "numpy")
)
# Extend it for a specific ML task
ml_image = base_ds_image.with_pip_packages("scikit-learn", "xgboost")
Internally, the Image class (found in src/flyte/_image.py) prevents direct instantiation via a _guard token in its __post_init__ method. You must use one of the from_* class methods to begin. Every customization call invokes clone(), which creates a new Image object and appends a new Layer to the internal _layers tuple.
Base Constructors
flyte-sdk offers several starting points depending on your project's requirements.
Debian-based Images
The most common starting point is from_debian_base(). This uses a pre-built, optimized Debian image provided by the Flyte team.
image = flyte.Image.from_debian_base(
python_version=(3, 12),
install_flyte=True
)
This method calls _get_default_image_for, which selects a base image like python:3.12-slim-bookworm and sets up a default flytekit user and working directory.
Custom Dockerfiles
If you have complex system requirements that the fluent API doesn't cover, you can point to an existing Dockerfile using from_dockerfile().
from pathlib import Path
image = flyte.Image.from_dockerfile(
file=Path(__file__).parent / "Dockerfile",
registry="ghcr.io/my-org",
name="custom-app",
platform=("linux/amd64", "linux/arm64")
)
Note that images created via from_dockerfile set extendable=False. Because flyte-sdk does not parse the contents of the Dockerfile, it cannot safely layer additional with_* customizations on top of it.
UV Scripts
For small scripts or notebooks, you can use from_uv_script(). This parses the inline dependency metadata (PEP 723) at the top of a Python file to build the environment.
image = flyte.Image.from_uv_script(
script="my_script.py",
name="script-image",
registry="ghcr.io/my-org"
)
Modern Python Tooling
flyte-sdk has first-class support for uv and Poetry, allowing you to mirror your local development environment exactly in the container.
UV Projects
The with_uv_project() method uses your pyproject.toml and uv.lock files to install dependencies.
image = flyte.Image.from_debian_base().with_uv_project(
pyproject_file="pyproject.toml",
project_install_mode="dependencies_only"
)
By default, this copies only the configuration files to keep the image layer small. If you set project_install_mode="install_project", it will copy the entire directory and install your local package into the image.
Poetry Projects
Similarly, with_poetry_project() handles Poetry-managed environments.
image = flyte.Image.from_debian_base().with_poetry_project(
pyproject_file="pyproject.toml",
poetry_lock="poetry.lock"
)
Internally, these methods use the UVProject and PoetryProject layer classes to generate the appropriate build commands.
Managing Source Code
How you include your source code in the image significantly impacts build times and deployment speed.
Static Source Inclusion
Use with_source_folder() or with_source_file() to copy files into the image at build time.
image = (
flyte.Image.from_debian_base()
.with_source_folder(src=Path("./src"), dst="/app/src")
.with_source_file(src=Path("config.yaml"), dst="/app/config.yaml")
)
These methods use CopyConfig layers. Because the contents of these files are hashed into the image's unique identifier (via _get_hash_digest), any change to the source code will trigger a full image rebuild.
Code Bundles for Fast Iteration
To avoid rebuilding the image for every code change, use with_code_bundle().
image = flyte.Image.from_debian_base().with_code_bundle(copy_style="all")
This marks the image as containing the application code. When you register your Flyte workflow, flyte-sdk can use "Fast Registration," where the code is zipped and uploaded separately, allowing the same container image to be reused across many code iterations.
Build Orchestration
The ImageBuildEngine (in src/flyte/_internal/imagebuild/image_builder.py) manages the lifecycle of the build.
Existence Checks and Hashing
Before starting a build, the engine calls image_exists(). It calculates a MD5 hash of the image specification using _get_hash_digest(), which includes:
- The base image URI.
- The contents of the Dockerfile (if applicable).
- The configuration and file contents of every added layer (e.g.,
PipPackages,CopyConfig).
If an image with that hash already exists in the registry, the engine skips the build entirely.
Local vs. Remote Builders
You can configure which builder to use via the image_builder setting in your Flyte configuration.
- Local Builder: Uses the local Docker or Podman daemon to build the image.
- Remote Builder: Kicks off a Flyte job to build the image in the cloud, which is useful if you don't have Docker installed locally or need to build multi-arch images.
The build() method returns an ImageBuild object:
from flyte.extend import ImageBuildEngine
# Manually trigger a build
build_result = await ImageBuildEngine.build(my_image)
print(f"Image is ready at: {build_result.uri}")
If using the remote builder, build_result.remote_run will contain a reference to the Flyte Run that performed the build.