Flyte SDK: Authoring Production-Ready Python Workflows

The Flyte SDK is a Python library for defining robust, scalable, and reproducible data and machine learning workflows. Write your code as decorated Python functions, and Flyte handles the rest: containerization, execution, and data passing between steps.

Why Does This Exist?

Turning a Python script into a production-ready workflow is hard. You start with a script on your laptop, but then you need to:

Package dependencies: Manually create and manage requirements.txt files and Docker containers.
Manage resources: Figure out how to get a GPU for your training step or more memory for data processing.
Handle failures: Add complex retry logic and figure out how to resume a pipeline from a failed step without starting over.
Orchestrate execution: Chain scripts together, passing data between them, often with brittle, custom-built solutions.

The Flyte SDK solves this by letting you define your infrastructure needs in your Python code. It automates the tedious parts of productionizing your code so you can focus on writing your logic.

The Blueprint Analogy: Core Concepts

Think of the Flyte SDK as providing blueprints for building self-contained, production-ready computational steps. You write a standard Python function, decorate it as a @task, and attach an Environment blueprint.

@task: A decorator that turns any Python function into a fundamental unit of work in Flyte. It's the smallest, indivisible step in a workflow.
Environment: A blueprint that defines a task's execution context. It specifies the container image, Python packages, resource requirements (like CPU, memory, or GPUs), and secrets.
Workflows: A workflow is just a Python function that calls one or more tasks. The SDK automatically understands the dependencies between tasks, forming a directed acyclic graph (DAG).
Plugins: The core SDK is kept small and fast. Specialized integrations with tools like Spark, Dask, BigQuery, and WandB are provided through optional plugins.

How It Works

When you run your code with Flyte, the SDK takes care of the operational details:

Define: You write a Python function and decorate it with @env.task, where env is an Environment object.
Specify: In the Environment, you declare dependencies (with_pip_packages(...)), resource needs (resources=Resources(...)), and other settings.
Compose: You create a workflow by calling tasks from other Python functions. The SDK tracks the data flow between them.
Package: Flyte automatically builds a container image for your Environment using the dependencies you specified. No Dockerfile writing required.
Execute: Flyte runs each task in its own container with the specified resources, handling input serialization and output passing automatically.

What Can I Build With It?

Here are a few simple examples to give you a feel for the SDK.

A Simple "Hello, World"

This is the most basic task. It runs in a default Python environment.

# hello.py
from flyte import Environment, task

# Define a simple environment
basic_env = Environment(name="basic")

@basic_env.task
def say_hello(name: str = "world") -> str:
    """A simple task that returns a greeting."""
    return f"Hello, {name}!"

if __name__ == "__main__":
    # Run the task like a regular Python function
    print(say_hello(name="developer"))

A Task with Dependencies

Need a package like pandas? Just add it to your environment's blueprint.

# dataframe_task.py
from flyte import Environment, task
import pandas as pd

# Define an environment with pandas installed
pandas_env = Environment(
    name="pandas_env",
).with_pip_packages("pandas")

@pandas_env.task
def create_dataframe() -> pd.DataFrame:
    """A task that creates and returns a pandas DataFrame."""
    data = {'col1': [1, 2], 'col2': [3, 4]}
    return pd.DataFrame(data=data)

if __name__ == "__main__":
    df = create_dataframe()
    print(df)

A GPU-Accelerated Task

Requesting a GPU is as simple as adding it to your environment's resources.

# gpu_task.py
from flyte import Environment, task, Resources, GPU

# Define an environment that requests a GPU
gpu_env = Environment(
    name="gpu_env",
    resources=Resources(gpu=GPU(device="nvidia-tesla-t4", quantity=1)),
).with_pip_packages("torch")

@gpu_env.task
def check_gpu() -> str:
    """A task that checks for GPU availability using torch."""
    import torch
    if torch.cuda.is_available():
        return f"GPU is available: {torch.cuda.get_device_name(0)}"
    return "GPU not available."

if __name__ == "__main__":
    # Note: This will only work if a GPU is available locally.
    # When deployed, Flyte will schedule it on a GPU-enabled node.
    print(check_gpu())

When to Use It (and When Not To)

Use flyte-sdk when you need to:

Run Python code that is resource-intensive (e.g., needs a GPU or lots of memory).
Orchestrate multi-step data or ML pipelines that must be reliable and reproducible.
Automate running tasks on a schedule (e.g., daily or hourly).
Version and share computational tasks with a team.
Cache the results of expensive computations to avoid re-running them.

flyte-sdk might be overkill if:

You have a simple script that runs quickly on a single machine.
You are building a traditional, long-running web service (like a FastAPI or Flask app).
You just need a lightweight alternative to cron for simple, non-containerized jobs.

Integrations

Languages: Requires Python >= 3.10.
Containerization: Requires a Docker daemon to be running for local execution and image building.
Storage: Uses fsspec for abstracting I/O, enabling support for local storage, S3, GCS, and more out of the box.
Ecosystem: Designed to be extensible via plugins. A rich ecosystem of plugins exists for tools like Spark, Dask, BigQuery, Snowflake, WandB, and more.

Getting Started

Install dependencies:

# Requires uv, a fast Python package installer
pip install uv
uv sync

Write your first task: Create a file my_first_task.py:

from flyte import Environment, task

env = Environment(name="my_env")

@env.task
def greet(name: str) -> str:
    return f"Welcome to Flyte, {name}!"

if __name__ == "__main__":
    # Run it locally
    print(greet(name="friend"))

Run it:
```
uv run python my_first_task.py
```

Limitations & Assumptions

Docker Dependency: To build images and run tasks in isolated environments locally, you must have a Docker daemon installed and running.
Backend Required for Scale: The SDK is the authoring layer. To run workflows at scale, with parallelism, scheduling, and distributed execution, you need to connect to a Flyte backend cluster.
Serialization: Data is passed between tasks by serializing Python objects (using cloudpickle and other tools). Most standard types are supported, but complex, non-serializable objects may require custom type transformers.

Common Questions

Do I need to know Docker to use this? No. You don't need to write Dockerfiles. The SDK builds container images for you based on your Environment definition. You just need to have Docker installed.

How do tasks pass data to each other? The SDK automatically handles serializing outputs from one task and deserializing them as inputs for the next. For large data like dataframes or files, it uses remote storage (like S3) and passes lightweight references between tasks.

What's the difference between an Image and an Environment? An Environment is a high-level blueprint for your task's execution context. An Image is a more detailed component within an Environment that defines the container image. For most use cases, you will only need to interact with the Environment object.

Can I run tasks without a Flyte cluster? Yes. You can run any task as a regular Python function for local development and testing. The SDK will execute it directly in your local Python environment.

How are secrets handled? You can request secrets using flyte.Secret in your Environment definition. At runtime, the Flyte backend securely injects these secrets as environment variables or mounted files into your task's container.

What are plugins for? Plugins keep the core SDK small and fast while allowing for a rich ecosystem of integrations. They provide pre-built tasks and type transformers for tools like Spark, Dask, BigQuery, and WandB, so you don't have to build the integrations yourself.

Why Does This Exist?​

The Blueprint Analogy: Core Concepts​

How It Works​

What Can I Build With It?​

A Simple "Hello, World"​

A Task with Dependencies​

A GPU-Accelerated Task​

When to Use It (and When Not To)​

Integrations​

Getting Started​

Limitations & Assumptions​

Common Questions​