Skip to main content

Advanced Python Types: Tuples, TypedDicts, and Enums

Flyte-sdk leverages modern Python typing to handle complex data structures like fixed-size sequences, structured dictionaries, and categorical values. By using standard Python types, flyte-sdk ensures type safety, IDE autocompletion, and automatic serialization via Pydantic.

Typed Tuples

When you need to pass or return a fixed-size sequence of values with specific types, use Python's tuple[T1, T2, ...] syntax. flyte-sdk requires explicit typing for tuples; bare tuple (untyped) is not supported.

import flyte

env = flyte.TaskEnvironment(name="tuple_example")

@env.task
async def create_tuple() -> tuple[int, str, float]:
"""Create and return a typed tuple."""
return (42, "hello", 3.14)

@env.task
async def process_tuple(data: tuple[int, str, float]) -> str:
"""Process a typed tuple input."""
num, text, decimal = data
return f"Number: {num}, Text: {text}, Decimal: {decimal}"

As seen in examples/basics/types/tuple_types.py, you can also nest tuples or include dataclasses within them:

@env.task
async def nested_tuple() -> tuple[tuple[int, int], str]:
return ((1, 2), "nested")

NamedTuples for Readable Outputs

For tasks that return multiple values, NamedTuple is often preferred over plain tuples because it provides named access to fields and better IDE support.

from typing import NamedTuple
import flyte

env = flyte.TaskEnvironment(name="namedtuple_example")

class ModelMetrics(NamedTuple):
accuracy: float
precision: float
recall: float
f1_score: float = 0.0 # Supports default values

@env.task
async def calculate_metrics() -> ModelMetrics:
return ModelMetrics(accuracy=0.95, precision=0.92, recall=0.93, f1_score=0.94)

@env.task
async def report_metrics(metrics: ModelMetrics) -> str:
# Access fields by name
return f"Accuracy: {metrics.accuracy:.2%}, F1: {metrics.f1_score:.2%}"

Structured Data with TypedDict

TypedDict allows you to define the structure of dictionaries used in your workflows. This is particularly useful for JSON-like data or configurations.

Optional Fields with NotRequired

In flyte-sdk, fields marked as NotRequired are truly absent from the dictionary if not provided, rather than being set to None. This behavior is critical for logic that uses the in operator to check for field existence.

from typing import List, TypedDict
from typing_extensions import NotRequired
import flyte

class ToolCall(TypedDict):
name: str
args: dict

class AIResponse(TypedDict):
content: str
role: str
tool_calls: NotRequired[List[ToolCall]]

@env.task
async def process_ai_response(response: AIResponse) -> bool:
# This check works correctly because NotRequired fields are absent if not provided
has_tool_calls = "tool_calls" in response
return has_tool_calls

Recursive and Complex TypedDicts

flyte-sdk supports self-referential (recursive) TypedDict structures, which are useful for representing trees or nested hierarchies.

class TreeNode(TypedDict):
value: str
children: NotRequired[List[TreeNode]]

@env.task
async def create_tree() -> TreeNode:
return TreeNode(
value="root",
children=[
TreeNode(value="child1"),
TreeNode(value="child2", children=[TreeNode(value="grandchild")]),
],
)

You can also nest complex Flyte types like File, Dir, or DataFrame inside a TypedDict, as demonstrated in examples/basics/types/typeddict_types_complex.py:

from flyte.io import DataFrame, File

class DatasetPackage(TypedDict):
metadata_file: File
data: DataFrame
version: str

Categorical Inputs with Enums

Use Python's enum.Enum to define a fixed set of valid values for a task input. flyte-sdk handles the serialization of these values, typically using their string representation.

import enum
import flyte

class Status(enum.Enum):
PENDING = "pending"
COMPLETED = "completed"
FAILED = "failed"

@env.task
def update_status(status: Status) -> str:
# Access both .name and .value
return f"Processing status: {status.value} ({status.name})"

Troubleshooting and Gotchas

Bare Tuples

If you use a bare tuple without type parameters (e.g., def task(data: tuple):), flyte-sdk will fail to serialize the data. Always provide explicit types like tuple[int, ...] or use a List[int] if the size is dynamic.

Cache Persistence for Recursive Types

When using self-referential TypedDicts (like the TreeNode example), flyte-sdk implements internal fixes to ensure that these recursive structures are correctly handled during cache persistence. If you encounter issues with deep nesting in cached tasks, ensure you are using the latest version of flyte-sdk.

NotRequired vs Optional

In a TypedDict, Optional[T] means the key must exist but its value can be None. NotRequired[T] means the key may be missing entirely. flyte-sdk respects this distinction, which is important for downstream tasks that validate dictionary keys.