Skip to main content

Compute Resources & Hardware Accelerators

In flyte-sdk, you manage compute requirements using the Resources class, which allows you to specify CPU, memory, storage, and hardware accelerators for your tasks. These requirements can be set globally for a task via TaskEnvironment or dynamically adjusted at call time using the .override() method.

Basic Resource Allocation

You can define CPU and memory requirements using integers, floats, or Kubernetes-style strings. To handle varying workloads, you can also provide a tuple to specify a (request, limit) range.

import flyte

# Define resources in a TaskEnvironment
env = flyte.TaskEnvironment(
name="compute-env",
resources=flyte.Resources(
cpu="500m", # 0.5 cores
memory="1Gi", # 1 GiB memory
disk="10Gi", # 10 GiB ephemeral storage
),
)

@env.task
async def process_data(x: int) -> int:
return x + 1

# Override resources for a specific call
async def dynamic_task():
# Request 1 CPU (limit 2) and 2Gi memory (limit 4Gi)
await process_data.override(
resources=flyte.Resources(
cpu=(1, 2),
memory=("2Gi", "4Gi")
)
)(x=10)

Hardware Accelerators

flyte-sdk supports a wide range of hardware accelerators, including NVIDIA GPUs, Google Cloud TPUs, AWS Neuron, AMD GPUs, and Habana Gaudi.

Simple GPU Allocation

For standard GPU requests, you can use a formatted string "<type>:<quantity>" or a simple integer for any available GPU.

# Request 1 NVIDIA T4 GPU
res_t4 = flyte.Resources(gpu="T4:1")

# Request 2 NVIDIA A100 GPUs
res_a100 = flyte.Resources(gpu="A100:2")

# Request 1 of any available GPU
res_any = flyte.Resources(gpu=1)

Advanced GPU Configuration (MIG)

For NVIDIA GPUs that support Multi-Instance GPU (MIG), use the GPU helper to specify a partition. This is supported for A100, A100 80G, H100, and H200.

# Request a 1g.5gb partition on an A100
gpu_config = flyte.GPU(device="A100", quantity=1, partition="1g.5gb")
resources = flyte.Resources(gpu=gpu_config)

TPU Slices

To use Google Cloud TPUs, use the TPU helper to specify the device type and the slice topology (partition).

# Request a V5P TPU with a 2x2x1 topology
tpu_config = flyte.TPU(device="V5P", partition="2x2x1")
resources = flyte.Resources(gpu=tpu_config)

Other Accelerators

flyte-sdk provides dedicated helpers for other specialized hardware:

# AWS Neuron (Inferentia/Trainium)
neuron_res = flyte.Resources(gpu=flyte.Neuron(device="Trn1"))

# AMD GPUs
amd_res = flyte.Resources(gpu=flyte.AMD_GPU(device="MI300X"))

# Habana Gaudi
gaudi_res = flyte.Resources(gpu=flyte.HABANA_GAUDI(device="Gaudi1"))

Shared Memory

For tasks that require high-performance inter-process communication or large data loading (common in deep learning), you can configure shared memory (/dev/shm).

# Set shared memory to a specific size
res_shm = flyte.Resources(shm="16Gi")

# Automatically set shared memory to the maximum available on the node
res_auto = flyte.Resources(shm="auto")

Advanced Customization with PodTemplate

When standard Resources are insufficient, PodTemplate allows you to customize the underlying Kubernetes Pod specification directly. This is useful for adding environment variables, image pull secrets, or custom labels.

from kubernetes.client import V1Container, V1EnvVar, V1LocalObjectReference, V1PodSpec
import flyte

pod_template = flyte.PodTemplate(
primary_container_name="primary",
labels={"team": "ml-platform"},
annotations={"description": "high-memory-worker"},
pod_spec=V1PodSpec(
containers=[
V1Container(
name="primary",
env=[V1EnvVar(name="DATASET_VERSION", value="v2")]
)
],
image_pull_secrets=[V1LocalObjectReference(name="my-registry-key")],
),
)

env = flyte.TaskEnvironment(
name="custom-pod-env",
pod_template=pod_template,
)

Troubleshooting

Resource Ranges and Singular Values

While Resources supports tuples for (request, limit), some internal flyte-sdk operations require singular values. If you encounter a ValueError stating a value "can not be a list or tuple", ensure you are providing a single int, float, or str for that specific context.

GPU Validation

  • Quantity: The quantity for any Device (GPU, TPU, etc.) must be at least 1. Passing 0 or negative values will trigger a ValueError.
  • Partition Validation: flyte-sdk validates partitions against specific device types. For example, 1g.5gb is valid for A100 but will be rejected for T4. Similarly, TPU topologies like 2x2x1 are validated against the specific TPU version (e.g., V5P).
  • Device Types: When using the string format (e.g., "T4:1"), the device name must match one of the supported types defined in flyte.Accelerators.