Skip to main content

AI Agents & Automated Coding

The flyte-sdk provides tools for building LLM-powered agents that can autonomously generate, test, and execute code within isolated sandboxes. This is primarily handled by the AutoCoderAgent, which supports iterative "generate-test-fix" loops to ensure the produced code is functional before it is ever run on production data.

In this tutorial, you will build an agent that analyzes a CSV file with an unknown schema, generates Python code to process it, and executes that code in a secure Flyte sandbox.

Prerequisites

To use the AI coding agents, you need the following installed:

  • flyte-sdk
  • flyteplugins-codegen
  • claude-agent-sdk (required if using the claude backend)

You also need an API key for your chosen LLM provider (e.g., ANTHROPIC_API_KEY or OPENAI_API_KEY) available in your environment.

Step 1: Initialize the AutoCoderAgent

The AutoCoderAgent is the main entry point for automated code generation. You can choose between two backends:

  • litellm (default): Uses an iterative loop to generate code, write tests, and fix errors based on test output.
  • claude: Uses the Claude Agent SDK for an autonomous agent that can read/write files and execute commands in a workspace.
from flyteplugins.codegen import AutoCoderAgent
import flyte

agent = AutoCoderAgent(
name="sales-data-processor",
backend="claude",
model="claude-sonnet-4-5-20250929",
resources=flyte.Resources(cpu=1, memory="1Gi"),
)

Step 2: Provide Data Context for Accurate Generation

One of the most powerful features of the AutoCoderAgent is its ability to extract context from sample data. When you provide samples, the agent uses extract_data_context to analyze the files or DataFrames, inferring schemas (using Pandera) and statistical summaries. This context is injected into the LLM prompt so the generated code understands the data structure it will process.

from flyte.io import File
import pandas as pd

# You can provide a local file or a DataFrame as a sample
sample_df = pd.DataFrame({
"date": ["2024-01-01", "2024-01-02"],
"revenue": [150.00, 200.50],
"units": [10, 15]
})

# The agent will use this to understand column names and types
samples = {"sales_csv": sample_df}

Step 3: Generate and Validate Code

Call agent.generate() to start the coding process. The agent will:

  1. Analyze the samples to build a data context.
  2. Generate a CodePlan.
  3. Write the Python solution and corresponding tests.
  4. Execute tests in a Flyte sandbox.
  5. Iterate (up to max_iterations) if tests fail.
result = await agent.generate.aio(
prompt="Calculate the total revenue and average units sold per day.",
samples=samples,
outputs={
"total_revenue": float,
"avg_units": float,
},
)

if result.success:
print(f"Code generated successfully in {result.attempts} attempts.")
print(f"Detected packages: {result.detected_packages}")

The result object is a CodeGenEvalResult containing the final solution.code, the tests used for validation, and the image built to run the code.

Step 4: Execute on Real Data

Once the code is validated, you can run it on real, production-scale data using result.run(). This executes the generated code in a fresh sandbox using the same environment (image and packages) that was validated during the generation phase.

# Run on a real production file
real_data = await File.from_local("production_sales_2024.csv")

total_revenue, avg_units = await result.run.aio(
sales_csv=real_data
)

print(f"Results: Revenue=${total_revenue}, Avg Units={avg_units}")

Step 5: Integrate into a Flyte Workflow

For production pipelines, you can convert the generation result into a reusable Flyte task using result.as_task(). This allows you to bake the generated logic into a larger workflow.

@flyte.workflow
async def sales_workflow(data: File) -> float:
# ... logic to decide when to trigger the agent ...

# Create a task from the agent's successful result
processing_task = result.as_task(name="process_final_report")

# Execute the task like any other Flyte task
return await processing_task(sales_csv=data)

Sandbox Best Practices

When the AutoCoderAgent generates code, it follows a specific filesystem layout required by the Flyte sandbox:

  • Inputs: Input files and variables are mapped to /var/inputs/. For example, if you defined an input named sales_csv, the code should look for it at /var/inputs/sales_csv.
  • Outputs: The code must write its results to /var/outputs/. If you defined an output named total_revenue, the code should write that value to /var/outputs/total_revenue.

The AutoCoderAgent automatically handles this mapping when you use result.run() or result.as_task(), but the LLM is prompted to respect these paths during code generation to ensure compatibility.

Configuration Summary

ParameterDescription
backend"litellm" (iterative) or "claude" (autonomous agent).
max_iterationsMax attempts to fix code if tests fail (default: 10).
base_packagesPackages to always include in the sandbox (e.g., ["pandas", "numpy"]).
skip_testsIf True, skips the validation phase (ignored by claude backend).
block_networkIf True, prevents the sandbox from accessing the internet.

Next Steps

  • Explore Self-Healing Pipelines: Use the agent within a try/except block in a Flyte task to automatically generate a fix when a standard processing script fails due to a schema change.
  • Model Prefetching: Use the image_config parameter in AutoCoderAgent to provide a base image that already contains large model weights or heavy dependencies to speed up sandbox initialization.