Skip to main content

Automated Python Code Generation

The AutoCoderAgent in flyte-sdk allows you to generate, test, and execute Python code using natural language prompts. It automates the entire lifecycle of code generation: it analyzes your sample data to infer schemas, generates a solution using an LLM, builds an isolated sandbox image with the necessary dependencies, and runs a test-fix loop until the code is correct.

In this tutorial, you will build a pipeline that generates code to analyze sensor data, validates it in a sandbox, and then executes it as a reusable Flyte task.

Prerequisites

To use the AutoCoderAgent, you need:

  • The flyte-sdk installed with the codegen plugin.
  • An LLM API key (e.g., OPENAI_API_KEY or ANTHROPIC_API_KEY) configured as a Flyte secret.
  • A Flyte environment configured to support sandboxes.

Step 1: Initialize the AutoCoderAgent

The AutoCoderAgent requires a model name and resource specifications for the sandbox where the generated code will run.

import flyte
from flyteplugins.codegen import AutoCoderAgent

# Initialize the agent with a model and resource limits for the sandbox
agent = AutoCoderAgent(
name="sensor-analysis-agent",
model="gpt-4.1",
resources=flyte.Resources(cpu=1, memory="1Gi"),
base_packages=["pandas", "numpy"],
max_iterations=5
)

The base_packages list ensures that specific libraries are always available in the generated environment, while max_iterations limits how many times the agent will attempt to fix the code if tests fail.

Step 2: Define the Task Environment

Any Flyte task that uses AutoCoderAgent must depend on the sandbox_environment. This ensures the task has the necessary permissions and infrastructure to build and run isolated images.

from flyte.sandbox import sandbox_environment

env = flyte.TaskEnvironment(
name="codegen-env",
secrets=[
flyte.Secret(key="my_openai_key", as_env_var="OPENAI_API_KEY"),
],
depends_on=[sandbox_environment],
)

Step 3: Generate Code from Samples and Constraints

Use the generate method to create code. By providing samples, the agent automatically extracts data schemas and statistics to help the LLM understand the data structure. You can also provide constraints to enforce business logic.

import pandas as pd
from flyte.io import File

@env.task
async def generate_analysis(prompt: str, data: pd.DataFrame):
# Generate code based on the prompt and sample data
result = await agent.generate.aio(
prompt=prompt,
samples={"readings": data},
constraints=[
"Temperature values must be between -40 and 60 Celsius",
"Output report must have one row per unique sensor_id",
],
outputs={
"report": File,
"total_anomalies": int,
},
)

if not result.success:
raise RuntimeError(f"Code generation failed: {result.error}")

return result

The outputs dictionary defines the expected return types of the generated script. flyte-sdk supports str, int, float, bool, File, and Dir as sandbox outputs.

Step 4: Execute the Generated Code

Once you have a CodeGenEvalResult, you can execute the code immediately or convert it into a reusable Flyte task.

Option A: Immediate Execution

Use result.run() for one-off executions. This runs the generated code in the sandbox using the original samples as inputs.

# Inside a task...
total_revenue, total_units, transaction_count = await result.run.aio()

Option B: Create a Reusable Task

Use result.as_task() to create a standard Flyte task from the generated code. This is ideal for production pipelines where you want to reuse the validated code on new data.

# Inside a task...
analysis_task = result.as_task(
name="run_sensor_analysis",
resources=flyte.Resources(cpu=1, memory="512Mi"),
)

# Execute the new task with different data
report, total_anomalies = await analysis_task.aio(
readings=new_data_file,
)

Advanced Configuration

Using the Claude Agent Backend

By default, AutoCoderAgent uses a structured iteration loop via LiteLLM. You can switch to an autonomous agent mode by setting backend="claude". This requires an ANTHROPIC_API_KEY.

agent = AutoCoderAgent(
model="claude-3-5-sonnet-20240620",
backend="claude",
agent_max_turns=20
)

Tuning LLM Parameters

You can pass provider-specific parameters (like temperature or top_p) through litellm_params.

agent = AutoCoderAgent(
model="gpt-4",
litellm_params={
"temperature": 0.2,
"max_tokens": 4096
}
)

Complete Example Result

When the generation process finishes, the CodeGenEvalResult contains metadata about the run, which you can use for logging or debugging:

print(f"Success: {result.success}")
print(f"Attempts: {result.attempts}")
print(f"Detected Packages: {result.detected_packages}")
print(f"Input Tokens: {result.total_input_tokens}")
print(f"Output Tokens: {result.total_output_tokens}")
print(f"Generated Code:\n{result.solution}")

The AutoCoderAgent ensures that the result.solution is not just syntactically correct, but has actually passed execution tests within the built sandbox image before it is returned to you.