Automated Python Code Generation
The AutoCoderAgent in flyte-sdk allows you to generate, test, and execute Python code using natural language prompts. It automates the entire lifecycle of code generation: it analyzes your sample data to infer schemas, generates a solution using an LLM, builds an isolated sandbox image with the necessary dependencies, and runs a test-fix loop until the code is correct.
In this tutorial, you will build a pipeline that generates code to analyze sensor data, validates it in a sandbox, and then executes it as a reusable Flyte task.
Prerequisites
To use the AutoCoderAgent, you need:
- The
flyte-sdkinstalled with thecodegenplugin. - An LLM API key (e.g.,
OPENAI_API_KEYorANTHROPIC_API_KEY) configured as a Flyte secret. - A Flyte environment configured to support sandboxes.
Step 1: Initialize the AutoCoderAgent
The AutoCoderAgent requires a model name and resource specifications for the sandbox where the generated code will run.
import flyte
from flyteplugins.codegen import AutoCoderAgent
# Initialize the agent with a model and resource limits for the sandbox
agent = AutoCoderAgent(
name="sensor-analysis-agent",
model="gpt-4.1",
resources=flyte.Resources(cpu=1, memory="1Gi"),
base_packages=["pandas", "numpy"],
max_iterations=5
)
The base_packages list ensures that specific libraries are always available in the generated environment, while max_iterations limits how many times the agent will attempt to fix the code if tests fail.
Step 2: Define the Task Environment
Any Flyte task that uses AutoCoderAgent must depend on the sandbox_environment. This ensures the task has the necessary permissions and infrastructure to build and run isolated images.
from flyte.sandbox import sandbox_environment
env = flyte.TaskEnvironment(
name="codegen-env",
secrets=[
flyte.Secret(key="my_openai_key", as_env_var="OPENAI_API_KEY"),
],
depends_on=[sandbox_environment],
)
Step 3: Generate Code from Samples and Constraints
Use the generate method to create code. By providing samples, the agent automatically extracts data schemas and statistics to help the LLM understand the data structure. You can also provide constraints to enforce business logic.
import pandas as pd
from flyte.io import File
@env.task
async def generate_analysis(prompt: str, data: pd.DataFrame):
# Generate code based on the prompt and sample data
result = await agent.generate.aio(
prompt=prompt,
samples={"readings": data},
constraints=[
"Temperature values must be between -40 and 60 Celsius",
"Output report must have one row per unique sensor_id",
],
outputs={
"report": File,
"total_anomalies": int,
},
)
if not result.success:
raise RuntimeError(f"Code generation failed: {result.error}")
return result
The outputs dictionary defines the expected return types of the generated script. flyte-sdk supports str, int, float, bool, File, and Dir as sandbox outputs.
Step 4: Execute the Generated Code
Once you have a CodeGenEvalResult, you can execute the code immediately or convert it into a reusable Flyte task.
Option A: Immediate Execution
Use result.run() for one-off executions. This runs the generated code in the sandbox using the original samples as inputs.
# Inside a task...
total_revenue, total_units, transaction_count = await result.run.aio()
Option B: Create a Reusable Task
Use result.as_task() to create a standard Flyte task from the generated code. This is ideal for production pipelines where you want to reuse the validated code on new data.
# Inside a task...
analysis_task = result.as_task(
name="run_sensor_analysis",
resources=flyte.Resources(cpu=1, memory="512Mi"),
)
# Execute the new task with different data
report, total_anomalies = await analysis_task.aio(
readings=new_data_file,
)
Advanced Configuration
Using the Claude Agent Backend
By default, AutoCoderAgent uses a structured iteration loop via LiteLLM. You can switch to an autonomous agent mode by setting backend="claude". This requires an ANTHROPIC_API_KEY.
agent = AutoCoderAgent(
model="claude-3-5-sonnet-20240620",
backend="claude",
agent_max_turns=20
)
Tuning LLM Parameters
You can pass provider-specific parameters (like temperature or top_p) through litellm_params.
agent = AutoCoderAgent(
model="gpt-4",
litellm_params={
"temperature": 0.2,
"max_tokens": 4096
}
)
Complete Example Result
When the generation process finishes, the CodeGenEvalResult contains metadata about the run, which you can use for logging or debugging:
print(f"Success: {result.success}")
print(f"Attempts: {result.attempts}")
print(f"Detected Packages: {result.detected_packages}")
print(f"Input Tokens: {result.total_input_tokens}")
print(f"Output Tokens: {result.total_output_tokens}")
print(f"Generated Code:\n{result.solution}")
The AutoCoderAgent ensures that the result.solution is not just syntactically correct, but has actually passed execution tests within the built sandbox image before it is returned to you.