AgentFlow · Trilogy

The instrument panel for AI work.

You've put agents into production. Your dashboards already say they ran. The harder questions — are they working, are they getting better or quietly degrading, who proposed what and who approved it, where should the next dollar go — your dashboards don't answer.

AgentFlow is built around two ideas, in this order:

A pipeline orchestrator that doesn't touch your agent code. Your agents are yours. AgentFlow launches them in their own containers, integrates with the humans and external systems that approve their work, and gives you fleet-wide visibility into which pipelines and gates are healthy. This is what AgentFlow is today.
A measurement layer (beads) that ties cost, telemetry, and outcome to each run. Today's beads carry attribution and acceptance; richer telemetry (cost, tokens, models) is in flight as a per-run AI-gateway integration. Section 2 separates what's shipped from what's coming.

Read Section 1 to understand the orchestration layer (works today). Read Section 2 to understand beads — today's attribution + tomorrow's per-run cost.

What AgentFlow runs for you

AgentFlow's first job is to be a pipeline orchestrator. You define a sequence of work; AgentFlow executes it as a Step Functions state machine, manages the human and external decision points along the way, and records what was decided.

The four primitives:

App

one tenant workspace (e.g. cost-opt, cloudsense) — its own AWS account, GitHub repo, S3 bucket, EventBridge, pipelines, agents, RBAC

Pipeline

YAML → Step Functions state machine — triggers (schedule | event | cross-account-event), steps + gates + on_failure handler

Step

Container

where the actual code runs: function/skill → framework's CodeRunner / SkillRunner Lambda · workspace → your Fargate task (your image) · lambda → your Lambda ARN · external_runner → your registered runtime (AgentFlow doesn't launch; observes)

Gate

App contains Pipelines. Pipelines are Step Functions state machines whose nodes are Steps. Each Step launches into a Container — that's where your code runs. Pipelines end at (or contain) Gates. AgentFlow does not enter the container.

What AgentFlow does NOT do

AgentFlow does not interfere with how you build your apps or your agents.

No SDK to import inside your agent code
No framework conventions inside the container
No context to propagate, no headers to thread
Your agents can be LangGraph, CrewAI, raw OpenAI SDK, custom Python, anything
Step kinds match natural deployment shapes: function / skill for Python that runs in our pooled runners; workspace for whatever Docker image you ship to Fargate; lambda for an ARN you hand us; external_runner for runtimes outside our launch path that report back

AgentFlow is the launcher and the recorder. It launches the container, waits for the step to finish, routes the gate to the right human or system, records the decision, then advances the state machine. Your code stays yours.

How decisions get made

Pipelines incorporate gates — points where a decision happens (approve, reject, hold). Two kinds today:

`llm_gate` — LLM-as-decider

A Bedrock-backed Lambda evaluates the step's output against a decision_schema. Useful for deterministic rule-checks expressed in natural language ("does this PR contain a breaking change?"). Decision lands in DecisionV2.

yaml
- id: cp1-rerun-decider
  kind: llm_gate
  on_event: cost-opt.cp1.proposed
  scope: ticket
  model: anthropic.claude-3-haiku-20240307-v1:0
  decision_schema: { ref: cost_opt.schemas.RerunDecision }
  decided_by: cost-opt-cp1-rerun-decider:v1
  on_rerun: { ... }
  on_park: { transition_to: Awaiting-CP1 }

`external_gate` — human or external system as decider

Three sources:

yaml
# Jira: a state transition in a Jira issue resolves the gate
- id: production-approval
  kind: external_gate
  source: jira
  config:
    actor_field: assignee
    state_to_decision:
      Approved: approved
      Rejected: rejected
    timeout_seconds: 86400
  on_approved: { ... }
  on_rejected: { ... }

# github_label: applying a label resolves the gate
- id: pr-review
  kind: external_gate
  source: github_label
  config:
    actor_field: actor
    label_to_decision:
      lgtm: approved
      blocked: rejected

# webhook: any system that POSTs to AgentFlow's resume endpoint
- id: custom-approval
  kind: external_gate
  source: webhook
  config: { ... }

The gate's underlying mechanism is a Step Functions task token — the pipeline pauses at the gate state until the source system reports a decision via the configured resolver. The decision lands in DecisionV2 regardless of source. The pipeline cares about the decision, not where the human was when they made it.

The Inbox — a cross-app view of pending decisions

For human-resolved gates that don't have an external system attached, AgentFlow exposes an Inbox — a console view that aggregates tickets across all apps that are sitting in Awaiting-* or Pending-* stages. Backed by the list_pending_decisions MCP tool (queries each app's data layer for stage-pending tickets).

The Inbox is a view, not a gate kind. Operators use it to see "what needs me, across every product, right now."

How to use it

You can run AgentFlow as just an orchestrator. No bead enrichment needed. Five steps:

Register an app

MCP tool: registry.register_app (required: display_name; optional: app_id, description, admins, bu, product, tags). Creates a draft AppV2 row. The caller becomes the immutable creator (created_by); admins manage downstream.

Self-deploy your app's CFN stack

Each app deploys its own Step Functions, Lambda functions, IAM roles, etc. into its own AWS account. Standard SAM / CDK / whatever you use.

Finalize registration

MCP tool: registry.register_app_resources (passes the deployed ARNs back to AgentFlow: account_id, region, data_bucket, event_bus_arn, mcp_cross_account_role_arn). Status flips to registered.

Author pipelines as YAML

File location is your choice; canonical pattern is apps/<app_id>/pipelines/<name>.pipeline.yaml. The YAML is parsed by framework/backend/dsl/parser.py; required top-level fields are name and steps. Each pipeline becomes a Step Functions state machine on your next deploy and gets registered via registry.register_pipeline.

Run, resolve gates, watch

Trigger pipelines via triggers: (cron or EventBridge), via MCP execute_pipeline, or via webhook. Resolve external gates wherever your humans are. Watch the Inbox + run history in the console.

That's the orchestrator surface. No beads required. A reader who stops here understands why they'd use AgentFlow.

Closing

Today, AgentFlow is a pipeline orchestrator that integrates with your gates and records attribution against a bead axis. That alone replaces a lot of duct tape and gives you fleet-wide visibility into which agents are healthy.

In flight, the bead axis grows into per-run cost + LLM telemetry through a per-run AI gateway virtual key, with no instrumentation required inside the agent. Three tracked issues (319z, ls6a, aqs3) carry the work; this page will be updated as each ships.

The contract is stable in either world. Today's beads carry identity + attribution + acceptance; tomorrow's beads carry the same plus telemetry; the same bead_id and the same MCP tools read both. Adopt the orchestrator now; the measurement layer ships behind it without breaking anything you've already integrated.