The ComputeRouter is unified managed compute for agents. Your agent runs fast and cheap; you don’t pick hosts, regions, or interconnect topologies. The Router handles compute provisioning, lifecycle, idle suspension, fail-over, and cost attribution per agent run. It’s the third axis of Copass’s three-Router architecture:Documentation Index
Fetch the complete documentation index at: https://docs.copass.com/llms.txt
Use this file to discover all available pages before exploring further.
- AgentRouter — who runs (the agent runtime)
- ContextRouter — what they know (the data and retrieval layer)
- ComputeRouter — where they run (the silicon under the runtime)
What you can do
- Run open-weights agent runtimes (Hermes today) on managed compute without provisioning servers yourself.
- Pick a compute provider per agent (Daytona, Fly Sprites) — or let cost-aware routing pick per call.
- Get long-lived sandboxes that auto-suspend on idle and resume on next invocation. No cold-start tax on every turn.
- Audit compute time per
(user_id, sandbox_id, agent_id, run_id)through the same credit ledger as token usage. - Fail over between providers automatically when one is degraded.
Supported compute providers
| Provider | Best for | Key trade-off |
|---|---|---|
| Daytona | Latency-sensitive workloads, generous free tier, fast warm-start (~30 ms). Archive-resume model fits long-lived per-(user, sandbox) instances. | Slightly higher per-hour cost than Sprites at scale. |
| Fly Sprites | Cost-sensitive workloads, L3 egress allowlists, multi-region. The default for high-throughput agents. | Slightly higher cold-start latency than Daytona. |
Cost-aware routing
When you don’t pin a provider, ComputeRouter picks per call by policy:latency-aware (favor warm Daytona instances; spill to Sprites when caps are hit).
Why this exists
Two breaks in the existing agent runtime model push toward a separate compute axis:- Open-weights agents need real compute. Hermes (Nous Research, MIT-licensed, OpenAI-compatible HTTP API, SQLite-on-disk memory) needs to run on a host you provision. Vendor-managed agent platforms don’t help here — you own the runtime, you own the silicon.
- The compute axis is orthogonal to the runtime axis. Hermes-on-Daytona vs. Hermes-on-Sprites is a compute decision; Hermes vs. some-future-open-weights-agent is a runtime decision. Treating them as one axis would force a combinatorial enum that breaks every time you add either.
backend; you select the compute on compute_provider. Two orthogonal fields, picked independently.
What ComputeRouter handles
| Concern | Behavior |
|---|---|
| Provisioning | Spins up a compute sandbox per (user_id, sandbox_id). Long-lived; auto-stops on idle (default 15 min); archives after a window (default 7 days); resumes on next invocation. |
| Cost attribution | Every sandbox carries {user_id, sandbox_id, agent_id, run_id} metadata tags. Compute time bills to the right account through the credit ledger — two rows per run (token cost + compute cost), joined on run_id. |
| Auth | Per-turn DEK injection — credentials never bake into the sandbox image. Authorization: Bearer rotates per call. |
| Egress | Sandbox network is locked down to the Copass MCP host. No general internet. L3 allowlists on Sprites. |
| Kill-switch | Inherits the platform-wide gate (global pause, per-user pause, per-provider pause). Pause the user, pause the platform, pause Daytona but not Sprites — all supported. |
| Concurrency caps | Hard per-user limits prevent runaway sandboxes. Max-runtime ceilings prevent silent cost accumulation. |
| Region selection | Provider chooses the region nearest the calling user, or pins explicitly when compliance requires. |
| Fallback | Primary provider unhealthy? ComputeRouter fails over to the secondary automatically. |
compute_provider, and asks ComputeRouter to ensure a session.
Three paths
Via the Concierge
“Create a Hermes agent that runs on the cheapest compute available.” “Show me my active compute sessions.” “Pause my Hermes agents on Daytona — keep Sprites running.”
Via the CLI
Via the SDK
How an agent run uses ComputeRouter
router.run({...}) call as any other AgentRouter invocation.
Common patterns
Latency-sensitive customer chat
Pin
compute_provider: 'daytona' for an agent that handles live customer conversations. Daytona’s warm-start budget keeps response times tight.Throughput-heavy backend processing
Pin
compute_provider: 'fly_sprites' for batch-style agents. Lower per-hour cost; cold-start latency doesn’t matter when the workload runs on a queue.Mixed workload, policy-driven
compute_policy: 'cost-aware' lets ComputeRouter pick per call. Latency-tagged turns route to Daytona; everything else hits Sprites.Compliance routing
Region-pin via
compute_region when data-residency requires a specific geography. Both providers support multi-region.What it explicitly is not
- Not a generic GPU compute marketplace. ComputeRouter is for agent runtimes — Hermes today, peers later. ML training jobs, batch inference at scale, custom CUDA kernels — not the target. Buyers who need raw GPU SKU access go direct to the providers.
- Not a sandbox for untrusted user code. That’s E2B’s niche. Hermes is our code; the isolation requirement is per-tenant data isolation, not adversarial-code isolation.
- Not a replacement for vendor-managed agent compute. Anthropic and Google still run their own compute when you select
provider: 'anthropic'orprovider: 'google'on AgentRouter. ComputeRouter only enters the picture when the agent runtime is one we self-host. - Not user-OAuth into the providers. Compute provider accounts are platform-wide, with metadata tags for cost attribution. You don’t bring your own Daytona or Fly account.
Next steps
- AgentRouter — Providers — provider matrix including the Hermes / Daytona / Sprites paths.
- AgentRouter — Build an agent — full agent creation flow including the
compute_providerandcompute_policyfields. - ANS overview — the addressing layer that ties an agent’s address to its compute session.
- Portable Context — why context survives provider and compute swaps.

