Fleet
Warm pools of pre-initialized cloud nodes — eliminate cold-start latency for machines and jobs.
Outpost Fleets are named pools of compute nodes that stay warm between jobs. Instead of waiting 1–3 minutes for instance provisioning every time you launch a machine or submit a job, a fleet keeps a configurable minimum number of nodes initialized and ready. Work targeted at a fleet starts in seconds.
Key features
- Warm nodes — keep
min_nodesinstances booted and idle at all times. Jobs start immediately instead of waiting for provisioning. - Auto-scaling — fleet scales up to
max_nodesunder load and scales back down during idle periods. - Idle timeout — nodes above
min_nodesare terminated after a configurable idle period, controlling standby cost. - Spot support — run warm spot instances at 60–90% reduced cost. Outpost replaces preempted nodes automatically.
- Labels — tag fleet nodes for job targeting and organization.
Quick start
How it works
- Create — define a fleet with hardware requirements, min/max node counts, and idle timeout.
- Warm up — Outpost provisions
min_nodesinstances and keeps them ready. - Submit work — jobs targeting the fleet are assigned to idle nodes immediately.
- Scale — fleet scales up when demand exceeds the pool, and scales down when nodes become idle past the timeout.
Node lifecycle
min_nodes vs max_nodes
| Setting | Effect |
|---|---|
min_nodes: 0 | No warm nodes. Fleet exists but has no standing cost. Nodes provision on demand. |
min_nodes: 2 | 2 nodes always warm. First 2 concurrent jobs start immediately. |
max_nodes: 10 | Fleet auto-scales up to 10 nodes under load. Additional work is queued. |
Spot fleets
Set spot: true to keep warm spot instances. Outpost automatically replaces preempted nodes to maintain min_nodes. Do not use for latency-sensitive workloads where cold-start on replacement is unacceptable.
Use cases
- Batch training pipelines — keep GPU nodes warm for recurring training jobs to avoid provisioning delays.
- Inference burst capacity — pre-warm a pool of GPU nodes for burst inference demand.
- CI/CD compute — maintain a pool of test runners that start immediately on commit.
- Multi-tenant compute — share a warm pool across team members in a namespace.
Next steps
- Create a fleet — API reference
- List fleets — API reference
- Delete a fleet — API reference
Previous → Custom Domains
Next Load Balancer →