Command Palette
Search for a command to run

Create a service

Deploy a new production service with auto-scaling and load balancing

POST
/auth/v1/seed/{namespace}/services

Path parameters


namespace
type: string
required
The namespace (user or organization) to deploy the service in.

Body parameters


name
type: string
required
A unique name for the service within the namespace. Must be 1-64 characters. Allowed characters: alphanumeric, hyphens, underscores.
cloud
type: string
required
Cloud provider to deploy on. One of `aws`, `azure`, `gcp`.
region
type: string
required
Cloud region to deploy in (e.g., `us-east-1`, `westus2`, `europe-west1`).
zone
type: string
Availability zone within the region.
replicas
type: integer
Number of replicas to deploy.
default: 1
image
type: string
Container image to deploy (e.g., `ghcr.io/acme/llama-server:v2.0`).
port
type: integer
Port the service listens on inside the container.
default: 80
gpus
type: string
GPU configuration in `GPU_TYPE:COUNT` format. For example, `"A100:4"`, `"H100:8"`.
cpus
type: string
CPU count. Specify an exact value like `"4"` or a minimum with `"4+"`.
memory
type: string
Memory in GB. Specify an exact value like `"16"` or a minimum with `"16+"`.
disk_size_gb
type: integer
Ephemeral disk size in gigabytes.
disk_tier
type: string
Disk performance tier. One of `standard`, `high`, `ultra`.
default: standard
spot
type: boolean
Use spot (AWS), preemptible (GCP), or low-priority (Azure) instances for reduced cost.
default: false
instance_type
type: string
Specific cloud instance type to use, overriding automatic selection from CPU/GPU/memory requirements.
ports
type: integer[]
Additional ports to expose on the service.
labels
type: object
Cloud provider labels/tags as key-value pairs.
envs
type: object
Environment variables as key-value pairs.
command
type: string
Override the container entrypoint command.
health_check_path
type: string
HTTP path for health checks. The service must return a 200 on this path.
autostop
type: integer
Number of idle minutes before the service is automatically stopped. Useful for development services.
min_replicas
type: integer
Minimum number of replicas for autoscaling. Set to `0` to enable scale-to-zero.
max_replicas
type: integer
Maximum number of replicas for autoscaling.
volumes
type: string[]
Persistent volumes in `name:/mount/path:sizeGB` format. For example, `"model-cache:/models:100"`.
isolation
type: string
Workload isolation level. One of `container`, `process`, `microvm`.
default: container
registry_id
type: string
Registry credential ID for pulling private container images.

Previous Delete a job

Next List services