Command Palette
Search for a command to run

Jobs

Create, monitor, and manage batch compute jobs

POST
/{namespace}/jobs

Jobs are batch workloads that run a command to completion and terminate automatically. Supports spot instances, multi-node, and progress tracking. Read operations use /{'{namespace}'}/jobs, write operations go through /auth/v1/seed/.

List jobs

GET `/{namespace}/jobs`

Retrieve all jobs in a namespace.

Path parameters


namespace
type: string
required
The namespace (user or organization) to list jobs for.

Query parameters


status
type: string
Filter by status. One of `queued`, `running`, `succeeded`, `failed`, `cancelled`, `timed_out`.
page
type: integer
Page number for pagination.
default: 1
per_page
type: integer
Number of results per page. Maximum `100`.
default: 20
sort
type: string
Sort field. One of `created_at`, `started_at`, `completed_at`, `name`.
default: created_at
order
type: string
Sort order. One of `asc`, `desc`.
default: desc

Request

curl "https://outpost.run/acme/jobs?status=running" -H "Authorization: Bearer <access_token>"

Response 200

{ "data": [ { "id": "job_6e5d4c3b2a1f", "name": "finetune-llama-3-r1", "namespace": "acme", "status": "running", "gpu": "A100-80GB", "gpu_count": 4, "region": "us-east-1", "progress": "Epoch 2/3 - Step 4200/6300", "cost_per_hour": "12.80", "runtime_seconds": 14400, "created_at": "2026-03-18T10:00:00Z", "started_at": "2026-03-18T10:01:15Z", "completed_at": null } ], "pagination": { "page": 1, "per_page": 20, "total": 1, "total_pages": 1 } }

Get a job

GET `/{namespace}/job/{name}`

Retrieve details for a single job, including runtime metrics and artifact information.

Path parameters


namespace
type: string
required
The namespace (user or organization) the job belongs to.
name
type: string
required
The job name (e.g., `finetune-llama-3-r1`).

Request

curl https://outpost.run/acme/job/finetune-llama-3-r1 -H "Authorization: Bearer <access_token>"

Response 200

{ "id": "job_6e5d4c3b2a1f", "name": "finetune-llama-3-r1", "namespace": "acme", "status": "succeeded", "command": "python train.py --model meta-llama/Meta-Llama-3.1-8B --dataset acme/instructions-v3 --epochs 3 --lr 2e-5", "repo": "acme/training-pipeline", "branch": "main", "commit_sha": "f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5", "gpu": "A100-80GB", "gpu_count": 4, "cpu": 64, "memory_gb": 256, "disk_gb": 500, "region": "us-east-1", "timeout_minutes": 4320, "retries": 1, "retry_count": 0, "output_path": "/workspace/output/checkpoints", "exit_code": 0, "cost_per_hour": "12.80", "total_cost": "76.80", "runtime_seconds": 21600, "artifacts": { "path": "/workspace/output/checkpoints", "size_bytes": 16106127360, "files": 12, "download_url": "https://artifacts.outpost.run/acme/job_6e5d4c3b2a1f/checkpoints" }, "created_at": "2026-03-18T10:00:00Z", "started_at": "2026-03-18T10:01:15Z", "completed_at": "2026-03-18T16:01:15Z" }

Get job logs

GET `/{namespace}/job/{name}/logs`

Stream or retrieve the logs for a job.

Path parameters


namespace
type: string
required
The namespace (user or organization) the job belongs to.
name
type: string
required
The job name.

Query parameters


tail
type: integer
Number of most recent log lines to return. Maximum `10000`.
default: 100
since
type: string
Return logs after this ISO 8601 timestamp. For example, `2026-03-18T12:00:00Z`.
stream
type: boolean
If `true`, the response uses `text/event-stream` to stream logs in real time. Only available for running jobs.
default: false

Request

curl "https://outpost.run/acme/job/finetune-llama-3-r1/logs?tail=50" -H "Authorization: Bearer <access_token>"

Response 200

{ "name": "finetune-llama-3-r1", "namespace": "acme", "lines": [ { "timestamp": "2026-03-18T15:58:00Z", "message": "[Epoch 3/3] Step 6280/6300 | Loss: 0.0312 | LR: 1.2e-6" }, { "timestamp": "2026-03-18T15:59:00Z", "message": "[Epoch 3/3] Step 6290/6300 | Loss: 0.0298 | LR: 6.0e-7" }, { "timestamp": "2026-03-18T15:59:30Z", "message": "[Epoch 3/3] Step 6300/6300 | Loss: 0.0285 | LR: 0.0" }, { "timestamp": "2026-03-18T15:59:45Z", "message": "Training complete. Final loss: 0.0285" }, { "timestamp": "2026-03-18T16:00:00Z", "message": "Saving checkpoint to /workspace/output/checkpoints/final..." }, { "timestamp": "2026-03-18T16:00:30Z", "message": "Checkpoint saved. Total size: 15.0 GB" }, { "timestamp": "2026-03-18T16:01:00Z", "message": "Uploading artifacts..." }, { "timestamp": "2026-03-18T16:01:15Z", "message": "Done. Artifacts uploaded to acme/job_6e5d4c3b2a1f/checkpoints" } ], "has_more": true }

Streaming logs in real time

For running jobs, you can stream logs using Server-Sent Events:

curl -N "https://outpost.run/acme/job/finetune-llama-3-r1/logs?stream=true" -H "Authorization: Bearer <access_token>" -H "Accept: text/event-stream"

Get job analytics

GET `/{namespace}/job/{name}/analytics`

Retrieve analytics and metrics for a job.

Path parameters


namespace
type: string
required
The namespace (user or organization) the job belongs to.
name
type: string
required
The job name.

Request

curl https://outpost.run/acme/job/finetune-llama-3-r1/analytics -H "Authorization: Bearer <access_token>"

Response 200

{ "name": "finetune-llama-3-r1", "namespace": "acme", "gpu_utilization_percent": 94, "cpu_utilization_percent": 67, "memory_used_gb": 210, "disk_used_gb": 380, "runtime_seconds": 21600 }

Create a job

POST `/auth/v1/seed/{namespace}/jobs`

Submit a new batch job for execution. Jobs run to completion and then terminate.

Path parameters


namespace
type: string
required
The namespace (user or organization) to run the job in.

Body parameters


name
type: string
required
A human-readable name for the job. Must be unique within the namespace. Allowed characters: alphanumeric, hyphens, underscores.
command
type: string
required
The command to execute. This is run inside the container as the entrypoint.
repo
type: string
Full repository name (namespace/repo) to clone into the job's working directory. Either `repo` or `image` is required.
image
type: string
Container image to run. Either `repo` or `image` is required.
branch
type: string
Branch to clone when using `repo`.
default: main
gpu
type: string
required
GPU type for the job. One of `A100-40GB`, `A100-80GB`, `H100-80GB`, `L4`, `T4`, `none`.
gpu_count
type: integer
Number of GPUs. Must be `1`, `2`, `4`, or `8`.
default: 1
cpu
type: integer
Number of vCPUs. One of `2`, `4`, `8`, `16`, `32`, `64`, `96`.
default: 8
memory_gb
type: integer
Memory in gigabytes.
default: 32
disk_gb
type: integer
Ephemeral disk size in gigabytes. Data is discarded when the job completes.
default: 100
env
type: object
Environment variables as key-value pairs.
region
type: string
Deployment region. One of `us-east-1`, `us-west-2`, `eu-west-1`, `eu-central-1`, `ap-northeast-1`.
default: us-east-1
timeout_minutes
type: integer
Maximum runtime in minutes before the job is killed. Default is 24 hours. Maximum is `10080` (7 days).
default: 1440
retries
type: integer
Number of times to retry the job on failure. Maximum `5`.
default: 0
output_path
type: string
Path inside the container to persist as job artifacts. Contents are uploaded to your namespace's artifact storage on completion.

Request

curl -X POST https://outpost.run/auth/v1/seed/acme/jobs -H "Authorization: Bearer <access_token>" -H "Content-Type: application/json" -d '{ "name": "finetune-llama-3-r1", "command": "python train.py --model meta-llama/Meta-Llama-3.1-8B --dataset acme/instructions-v3 --epochs 3 --lr 2e-5", "repo": "acme/training-pipeline", "branch": "main", "gpu": "A100-80GB", "gpu_count": 4, "cpu": 64, "memory_gb": 256, "disk_gb": 500, "env": { "WANDB_PROJECT": "llama-finetune", "WANDB_API_KEY": "wk_abc123", "HF_TOKEN": "hf_xyz789" }, "region": "us-east-1", "timeout_minutes": 4320, "retries": 1, "output_path": "/workspace/output/checkpoints" }'

Response 201

{ "id": "job_6e5d4c3b2a1f", "name": "finetune-llama-3-r1", "namespace": "acme", "status": "queued", "command": "python train.py --model meta-llama/Meta-Llama-3.1-8B --dataset acme/instructions-v3 --epochs 3 --lr 2e-5", "repo": "acme/training-pipeline", "branch": "main", "commit_sha": "f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5", "gpu": "A100-80GB", "gpu_count": 4, "cpu": 64, "memory_gb": 256, "disk_gb": 500, "region": "us-east-1", "timeout_minutes": 4320, "retries": 1, "retry_count": 0, "output_path": "/workspace/output/checkpoints", "cost_per_hour": "12.80", "created_at": "2026-03-18T10:00:00Z", "started_at": null, "completed_at": null }

You can also submit jobs from a container image using an API key:

curl -X POST https://outpost.run/auth/v1/seed/acme/jobs -H "Authorization: API-Key my-key$sk_live_a1b2c3d4e5f6g7h8i9j0" -H "Content-Type: application/json" -d '{ "name": "batch-embeddings", "command": "python embed.py --input /data/corpus.jsonl --output /workspace/output/embeddings", "image": "ghcr.io/acme/embedding-batch:v1.3.0", "gpu": "L4", "cpu": 16, "memory_gb": 64, "disk_gb": 200, "timeout_minutes": 360, "output_path": "/workspace/output/embeddings" }'

Cancel a job

POST `/auth/v1/seed/{namespace}/jobs/{name}/cancel`

Cancel a queued or running job. Running jobs receive a SIGTERM followed by a SIGKILL after a 30-second grace period.

Path parameters


namespace
type: string
required
The namespace (user or organization) the job belongs to.
name
type: string
required
The job name.

Request

curl -X POST https://outpost.run/auth/v1/seed/acme/jobs/finetune-llama-3-r1/cancel -H "Authorization: Bearer <access_token>"

Response 200

{ "id": "job_6e5d4c3b2a1f", "name": "finetune-llama-3-r1", "namespace": "acme", "status": "cancelled", "message": "Job cancellation initiated. Running processes will be terminated.", "runtime_seconds": 14400, "total_cost": "51.20", "completed_at": "2026-03-18T14:01:15Z" }

Delete a job

DELETE `/auth/v1/seed/{namespace}/jobs/{name}`

Delete a completed job record and its associated artifacts. Only completed, failed, or cancelled jobs can be deleted.

Path parameters


namespace
type: string
required
The namespace (user or organization) the job belongs to.
name
type: string
required
The job name.

Query parameters


delete_artifacts
type: boolean
Also delete uploaded artifacts from storage. This action cannot be undone.
default: false

Request

curl -X DELETE "https://outpost.run/auth/v1/seed/acme/jobs/finetune-llama-3-r1?delete_artifacts=true" -H "Authorization: Bearer <access_token>"

Response 204

Returns an empty response body on success.


Error responses

All job endpoints may return the following errors:

Status Description
400 Bad request -- invalid parameters or configuration
401 Unauthorized -- missing or invalid credentials
403 Forbidden -- insufficient permissions
404 Not found -- job does not exist
409 Conflict -- job is in an incompatible state (e.g., already completed)
422 Unprocessable entity -- invalid GPU/CPU/memory combination
429 Rate limit exceeded
500 Internal server error -- Seed orchestration failure
{ "error": { "code": "invalid_request", "message": "Either 'repo' or 'image' must be provided.", "request_id": "req_7b8c9d0e1f2a" } }

Previous Services