Documentation Index
Fetch the complete documentation index at: https://docs.strait.dev/llms.txt
Use this file to discover all available pages before exploring further.
What is Strait?
Strait is an open-source platform that runs your background jobs, orchestrates multi-step workflows, and manages AI agent pipelines. You define what needs to happen. Strait handles the retries, scheduling, dependencies, and monitoring. One Go binary. One PostgreSQL database. No Redis required. No RabbitMQ. No SQS. Just deploy and start running jobs.Why teams switch to Strait
Most teams start with a simple queue and a retry loop. Then they need scheduling. Then workflow dependencies. Then approval gates. Then cost tracking for AI agents. Before long, they’re maintaining five different systems that don’t talk to each other. Strait replaces that patchwork with one system:- Jobs fail gracefully. Every run follows a 13-state lifecycle. Retries use exponential backoff, fixed delays, or custom sequences. Exhausted runs go to a dead letter queue for review.
- Workflows run as DAGs. Define step dependencies, approval gates, sub-workflows, event waits, and sleep delays. Strait validates the graph and runs it.
- Everything is observable. See exactly where every run is, why it failed, how long it took, and what it cost. Real-time streaming, not polling.
- Five SDKs, same architecture. TypeScript, Python, Go, Ruby, and Rust. Pick your language, define your jobs, run them anywhere.
- Self-host or use managed. Deploy on your infrastructure with just PostgreSQL, or use the managed platform at app.strait.dev.
Key Capabilities
13-State FSM
Robust lifecycle management—queued, executing, completed, failed, timed_out, dead_letter—ensures every job run is tracked correctly.
Workflow DAGs
Directed Acyclic Graphs with fan-in/fan-out, step conditions, template variables, output transforms, human approval gates, and durable event waits.
Smart Retry
Exponential, linear, fixed, or custom per-attempt delays with ±20% jitter. Prevents thundering herd and handles transient failures gracefully.
Cost Budgets
Track AI model usage with micro-USD precision. Enforce per-run and daily project limits to control costs.
Event Triggers
Pause execution and wait for external events—approvals, webhooks, third-party callbacks—for days or weeks without holding goroutines. Durable, database-backed waits with timeout support.
Real-Time CDC
Postgres WAL change capture via Sequin. No polling required—your applications react instantly when jobs, workflows, or runs change.
SDK Endpoints
Specialized endpoints for job executors—logging, heartbeats, progress updates, checkpoints, continuation, and child job spawning.
Webhooks
HMAC-SHA256 signed webhooks with automatic retries and dead letter queue on delivery failure.
Health Scoring
Aggregate metrics over configurable time windows. Success rate, timeout rate, crash rate, and latency stability—at-a-glance job reliability.
Architecture Overview
api: Handles HTTP requests, job management, and triggering. Scale horizontally for API throughput.
worker: Runs executor, scheduler, and background maintenance. Scale horizontally for job processing throughput.
all: Combined mode for development or small deployments. Single binary, single process.
Why Strait?
Zero External Dependencies
No RabbitMQ. No SQS. No Kafka. PostgreSQL handles queuing with
SELECT FOR UPDATE SKIP LOCKED—lock-free concurrent workers without operational overhead. Single binary includes everything—no runtime dependencies to install.Production-Grade Concurrency
Go goroutines provide parallel job execution without external coordination. Worker pool with bounded backpressure prevents memory exhaustion during traffic spikes. Structured concurrency patterns (
sourcegraph/conc) ensure panic recovery and graceful shutdown.Built for AI Workloads
SDK endpoints designed for AI agents—logging, heartbeats, progress checkpoints, continuation for long-running workflows, and child job spawning. Cost budgets track token usage with micro-USD precision. Debug bundles aggregate execution data for troubleshooting.
Workflow Orchestration
Complex DAGs with step conditions, output transforms, template variables, and human approval gates. Atomic fan-in handles concurrent parent completions safely. Sub-workflows enable arbitrary nesting depth for multi-stage pipelines.
Observability First
OpenTelemetry tracing links job runs across API server, worker, and external endpoints. Prometheus metrics expose queue depth, throughput, and latency. Structured JSON logging enables log aggregation. Real-time SSE streaming via Redis.
Use Cases
Strait fits these patterns: Background Jobs: Scheduled data imports, report generation, cache warming, cleanup tasks, and recurring maintenance operations. Webhook Consumers: Process events from external services with retries, dead letter queue, and delivery guarantees. AI Agent Workflows: Multi-step AI pipelines with human approval gates, conditional execution, and sub-workflow nesting. Cost tracking per run and per project. Batch Processing: Bulk job triggering with configurable batch sizes, priority ordering, and idempotency deduplication. Data Pipelines: ETL workflows with fan-out parallel steps, transform stages, and aggregation. Cron Jobs: Standard 5-field cron expressions with timezone support and execution windows.Getting Started
Quick Start
Get Strait running in minutes. Clone repository, start infrastructure with Docker Compose, and trigger your first job.
Architecture
Deep dive into internals. Learn about queue mechanics, FSM states, workflow engine, and technology choices.
SDK Reference
Official SDKs for TypeScript, Python, Go, Ruby, and Rust with full feature parity. Authoring DSL, composition helpers, and typed errors.
CLI Reference
Complete CLI documentation. 48+ commands organized by category with examples and shell completion.
API Reference
REST API endpoints for job management, triggering, workflow orchestration, and SDK interactions.
Concepts
Core domain concepts you’ll encounter:- Jobs
- Runs
- Workflows
- Event Triggers
- Environments
Jobs define the template for recurring tasks—endpoint URL, timeout, retry strategy, cron schedule, and cost budgets. Runs are execution instances of jobs.
Guides
Step-by-step guides for common tasks:Authentication
Internal secret auth for API endpoints and JWT run token auth for SDK. API key management with system keychain storage.
Deployment
Docker deployment, Fly.io configuration, horizontal scaling strategies, and production readiness checklist.
Security
SSRF protection, rate limiting, encryption at rest, and secure webhook delivery.
Cost Budgets
Per-run and daily project limits. AI model usage tracking with micro-USD precision. Budget enforcement before execution.
Development
Contributing to Strait or running it locally:Contributing
Setup development environment, code style, commit conventions, and PR guidelines.
Testing
Unit tests, integration tests with testcontainers, E2E tests, fuzz testing, and benchmarks.
Database Schema
Complete table definitions, indexes, and relationships for PostgreSQL schema.
What’s Next?
Ready to dive deeper?- Learn about the queue mechanics and how
SKIP LOCKEDworks - Understand the workflow engine and DAG execution
- Explore retry strategies — exponential, linear, fixed, and custom
- Set up webhooks with HMAC signing for event delivery
- Build durable workflows with event triggers — wait for external events without holding goroutines
- Configure cost budgets for AI workloads
Explore the Docs
Quick Start
Run your first job in under 10 minutes.
Architecture
Understand how Strait works under the hood.
SDKs
Official client libraries for 5 languages.
API Reference
Complete REST API documentation.