Detailed reference of the PostgreSQL database schema and migration system.
Strait uses PostgreSQL 18 as its primary data store. All primary keys are UUIDv7 stored as TEXT. Schema is managed by versioned SQL migrations (currently through the 000089_* series) that run automatically on startup.
Core Tables
| Table | Description |
|---|---|
jobs | Job definitions (name, slug, endpoint URL, retry config, cron, TTL, retry strategy, environment link) |
job_runs | Execution instances with 13-state FSM, payload, result, error, execution trace, debug mode, continuation lineage |
run_events | Structured log entries per run (type, level, message, data) |
job_versions | Auto-snapshot of job config on every update |
api_keys | Per-project API keys (SHA-256 hashed, revocable, rotatable with grace windows) |
audit_events | Immutable audit trail for security-sensitive and control-plane actions |
project_roles / project_member_roles | RBAC role definitions and project user-role bindings |
resource_policies / tag_policies | Fine-grained policy grants by resource and by tag match |
known_actors | Synced actor identities used for attribution and audit context |
webhook_deliveries | Webhook delivery tracking and dead letter queue (job runs + event triggers) |
event_triggers | Durable event waits for workflow steps and job runs |
project_quotas | Per-project quotas (max jobs, runs, concurrency, cost limits) |
job_run_idempotency | Global deduplication tracking for job runs across partitions |
webhook_subscriptions | Project-level webhook subscriptions for multiple event types |
Workflow Tables
| Table | Description |
|---|---|
workflows | Workflow DAG definitions (name, slug, project, version) |
workflow_versions | Immutable workflow snapshots by version number |
workflow_version_steps | Version-pinned step snapshots used by running workflow runs |
workflow_steps | Current mutable step definitions (dependencies, conditions, retries, concurrency/resource hints) |
workflow_runs | Workflow execution instances (status, payload, parent lineage, version pin) |
workflow_step_runs | Per-step runtime state (status, deps counters, output/error, timing) |
workflow_step_decisions | Explainability log for scheduler/condition decisions during progression |
workflow_policies | Project-level DAG governance controls (max_fan_out, max_depth, forbidden types, approval requirement) |
workflow_run_labels | Key-value labels on workflow runs |
workflow_step_approvals | Step approval tracking (approvers, status, timestamps) |
Core Engine Tables
| Table | Description |
|---|---|
run_usage | AI model usage tracking per run (provider, model, tokens, cost in micro-USD) |
run_checkpoints | SDK-driven run state checkpoints |
run_tool_calls | Tool call recording with input/output and duration |
run_outputs | Structured outputs with optional schema validation |
environments | Named execution environments with JSONB variable storage |
endpoint_circuit_state | Circuit breaker state per endpoint URL (state, failures, opened_at, last_failure_at) |
endpoint_health_scores | EWMA-based health scoring per endpoint (score, success_rate, timeout_rate, latency) |
job_slos | Service level objectives per job (metric, target, window) |
job_slo_evaluations | Point-in-time SLO evaluation results (current_value, budget_remaining) |
pricing_catalog | Static pricing table for AI model cost calculation |
job_secrets | Encrypted secrets scoped to job and environment |
job_groups | Logical grouping of jobs (full CRUD with name, description, project scope) |
job_dependencies | Inter-job dependency definitions |
audit_events | Immutable audit trail for security-sensitive operations (migration 000057) |
batch_operations | Tracks bulk trigger and bulk cancel batch operation status and results |
log_drains | External log drain endpoint configuration for streaming run events |
event_sources | Inbound event source definitions with type and configuration |
event_subscriptions | Bindings between event sources and jobs with optional filter expressions |
Key Columns
| Table.Column | Type | Description |
|---|---|---|
jobs.retry_strategy | TEXT | Retry strategy: exponential, linear, fixed, custom |
jobs.retry_delays_secs | INT[] | Custom per-attempt delays in seconds (for custom strategy) |
jobs.environment_id | TEXT | FK to environments table for endpoint override |
job_runs.execution_trace | JSONB | Timing breakdown (queue_wait, dequeue, connect, ttfb, transfer, total) |
job_runs.debug_mode | BOOLEAN | Whether debug diagnostics are enabled for this run |
job_runs.continuation_of | TEXT | FK to parent run for continuation lineage |
job_runs.lineage_depth | INT | Depth in continuation chain |
workflow_steps.concurrency_key | TEXT | Step-level serialization key across runnable steps |
workflow_steps.resource_class | TEXT | Scheduler capacity bucket (small, medium, large) |
workflow_runs.parent_step_run_id | TEXT | FK linking sub-workflow runs back to parent step run |
workflow_step_decisions.decision_type | TEXT | Decision class (scheduler, concurrency, resource, condition) |
workflow_step_decisions.details | JSONB | Machine-readable context used by explain endpoint |
workflow_policies.max_fan_out | INT | Per-step direct dependent cap enforced on create/update/trigger |
workflow_policies.max_depth | INT | Maximum DAG depth enforced by policy checks |
project_quotas.max_cost_per_run_microusd | BIGINT | Per-run cost limit in micro-USD |
project_quotas.max_daily_cost_microusd | BIGINT | Daily project cost limit in micro-USD |
webhook_deliveries.workflow_run_id | TEXT | FK to workflow_runs for workflow-triggered webhooks |
webhook_deliveries.event_type | TEXT | Type of event being delivered (e.g., run.completed) |
audit_events.resource_type | TEXT | Type of resource affected (e.g., job, workflow) |
audit_events.changes | JSONB | Detailed diff of changes made during the event |
webhook_subscriptions.event_types | TEXT[] | Array of event types this subscription listens to |
Relationships
- Jobs & Runs: A
jobhas manyjob_runs. Each run is linked to a specificjob_version. - Workflows: A
workflowhas manyworkflow_versions, and each version has manyworkflow_version_stepsused to materializeworkflow_step_runsat trigger time. - Workflow Governance:
workflow_policiesis one row per project and is enforced at workflow create/update/trigger boundaries. - Workflow Explainability:
workflow_step_decisionsstores progression decisions per run and powers/v1/workflow-runs/{id}/explain. - Environments:
jobsandjob_secretsare linked toenvironmentsfor configuration isolation.
Event Trigger Tables
event_triggers
id TEXT PRIMARY KEY -- UUIDv7
event_key TEXT NOT NULL UNIQUE -- Globally unique caller-controlled key
project_id TEXT NOT NULL
source_type TEXT NOT NULL -- 'workflow_step' or 'job_run'
trigger_type TEXT NOT NULL DEFAULT 'event' -- 'event' or 'sleep'
workflow_run_id TEXT -- FK to workflow_runs
workflow_step_run_id TEXT -- FK to workflow_step_runs
job_run_id TEXT -- FK to job_runs
status TEXT NOT NULL DEFAULT 'waiting' -- waiting, received, timed_out, canceled
timeout_secs INT NOT NULL DEFAULT 3600
request_payload JSONB
response_payload JSONB
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
received_at TIMESTAMPTZ
expires_at TIMESTAMPTZ
error TEXT
notify_url TEXT
notify_status TEXT
event_emit_key TEXT
sent_by TEXT
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()Key Indexes
idx_event_triggers_status_expires:(status, expires_at) WHERE status = 'waiting'-- Reaper timeout lookupidx_event_triggers_project_status:(project_id, status)-- List queriesidx_event_triggers_event_key_prefix:(event_key text_pattern_ops)-- Prefix matching for batch resolutionidx_event_triggers_step_run:(workflow_step_run_id) WHERE workflow_step_run_id IS NOT NULL-- Step-to-trigger lookup
Performance Indexes
idx_workflow_runs_status_expires:(status, expires_at) WHERE expires_at IS NOT NULL-- Reaper timeout lookupidx_workflow_runs_status_finished:(status, finished_at) WHERE finished_at IS NOT NULL-- Retention deletionidx_step_runs_workflow_run_status:(workflow_run_id, status)-- Step run queries by run and statusidx_step_decisions_step_run_id:(step_run_id)-- CASCADE deletes on decisionsidx_job_runs_active_by_job:(job_id) WHERE status IN ('queued', 'dequeued', 'executing')-- Active run counting
webhook_deliveries Extensions
The webhook_deliveries table (migration 000009, extended in 000061 and 000062) is shared between job run webhooks and event trigger webhooks:
id TEXT PRIMARY KEY
run_id TEXT -- FK to job_runs (nullable)
job_id TEXT -- FK to jobs (nullable)
event_trigger_id TEXT -- FK to event_triggers (nullable)
workflow_run_id TEXT -- FK to workflow_runs (nullable)
subscription_id TEXT -- FK to webhook_subscriptions (nullable)
webhook_url TEXT NOT NULL
status TEXT NOT NULL DEFAULT 'pending' -- pending, delivering, delivered, failed, dead
attempts INT NOT NULL DEFAULT 0
max_attempts INT NOT NULL DEFAULT 5
last_status_code INT
last_error TEXT
next_retry_at TIMESTAMPTZ
delivered_at TIMESTAMPTZ
payload JSONB
event_type TEXT
sequence INT NOT NULL DEFAULT 0
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()Key indexes:
idx_webhook_deliveries_pending:(next_retry_at ASC) WHERE status = 'pending'-- Delivery worker queueidx_webhook_deliveries_run_seq:(run_id, sequence) WHERE run_id IS NOT NULL-- Ordered delivery per runidx_webhook_deliveries_event_trigger:(event_trigger_id) WHERE event_trigger_id IS NOT NULLidx_webhook_deliveries_status_created:(status, created_at)
endpoint_circuit_state
The circuit breaker state table (migration 000022) tracks per-endpoint failure state:
endpoint_url TEXT PRIMARY KEY
state TEXT NOT NULL DEFAULT 'closed' -- closed, open, half_open
failures INT NOT NULL DEFAULT 0
opened_at TIMESTAMPTZ
last_failure_at TIMESTAMPTZ
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()endpoint_health_scores
The endpoint health scoring table (migration 000120) tracks continuous health metrics per endpoint using EWMA:
endpoint_url TEXT PRIMARY KEY
health_score DOUBLE PRECISION NOT NULL DEFAULT 100.0
success_rate DOUBLE PRECISION NOT NULL DEFAULT 1.0
timeout_rate DOUBLE PRECISION NOT NULL DEFAULT 0.0
latency_score DOUBLE PRECISION NOT NULL DEFAULT 1.0
total_requests BIGINT NOT NULL DEFAULT 0
last_latency_ms DOUBLE PRECISION NOT NULL DEFAULT 0.0
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()job_slos and job_slo_evaluations
SLO definitions and evaluation history (migration 000122):
-- job_slos
id TEXT PRIMARY KEY
job_id TEXT NOT NULL REFERENCES jobs(id) ON DELETE CASCADE
project_id TEXT NOT NULL
metric TEXT NOT NULL -- success_rate, p95_latency_secs, p99_latency_secs
target DOUBLE PRECISION NOT NULL
window_hours INT NOT NULL -- 24, 168, 720
UNIQUE(job_id, metric, window_hours)
-- job_slo_evaluations
id TEXT PRIMARY KEY
slo_id TEXT NOT NULL REFERENCES job_slos(id) ON DELETE CASCADE
current_value DOUBLE PRECISION NOT NULL
budget_remaining DOUBLE PRECISION NOT NULL
evaluated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()job_runs Key Indexes
idx_runs_queue_covering:(created_at ASC) INCLUDE (job_id, priority, scheduled_at, next_retry_at) WHERE status = 'queued'-- Covering index for dequeueidx_runs_status_project_covering:(project_id, status, created_at DESC) INCLUDE (job_id, priority, scheduled_at, next_retry_at)-- Covering index for list queriesidx_runs_heartbeat:(heartbeat_at) WHERE status = 'executing'-- Stale run detectionidx_runs_expires:(expires_at) WHERE expires_at IS NOT NULL AND status IN ('delayed', 'queued')-- Expiration reaperidx_job_runs_parent_run_id:(parent_run_id) WHERE parent_run_id IS NOT NULL-- Parent-child lookupidx_runs_metadata: GIN index onmetadata(migration 000016) -- JSONB filteringidx_jobs_tags: GIN index ontags(migration 000054) -- Tag-based filtering
FK Constraints on job_runs
Important for partitioning considerations:
job_runs.parent_run_id-- Self-referential FK dropped in migration 000066 to support range partitioning. Integrity is enforced at the application layer.run_events.run_id-- No ON DELETE clauserun_checkpoints.run_id-- ON DELETE CASCADErun_usage.run_id-- ON DELETE CASCADErun_tool_calls.run_id-- ON DELETE CASCADErun_outputs.run_id-- ON DELETE CASCADE
Miscellaneous Indexes
idx_idempotency_expires:(expires_at) WHERE expires_at IS NOT NULLonjob_run_idempotencyidx_audit_events_resource_lookup:(resource_type, resource_id)onaudit_eventsidx_webhook_subscriptions_project_active:(project_id, created_at DESC) WHERE active = TRUEonwebhook_subscriptions
Migrations
We use golang-migrate for schema evolution.
Numbering Convention
Migrations follow a sequential 6-digit numbering scheme:
000001_*.up.sql through the latest migration in the repository.
Recent Migrations
| Migration | Description |
|---|---|
000062_webhook_deliveries_extend | Extends webhook deliveries with workflow run context, subscriptions, and unified delivery worker columns. |
000063_autovacuum_tuning | Optimizes autovacuum settings for high-churn tables (job_runs, webhook_deliveries, run_events). |
000064_uuidv7_defaults | Adds native uuidv7() defaults for primary keys using the pg_uuidv7 extension. |
000065_job_run_idempotency | Introduces a dedicated table for global job run idempotency tracking across partitions. |
000066_job_runs_partitioning | Converts job_runs to a range-partitioned table by created_at with automated partition management. |
000067_add_performance_indexes | Adds workflow_runs and workflow_step_runs performance indexes. |
000068_covering_indexes | Adds covering indexes (INCLUDE clause) to job_runs and webhook_deliveries for index-only scans. |
000069_audit_events_extend | Extends audit events with resource context, changes, and client metadata (IP, User-Agent). |
000070_webhook_subscriptions | Adds webhook subscriptions table for multi-event project-level webhook management. |
000071_performance_indexes_phase2 | Phase 2 performance indexes for job_runs and workflows. |
000072_add_webhook_retry_policy | Adds retry policy configuration to webhook deliveries. |
000073_recreate_idempotency_index | Recreates idempotency index with corrected partial filter. |
000074_workflow_parent_step_run_id | Adds parent_step_run_id to workflow_runs for sub-workflow lineage. |
000075_workflow_step_concurrency_key | Adds concurrency_key to workflow_steps and workflow_version_steps. |
000076_workflow_step_decisions | Adds explainability decision log table for DAG runtime. |
000077_workflow_step_resource_class | Adds resource_class scheduling bucket to workflow step tables. |
000078_workflow_policies | Adds per-project DAG governance policy table. |
000079_queue_notify_trigger | Adds pg_notify trigger to wake queue workers on new job insertion. |
000080_dequeue_active_runs_index | Adds partial index on job_runs for active run counting during dequeue. |
000081_step_decisions_index | Adds indexes on workflow step decisions table. |
000082_batch_tracking | Creates batch_operations table; adds batch_id to job_runs. |
000083_job_default_run_metadata | Adds default_run_metadata JSONB to jobs and job_versions. |
000084_payload_gin_index | Adds GIN index on job_runs.payload for JSONB containment queries. |
000086_job_run_concurrency_key | Adds max_concurrency_per_key to jobs; concurrency_key to job_runs with partial index. |
000087_named_rate_limit_keys | Adds rate_limit_keys JSONB to jobs and job_versions. |
000088_log_drains | Creates log_drains table with auth config and level filtering. |
000089_event_sources_v2 | Creates event_sources and event_subscriptions tables. |
Adding New Migrations
- Create a new pair of SQL files in the
apps/strait/migrations/directory using the next sequence number:000090_your_feature.up.sql000090_your_feature.down.sql
- Ensure the migration is idempotent.
- The application will automatically detect and apply the new migration on the next startup.
Never modify an existing migration file that has already been merged. Always create a new migration to apply changes.