Skip to content

Data Model

Orchestra uses a PostgreSQL database to persist workshop templates and instance records. The Kubernetes CRD is the live runtime representation; the database holds history, ownership, and utilization data that the cluster does not track.

erDiagram
workshops {
uuid id PK
string name
string slug UK
string description
string image
string default_duration
int port
jsonb env
string[] args
jsonb resources
jsonb storage
jsonb ingress
string[] tags
bool is_active
string created_by
timestamptz created_at
timestamptz updated_at
}
workshop_instances {
uuid id PK
uuid workshop_id FK
string k8s_name
string namespace
string owner_email
string phase
string url
string duration_requested
timestamptz launched_at
timestamptz expires_at
timestamptz terminated_at
timestamptz created_at
timestamptz updated_at
}
instance_events {
uuid id PK
uuid instance_id FK
string phase
timestamptz recorded_at
}
workshops ||--o{ workshop_instances : "launched as"
workshop_instances ||--o{ instance_events : "transitions through"

Stores the reusable template configuration created and managed by admins.

ColumnTypeNotes
idUUID PK
namestringHuman display name
slugstring UNIQUEk8s-safe prefix; used to generate instance names
descriptionstring nullable
imagestringDocker image (e.g. rocker/rstudio:latest)
default_durationstringe.g. "4h"
portintPort the app listens on in-container (default 8787)
envJSONB nullableExtra env vars for the app container (name -> value)
argsstring[] nullableContainer args, replacing the image’s default CMD
resourcesJSONB{cpu, memory, cpuRequest, memoryRequest, ephemeralStorage, ephemeralStorageRequest}
storageJSONB nullable{size, storageClass}
ingressJSONB nullableIngress overrides ({host, annotations})
tagsstring[]Catalog tags; defaults to empty list
is_activeboolfalse = archived; not launchable
created_bystringEmail of the admin who created it

One row per launch. Created when a user launches from a template; updated on each API read as phase/url are synced from the live k8s CRD.

ColumnTypeNotes
idUUID PK
workshop_idUUID FKworkshops.id (NOT NULL)
k8s_namestringCRD name in k8s, auto-generated as {slug}-{6 chars}
namespacestringKubernetes namespace
owner_emailstringUser who launched the instance
phasestringMirrors k8s status.phase
urlstring nullablePopulated by operator via ingress
duration_requestedstringDuration override or template default
launched_attimestamptzWhen the DB record and CRD were created
expires_attimestamptz nullableSet when k8s operator writes expiry to status
terminated_attimestamptz nullableSet on DELETE or when CRD is gone from k8s

Append-only log of phase transitions. Each row records the phase the instance entered and when. Used to compute time-in-phase utilization.

ColumnTypeNotes
idUUID PK
instance_idUUID FKworkshop_instances.id
phasestringPhase transitioned INTO
recorded_attimestamptzWall-clock time of the transition

Phase, URL, and expiry are written by the Kubernetes operator into the CRD status field. Rather than running a background sync daemon, the API syncs on-demand:

  1. GET /instances/{name} loads the DB record then calls k8s.get_workshop().
  2. The synced fields are phase (← status.phase), url (← status.url), and expires_at (← status.expiresAt). If phase changed, an InstanceEvent is appended; any changed field updates the DB row.
  3. DELETE /instances/{name} deletes the k8s CRD and sets terminated_at, appending a final Terminating event.

This gives eventual consistency with a single read latency per API call and no additional infrastructure.

Utilization is derived from instance_events at query time — no aggregation table is maintained. For a given instance the API walks consecutive event pairs:

time_in_phase[events[i].phase] += events[i+1].recorded_at - events[i].recorded_at

For the last (open) event the upper bound is terminated_at (if set) or now(). active_seconds is the sum of seconds in Ready and Running phases.

See GET /instances/{name}/utilization and GET /templates/{id}/stats.