Skip to content

Orchestrator instances

CRUD, lifecycle load/unload, pool tiers, and domain validation.

Intended audience: Stakeholders, Business analysts, Solution architects, Developers, Testers

Learning outcomes by role

Stakeholders

  • Explain orchestrator instances as billable or capacity-consuming units per org.

Business analysts

  • Define CRUD stories for create, update, delete, and tier selection.

Solution architects

  • Connect instance records to pool tiers and external orchestration backends.

Developers

  • Use orchestrator APIs and validation rules from cadence.api.orchestrator.

Testers

  • Cover domain validation errors, quota limits, and concurrent updates.

An orchestrator instance is one configured AI runtime for an org: framework, mode, plugins, and pool tier (how eagerly it stays in memory). Org users manage instances only inside their org; creation respects quotas. Data is persisted in Postgres and optionally loaded in OrchestratorPool; HTTP lives under /api/orgs/{org_id}/orchestrators (CRUD, lifecycle, and plugin attachment routes).

  • Declare framework (framework_type), mode, config, and tier (hot / warm / cold) for an org.
  • Load / unload runtimes into the process pool for low-latency chat.
  • Publish events (when RabbitMQ is up) so other workers or the pool can react.
  1. CreatePOST /api/orgs/{org_id}/orchestrators with roles_allowed(ORG_ORCHESTRATORS_WRITE). org_context ensures membership. SettingsService.validate_orchestrator_config checks framework/mode/config against tenant LLM and plugin constraints.
  2. Persist — Domain OrchestratorService / repositories write the instance; optional publish_after emits an event.
  3. Load — Lifecycle routes call OrchestratorPool.create_instance (or unload) with resolved config and plugin artifacts.
  4. Chat — Clients call POST /api/chat/completion with X-ORG-ID + X-INSTANCE-ID (see Chat and engine).
  • cadence:org:orchestrators:read / :write / :lifecycle — Defined in the permissions module. Routes use roles_allowed with the appropriate constant.

load_hot_tier_instances in application lifespan preloads active hot-tier instances after plugin catalog sync.

  • Pool capacity — Large numbers of hot instances increase memory and startup time; warm/cold defer load.
  • Events — If RabbitMQ is unavailable, event_publisher may be None; handlers should tolerate missing publish (see get_event_publisher in shared dependencies).