Orchestrator instances
CRUD, lifecycle load/unload, pool tiers, and domain validation.
Intended audience: Stakeholders, Business analysts, Solution architects, Developers, Testers
Learning outcomes by role
Stakeholders
- Explain orchestrator instances as billable or capacity-consuming units per org.
Business analysts
- Define CRUD stories for create, update, delete, and tier selection.
Solution architects
- Connect instance records to pool tiers and external orchestration backends.
Developers
- Use orchestrator APIs and validation rules from cadence.api.orchestrator.
Testers
- Cover domain validation errors, quota limits, and concurrent updates.
An orchestrator instance is one configured AI runtime for an org: framework, mode, plugins, and pool tier (how eagerly it stays in memory). Org users manage instances only inside their org; creation respects quotas. Data is persisted in Postgres and optionally loaded in OrchestratorPool; HTTP lives under /api/orgs/{org_id}/orchestrators (CRUD, lifecycle, and plugin attachment routes).
Purpose
Section titled “Purpose”- Declare framework (
framework_type), mode, config, and tier (hot/warm/cold) for an org. - Load / unload runtimes into the process pool for low-latency chat.
- Publish events (when RabbitMQ is up) so other workers or the pool can react.
Typical flow
Section titled “Typical flow”- Create —
POST /api/orgs/{org_id}/orchestratorswithroles_allowed(ORG_ORCHESTRATORS_WRITE).org_contextensures membership.SettingsService.validate_orchestrator_configchecks framework/mode/config against tenant LLM and plugin constraints. - Persist — Domain
OrchestratorService/ repositories write the instance; optionalpublish_afteremits an event. - Load — Lifecycle routes call
OrchestratorPool.create_instance(or unload) with resolved config and plugin artifacts. - Chat — Clients call
POST /api/chat/completionwithX-ORG-ID+X-INSTANCE-ID(see Chat and engine).
Permissions
Section titled “Permissions”cadence:org:orchestrators:read/:write/:lifecycle— Defined in the permissions module. Routes useroles_allowedwith the appropriate constant.
Startup
Section titled “Startup”load_hot_tier_instances in application lifespan preloads active hot-tier instances after plugin catalog sync.
Limitations
Section titled “Limitations”- Pool capacity — Large numbers of hot instances increase memory and startup time; warm/cold defer load.
- Events — If RabbitMQ is unavailable, event_publisher may be
None; handlers should tolerate missing publish (seeget_event_publisherin shared dependencies).
Related pages
Section titled “Related pages” Chat and engine Completion routes and headers.
Hot-reload and pool Tiers, load/unload, and 202 semantics.
Orchestration modes Supervisor vs grounded and mode_config.