AI Apps
OrchestratorCRUD, lifecycle load/unload, pool tiers, and domain validation.
Intended audience: Stakeholders, Business analysts, Solution architects, Developers, Testers
Learning outcomes by role
Stakeholders
- Explain AI Apps as billable or capacity-consuming units per org.
Business analysts
- Define CRUD stories for create, update, delete, and tier selection.
Solution architects
- Connect instance records to pool tiers and external orchestration backends.
Developers
- Use orchestrator APIs and validation rules from cadence.api.orchestrator.
Testers
- Cover domain validation errors, quota limits, and concurrent updates.
An AI App (historically called an orchestrator instance in the API) is one configured AI runtime for an org. It declares the framework (langgraph or openai_agents), the mode (for example supervisor, coordinator, handoff, or grounded — see Orchestration backends), which plugins are active, and how eagerly it should stay resident in memory (pool tier). The record lives in PostgreSQL; the running runtime lives in OrchestratorPool. HTTP routes are under /api/orgs/{org_id}/orchestrators.
How instances are created and used
Section titled “How instances are created and used”Creating an instance writes a configuration record but does not start a runtime. The pool only holds the instance in memory after an explicit load call, which publishes an async event that a worker processes. Once loaded, the instance handles chat requests; when unloaded it returns 503 to callers.
SettingsService.validate_orchestrator_config runs before any record is written. It checks that the requested framework and mode are a valid combination, that the LLM configuration resolves correctly for this org, and that all referenced plugins exist in the org’s catalog. A failed validation returns 422 before anything is persisted.
Typical flow
Section titled “Typical flow”- Create —
POST /api/orgs/{org_id}/orchestratorswithroles_allowed(ORG_ORCHESTRATORS_WRITE).org_contextenforces membership.SettingsService.validate_orchestrator_configchecks framework, mode, config, and plugin references. - Persist —
OrchestratorServiceand its repositories write the instance record;publish_afteroptionally emits a creation event to the message bus. - Load — Call
POST /api/orgs/{org_id}/orchestrators/{instance_id}/load. The handler validates org ownership and publishes a load event. The worker callsOrchestratorPool.create_instanceand the instance becomes available for chat. The HTTP response returns202 Acceptedbefore the worker finishes — see Hot-reload and AI App pool. - Chat — Clients call
POST /api/chat/completionwithX-ORG-IDandX-INSTANCE-ID. See Chat and engine.
Permissions
Section titled “Permissions”Routes use roles_allowed with these permission constants:
| Operation | Permission |
|---|---|
| Read instance list and details | cadence:org:orchestrators:read |
| Create, update, delete | cadence:org:orchestrators:write |
| Load, unload, reload | cadence:org:orchestrators:lifecycle |
Startup preloading
Section titled “Startup preloading”During application startup, load_hot_tier_instances runs after plugin catalog sync. It loads all active AI Apps whose tier is hot so they are ready before the first chat request arrives. Instances on demand tier are not preloaded — they load into the demand pool on first use (or via an explicit load call).
Limitations
Section titled “Limitations”- Pool capacity — Large numbers of
hotinstances increase memory and startup time. Usedemandtier for low-traffic AI Apps and promote tohotbefore expected load spikes. - Async load —
202 Acceptedmeans the event was published, not that the instance is ready. Poll pool stats or wait before sending chat requests. - RabbitMQ optional — If the message bus is unavailable,
event_publishermay beNone. Load and unload handlers tolerate this and write the record without broadcasting.