Lifecycle
Two lifecycles matter: the agent lifecycle (how an agent is created, tuned, and retired) and the run lifecycle (how a single execution proceeds).
Agent lifecycle
Create Agent
→ Draft
→ Test on sample task
→ Enable
→ Available for scheduling
→ Disable if failing
→ Clone / tune
Statuses
| Status | Meaning |
|---|---|
draft | created, not yet validated |
enabled | available for scheduling |
disabled | manually paused |
quarantined | auto-paused after repeated failures |
archived | retired |
Quarantine policy
If an agent fails too much, Nexus quarantines it:
success_rate < 50% over last 10 runs
OR 3 consecutive critical failures
→ status = quarantined
→ no new tasks assigned
A quarantined agent stays visible in the UI with its failure history so an operator can inspect, tune skills/permissions, and re-enable.
Run lifecycle
A single agent run, realized as a Kubernetes Job:
1. Nexus loads agent definition from MongoDB
2. Nexus loads attached skills
3. Nexus searches memory for relevant context
4. Nexus builds runtime prompt/config
5. Nexus creates Kubernetes Job
6. Agent runs in isolated pod
7. Agent reports events back to Nexus
8. Nexus updates internal board
9. Nexus syncs to Plane
Task statuses
draft → planned → ready_for_agent → leased → running
→ waiting_review → (blocked) → done
leased means a worker has claimed the task for a specific run (a distributed
lease prevents double-dispatch). waiting_review means a permission gate or a
review step needs a human.
Related
- Agents · Permissions
- Runtime flows — the detailed task and run flows