Skip to main content

Lifecycle

Two lifecycles matter: the agent lifecycle (how an agent is created, tuned, and retired) and the run lifecycle (how a single execution proceeds).

Agent lifecycle

Create Agent
→ Draft
→ Test on sample task
→ Enable
→ Available for scheduling
→ Disable if failing
→ Clone / tune

Statuses

StatusMeaning
draftcreated, not yet validated
enabledavailable for scheduling
disabledmanually paused
quarantinedauto-paused after repeated failures
archivedretired

Quarantine policy

If an agent fails too much, Nexus quarantines it:

success_rate < 50% over last 10 runs
OR 3 consecutive critical failures
→ status = quarantined
→ no new tasks assigned

A quarantined agent stays visible in the UI with its failure history so an operator can inspect, tune skills/permissions, and re-enable.

Run lifecycle

A single agent run, realized as a Kubernetes Job:

1. Nexus loads agent definition from MongoDB
2. Nexus loads attached skills
3. Nexus searches memory for relevant context
4. Nexus builds runtime prompt/config
5. Nexus creates Kubernetes Job
6. Agent runs in isolated pod
7. Agent reports events back to Nexus
8. Nexus updates internal board
9. Nexus syncs to Plane

Task statuses

draft → planned → ready_for_agent → leased → running
→ waiting_review → (blocked) → done

leased means a worker has claimed the task for a specific run (a distributed lease prevents double-dispatch). waiting_review means a permission gate or a review step needs a human.