Operations
How to run Nexus locally and deploy it on Kubernetes via GitOps.
Prerequisites
- Rust (stable) —
rustup default stable - Node 22+ and
pnpm(fornexus-ui) - MongoDB 6+ reachable on the network
- NATS with JetStream enabled
- A Kubernetes cluster +
kubectlfor agent runs - A Telegram bot token (optional, for the gateway)
- A Plane workspace + API key (optional, for board sync)
Run locally
MongoDB + NATS
docker run -d --name nexus-mongo -p 27017:27017 mongo:6
docker run -d --name nexus-nats -p 4222:4222 nats:latest -js
Nexus Core
cd apps/nexus-core
cp .env.example .env
cargo run --release
Minimum settings:
MONGODB_URL=mongodb://localhost:27017
MONGODB_DB=nexus
NATS_URL=nats://localhost:4222
KUBE_NAMESPACE=nexus-agents
JWT_SIGNING_KEY=<random string>
Verify:
curl http://localhost:8080/healthz
open http://localhost:8080/swagger-ui/
Worker
cd apps/nexus-worker
cargo run --release
UI
The admin UI is a separate repo (nexus-ui):
cd ../nexus-ui # the standalone Next.js repo
cp .env.example .env.local
pnpm install
pnpm dev # http://localhost:3000
NEXUS_CORE_URL=http://localhost:8080
NEXUS_ADMIN_TOKEN=dev-token
AUTH_SECRET=at-least-16-chars-long-secret
Deploy on Kubernetes (GitOps)
Deployment lives in nexus-gitops: Helm charts, per-environment values, Argo CD apps / Flux manifests, and the MongoDB
- NATS manifests.
# Apply the Argo CD app for an environment
kubectl apply -f argocd/nexus-dev.yaml
Namespaces
Run agent jobs in a dedicated namespace (e.g. nexus-agents), separate
from the control plane (nexus). Core's service account can create Jobs only in
the agent namespace.
Scoped RBAC
verbs: [create, get, list, watch, delete]
resources: [jobs, pods, pods/log]
Observability
Nexus components export Prometheus metrics on :9100:
nexus_agent_runs_total
nexus_agent_run_failures_total
nexus_task_duration_seconds
nexus_plane_sync_errors_total
nexus_llm_tokens_total
nexus_llm_cost_usd_total
Traces via OpenTelemetry (opentelemetry-otlp). Recommended dashboards: run
throughput/failures, task duration, LLM token/cost, Plane sync errors, NATS
consumer lag, MongoDB pool/latency.
Backups
- MongoDB: replica set (3 nodes), daily
mongodump, encrypted, off-site with object-lock, quarterly restore drills. - NATS JetStream: file storage with replicas; streams are the durable event log.
Secrets
.envfiles never committed.- Production secrets from a secrets manager (Vault / cloud secret manager / sealed-secrets in GitOps).
- LLM provider keys, Plane API key, Telegram token, JWT signing key — all from the secret manager, rotated on a schedule.
Production checklist
- MongoDB replica set + backups tested via restore
- NATS JetStream durable streams configured
- Agent namespace + scoped RBAC applied
- Metrics + alerting deployed and verified
- Secrets in a manager, rotation scheduled
- Approval gates configured for sensitive actions
- Quarantine policy thresholds set
- Plane integration (if used) rate-limited + webhook HMAC verified
- Telegram allowlist locked down