Skip to main content

Operations

How to run Nexus locally and deploy it on Kubernetes via GitOps.

Prerequisites

  • Rust (stable) — rustup default stable
  • Node 22+ and pnpm (for nexus-ui)
  • MongoDB 6+ reachable on the network
  • NATS with JetStream enabled
  • A Kubernetes cluster + kubectl for agent runs
  • A Telegram bot token (optional, for the gateway)
  • A Plane workspace + API key (optional, for board sync)

Run locally

MongoDB + NATS

docker run -d --name nexus-mongo -p 27017:27017 mongo:6
docker run -d --name nexus-nats -p 4222:4222 nats:latest -js

Nexus Core

cd apps/nexus-core
cp .env.example .env
cargo run --release

Minimum settings:

MONGODB_URL=mongodb://localhost:27017
MONGODB_DB=nexus
NATS_URL=nats://localhost:4222
KUBE_NAMESPACE=nexus-agents
JWT_SIGNING_KEY=<random string>

Verify:

curl http://localhost:8080/healthz
open http://localhost:8080/swagger-ui/

Worker

cd apps/nexus-worker
cargo run --release

UI

The admin UI is a separate repo (nexus-ui):

cd ../nexus-ui # the standalone Next.js repo
cp .env.example .env.local
pnpm install
pnpm dev # http://localhost:3000
NEXUS_CORE_URL=http://localhost:8080
NEXUS_ADMIN_TOKEN=dev-token
AUTH_SECRET=at-least-16-chars-long-secret

Deploy on Kubernetes (GitOps)

Deployment lives in nexus-gitops: Helm charts, per-environment values, Argo CD apps / Flux manifests, and the MongoDB

  • NATS manifests.
# Apply the Argo CD app for an environment
kubectl apply -f argocd/nexus-dev.yaml

Namespaces

Run agent jobs in a dedicated namespace (e.g. nexus-agents), separate from the control plane (nexus). Core's service account can create Jobs only in the agent namespace.

Scoped RBAC

verbs: [create, get, list, watch, delete]
resources: [jobs, pods, pods/log]

See Security & permissions.

Observability

Nexus components export Prometheus metrics on :9100:

nexus_agent_runs_total
nexus_agent_run_failures_total
nexus_task_duration_seconds
nexus_plane_sync_errors_total
nexus_llm_tokens_total
nexus_llm_cost_usd_total

Traces via OpenTelemetry (opentelemetry-otlp). Recommended dashboards: run throughput/failures, task duration, LLM token/cost, Plane sync errors, NATS consumer lag, MongoDB pool/latency.

Backups

  • MongoDB: replica set (3 nodes), daily mongodump, encrypted, off-site with object-lock, quarterly restore drills.
  • NATS JetStream: file storage with replicas; streams are the durable event log.

Secrets

  • .env files never committed.
  • Production secrets from a secrets manager (Vault / cloud secret manager / sealed-secrets in GitOps).
  • LLM provider keys, Plane API key, Telegram token, JWT signing key — all from the secret manager, rotated on a schedule.

Production checklist

  • MongoDB replica set + backups tested via restore
  • NATS JetStream durable streams configured
  • Agent namespace + scoped RBAC applied
  • Metrics + alerting deployed and verified
  • Secrets in a manager, rotation scheduled
  • Approval gates configured for sensitive actions
  • Quarantine policy thresholds set
  • Plane integration (if used) rate-limited + webhook HMAC verified
  • Telegram allowlist locked down