KBkilterKB
userdev

Troubleshooting the Dev Environment

Start every triage with the same three commands:

kilter status    # pod states — look for CrashLoopBackOff, ImagePullBackOff
kilter logs <service>
kilter env       # verify connection strings and ports

Symptom → fix

SymptomLikely causeFix
Cluster dead after rebootKIND containers stop on host rebootkilter up — it detects the dead cluster and recreates it
Pod in CrashLoopBackOff (postgres)Data directory corruptionkilter db reset
Pod in CrashLoopBackOff (app)Missing env varsCheck kilter env and your .env file
Pod in CrashLoopBackOff (app, schema errors in logs)Schema drift — code expects tables that don't existkilter db migrate
ImagePullBackOffKIND can't pull private imagesUse a public image — check kilter catalog for defaults
Port already in useAnother process holds a kilter-allocated portFind it via kilter env, kill the process, then kilter down && kilter up
"Connection refused" app → serviceService pod down, or app using localhost instead of cluster DNSkilter status + kilter logs <service>; in-cluster URLs are <service>.<namespace>.svc:<port>
Code edits don't reach the podTilt live_update sync rules miss your directorySee below
Login/Keto 500s after DB resetStale Ory nid cacheSee below

Edits not syncing into the pod

This fails silently: Tilt runs fine, the pod stays up, but edits in an unsynced directory never propagate — while files that are in the sync list (like package.json) still hot-reload, masking the bug.

Diagnose by comparing mtimes: edit a file, then stat it on the host and inside the pod (kubectl exec ... -- stat /app/<file> using the project kubeconfig at ~/.cache/kilter/<name>/kubeconfig). Host recent + pod stale confirms it.

Fix: re-run kilter render so the generator emits sync rules matching your layout. If you've ejected the Tiltfile, splice the missing sync() lines in yourself, and confirm every top-level source directory appears in live_update.

Ory 500s after postgres restart or kilter db reset

Ory services (Kratos, Keto, Hydra) cache a network ID (nid) in memory. A DB reset invalidates it and every Ory call starts returning 500 (foreign-key errors on nid in the logs).

Liveness probes auto-restart the pods within about 90 seconds. For immediate recovery:

export KUBECONFIG=~/.cache/kilter/<name>/kubeconfig
kubectl rollout restart deployment ory-kratos ory-keto ory-hydra -n <name>-dev

Restart vs stop vs destroy

GoalCommandWhat survives
One stuck podkilter restart <service>Everything else
Pause Tilt brieflykilter downCluster and pods keep running
End of day, free RAMkilter stopState preserved on disk; kilter up resumes in ~30s
Corrupted state, start overkilter destroy && kilter upNothing — full recreate
Destroy is the last resort

kilter destroy loses all cluster state including the database. Try kilter restart, kilter db reset, or the Ory rollout-restart first — most "everything is broken" states have a narrower fix.

Disk filling up

Tilt loads a fresh app image into KIND on every rebuild; node disk grows until it fills the host. Reclaim without tearing down:

kilter prune            # prunes kilter-built images, keeps recent builds
kilter prune --dry-run  # preview first

Still stuck: go direct

export KUBECONFIG=~/.cache/kilter/<name>/kubeconfig
kubectl describe pod <pod> -n <name>-dev          # events, restart reasons
kubectl get events -n <name>-dev --sort-by=.lastTimestamp