Once OpenClaw is running production-ready on a rented Mac mini M4 in Canada, the next problem stops being “does it boot?” and becomes “can we upgrade it on a Tuesday morning without paging anyone?” This handbook is the controlled-change companion: how to pin dependencies so versions move when you decide, how to wire Gateway health probes and rolling restarts instead of kill -9, how to set log disk watermarks before APFS bites, and how to separate real remote link jitter from local faults across APAC and North America operators. We close with mid-to-high tier sizing cases so the runbook lines up with the box you actually rent. For deeper triage of token, daemon, and log paths, keep our 7×24 Gateway LaunchDaemon and log error table open in a tab; for a baseline production install, see the Canada M4 production HowTo on Node disk and Channels renewal.
1. Freeze dependencies so upgrades are a decision, not a surprise
Controlled upgrades start before the upgrade. On the M4, lock node, npm, and any global CLI to a single LTS line per host, and commit a working lockfile (package-lock.json, pnpm-lock.yaml, or yarn.lock) into the repo that drives launchd. Avoid npm install -g foo@latest in onboarding scripts; replace it with explicit majors ([email protected]) or full pins so an idle weekend does not silently pull a breaking minor on next reboot. Treat Homebrew the same way: snapshot installed formulae with brew bundle dump after a known-good run and check the Brewfile in alongside the OpenClaw config.
1.1 Two-environment promotion
Keep two profiles on the same Mac if you can spare 50–100 GB: a current live workspace and a candidate directory that mirrors the next pinned set. Run the candidate against a copy of the Channels token in dry-run for an hour before swapping the launchd plist’s WorkingDirectory. This makes upgrades reversible: if the new pin misbehaves, you flip back in seconds without re-installing anything.
git diff, you are not upgrading—you are gambling.
2. Gateway health probes and rolling restarts
OpenClaw’s Gateway on TCP 18789 is the heartbeat your operators feel. Don’t guess at health from CPU graphs; add a tiny probe loop that hits http://127.0.0.1:18789/health with the right token, expects a JSON body within ~500 ms, and writes OK / FAIL with a UTC timestamp to a rotating file. Wire that probe into launchd through a StartInterval job, not into your operator’s shell history.
For rolling restarts, never combine “deploy code” and “restart Gateway” in one step. The safe order is: drain in-flight webhooks (set the channel adapter to maintenance for ~30 s), launchctl bootout the old service, validate the new binary path with which and file, then launchctl bootstrap. Watch the next three probe samples before declaring success in chat. If you run two M4 boxes for redundancy, restart one at a time and only when its peer’s last 5 probes are all OK.
curl -sS -m 2 -H "Authorization: Bearer $OPENCLAW_TOKEN" \ http://127.0.0.1:18789/health \ | jq -e '.status=="ok"' && echo "OK $(date -u +%FT%TZ)" \ || echo "FAIL $(date -u +%FT%TZ)"
3. Log disk watermarks before APFS bites
APFS gets twitchy when free space drops below ~15%, and OpenClaw chat queues plus Node debug logs can fill 50 GB faster than weekly maintenance can clean it. Define three watermarks and act on them automatically:
| Watermark | Free space | Action |
|---|---|---|
| Green | > 25% | Normal rotation; daily compress of yesterday’s logs. |
| Yellow | 15–25% | Page on-call, prune npm caches, drop logs older than 14 days. |
| Red | < 15% | Stop new builds, archive logs to object storage, freeze new channel onboarding until green. |
Use newsyslog or a tiny launchd job, not hand rotation. Keep workspaces and Gateway logs on the data volume you sized for the machine; the boot slice should never be the disk that decides Gateway latency.
4. Remote link jitter vs. real faults: a triage map
Trans-Pacific SSH and hotel Wi-Fi produce uneven typing and occasional UI stalls; they should not move loopback metrics on the Mac. Before you blame OpenClaw, run the same three probes in order: loopback on the Mac, forwarded localhost from your laptop, then the public URL if you expose direct mode. Divergence between probes one and two implicates SSH forwards or bind addresses; divergence between two and three implicates DNS, TLS, or edge firewall rules.
5. Mid-to-high tier M4 sizing cases
The runbook above is identical across tiers, but headroom is not. Two patterns we see most often:
- M4 / 24 GB / 512 GB — mid tier. One Gateway, two or three Channels, ~30 active operators. Holds 60–90 days of compressed logs comfortably and supports a single rolling-restart slot. Good fit for teams who upgrade monthly and only need one APAC time window.
- M4 Pro / 48 GB / 1 TB — high tier. Two Gateway processes (active/standby), heavier Channels mix, parallel build sandboxes. Watermarks rarely turn yellow; you can afford a candidate workspace permanently. Recommended once you serve both NA daytime and APAC overnight from the same node.
6. FAQ: symptom → first action
| What you see | First action |
|---|---|
| Gateway flapped after “tiny” npm upgrade | Diff the lockfile, roll back to the candidate workspace, file the upgrade as a real change. |
Health probe says OK but operators report stalls |
Network path only—measure SSH RTT and jitter, do not bounce Gateway. |
| Disk hits red watermark overnight | Confirm log rotation ran, archive last 14 days, prune node_modules duplicates and APFS snapshots. |
| Rolling restart left two listeners on 18789 | lsof -iTCP:18789 -sTCP:LISTEN, launchctl bootout the orphan, then re-validate token alignment. |
7. Summary
A controlled-change handbook turns OpenClaw upgrades on a 2026 Canada Mac mini M4 from a stress event into a quiet rotation: pinned dependencies, scripted health probes, drained rolling restarts, watermark-based log policies, and a remote-jitter triage that tells you when to leave the Gateway alone. Pick the M4 tier that gives you headroom for a candidate workspace, document the watermark thresholds in the same place as the lockfile, and the next upgrade is a Tuesday morning—not an incident.