Verifying the connection
Four signals that the agent is talking to the orchestrator healthily: the dashboard state, the agent's logs, the heartbeat cadence, and the first backup round-trip. When one of them is off, this page tells you where to look.
Signal 1: the dashboard Agents row
Open Agents in the dashboard. A healthy agent shows:
- Name as you set it during Add Agent.
- Green dot next to the name. The dashboard marks an agent green if its most recent heartbeat was within the last three heartbeat intervals (default three minutes).
- Last seen reading a short relative time: "just now", "1m ago", "2m ago".
- Agent version string matching what you installed.
If any of those is wrong:
- Red or grey dot: the agent has not heartbeat in at least nine minutes. Check the agent's systemd status.
-
"Last seen: never": setup completed, but the service was not
started. Run
sudo systemctl enable --now restorable.service. -
Version string stale after an upgrade: the service needs a
restart (
systemctl restart restorable).
Signal 2: the agent's journalctl
sudo journalctl -u restorable.service -f A healthy start looks like this:
systemd[1]: Started Restorable agent.
restorable[1234]: ✓ Agent running in orchestrator-driven mode
restorable[1234]: Agent ID: agent_01hz...
restorable[1234]: Orchestrator: https://app.restorable.app
restorable[1234]: Heartbeat: 1m0s
Every sixty seconds thereafter the agent logs a heartbeat line
(quiet at info level; --verbose shows them). A loud
heartbeat failure looks like:
restorable[1234]: heartbeat: Post "https://app.restorable.app/v1/heartbeat": dial tcp: ... Common causes: orchestrator unreachable (DNS, firewall, TLS), token revoked in the dashboard, clock skew beyond the 5-minute tolerance. See Troubleshooting below.
Signal 3: the heartbeat cadence in the dashboard
Open the agent's detail view. The Events timeline shows heartbeats as dots on a minute-grained axis. A healthy cadence looks like one heartbeat every sixty seconds, with at most a 2-3 second jitter.
Patterns that indicate trouble:
- Gaps of exactly sixty seconds. One heartbeat failed and retried successfully. Transient; log it and move on.
- Gaps of several minutes. Network flapping or the orchestrator was unreachable. The agent kept running and queued nothing (heartbeats are fire-and-forget); the dashboard just did not hear.
- Clustered pairs. Two agents talking to the
orchestrator with the same
agent_id. Do not run two instances of the same agent. If you are moving the agent to a new host, revoke and re-register.
Signal 4: the first backup round-trip
The definitive test of end-to-end connectivity. From the dashboard, open a source and click Run backup now. You should see, within 60 seconds:
-
Dashboard event:
command_delivered: run_backup. -
Agent log:
running command cmd_01hz... (run_backup), followed by the dump command output. -
Dashboard event:
backup_completed. - The new row in the Backups table for that source, with a size and SHA-256.
If the command is delivered but backup_completed
never arrives, the agent hit a real failure. See
Add a
Postgres source, the "Common failures" section.
Troubleshooting by symptom
Dashboard shows "Waiting for first heartbeat"
Setup completed, service never started, or the first heartbeat failed. Run:
sudo systemctl status restorable.service
sudo journalctl -u restorable.service -n 30 The first line of journal output after service start tells you which layer failed: config parsing, orchestrator URL resolution, TLS, or authentication.
Error: 401 Unauthorized on heartbeat
The bearer token the agent has was invalidated on the orchestrator side. Either you rotated the token in the dashboard and did not propagate the new value, or the agent was revoked. Re-run setup with a fresh auth key.
Error: 403 Forbidden with a clock warning
Your host's clock is off by more than five minutes from the
server's. Install chrony (or confirm
systemd-timesyncd is active) and re-check.
systemctl status systemd-timesyncd
timedatectl Error: x509: certificate signed by unknown authority
Your host's CA bundle is old or missing.
sudo apt install --reinstall ca-certificates
# or
sudo dnf install ca-certificates
sudo update-ca-trust The service keeps restarting
Systemd's Restart=on-failure loops the agent if it
exits non-zero. Look at the start logs to see the cause:
sudo journalctl -u restorable.service --since "5 minutes ago"
A config validation error prints its one-line failure at
startup and the service exits. Fix the YAML and
systemctl restart. See
config.yaml
reference for the list of validation rules.
Still stuck
Email simon@hackerman.co with the agent ID and the journalctl output since service start. Early-customer support is handled by hand.