I wanted a Hermes Agent running like a normal service.
Not a local terminal experiment. Not a container I had to babysit. Not something that worked once and then broke the next time CapRover recreated the app.
The target shape was simple:
- Hermes runs on CapRover,
- all state survives image upgrades,
- Codex is available inside the container,
- ChatGPT/Codex auth survives restarts,
- the dashboard is reachable behind basic auth,
- and redeploying the image does not require redoing setup by hand.
That shape is achievable, but there are a few gotchas worth avoiding.
The deployment model
Hermes’ Docker model is clean: the image is mostly stateless, and user data lives in /opt/data.
That directory is the important part. It contains things like:
config.yaml,.env,- auth files,
- sessions,
- skills,
- logs,
- generated Slack manifests,
- and any tool state that needs to persist.
So the CapRover app should have persistent storage mounted at:
/opt/data
In my case the host path was:
/captain/data/hermes/data
The first version used the upstream image directly:
nousresearch/hermes-agent:latest
That works for a basic deployment, but I quickly wanted two small additions:
- the
hermescommand should be available directly onPATH, - the Codex CLI should be present after every restart and redeploy.
That meant a thin wrapper image was the right abstraction.
The wrapper image
The wrapper image keeps Hermes itself upstream, while adding the operational conveniences I need.
FROM nousresearch/hermes-agent:latest
USER root
RUN npm install -g @openai/codex \
&& ln -sf /opt/hermes/.venv/bin/hermes /usr/local/bin/hermes \
&& npm cache clean --force
COPY docker/hermes-entrypoint.sh /usr/local/bin/hermes-entrypoint.sh
RUN chmod +x /usr/local/bin/hermes-entrypoint.sh
ENV PATH="/opt/hermes/.venv/bin:/usr/local/bin:${PATH}" \
HOME="/opt/data/home"
ENTRYPOINT ["/usr/local/bin/hermes-entrypoint.sh"]
The important pieces:
@openai/codexis installed at build time, not manually inside a live container.hermesis symlinked into/usr/local/bin.HOME=/opt/data/home, so tool auth like~/.codexpersists on the mounted volume.- a wrapper entrypoint handles permission normalization before delegating to Hermes’ official entrypoint.
This lets upgrades stay boring.
Rebuild the image from the latest upstream base, redeploy, keep /opt/data untouched.
The entrypoint permission fix
This was the gotcha that actually mattered.
CapRover docker exec sessions often land you in the container as root.
If you run setup commands there, files like these can become root-owned:
/opt/data/auth.json
/opt/data/config.yaml
But the official Hermes entrypoint drops privileges before running the gateway. That is good. Running the gateway as root would create more problems later.
The failure mode is subtle:
hermes -zworks when you test it as root,- Slack connects,
- the gateway starts,
- but Slack requests fail with provider/auth errors.
The logs tell the truth:
Permission denied: /opt/data/auth.json
No Codex credentials stored
No inference provider configured
The fix is to normalize the persistent volume before the official entrypoint drops privileges:
#!/usr/bin/env bash
set -euo pipefail
if [ "$(id -u)" = "0" ]; then
mkdir -p /opt/data /opt/data/home
if getent passwd hermes >/dev/null 2>&1; then
chown -R hermes:hermes /opt/data || true
fi
chmod -R u+rwX,go+rX /opt/data || true
fi
exec /opt/hermes/docker/entrypoint.sh "$@"
That extra chmod is intentionally boring.
If ownership changes are restricted by the mount, the gateway can still read config and auth.
The command CapRover should run
Another easy trap: if the container starts the default interactive CLI, it exits immediately because there is no TTY.
The logs look like this:
Warning: Input is not a terminal (fd=0).
Goodbye!
Then CapRover keeps restarting it.
For gateway mode, run:
gateway run
If you are setting a CapRover service override manually, make sure it still goes through the entrypoint. The safe shape is:
/opt/hermes/docker/entrypoint.sh gateway run
With the wrapper image above, the default entrypoint can also receive:
gateway run
The key is: do not accidentally boot the interactive CLI in a non-interactive container.
CapRover environment variables
For the dashboard and API server, I used:
API_SERVER_ENABLED=true
API_SERVER_HOST=0.0.0.0
API_SERVER_KEY=...
API_SERVER_CORS_ORIGINS=https://hermes.example.com
HERMES_DASHBOARD=1
HERMES_DASHBOARD_HOST=0.0.0.0
HERMES_DASHBOARD_PORT=9119
HERMES_HEADLESS=1
The app exposed port 9119 because the dashboard was the public surface.
The dashboard itself was also behind CapRover nginx basic auth. That is separate from Hermes’ own configuration. If the browser prompts for credentials before the dashboard loads, that is nginx basic auth, not Hermes auth.
Configuring Codex
Once the container is running with persistent /opt/data, exec into it and authenticate Hermes with Codex.
hermes auth add codex-oauth
Then choose Codex as the inference provider:
hermes model
Pick OpenAI Codex and the model your subscription exposes.
Then test from inside the container:
hermes -z "Reply with exactly: codex-ok"
If that returns:
codex-ok
Codex works for that shell.
But do not stop there.
The gateway runs as the non-root Hermes user, so verify that Slack or gateway-triggered requests can read the same auth/config files.
If Slack says “No Codex credentials stored” while hermes -z works in your root shell, you almost certainly have a file ownership problem under /opt/data.
Build and deploy automation
I put the wrapper image in the agent control repo and added a GitHub Actions workflow that:
- builds the wrapper image,
- pushes it to GHCR,
- deploys the immutable SHA-tagged image to CapRover.
The image tags look like:
ghcr.io/gregagi/hermes-agent:latest
ghcr.io/gregagi/hermes-agent:sha-<commit-sha>
Deploying the SHA tag matters. If something breaks, you know exactly what is running.
The workflow needs these repo variables/secrets:
CAPROVER_HERMES_URL
CAPROVER_HERMES_APP_NAME
CAPROVER_HERMES_APP_TOKEN
I also allowed a CapRover password secret as a fallback before a per-app token exists. App tokens are better long-term.
One small workflow gotcha: include every file that affects the image in the path trigger.
I initially triggered only on Dockerfile.hermes, then changed the entrypoint script and wondered why the fix had not deployed.
The workflow should include both:
paths:
- Dockerfile.hermes
- docker/hermes-entrypoint.sh
- .github/workflows/deploy-hermes-image.yml
How I verified the deployment
The checks that mattered were:
- the CapRover app deployed the new image,
- the container did not restart-loop,
- logs showed gateway mode, not interactive mode,
- the dashboard started,
- Hermes could answer through Codex,
- Slack requests used the same credentials the CLI test used.
Good logs look like this:
Dropping root privileges
Starting hermes dashboard on 0.0.0.0:9119
Hermes Gateway Starting...
Bad logs look like this:
Warning: Input is not a terminal
Goodbye!
or:
Permission denied: /opt/data/auth.json
No Codex credentials stored
Those two failures point to different fixes:
- interactive CLI failure → fix the container command,
- auth/config permission failure → fix
/opt/dataownership/permissions.
The final shape
The deployment I would repeat is:
- official Hermes image as the base,
- tiny wrapper image for Codex and PATH ergonomics,
- persistent
/opt/data, HOME=/opt/data/home,- gateway mode as the container command,
- entrypoint-level volume permission normalization,
- GitHub Actions rebuild and deploy to CapRover.
That keeps the setup easy to upgrade.
When a new Hermes version lands, rebuild the wrapper image.
The state stays in /opt/data.
The image stays disposable.
The agent keeps working.