Single-node Vault deployments are a liability. When that server goes down, your entire infrastructure loses access to secrets — databases can’t connect, applications can’t authenticate, deployments halt. Vault’s integrated Raft storage solves this without external dependencies like Consul. Three nodes, automatic leader election, survives any single failure.
Why Raft Over External Storage
Historically, Vault HA required an external storage backend — Consul, etcd, or DynamoDB. That meant managing additional infrastructure and troubleshooting two distributed systems instead of one. Integrated Raft eliminates that entirely.
Integrated Raft
No external storage dependencies. Built-in leader election, strong consistency, lower operational overhead.
How Raft Works
Quorum-based consensus — needs N/2 + 1 nodes. Followers forward writes to the leader. Can tolerate (N-1)/2 failures.
Architecture
Node 1 — On-Premises
Low-latency local access. Participates in quorum but is not the primary leader candidate.
Node 2 — Cloud
Primary leader candidate. Geographic redundancy.
Node 3 — Cloud
Follower and failover candidate. Completes quorum.
3 nodes means you can lose any 1 and keep running. With 2 nodes you need both for quorum — losing one stops writes entirely. Always use odd numbers: 3, 5, or 7.
TLS Certificates
Raft cluster communication requires mutual TLS. Certificates must include both serverAuth and clientAuth in Extended Key Usage. Missing clientAuth causes “tls: bad certificate” errors during cluster formation.
Step 1 — Create the CA
openssl genrsa -out rootCA.key 4096
openssl req -x509 -new -nodes \
-key rootCA.key \
-sha256 -days 3650 \
-out rootCA.crt \
-subj "/C=US/ST=State/L=City/O=Your Organization/CN=Vault Root CA"
Keep rootCA.key secure — it signs everything. rootCA.crt gets distributed to all nodes.
Step 2 — Generate Node Certificates
If you’re using HAProxy, all three node certificates must include the HAProxy hostname (e.g. vault.example.com) in their SANs. This lets clients connect through HAProxy without certificate errors.
#!/bin/bash
# gen-cert.sh
HOSTNAME=$1
IP=$2
cat > cert.cnf <<EOF
[ req ]
default_bits = 2048
default_md = sha256
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no
[ req_distinguished_name ]
CN = ${HOSTNAME}
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = critical, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth, clientAuth
subjectAltName = @alt_names
[ alt_names ]
DNS.1 = ${HOSTNAME}
DNS.2 = vault.example.com
IP.1 = ${IP}
EOF
openssl req -new -newkey rsa:2048 -nodes \
-keyout ${HOSTNAME}-key.pem \
-out ${HOSTNAME}.csr \
-config cert.cnf
# -extfile and -extensions are both required
# Without them, clientAuth won't be included
openssl x509 -req \
-in ${HOSTNAME}.csr \
-CA rootCA.crt -CAkey rootCA.key -CAcreateserial \
-out ${HOSTNAME}.pem \
-days 365 -sha256 \
-extfile cert.cnf -extensions v3_req
rm ${HOSTNAME}.csr cert.cnf
./gen-cert.sh vault1.example.com 192.168.1.10
./gen-cert.sh vault2.example.com 10.0.1.20
./gen-cert.sh vault3.example.com 10.0.1.30
Verify each cert shows both TLS Web Server Authentication and TLS Web Client Authentication in Extended Key Usage.
Docker Compose Setup
Directory Structure
Dockerfile
FROM hashicorp/vault:latest
COPY rootCA.crt /usr/local/share/ca-certificates/rootCA.crt
RUN apk add --no-cache ca-certificates && \
update-ca-certificates
Embedding the CA into the container ensures TLS verification works between nodes without any tls_skip_verify workarounds.
docker-compose.yml
version: "3.8"
services:
vault:
build: .
container_name: vault
restart: unless-stopped
cap_add:
- IPC_LOCK
environment:
VAULT_ADDR: "https://vault1.example.com:8200"
VAULT_API_ADDR: "https://vault1.example.com:8200"
ports:
- "8200:8200"
- "8201:8201"
volumes:
- ./config:/vault/config:ro
- ./data:/vault/data:rw
- ./certs:/certs:ro
entrypoint: ["vault", "server", "-config=/vault/config/config.hcl"]
config.hcl
storage "raft" {
path = "/vault/data"
node_id = "vault1"
retry_join {
leader_api_addr = "https://vault2.example.com:8200"
}
retry_join {
leader_api_addr = "https://vault3.example.com:8200"
}
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/certs/vault1.pem"
tls_key_file = "/certs/key.pem"
}
api_addr = "https://vault1.example.com:8200"
cluster_addr = "https://vault1.example.com:8201"
disable_mlock = true
ui = true
Adjust node_id, api_addr, cluster_addr, and cert paths for each node. The retry_join blocks point to the other two nodes and handle automatic cluster formation on restart.
Cluster Initialization
With Raft storage, join before unsealing. This is different from file-based storage. Nodes need to establish Raft cluster membership while sealed.
Step 1 — Initialize the Leader
# On Node 2 (will be initial leader)
docker-compose up -d
docker exec vault vault operator init
Save all five unseal keys and the root token somewhere secure — this is the only time they’re shown. Then unseal:
docker exec vault vault operator unseal <key1>
docker exec vault vault operator unseal <key2>
docker exec vault vault operator unseal <key3>
docker exec vault vault status
Step 2 — Join Follower Nodes
# On Node 1 and Node 3
docker-compose up -d
# Join while still sealed
docker exec vault vault operator raft join https://vault2.example.com:8200
# Then unseal with the same keys from initialization
docker exec vault vault operator unseal <key1>
docker exec vault vault operator unseal <key2>
docker exec vault vault operator unseal <key3>
Step 3 — Verify
export VAULT_ADDR=https://vault2.example.com:8200
export VAULT_TOKEN=<root_token>
vault operator raft list-peers
# Node Address State Voter
# vault2 vault2.example.com:8201 leader true
# vault1 vault1.example.com:8201 follower true
# vault3 vault3.example.com:8201 follower true
HAProxy Load Balancer
Vault handles leader election internally, but your applications need a single endpoint that always resolves to a healthy node. HAProxy does health checks every 5 seconds and routes around failures automatically.
frontend vault_frontend
bind *:8200
mode tcp
default_backend vault_backend
backend vault_backend
mode tcp
balance roundrobin
option tcp-check
tcp-check connect
server vault1 192.168.1.10:8200 check inter 5s fall 3 rise 2
server vault2 10.0.1.20:8200 check inter 5s fall 3 rise 2
server vault3 10.0.1.30:8200 check inter 5s fall 3 rise 2
fall 3 with inter 5s means a node is marked down after 15 seconds of failures. rise 2 means it comes back after 10 seconds of successful checks. Tune these to your tolerance.
Point vault.example.com at your HAProxy IP. Applications use one address and never need to know which node is the current leader.
Testing Failover
Leader Failure
# Check current leader
vault operator raft list-peers
# Stop the leader
docker-compose down # on vault2
# Watch election on another node
watch -n 1 'vault operator raft list-peers'
# HAProxy endpoint stays up throughout
vault status -address=https://vault.example.com:8200
Follower Failure
# Stop a follower
docker-compose down # on vault3
# Cluster still has quorum (2/3)
vault kv get secret/test # still works
Operations
Backups
# Manual snapshot
vault operator raft snapshot save backup-$(date +%Y%m%d).snap
# Restore
vault operator raft snapshot restore -force backup.snap
# Automated daily
cat > /etc/cron.daily/vault-backup <<'EOF'
#!/bin/bash
export VAULT_ADDR=https://vault.example.com:8200
export VAULT_TOKEN=<token>
vault operator raft snapshot save /backup/vault/vault-$(date +%Y%m%d).snap
find /backup/vault -name "vault-*.snap" -mtime +30 -delete
EOF
chmod +x /etc/cron.daily/vault-backup
Monitoring
# Prometheus metrics
curl https://vault.example.com:8200/v1/sys/metrics?format=prometheus
# Key metrics
# vault_core_unsealed — should be 1
# vault_core_active — 1 on leader, 0 on followers
# vault_raft_peers — should equal node count
# vault_raft_leader — should be 1
Vault starts sealed after any restart. With Shamir seal, you must manually unseal. Only the leader needs unsealing after a full cluster restart — followers auto-unseal once the leader is up. Individual node restarts require manual unseal of that node.
For production, consider migrating to auto-unseal with a cloud KMS — AWS KMS, Azure Key Vault, or GCP Cloud KMS. Migration is done with vault operator unseal -migrate.
Security
Never commit them to git. Store in a password manager and distribute across key holders — no single person should hold all five. Losing unseal keys means permanent data loss with no recovery path.
- Don’t use the root token for daily operations — create admin tokens with
vault token create -policy=admin -period=768h - Use service-specific policies for applications
- Enable audit logging:
vault audit enable file file_path=/vault/logs/audit.log
Key Points
- Integrated Raft removes the need for an external storage backend entirely
- 3 nodes survive any single failure — always use odd numbers
- Certificates must include both
serverAuthandclientAuth— missing the latter breaks cluster formation - Join before unsealing with Raft storage — order matters
- HAProxy gives applications a single stable endpoint regardless of which node is leader
- Plan for auto-unseal in production so a server restart doesn’t require manual intervention
