LiteLLM requires multiple sensitive credentials — OpenAI and Anthropic API keys, a database password, and a master key for its own API. The obvious approach is to throw them in Kubernetes secrets and move on. This post does it properly instead: HashiCorp Vault stores all credentials, External Secrets Operator syncs them into Kubernetes, and nothing sensitive lives in your manifests.
This guide assumes a running Kubernetes cluster with an ingress controller, a Vault instance with the Kubernetes auth method available, and a PostgreSQL database. We’re focused on the LiteLLM deployment — not standing up these foundations.
Vault Configuration
Enable a KV v2 secrets engine scoped to LiteLLM, populate it with your credentials, then create a policy that grants read-only access to those paths. The policy is what you’ll attach to the External Secrets Operator token.
# Enable KV secrets engine
vault secrets enable -path=litellm kv-v2
# Store credentials
vault kv put litellm/api-keys \
master-key="sk-your-secure-master-key" \
salt-key="sk-your-secure-salt-key" \
openai-key="sk-your-openai-api-key" \
anthropic-key="sk-your-anthropic-api-key"
vault kv put litellm/database \
url="postgresql://litellm:password@postgres:5432/litellm?sslmode=require"
# Create read-only policy for ESO
vault policy write eso-litellm-policy - <<EOF
path "litellm/data/api-keys" {
capabilities = ["read"]
}
path "litellm/data/database" {
capabilities = ["read"]
}
EOF
# Create token for ESO (1-year TTL)
vault token create -policy=eso-litellm-policy -ttl=8760h -display-name="eso-litellm"
This uses token-based auth for simplicity. Kubernetes authentication — where Vault validates the pod’s service account token directly — is more secure and removes the need to manage a long-lived token. That’ll be covered separately.
External Secrets Operator
ESO is the bridge between Vault and Kubernetes. It watches your ExternalSecret resources, fetches the referenced values from Vault, and creates or updates the corresponding Kubernetes Secret objects automatically — including on rotation.
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets-system --create-namespace
SecretStore
The SecretStore tells ESO where Vault lives and how to authenticate. Create the token secret first, then reference it in the store.
apiVersion: v1
kind: Secret
metadata:
name: vault-token
namespace: ai-services
type: Opaque
stringData:
token: your-vault-token-here
---
apiVersion: external-secrets.io/v1
kind: SecretStore
metadata:
name: vault-backend
namespace: ai-services
spec:
provider:
vault:
server: "https://vault.example.com:8200"
path: "secret"
version: "v2"
auth:
tokenSecretRef:
name: "vault-token"
key: "token"
ExternalSecret
Each entry in data maps a Vault key to a Kubernetes secret key. The refreshInterval controls how often ESO checks for updated values — 15 seconds is fine for most deployments.
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: litellm-secrets
namespace: ai-services
spec:
refreshInterval: 15s
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: litellm-secrets
creationPolicy: Owner
data:
- secretKey: LITELLM_MASTER_KEY
remoteRef:
key: litellm/api-keys
property: master-key
- secretKey: LITELLM_SALT_KEY
remoteRef:
key: litellm/api-keys
property: salt-key
- secretKey: OPENAI_API_KEY
remoteRef:
key: litellm/api-keys
property: openai-key
- secretKey: ANTHROPIC_API_KEY
remoteRef:
key: litellm/api-keys
property: anthropic-key
- secretKey: DATABASE_URL
remoteRef:
key: litellm/database
property: url
LiteLLM Deployment
The deployment pulls every credential from the litellm-secrets object ESO manages. STORE_MODEL_IN_DB enables PostgreSQL persistence and RUN_MIGRATION handles schema setup on first boot.
apiVersion: apps/v1
kind: Deployment
metadata:
name: litellm
namespace: ai-services
spec:
replicas: 2
selector:
matchLabels:
app: litellm
template:
metadata:
labels:
app: litellm
spec:
containers:
- name: litellm
image: ghcr.io/berriai/litellm:main-stable
ports:
- containerPort: 4000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: litellm-secrets
key: DATABASE_URL
- name: STORE_MODEL_IN_DB
value: "True"
- name: RUN_MIGRATION
value: "True"
- name: LITELLM_MASTER_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: LITELLM_MASTER_KEY
- name: LITELLM_SALT_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: LITELLM_SALT_KEY
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: OPENAI_API_KEY
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: ANTHROPIC_API_KEY
livenessProbe:
httpGet:
path: /health/liveliness
port: 4000
initialDelaySeconds: 40
periodSeconds: 30
readinessProbe:
httpGet:
path: /health/readiness
port: 4000
initialDelaySeconds: 10
periodSeconds: 5
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
Service & Ingress
apiVersion: v1
kind: Service
metadata:
name: litellm-service
namespace: ai-services
spec:
selector:
app: litellm
ports:
- port: 4000
targetPort: 4000
name: http
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: litellm-ingress
namespace: ai-services
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- litellm.example.com
secretName: litellm-tls
rules:
- host: litellm.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: litellm-service
port:
number: 4000
Verification
# Confirm ESO synced the secrets from Vault
kubectl get externalsecret -n ai-services
kubectl describe externalsecret litellm-secrets -n ai-services
# Check pods are running
kubectl get pods -n ai-services
kubectl logs -n ai-services deployment/litellm
# Test health endpoint
kubectl port-forward -n ai-services svc/litellm-service 4000:4000
curl http://localhost:4000/health
# Test the API
curl -X POST https://litellm.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Key Points
- Vault’s KV v2 engine keeps all credentials out of your manifests and version control
- ESO handles sync automatically — rotate a secret in Vault and Kubernetes picks it up within the refresh interval
- The policy grants read-only access to specific paths — ESO can’t touch anything outside
litellm/ - Token auth works but Kubernetes auth is better for production — it eliminates long-lived tokens entirely
RUN_MIGRATIONonly needs to beTrueon first boot; safe to leave on, but you can disable it after the schema is initialized
