LiteLLM + HashiCorp Vault
AI Gateway with Dynamic Secrets in Kubernetes
Deploy LiteLLM as an AI proxy using HashiCorp Vault for dynamic secrets management, PostgreSQL persistence, and Kubernetes orchestration for production-grade AI infrastructure.
Why HashiCorp Vault?
While deploying LiteLLM in my Kubernetes infrastructure, I realized this was the perfect opportunity to showcase HashiCorp Vault integration. LiteLLM requires multiple sensitive credentials – OpenAI API keys, Anthropic tokens, database passwords, and master keys – making it an ideal candidate for demonstrating proper secrets management rather than stuffing everything into plain Kubernetes secrets.
The Problem: AI applications often need multiple high-value API keys that should be rotated regularly, audited, and controlled. Traditional secrets management doesn’t give you the granular control needed for production deployments.
HashiCorp Vault provides dynamic secrets, automatic rotation, detailed audit logs, and policy-based access control. Since I was deploying LiteLLM anyway, it seemed like the perfect opportunity to showcase proper secrets management in action.
Prerequisites
Required Infrastructure
This guide assumes you have existing infrastructure components. We’ll focus on LiteLLM deployment rather than setting up these foundational systems.
⚓ Kubernetes Cluster
Running cluster with ingress controller (Traefik/Nginx) and persistent storage. We’ll use the ‘ai-services’ namespace.
🔐 HashiCorp Vault
Running Vault instance with Kubernetes authentication configured. Access to create policies and manage secrets paths.
🗄️ PostgreSQL Database
PostgreSQL instance for LiteLLM data persistence. Could be managed service or self-hosted cluster with TLS enabled.
Vault Configuration
Vault Setup Commands
First, create the secrets engine, store your credentials, and configure External Secrets Operator authentication.
# Step 1: Enable KV secrets engine for LiteLLM
vault secrets enable -path=litellm kv-v2
# Step 2: Create secrets for LiteLLM
vault kv put litellm/api-keys \
master-key="sk-your-secure-master-key" \
salt-key="sk-your-secure-salt-key" \
openai-key="sk-your-openai-api-key" \
anthropic-key="sk-your-anthropic-api-key"
vault kv put litellm/database \
url="postgresql://litellm:password@postgres:5432/litellm?sslmode=require"
# Step 3: Create policy for External Secrets Operator
vault policy write eso-litellm-policy - <
External Secrets Operator
Dynamic Secrets Integration
External Secrets Operator (ESO) synchronizes secrets from Vault to Kubernetes, providing automatic updates and rotation without redeploying pods.
# Install External Secrets Operator
helm repo add external-secrets https://charts.external-secrets.io
helm install external-secrets external-secrets/external-secrets \
--namespace external-secrets-system --create-namespace
Service Account and SecretStore
# Vault Token Secret (create this first)
apiVersion: v1
kind: Secret
metadata:
name: vault-token
namespace: ai-services
type: Opaque
stringData:
token: your-vault-token-here
---
# SecretStore with token authentication
apiVersion: external-secrets.io/v1
kind: SecretStore
metadata:
name: vault-backend
namespace: ai-services
spec:
provider:
vault:
server: "https://vault.example.com:8200"
path: "secret"
version: "v2"
auth:
tokenSecretRef:
name: "vault-token"
key: "token"
Authentication Method
This setup uses token-based authentication for simplicity. While Kubernetes authentication provides better security through service account integration, token auth is easier to set up and manage for initial deployments. I'll cover Kubernetes authentication setup in a separate post.
ExternalSecret Resource
# ExternalSecret for LiteLLM credentials
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: litellm-secrets
namespace: ai-services
spec:
refreshInterval: 15s
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: litellm-secrets
creationPolicy: Owner
data:
# API Keys from Vault
- secretKey: LITELLM_MASTER_KEY
remoteRef:
key: litellm/api-keys
property: master-key
- secretKey: LITELLM_SALT_KEY
remoteRef:
key: litellm/api-keys
property: salt-key
- secretKey: OPENAI_API_KEY
remoteRef:
key: litellm/api-keys
property: openai-key
- secretKey: ANTHROPIC_API_KEY
remoteRef:
key: litellm/api-keys
property: anthropic-key
# Database connection from Vault
- secretKey: DATABASE_URL
remoteRef:
key: litellm/database
property: url
LiteLLM Deployment
Production-Ready Setup
The deployment uses secrets from Vault, includes health checks, and configures PostgreSQL persistence with automatic database migration.
# LiteLLM Deployment with Vault secrets
apiVersion: apps/v1
kind: Deployment
metadata:
name: litellm
namespace: ai-services
spec:
replicas: 2
selector:
matchLabels:
app: litellm
template:
metadata:
labels:
app: litellm
spec:
containers:
- name: litellm
image: ghcr.io/berriai/litellm:main-stable
ports:
- containerPort: 4000
env:
# Database configuration
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: litellm-secrets
key: DATABASE_URL
- name: STORE_MODEL_IN_DB
value: "True"
- name: RUN_MIGRATION
value: "True"
# Authentication keys from Vault
- name: LITELLM_MASTER_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: LITELLM_MASTER_KEY
- name: LITELLM_SALT_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: LITELLM_SALT_KEY
# AI Provider API keys from Vault
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: OPENAI_API_KEY
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: litellm-secrets
key: ANTHROPIC_API_KEY
# Health checks
livenessProbe:
httpGet:
path: /health/liveliness
port: 4000
initialDelaySeconds: 40
periodSeconds: 30
readinessProbe:
httpGet:
path: /health/readiness
port: 4000
initialDelaySeconds: 10
periodSeconds: 5
# Resource limits
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
Service & Ingress
Service Configuration
# LiteLLM Service
apiVersion: v1
kind: Service
metadata:
name: litellm-service
namespace: ai-services
spec:
selector:
app: litellm
ports:
- port: 4000
targetPort: 4000
name: http
type: ClusterIP
Ingress with TLS
# LiteLLM Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: litellm-ingress
namespace: ai-services
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- litellm.example.com
secretName: litellm-tls
rules:
- host: litellm.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: litellm-service
port:
number: 4000
Verification & Testing
Deployment Validation
Verify that your LiteLLM deployment is running correctly with Vault secrets integration and database connectivity.
# Check ExternalSecret synchronization
kubectl get externalsecret -n ai-services
kubectl describe externalsecret litellm-secrets -n ai-services
# Verify secrets are created
kubectl get secrets -n ai-services
kubectl describe secret litellm-secrets -n ai-services
# Check LiteLLM pods
kubectl get pods -n ai-services
kubectl logs -n ai-services deployment/litellm
# Test health endpoints
kubectl port-forward -n ai-services svc/litellm-service 4000:4000
curl http://localhost:4000/health
# Test via ingress
curl https://litellm.example.com/health
curl https://litellm.example.com/docs
API Testing
Test the LiteLLM API with your master key:
curl -X POST https://litellm.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Security Benefits
Dynamic Secrets
- Automatic secret rotation without redeployment
- Short-lived tokens with configurable TTL
- Centralized secrets management
- No secrets hardcoded in manifests
Access Control
- Policy-based access to secrets
- Service account authentication
- Namespace isolation
- Principle of least privilege
Audit & Compliance
- Complete audit trail of secret access
- Encryption at rest and in transit
- Integration with monitoring systems
- Compliance with security standards
Secure AI Infrastructure
You now have LiteLLM running in Kubernetes with proper secrets management via HashiCorp Vault. This setup gives you dynamic secrets, automatic rotation, policy-based access control, and comprehensive audit trails - everything you need for production AI workloads handling sensitive API keys and user data.
The External Secrets Operator bridge keeps your secrets synchronized while maintaining Kubernetes-native workflows. Your AI infrastructure is now ready for production deployment with proper security controls.
🔒 Secure • 📈 Scalable • 🚀 Production-Ready
