Secrets Management: Vault, Kubernetes Secrets, and Env Vars

Learn how to securely manage secrets, API keys, and credentials across microservices using HashiCorp Vault, Kubernetes Secrets, and best practices.

published: March 24, 2026 reading time: 35 min read author: GeekWorkBench

Secrets Management: Vault, Kubernetes Secrets, and Environment Variables

Every microservice needs credentials. Database passwords, API keys, TLS certificates, encryption keys. The list adds up fast. In a monolith, you could probably get away with a config file or some environment variables on each server. That stops working when you have dozens of services scaling dynamically across multiple clouds.

This post covers what secrets actually are in microservices, the difference between static and dynamic approaches, how Kubernetes handles them, and why HashiCorp Vault has become essential infrastructure for production deployments.

Introduction

A secret is any credential or sensitive data that authenticates or authorizes access to something. API keys, database passwords, OAuth tokens, SSH keys, TLS certificates.

The tricky part in microservices is that services communicate with each other and with external systems. Each connection typically requires authentication. A payment service needs database credentials. An API gateway needs to validate JWT tokens. Service-to-service calls need mTLS certificates.

In a monolith, you might have put these in a config.yaml on each server. With hundreds of services scaling dynamically across multiple clouds, that breaks down.

Core Concepts

Not all secrets are the same. The distinction between static and dynamic secrets affects both security and operational complexity.

Static Secrets

Static secrets are long-lived credentials that do not change unless someone manually rotates them. Database passwords, API keys, static TLS certificates fit here. These secrets persist for weeks, months, or even years.

The rotation problem is real. If a database password never changes and someone obtains it through a breach, they have permanent access until you manually update it. Most organizations update static secrets infrequently because rotation requires coordinating changes across multiple services and environments, a painful process that often introduces downtime risk.

Dynamic Secrets

Dynamic secrets are generated on-demand with short lifespans — minutes to hours. A service requests temporary credentials and they expire before an attacker can do much with them. Vault, AWS Secrets Manager, and GCP Secret Manager support this.

Kubernetes Secrets

Kubernetes has a built-in Secrets resource, but it has limitations that catch many teams off guard.

How Kubernetes Secrets Work

Kubernetes Secrets use base64 encoding. Creating a secret looks like this:

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: cG9zdGdyZXM=
  password: c2VjcmV0cGFzc3dvcmQ=

The values are base64 encoded, not encrypted. Anyone with cluster access (developers, CI/CD pipelines, anyone who can read pods) can decode those values trivially. Run echo "cG9zdGdyZXM=" | base64 -d and you get the plaintext username.

By default, Kubernetes stores secrets in etcd with encryption at rest, but this requires explicit configuration. Many managed Kubernetes services enable this by default. Self-hosted clusters often do not.

The Real Limitation

Kubernetes Secrets are not really secrets. They are configmaps with base64 encoding. Anyone with cluster access can read them. They do not support fine-grained access control beyond namespace separation. No built-in secret rotation either.

Teams use Kubernetes Secrets because they are built-in and simple, then hit problems when they need audit logs, automatic rotation, or integration with external secret stores.

For small deployments or early-stage projects, Kubernetes Secrets might be enough. For production systems handling sensitive data, you will probably need something more robust.

HashiCorp Vault

Vault is a dedicated secrets management tool that addresses the gaps in Kubernetes Secrets. It gives you centralized secret storage, dynamic secrets, encryption as a service, and detailed audit logs.

Vault Architecture

Vault uses an architecture that separates concerns cleanly:

graph TD
    subgraph Clients
        S1[Service A]
        S2[Service B]
        S3[Service C]
    end
    subgraph Vault
        SA[Storage Backend]
        LS[Logical System]
        PL[Plugin Layer]
    end
    S1 -->|Request secrets| LS
    S2 -->|Request secrets| LS
    S3 -->|Request secrets| LS
    LS -->|Persist| SA
    LS -->|Execute| PL
    SA[(Storage<br/>etcd/Consul/S3)]

The storage backend holds encrypted data. The logical system handles the API and secret engines. The plugin layer extends functionality for things like PKI certificates or cloud provider integrations.

Secret Engines

Vault organizes secrets into logical paths called secret engines. Each engine handles a specific type of secret:

Key-Value (KV): Static secrets stored at a path. Simple key-value storage for passwords, API keys, or any credential.
Database: Generates dynamic database credentials on demand. Supports PostgreSQL, MySQL, MongoDB, and many others.
PKI: Generates TLS certificates automatically. Handles certificate lifecycle including issuance, renewal, and revocation.
AWS: Generates temporary IAM credentials for AWS resources.
Kubernetes: Issues Kubernetes service account tokens and can manage Kubernetes Secrets.

Authentication Methods

Services do not just retrieve secrets - they authenticate to Vault first. Vault supports multiple authentication methods:

Kubernetes Service Account: Uses Kubernetes service account tokens to authenticate. The most common approach for services running in Kubernetes.
AppRole: A role-based authentication method for machines or applications.
JWT/OIDC: For services using JWT-based authentication.
TLS Certificates: For services with valid client certificates.

The typical flow for a Kubernetes workload looks like this:

Pod starts and gets its service account token mounted
Pod authenticates to Vault using its Kubernetes service account
Vault validates the token with Kubernetes
Vault returns the requested secrets
Pod uses secrets to connect to databases or other services

Dynamic Secrets in Action

The database secret engine demonstrates Vault’s power. Instead of sharing a static database password across all services, each service gets its own temporary credentials:

# A service requests database credentials
vault read database/creds/myapp-role

# Vault returns temporary credentials
Key                Value
---                -----
lease_id           database/creds/myapp-role/xyz123
lease_duration     1h
username           v-token-myapp-role-xyz123
password           A1a-xxxxxxxxxxxxx

The service uses these credentials to connect to the database. After one hour, Vault revokes the credentials automatically. The next time the service needs database access, it requests new credentials from Vault.

This approach means no service ever knows the master database password. Compromise of one service only exposes temporary credentials that expire quickly.

Service Accounts and Workload Identity

Modern microservices should not rely on shared static credentials. Workload identity provides cryptographically verifiable identity for services running in cloud environments or Kubernetes.

SPIFFE (Secure Production Identity Framework for Everyone) defines a standard for workload identity. SPIFFE IDs are URIs that uniquely identify workloads:

spiffe://example.com/ns/payment/sa/payment-service

This SPIFFE ID can be embedded in X.509 certificates or JWTs, letting services prove their identity without shared secrets. Service mesh solutions like Istio and Linkerd automatically issue SPIFFE certificates to workloads.

Workload identity shifts the security model from “protect the secrets” to “verify the identity”. Instead of obsessing about database passwords being leaked, you verify that the service presenting credentials is actually the payment service.

Vault supports SPIFFE-based authentication through its Kubernetes auth method. Services can authenticate using their SPIFFE IDs and receive Vault secrets based on their identity.

Secret Rotation Strategies

Rotation is where many secret management strategies fall apart. A good rotation strategy should be automated, have minimal blast radius, and not require downtime.

Rotation Patterns

Time-based rotation: Secrets expire after a fixed period. Services get new credentials before expiration. Simple approach but requires services to handle credential refresh gracefully.

Event-based rotation: Rotation triggers on specific events - a suspected compromise, an employee leaving, a compliance requirement. This requires coordination across services but ensures secrets do not persist indefinitely after a security incident.

Usage-based rotation: For dynamic secrets from Vault, rotation happens automatically on a schedule. Services always have fresh credentials without manual intervention.

Implementing Rotation

The key to successful rotation is designing services to handle credential refresh:

Cache credentials locally
Before using a credential, check if it is about to expire
If expiring soon, request new credentials proactively
Handle authentication failures gracefully and retry with fresh credentials

Many teams use a sidecar container that handles credential refresh. The sidecar communicates with Vault, stores credentials in a shared volume, and the main container reads from that volume. This separation lets you update the sidecar logic without touching your application.

External Secrets Operators

External Secrets Operators (ESO) integrate external secret stores like Vault, AWS Secrets Manager, or GCP Secret Manager with Kubernetes. Instead of manually copying secrets into Kubernetes, you reference external secrets in your Kubernetes manifests.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
  data:
    - secretKey: password
      remoteRef:
        key: secret/myapp/db-password
        property: password

When you apply this manifest, the operator syncs the secret from Vault into a Kubernetes Secret. The operator handles rotation based on the refresh interval.

This gives you the best of both worlds: secrets live in your centralized secrets store with full audit logs and access control, while your applications keep using standard Kubernetes Secrets.

The External Secrets Operator supports multiple backends: Vault, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, and more.

CI/CD and GitOps Integration

Secrets management does not stop at runtime. CI/CD pipelines need access to secrets to deploy applications. GitOps workflows need secrets to synchronize configuration.

Injecting Secrets at Deploy Time

The pattern that works well: secrets never live in git, but CI/CD pipelines retrieve them at deploy time. Your deployment process looks like this:

CI/CD pipeline triggers on a git commit
Pipeline authenticates to Vault or your cloud secrets manager
Pipeline retrieves the required secrets
Secrets are injected into the pod at deploy time (via dynamic secrets or ESO)
Application runs with short-lived credentials

Many teams use Kubernetes service account JWTs for this. The CI/CD runner has a service account with permissions to request specific secrets from Vault.

GitOps Considerations

In a GitOps model, your desired state lives in git. The problem is reconciling the fact that secrets cannot live in git.

The solutions:

External Secrets Operator: Reference external secrets in your git manifests. The operator syncs actual values from your secrets store.
Sealed Secrets: Encrypt secrets before committing to git. Only the cluster can decrypt them.
Vault Agent: Run a vault agent sidecar that handles authentication and secret retrieval.

The External Secrets Operator approach has become popular because it keeps git manifests clean and relies on established external secrets stores for the actual sensitive values.

When to Use / When Not to Use

Scenario	Use This Approach	Notes
Production microservices with dynamic scaling	Vault Dynamic Secrets	Short-lived credentials work well with scaling
Small project or early-stage startup	Kubernetes Secrets	Simpler to start, but plan migration to Vault
CI/CD pipeline credentials	Vault or Cloud Secrets Manager	Centralized audit trail for pipeline access
Multi-cloud or hybrid environment	Vault	Cloud-agnostic, works across AWS/GCP/Azure
Database credentials for PostgreSQL/MySQL	Vault Database Secrets Engine	Dynamic credentials per service
TLS certificates	Vault PKI or cert-manager	cert-manager is simpler for Kubernetes-native; Vault for unified secrets
External third-party API keys	Vault KV with rotation	Static but with automated rotation
Kubernetes-native only, simple needs	Kubernetes Secrets + ESO	External Secrets Operator syncs from external stores
Legacy VMs outside Kubernetes	Vault Agent Sidecar	Vault works outside K8s too

Trade-off Analysis

Aspect	Kubernetes Secrets	HashiCorp Vault
Encryption at rest	Requires explicit config (etcd encryption)	Encrypted by default
Dynamic secrets	No native support	Database, AWS, GCP, PKI engines
Secret rotation	Manual or via external tools	Native rotation engines
Access control	RBAC only (namespace, SA)	Fine-grained policies per path
Audit logging	Basic K8s events	Comprehensive audit log
High availability	Built into K8s control plane	Requires HA configuration
Learning curve	Low	Higher
Cost	Included with K8s	Open source free; Enterprise for advanced features

When Not to Use Plain Kubernetes Secrets

Secrets containing sensitive data: base64 encoding is not encryption; anyone with cluster access can decode
Production systems with compliance requirements: No audit log of who accessed what
Multi-team clusters where separation is needed: RBAC is coarse-grained
When you need dynamic credentials: K8s Secrets are static by nature

Best Practices

Managing secrets well requires consistent discipline across your organization. These practices matter most.

Never Commit Secrets to Version Control

This should go without saying, but it keeps happening. API keys, database passwords, and private keys should never appear in git repositories. Not even in private repositories.

Use tools like git-secrets, Talisman, or pre-commit hooks to scan commits for accidentally committed secrets. Treat any secret that appears in version control as compromised. Rotate it immediately.

A .gitignore file should exclude any file that might contain secrets. Be particularly careful with .env files, config maps, and any file with “secret” or “credential” in the name.

Least Privilege Access

Services should only have access to the secrets they need. A logging service does not need database credentials. An API service does not need access to your payment processing secrets.

In Vault, define policies that restrict which paths each service can read. In Kubernetes, use RBAC to limit who can read Secrets in each namespace.

Audit access regularly. Who is actually reading secrets in production? Are there services that request secrets they no longer use? Periodic reviews catch accumulation of unnecessary access.

Audit Everything

You need to know who accessed what, when, and from where. Vault provides comprehensive audit logging including:

Every authentication attempt (success and failure)
Every secret read
Every secret created or updated
Client identity and source IP for each request

Cloud providers offer similar logging for their secrets managers. Enable those logs and ship them to your SIEM or log aggregation system.

Audit logs serve multiple purposes: detecting unauthorized access, troubleshooting operational issues, and meeting compliance requirements.

Use Short-Lived Credentials

Dynamic secrets with short lifespans reduce risk significantly. If a credential expires in 15 minutes, an attacker has a narrow window to use it. Static credentials that last a year give attackers plenty of time.

Where dynamic secrets are not possible, use the shortest acceptable lifetime for static secrets. Review password policies and certificate expiration periods. Do you really need that API key to last forever?

Production Failure Scenarios

Failure Scenarios and Mitigations

Scenario: Vault Unavailable

Symptoms: Services cannot retrieve secrets. Applications fail to start or start with stale cached credentials. Dynamic credentials stop working.

Diagnosis:

# Check Vault pods
kubectl get pods -n vault -l app=vault

# Check Vault status
kubectl exec -n vault vault-0 -- vault status

# Check for seal status
kubectl exec -n vault vault-0 -- vault status | grep Sealed

# Test Vault API
kubectl exec -n vault vault-0 -- curl -s http://localhost:8200/v1/sys/health

Mitigation:

If Vault is sealed, unseal it using stored unseal keys (auto-unseal is preferred for production)
If Vault pods are down, check resource constraints (OOMKilled) or scheduling issues
Services with cached secrets may continue working if cache is still valid
If Vault is overloaded, scale horizontally (Vault supports read replicas)

Prevention:

Use auto-unseal with cloud KMS (AWS KMS, GCP CKMS, Azure Key Vault)
Run Vault in HA mode with multiple pods
Configure pod anti-affinity for Vault servers
Monitor Vault health and set alerts for sealed/unavailable status

Scenario: Kubernetes Service Account Token Not Validating Against Vault

Symptoms: Services cannot authenticate to Vault using Kubernetes auth method. Logs show “permission denied” or “invalid token” errors.

Diagnosis:

# Check if service account token is mounted
kubectl exec -it <pod> -- cat /var/run/secrets/kubernetes.io/serviceaccount/token

# Check Vault auth method status
kubectl exec -n vault vault-0 -- vault auth list

# Check the role exists
kubectl exec -n vault vault-0 -- vault read auth/kubernetes/role/<role-name>

# Test token validation locally
kubectl exec -n vault vault-0 -- vault write auth/kubernetes/login \
  role=<role-name> jwt=<token-from-pod>

Mitigation:

Verify the service account exists and has the correct name
Check if the Vault role is configured with correct service account name and namespace
Verify the cluster’s CA cert is configured in Vault’s Kubernetes auth method
If token was regenerated, the old token is invalid; restart the pod to get a fresh token

Prevention:

Use long-lived service accounts to avoid token rotation issues
Monitor Vault auth method configuration for unexpected changes
Test Kubernetes auth method monthly

Scenario: ESO Sync Failing

Symptoms: ExternalSecret resource exists but Kubernetes Secret is not created. ESO operator logs show errors.

Diagnosis:

# Check ESO operator logs
kubectl logs -n external-secrets deployment/external-secrets-operator

# Check ExternalSecret status
kubectl get externalsecret -n <namespace> -o yaml

# Check secret store connectivity
kubectl get secretstore -n <namespace>

# Verify ESO can reach Vault
kubectl exec -n external-secrets deployment/external-secrets-operator -- curl -s https://vault.example.com/v1/sys/health

Mitigation:1. Verify SecretStore CRD is correctly configured with Vault address and auth method 2. Check ESO operator has ClusterRole permissions to read ExternalSecret and SecretStore resources 3. If ESO pod restarted recently, wait for reconciliation 4. Delete and recreate ExternalSecret to force re-sync

Prevention:

Monitor ExternalSecret status with Prometheus metrics
Set alerts for ESO reconciliation errors
Use ESO with Vault’s Kubernetes auth method for automatic credential management

Scenario: Database Credentials Not Rotating

Symptoms: Vault shows one set of credentials issued but database shows different user active. Dynamic credentials not being renewed.

Diagnosis:

# Check Vault lease
vault read database/creds/myapp-role

# Check current database users
# For PostgreSQL: SELECT * FROM pg_user;

# Verify lease is being renewed
vault lease list database/

# Check Vault logs for renewal errors
kubectl logs -n vault vault-0 | grep renew

Mitigation:

If lease expired, Vault automatically revokes credentials; new credentials will be issued on next request
Services must request new credentials when lease expires; check service implementation
If using ESO, verify the refresh interval is configured correctly
Database might have reached max connections due to accumulated stale credentials

Prevention:

Implement credential refresh logic in services before lease expires
Monitor lease expiration metrics
Set database max connections with auto-cleanup for abandoned connections
Test renewal process in staging

Observability Hooks

Metrics to Capture

Metric	What It Tells You	Alert Threshold
`vault_secret_access_total`	Secret read rate by path	Unexpected paths accessed
`vault_lease_renewal_success_total`	Dynamic credential renewal success	<99.5%
`vault_lease_expiration_seconds`	Time until dynamic credentials expire	<300s warning, <60s critical
`vault_auth_method_failure_total`	Auth failures by method (k8s, AppRole)	>1% failure rate
`eso_sync_success_total`	External Secrets Operator sync success	<99%
`eso_sync_error_total`	ESO sync errors by type	Any increase

Logs to Collect

From Vault (structured logging):

{
  "event": "secret_accessed",
  "client_namespace": "default",
  "client_service_account": "payment-service",
  "secret_path": "secret/myapp/db-password",
  "access_result": "allowed|denied",
  "remote_addr": "10.0.0.5",
  "timestamp": "2026-03-24T10:30:00Z"
}

{
  "event": "lease_renewed",
  "lease_id": "database/creds/myapp-role/xyz123",
  "ttl_seconds": 3600,
  "renewed_by": "spire-agent",
  "timestamp": "2026-03-24T10:00:00Z"
}

{
  "event": "auth_failure",
  "auth_method": "kubernetes",
  "client_namespace": "default",
  "client_service_account": "unknown-sa",
  "failure_reason": "invalid_token|role_not_found|token_expired",
  "timestamp": "2026-03-24T10:30:00Z"
}

Key log fields: client identity (namespace, SA), secret path, access result, failure reason, source IP.

Traces to Capture

Enable Vault telemetry with Prometheus metrics. Key metrics to trace:

vault.secret.gen.success: Dynamic secret generation success
vault.expire.lease.expiration: Leases approaching expiration
vault.auth.kubernetes.fail: Kubernetes auth failures

Dashboards to Build

Vault Health: Active leases, auth success rate, storage utilization, seal status
Secret Access Patterns: Read rate by path, top accessed secrets, denied access attempts
Dynamic Credential Lifecycle: Active credentials by type, expiration heatmap, renewal success rate
ESO Operations: Sync success rate, error breakdown, reconciliation latency

Alerting Rules

# Vault unavailable
- alert: VaultUnavailable
  expr: up{job="vault"} == 0
  labels:
    severity: critical
  annotations:
    summary: "Vault is unavailable"

# Vault sealed
- alert: VaultSealed
  expr: vault_is_sealed == 1
  labels:
    severity: critical
  annotations:
    summary: "Vault is sealed"

# Auth failures
- alert: VaultAuthFailureRate
  expr: rate(vault_auth_method_failure_total[5m]) > 0.01
  labels:
    severity: warning
  annotations:
    summary: "Vault auth failure rate above 1%"

# Lease expiring soon
- alert: DynamicCredentialExpiring
  expr: vault_lease_expiration_seconds < 300
  labels:
    severity: warning
  annotations:
    summary: "Dynamic credential expiring in {{ $value }} seconds"

# ESO sync failures
- alert: ESOSyncFailure
  expr: rate(eso_sync_error_total[5m]) > 0
  labels:
    severity: warning
  annotations:
    summary: "External Secrets Operator sync failures detected"

Interview Questions

1. You discover a service account key JSON file was accidentally committed to a GitHub repository. What do you do?

Treat this as a critical security incident. First, rotate the key immediately using gcloud iam service-accounts keys delete (for GCP) or the equivalent for AWS/Azure. Do not wait — the key is compromised the moment it is public. Second, check CloudTrail or equivalent audit logs for all usage of that key since the commit timestamp — determine if it was actually exploited. Third, revoke all existing keys for that service account and create a new one. Fourth, update any CI/CD pipelines that used the old key with Workload Identity instead. Finally, run a retrospective: why was the key committed — was it in a pre-commit hook, gitignore misconfiguration, or IDE accident?

2. What is the difference between ESO (External Secrets Operator) and using Vault's Kubernetes auth method directly?

ESO syncs secrets from external secret stores (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault) into Kubernetes Secrets objects. Your application reads Kubernetes Secrets as env vars or volume mounts — no application code changes needed. Vault's Kubernetes auth method lets pods authenticate to Vault directly using their Kubernetes service account and retrieve secrets at runtime, without persisting secrets into Kubernetes Secrets. ESO is simpler for existing applications that expect Kubernetes Secrets. Vault's direct auth is better for security (secrets never exist as Kubernetes objects) and for short-lived workloads where dynamic secret generation per pod is needed. Both are better than storing static credentials in Kubernetes Secrets.

3. How do you rotate a database password for a production service without downtime?

Use a dual-update approach: update the secret in Vault or your secrets manager, then update the Kubernetes secret using ESO or kubectl patch. The application needs to support credential rotation without restart — implement graceful reload: when the application detects the secret has changed (via a Kubernetes watch on the secret, a SIGUSR1 signal, or a periodic file check), it closes existing connections and reconnects with the new credentials. For applications that do not support dynamic reload, use connection pooling via a proxy like PgBouncer where the proxy holds the database credentials and the application connects through the proxy. This way you rotate the database password at the proxy level without touching application configs.

4. A developer says "we do not need Vault, we can just use Kubernetes Secrets." How do you respond?

Kubernetes Secrets are base64-encoded by default, not encrypted — anyone with API access or etcd access can read them. They are also not auditable, have no secret versioning, no dynamic secret generation, and no automatic rotation built in. Kubernetes Secrets are fine for non-sensitive, low-risk credentials like a staging environment's public API key. For production with real credentials, service account keys, database passwords, and TLS certificates, a dedicated secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager) provides encryption at rest, audit logging, access policies, secret versioning, automatic rotation, and short-lived credentials. The right answer depends on the sensitivity of what you are protecting.

5. Explain the difference between static and dynamic secrets. When would you use each?

Static secrets are long-lived credentials that persist for weeks, months, or years — database passwords, API keys, static TLS certificates. They require manual rotation and if compromised give attackers permanent access. Dynamic secrets are generated on-demand with short lifespans (minutes to hours) — Vault database credentials, temporary IAM roles, short-lived certificates. Dynamic secrets eliminate the rotation problem since credentials constantly refresh, and even if intercepted they expire before meaningful damage. Use static secrets for things that genuinely need to persist (like an external API key you cannot control rotation for). Use dynamic secrets for anything internal where you control the system — database access, cloud credentials, service-to-service authentication.

6. How does Vault's Kubernetes authentication method work end-to-end?

The flow: (1) Pod starts with a Kubernetes service account token mounted at /var/run/secrets/kubernetes.io/serviceaccount/token. (2) Pod's application calls Vault's Kubernetes auth endpoint with its service account token and the role name it wants. (3) Vault's Kubernetes auth method validates the token against the Kubernetes TokenReview API — verifying the token is valid, not expired, and bound to the correct service account and namespace. (4) Vault returns a Vault token with policies scoped to the requested role. (5) Pod uses the Vault token to read secrets from specific paths. (6) When the pod terminates, its Kubernetes token is automatically revoked, and any Vault leases it held are eventually expired. This ties identity at the Kubernetes layer directly to Vault access, with no static credentials to manage.

7. What is the blast radius problem with shared static credentials, and how do dynamic secrets solve it?

With shared static credentials (e.g., one database password used by 50 microservices), compromise of any single service exposes the credential to the attacker — giving them access to the shared resource (the database) directly. The blast radius equals all services using that credential. Dynamic secrets solve this by giving each service its own unique credentials. If one service is compromised, only those specific credentials are exposed — the attacker cannot use them to access the database directly because the credential is tied to that service's identity and expires quickly. Vault's database secret engine creates per-service database usernames like v-token-myapp-role-xyz123 — even if one service is compromised, the attacker gets a short-lived credential for only that one service account, not the master database password.

8. Your team is migrating from Kubernetes Secrets to Vault. What is your migration strategy?

Phased approach: (1) Deploy Vault and configure the Kubernetes auth method alongside existing Kubernetes Secrets — do not change anything yet. (2) Deploy External Secrets Operator as a bridge — ESO syncs Vault secrets into Kubernetes Secrets, so applications keep using K8s Secrets without code changes. (3) Migrate one team or one non-critical service at a time to read directly from Vault via the Kubernetes auth method. (4) Once confident, deprecate ESO syncing for migrated services and remove static credentials from Kubernetes Secrets. (5) Finally, audit that no static credentials remain in etcd or git. Key principle: do not try to migrate everything at once — maintain backward compatibility during the transition.

9. What is SPIFFE and how does it relate to secrets management?

SPIFFE (Secure Production Identity Framework for Everyone) defines a standard for workload identity — cryptographic identity for services in dynamic environments. SPIFFE IDs are URIs like spiffe://example.com/ns/payment/sa/payment-service that uniquely identify a workload. These IDs are embedded in X.509 certificates or JWTs, allowing services to prove their identity without shared secrets. SPIFFE shifts security from "protect the secrets" to "verify the identity" — instead of obsessing about database passwords being leaked, you verify the service presenting credentials is actually the payment service. Service meshes like Istio and Linkerd automatically issue SPIFFE certificates. Vault integrates with SPIFFE through its Kubernetes auth method — services authenticate using their SPIFFE IDs and receive Vault secrets scoped to their identity. This enables passwordless authentication between services.

10. Describe a scenario where ESO sync fails and how you diagnose it.

Common causes: (1) SecretStore misconfiguration — wrong Vault address, expired Kubernetes auth token, or incorrect role mapping. (2) ESO operator lacks ClusterRole permissions to read ExternalSecret and SecretStore CRDs. (3) Vault lease expired and ESO cannot renew — ESO needs its own Vault token with appropriate policies. (4) Network policy blocking ESO pod from reaching Vault. Diagnosis steps: check ESO operator logs (kubectl logs -n external-secrets deployment/external-secrets-operator), verify ExternalSecret status (kubectl get externalsecret -n namespace -o yaml), check SecretStore connectivity (kubectl get secretstore -n namespace), and test Vault reachability from ESO pod (kubectl exec -n external-secrets deployment/external-secrets-operator -- curl -s https://vault.example.com/v1/sys/health). Prevention: use ESO with Vault's Kubernetes auth method so ESO gets automatic credential rotation, and monitor ExternalSecret status with Prometheus metrics.

11. How would you design a secret rotation strategy for a service that cannot handle frequent credential changes?

The key is to decouple the rotation cycle from the connection lifecycle. Implement a grace period approach: rotate the secret in Vault but keep the old credential valid for a transition window (e.g., 24-48 hours). Your service gets a new credential on next restart or next credential check, while existing connections continue using the old credential until they naturally terminate. For services that run long-lived connections, use a connection proxy (like PgBouncer for PostgreSQL) — the proxy holds the database credentials and your service connects through it. Rotate the password at the proxy level without touching application configs. Another pattern: implement a credential refresh signal using a SIGHUP or a periodic file check, where the service detects the secret changed, gracefully drains existing connections, and reconnects with new credentials. The worst approach is forcing credential rotation on a running service without graceful reload — you'll get connection failures and downtime.

12. What are the security implications of storing secrets in environment variables versus Kubernetes Secrets?

Both are bad options for production secrets. Kubernetes Secrets are base64-encoded, not encrypted, and anyone with API access or etcd access can decode them. Env vars are worse because they can leak more easily — they appear in logs, are visible in ps aux output, get exported in a shell's environment to child processes, and can be accidentally printed in error messages. A container running as non-root can still read /proc/self/environ to get env vars. The only advantage of env vars is simplicity, and that's not a security advantage. For production: use a secrets driver that supports encryption at rest, enable RBAC so only authorized service accounts can read secrets, and prefer direct Vault integration (pods authenticate to Vault and receive secrets at runtime) over persisting secrets in Kubernetes Secrets at all. If you must use Kubernetes Secrets, encrypt etcd and enable RBAC to restrict access.

13. Explain the difference between Vault's KV v1 and KV v2 secret engines. Why does it matter?

KV v1 is a simple key-value store — write a secret, read it back, delete it. No versioning, no metadata. KV v2 adds versioning, metadata, and check-and-set (CAS) operations for atomic updates. The practical difference: with KV v1, if you update a secret and someone else updates it at the same time, one of those updates is lost silently. With KV v2, you can use the cas parameter to make atomic updates — if the version has changed since you read it, the write fails. KV v2 also lets you recover from accidental deletes (old versions are preserved) and add metadata to secrets. The other key difference is paths — KV v2 prefixes paths with data/ by default, so database/creds becomes database/data/creds. Most production deployments use KV v2. If you are still on KV v1, migrate by re-creating secrets under KV v2 paths or using the vault kv migrate command.

14. How does Vault's lease and renewal system work, and why is it important for secret management?

Vault issues leases for every secret it issues, not just dynamic secrets. A database credential might have a 1-hour lease, a Vault token might have a 24-hour lease. When a lease expires, Vault revokes the secret — the credential is invalidated in the database, the token is invalidated. Your application is responsible for renewing leases before they expire (vault lease renew command) or re-reading secrets that get revoked. This is the key difference from static secrets: if you put a static password in a KV store and never touch it, it stays valid forever. With leases, Vault enforces expiration even if you forget about a secret. This is powerful for security — but it means your application needs a renewal loop. If you fail to renew and the lease expires, your service loses access to the credential. Vault emits events when leases are about to expire (vault. expire.lease_expiration metric), so you can monitor and alert on it. Services that do not implement lease renewal will fail when their secrets expire.

15. What is the principle of least privilege in the context of Vault policies, and how do you implement it?

Least privilege means each service gets exactly the permissions it needs and nothing more. A payment service reading database credentials should not also be able to read the credentials for the logging service or the email service. In Vault, policies are attached to tokens and define which paths can be read, write, updated, or deleted. A good policy structure: create a policy per service or per team, grant read access only to the specific secret paths that service needs, and deny everything else by default. Example: payment-service-policy might allow read on secret/data/myapp/db-creds but deny secret/data/myapp/admin-creds. Test policies before applying them in production — use vault policy read and vault token capabilities commands to verify what a token can actually access. Review policies quarterly — accumulated permissions over time is how you end up with services that have more access than they need, violating the principle of least privilege.

16. Your company is adopting GitOps. How do you handle secrets in a GitOps workflow without committing them to git?

This is the fundamental GitOps secret problem: your desired state lives in git, but secrets cannot live in git. Three practical solutions, in order of preference: (1) External Secrets Operator — reference external secrets in your git manifests (kind: ExternalSecret), the ESO syncs actual values from your secrets store at runtime. Your git repo has the reference, not the secret. (2) Sealed Secrets — encrypt secrets using a public key only the cluster can decrypt, commit the encrypted blobs to git. The cluster's sealed-secrets controller decrypts them. (3) Vault Agent Sidecar — run a vault agent in each pod that handles authentication to Vault and writes secrets to a shared volume. Your git manifests reference paths in Vault, not actual values. The vault agent sidecar approach is the most secure but also the most complex to set up. ESO is the most practical for most teams — it works with any external secrets store (Vault, AWS, GCP, Azure) and keeps git manifests clean.

17. What happens to secrets when a Kubernetes pod is deleted? How do you ensure orphaned secrets do not persist?

When a pod is deleted, the container's filesystem (where env vars and mounted secrets live) is destroyed with the pod. Any secret data that was read into memory is gone. However, Kubernetes Secrets objects themselves persist in etcd until explicitly deleted. If you created a Kubernetes Secret via ESO sync, that Secret remains in the cluster after the pod is gone — and might be mounted by other pods if the Secret name is reused. The cleanup strategy depends on how the secret was created: (1) ESO-managed secrets — use creationPolicy: Owner, which means Kubernetes garbage collector deletes the Secret when the ExternalSecret resource is deleted. (2) Manually created secrets — you must delete them explicitly. (3) Vault dynamic credentials — the lease is tied to the pod's identity, so when the pod is deleted and its Vault token is revoked, the dynamic credentials become invalid automatically. For Vault, the credential is useless even if it technically still exists in the database — what matters is that Vault revokes the lease and the database credential stops working.

18. How do you monitor secret management health in production? What metrics and alerts should you have?

Vault and ESO expose Prometheus metrics — enable the telemetry and ship them to your monitoring system. Key metrics: vault_secret_access_total (who is reading what), vault_lease_renewal_success_total (are dynamic credentials being renewed), vault_lease_expiration_seconds (leases approaching expiration), vault_auth_method_failure_total (auth failures by method), eso_sync_success_total and eso_sync_error_total (ESO sync health). Alerts you must have: vault_sealed (critical — Vault cannot serve any secrets), vault_unavailable (critical), dynamic_credential_expiring_soon (lease expiring within 5 minutes warning, 1 minute critical), auth_failure_rate_above_1_percent (possible attack or misconfiguration), eso_sync_errors_increasing (ESO not keeping secrets up to date). Dashboards to build: Vault health (seal status, active leases, storage), secret access patterns (top accessed secrets, denied access attempts), dynamic credential lifecycle (expiration heatmap, renewal success rate), ESO operations (sync success rate, error breakdown). Send Vault audit logs to your SIEM — each secret read, auth success/failure, and lease renewal should be logged with client identity, source IP, and timestamp.

19. What is the difference between Vault's built-in Kubernetes auth method and using ESO with Vault's KV store?

Vault Kubernetes auth: pods authenticate directly to Vault using their Kubernetes service account token. Vault validates the token with Kubernetes' TokenReview API, then returns a Vault token scoped to a specific role. The pod reads secrets directly from Vault using that token. Secrets never persist as Kubernetes objects. Best for: short-lived workloads, services that need dynamic secrets (database credentials per pod), security-critical applications where you want secrets ephemeral. ESO with Vault KV: ESO syncs secrets from Vault's KV store into Kubernetes Secrets objects on a schedule. Your application reads Kubernetes Secrets as env vars or volume mounts. Best for: existing applications that expect Kubernetes Secrets, gradual migrations from Kubernetes Secrets to Vault, applications you cannot modify to add Vault SDK. The trade-offs: Kubernetes auth is more secure (secrets never hit etcd, no persistent Kubernetes Secret objects) but requires application code changes or a sidecar. ESO is easier to adopt (no code changes) but secrets persist in Kubernetes Secrets and etcd.

20. A junior developer asks you why they cannot just use a shared service account for all microservices instead of creating separate service accounts. How do you explain the problem?

Shared service accounts break the security model completely. If all microservices use the same service account, then Vault policies cannot differentiate between them — the payment-service and the logging-service get the same Vault token with the same permissions. Compromise one service and you get access to everything. Beyond Vault: with a shared SA, a breach of any single service gives an attacker the ability to impersonate every other service in the cluster. Kubernetes RBAC cannot isolate workloads. Audit logs cannot distinguish which service accessed what. Service mesh mTLS cannot work because all services present the same identity. The blast radius of any single vulnerability becomes the entire system. The correct model: one service account per service (or per logical group of related services). This is how you get fine-grained security — the payment-service Vault role can only read the payment-service secrets, the logging-service Vault role can only read the logging-service secrets. If the logging service is compromised, the attacker gets nothing beyond logging. This is not over-engineering — it is the foundation of zero-trust architecture in microservices.

Conclusion

Secrets management connects to several other patterns worth understanding.

mTLS and Service Mesh handles encryption and authentication of service-to-service communication. Service meshes often leverage workload identity to issue short-lived certificates automatically.

Kubernetes provides the container orchestration layer where many secrets management solutions run. Understanding Kubernetes RBAC and service accounts is foundational.

GitOps changes how you think about configuration management. Secrets become part of the reconciliation loop without living in git.

Service Identity and SPIFFE provides the cryptographic identity layer that enables passwordless authentication and short-lived credentials.

Secrets management in microservices is a solved problem, but it requires intentional design. Start with Kubernetes Secrets for simple use cases, but plan to evolve toward a dedicated secrets manager as your system grows.

HashiCorp Vault has become the de facto standard for secrets management in Kubernetes environments. Its dynamic secrets, fine-grained policies, and comprehensive audit logging address the gaps in native Kubernetes Secrets.

The shift from static shared credentials to short-lived dynamic credentials improves your security posture. Even if an attacker obtains a credential, it expires before meaningful damage occurs.

Invest the time to implement proper secrets management early. Retrofitting it into an existing system with dozens of services is painful. Building it in from the start is straightforward and pays dividends in reduced risk and easier compliance audits.

Secrets Management: Vault, Kubernetes Secrets, and Environment Variables

Introduction

Core Concepts

Static Secrets

Dynamic Secrets

Kubernetes Secrets

How Kubernetes Secrets Work

The Real Limitation

HashiCorp Vault

Vault Architecture

Secret Engines

Authentication Methods

Dynamic Secrets in Action

Service Accounts and Workload Identity

Secret Rotation Strategies

Rotation Patterns

Implementing Rotation

External Secrets Operators

CI/CD and GitOps Integration

Injecting Secrets at Deploy Time

GitOps Considerations

When to Use / When Not to Use

Trade-off Analysis

When Not to Use Plain Kubernetes Secrets

Best Practices

Never Commit Secrets to Version Control

Least Privilege Access

Audit Everything

Use Short-Lived Credentials

Production Failure Scenarios

Failure Scenarios and Mitigations

Scenario: Vault Unavailable

Scenario: Kubernetes Service Account Token Not Validating Against Vault

Scenario: ESO Sync Failing

Scenario: Database Credentials Not Rotating

Observability Hooks

Metrics to Capture

Logs to Collect

Traces to Capture

Dashboards to Build

Alerting Rules

Interview Questions

Further Reading

Conclusion

Category

Tags

Related Posts

Kubernetes Network Policies: Securing Pod-to-Pod Communication

GitOps: Infrastructure as Code with Git for Microservices

Health Checks: Liveness, Readiness, and Service Availability