Cloud Security: IAM, Network Isolation, and Encryption
Implement defense-in-depth security for cloud infrastructure—identity and access management, network isolation, encryption, and security monitoring.
Cloud Security: IAM, Network Isolation, and Encryption
Cloud security requires rethinking the assumption that your network perimeter is safe. In cloud environments, the network is potentially hostile by default. Any resource with a public IP or membership in a security group with public access is exposed. Security must be layered: identity, network, and data encryption work together for defense in depth.
This post covers the core security practices that apply regardless of which cloud provider you use. The examples use AWS, but the concepts translate to Azure, GCP, and other providers with different service names.
Introduction
Cloud security operates on a fundamentally different model than on-premises security. On-premises, your network perimeter is a hard boundary—firewalls, VLANs, and physical access controls keep threats out. In the cloud, that perimeter is soft. Any resource with a public IP is potentially reachable from the internet, and any misconfigured security group can expose sensitive services.
The shift requires thinking about security as a series of layers rather than a single hard shell. Identity—who can access what—matters as much as network controls. Encryption protects data at rest and in transit. Monitoring and logging give you visibility into what is happening so you can detect and respond to threats.
This guide covers identity and access management (IAM), network isolation using VPCs and security groups, encryption strategies, and the cloud-native security services that provide visibility across your infrastructure.
When to Use
Cloud-Native Security Services vs. Third-Party
Choose cloud-native security services when you are already invested in a single cloud provider and want integrated visibility without additional vendor complexity. Security Hub, GuardDuty, and CloudTrail work together out of the box on AWS.
Choose third-party security tools when you run a multi-cloud environment and want unified visibility across providers, or when you need capabilities that cloud-native tools do not cover—advanced threat hunting, specific compliance frameworks, or integration with on-premises security tools.
VPC Endpoints vs. NAT Gateway
Use VPC endpoints when you need private resources like S3 or DynamoDB to be accessible from within your VPC without traffic leaving the AWS network. VPC endpoints are faster, cheaper, and more secure than routing through NAT.
Use NAT gateway when private instances need outbound internet access—for patching, downloading packages, or calling external APIs. NAT gateway does not allow inbound connections from the internet; it only handles outbound from private subnets.
Customer-Managed KMS Keys vs. Cloud-Managed Keys
Use customer-managed KMS keys when you need control over key rotation, key policies, or cross-account access to keys. Customer-managed keys cost money but give you audit trails and policy control.
Use cloud-managed keys when you do not need those controls and want to minimize cost and operational overhead. Cloud-managed keys are free and automatic, but you cannot inspect their policies or control rotation.
IAM Best Practices
Identity and Access Management (IAM) is the foundation of cloud security. Every request to a cloud API requires authentication and authorization. IAM policies determine what identities can do what operations on which resources.
The cardinal rule is least privilege: grant only the permissions required for a task, and nothing more. This applies to human users, service accounts, and compute workloads.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3ReadOnlyForApplication",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": ["arn:aws:s3:::my-app-bucket", "arn:aws:s3:::my-app-bucket/*"]
}
]
}
Avoid attaching policies directly to users. Instead, create groups for roles, add users to groups, and attach policies to groups. This makes permission management systematic rather than ad hoc.
# Create a group
aws iam create-group --group-name developers
# Attach a policy to the group
aws iam attach-group-policy \
--group-name developers \
--policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess
# Add a user to the group
aws iam add-user-to-group \
--group-name developers \
--user-name alice
Regularly audit IAM configurations. AWS Access Analyzer, Azure AD external identities, and GCP Policy Analyzer can identify permissions that grant external access or violate least privilege. Remove unused access keys, deactivate old credentials, and rotate secrets on a schedule.
Service Accounts and Workload Identity
Human users are not the only identities in cloud environments. Compute workloads—EC2 instances, containers, Lambda functions—need permissions to access other AWS services. The question is how those workloads authenticate.
Embedding long-lived credentials in instance profiles or environment variables is risky. Credentials persist beyond the workload lifecycle and can be exfiltrated from logs or environment variables.
Workload identity is the solution. Instead of storing credentials, workloads assume a role using short-lived tokens. The role permissions are scoped to what the workload actually needs.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
For Kubernetes workloads, cloud providers offer operators that project Kubernetes service account tokens into cloud IAM roles. This lets you create Kubernetes service accounts with specific IAM permissions without managing cloud credentials.
# Kubernetes service account with IAM role
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-app-role
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-app-role-binding
subjects:
- kind: ServiceAccount
name: my-app
namespace: production
roleRef:
kind: Role
name: my-app-role
VPC and Network Isolation
Network isolation in cloud environments uses virtual private clouds (VPCs) with subnet segmentation. The principle is straightforward: nothing should be directly accessible from the internet unless intentionally exposed.
# VPC with public and private subnets
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
}
# Public subnets for load balancers
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Type = "Public"
}
}
# Private subnets for application servers
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index + 10)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Type = "Private"
}
}
# NAT gateway for outbound traffic from private subnets
resource "aws_eip" "nat" {
domain = "vpc"
}
resource "aws_nat_gateway" "main" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public[0].id
}
resource "aws_route_table" "private" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main.id
}
}
Application servers sit in private subnets and cannot be reached directly from the internet. Load balancers in public subnets route traffic to application servers. Database and cache servers sit in private subnets with no internet access at all.
Security groups act as instance-level firewalls. They are stateful: allowing inbound traffic automatically allows outbound response traffic.
# Security group for web servers
resource "aws_security_group" "web" {
name = "web-servers"
description = "Security group for web servers"
vpc_id = aws_vpc.main.id
# Allow inbound HTTP/HTTPS from load balancer
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.0.1.0/24"] # Private subnet CIDR
}
# Allow outbound to internet via NAT
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Encryption at Rest and in Transit
Encrypt data wherever it lives. Cloud providers offer encryption at rest by default for most services, using KMS keys you control or provider-managed keys.
# S3 bucket with encryption
resource "aws_s3_bucket" "data" {
bucket = "my-sensitive-data"
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_key_id = aws_kms_key.data.arn
}
}
}
}
# KMS key with restricted usage
resource "aws_kms_key" "data" {
description = "KMS key for sensitive data"
deletion_window_in_days = 30
enable_key_rotation = true
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "Enable IAM User Permissions"
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::123456789:root"
}
Action = "kms:*"
Resource = "*"
},
{
Sid = "Allow use by application"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
Action = ["kms:Encrypt", "kms:Decrypt"]
Resource = "*"
}
]
})
}
TLS encrypts data in transit. Force HTTPS on all public endpoints. Use TLS for connections between services, especially when they cross network boundaries. Certificate management can be automated with services like AWS Certificate Manager or Let’s Encrypt.
Security Groups and Firewall Rules
Security groups should be as restrictive as possible. Start with deny all inbound, allow specific ports and sources.
# Database security group - minimal access
resource "aws_security_group" "database" {
name = "database"
description = "Security group for RDS instance"
vpc_id = aws_vpc.main.id
# No inbound rules - RDS is only reachable from application tier
# via security group references
egress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
}
}
Network ACLs provide subnet-level filtering as a secondary control. Security groups handle instance-level filtering. Use both together: NACLs for subnet-wide rules like blocking a specific IP range, security groups for instance-specific access control.
VPC endpoint policies restrict which actions are allowed through VPC endpoints. Without endpoints, traffic to S3 and DynamoDB leaves the VPC and re-enters from the internet. Endpoints keep traffic internal but require explicit policies to control access.
Cloud-Native Security Services
Each major cloud provider offers security services that layer on top of basic IAM and networking.
AWS Security Hub aggregates findings from GuardDuty, Inspector, and Macie. Azure Security Center and GCP Security Command Center play similar roles. These services provide centralized visibility and compliance monitoring across your cloud footprint.
Cloud-native firewalls and WAFs filter traffic at the edge. AWS WAF works with CloudFront and Application Load Balancers, Azure WAF with Application Gateway, and GCP Cloud Armor with Cloud CDN and load balancers. If you expose any HTTP services, a WAF is not optional—it’s the first thing attackers probe.
Logging and monitoring make incident response possible. CloudTrail logs every API call in your account, VPC Flow Logs capture every network connection, and GuardDuty uses machine learning to flag anomalies. Route these to a SIEM or analytics platform. Without them, you are blind to what is happening in your environment.
Defense-in-Depth Architecture
flowchart TD
A[Internet Traffic] --> B[WAF / Cloud Firewall]
B --> C[Load Balancer]
C --> D[Security Groups]
D --> E[Application Tier]
E --> F[Database Tier]
F --> G[KMS Encryption]
A --> H[IAM Authentication]
H --> E
E --> I[VPC Endpoints]
I --> J[S3 / DynamoDB]
Trade-off Analysis
| Security Control | Complexity | Security Benefit | Best For |
|---|---|---|---|
| Customer-managed KMS keys | High | Full audit and rotation control | Regulated workloads, cross-account access |
| Cloud-managed KMS keys | Low | Automatic rotation, no cost | Development, non-sensitive workloads |
| VPC endpoints | Medium | Traffic stays internal, lower cost | Private access to S3, DynamoDB from private subnets |
| NAT gateway for private traffic | Medium | Outbound-only internet for private subnets | Patching, external API calls from private instances |
| Security groups | Low | Instance-level stateful firewall | Primary network isolation for compute |
| NACLs | Medium | Subnet-level stateless filtering | Broad subnet rules, blocking specific CIDRs |
| IAM roles over user credentials | Low | Short-lived tokens, no credential management | All compute workloads |
Production Failure Scenarios
| Failure | Impact | Mitigation |
|---|---|---|
| IAM role trust policy misconfiguration locking out resources | Resources cannot assume roles, deployments fail | Use AWS Access Analyzer before deploying, test trust policies in dev |
| KMS key deletion without waiting for grace period | Encrypted data becomes irrecoverable | Use 7-30 day deletion windows, never delete keys with production data |
| Security group overly restrictive blocking legitimate traffic | Application cannot connect to dependencies, outages | Always test security group changes in staging first, use descriptive names |
| VPC endpoint policy denying required S3 access | Application cannot read from S3, deployments fail | Explicitly list required actions in endpoint policy, test after changes |
| CloudTrail not enabled for all regions | Attack activity in disabled regions goes unlogged | Enable CloudTrail across all regions, aggregate to single bucket |
Cloud Security Observability
What to monitor:
CloudTrail monitors all API calls. Enable it in all regions and route logs to a centralized bucket with object lock to prevent tampering.
GuardDuty monitors for compromised workloads. Review findings daily and route alerts to your security team’s notification channel.
Security Hub aggregates findings from GuardDuty, Inspector, and Macie into a unified view. Enable all integrated services for complete coverage.
VPC Flow Logs record source and destination IPs, ports, and bytes transferred. Use Flow Logs to detect lateral movement and unusual traffic patterns.
Key commands and queries:
# List recent CloudTrail events
aws cloudtrail lookup-events --max-results 10
# Get GuardDuty findings
aws guardduty list-findings \
--detector-id abc123 \
--finding-criteria '{"Severity": [{"Eq": ["HIGH"]}]}'
# Query VPC Flow Logs for port 22 access
aws logs insights query \
--log-group-name /aws/vpc/flow-logs \
--query-string 'fields srcAddr, dstAddr, dstPort, action | filter dstPort = 22 | limit 20'
# Check IAM access analyzer findings
aws accessanalyzer list-findings \
--analyzer-name my-analyzer
Common Pitfalls / Anti-Patterns
Using AWS root account for daily operations. The root account has full permissions and cannot be restricted by IAM policies. Use root account only for initial setup, then switch to IAM users and roles for everything else.
Over-permissive IAM roles. Granting *:* or AdministratorAccess to workloads because it is faster than scoping permissions defeats the purpose of least privilege. Start with minimal permissions and add only what the workload actually needs.
Leaving security groups open to 0.0.0.0/0. Allowing all inbound traffic to a database or cache port from anywhere on the internet is a common breach vector. Security groups should restrict access to known CIDRs or specific security groups.
Not enabling encryption by default. Some services allow creating unencrypted resources by default. Enforce encryption through service control policies or AWS Config rules so new resources cannot be created without encryption.
Forgetting to rotate access keys. Long-lived access keys on service accounts are a common exfiltration target. Rotate keys regularly, use short-lived credentials via IAM roles wherever possible.
Trade-off Analysis (Tools)
| Security Tool | Preventative vs Detective | CI/CD vs Runtime | Cost |
|---|---|---|---|
| Cloud-native (GuardDuty, Security Hub, Defender) | Detective | Runtime | Pay per consume |
| CSPM (Prisma Cloud, Wiz) | Both | Runtime | Per-asset pricing |
| SAST / IaC scanning | Preventative | CI/CD | Tool cost |
| Secret scanning (Gitleaks, TruffleHog) | Preventative | CI/CD | Free / paid tiers |
| Runtime security (Falco, Sysdig) | Detective | Runtime | Infrastructure + license |
| SIEM (Splunk, Elastic) | Detective | Runtime | High (licensing + storage) |
Real-world Failure Scenarios
| Company / Context | Failure | Consequence | Lesson Learned |
|---|---|---|---|
| Target breach (2013) | IAM credentials for HVAC vendor abused to access POS systems | 70 million customer records exposed | Segment networks; vendor access should never reach POS systems regardless of credentials valid |
| Capital One breach (2019) | Overly permissive IAM role allowing S3 access from external | 100 million customer records exposed | Use SCPs to block cross-account access; Audit trust policies regularly |
| Toyota data exposure (2019) | S3 bucket public; CloudTrail not enabled for region | Customer data accessible; attack undetected | Enable CloudTrail everywhere; Block public S3 access by default |
| Meow ransomware attacks | Elasticsearch and MongoDB with no authentication exposed | Petabytes of data encrypted by ransom | Network access controls alone are insufficient; Authentication required on all data stores |
| SolarWinds supply chain attack (2020) | Software build process compromised; malicious update pushed | 18,000+ organizations breached | Verify software supply chain integrity; sign releases; monitor for anomalous build behavior |
Interview Questions
Least privilege means granting exactly the permissions needed for a task and nothing more. It applies to human users, service accounts, and compute workloads.
Implementation starts with understanding what permissions your identities actually need rather than defaulting to broad policies. Use IAM Access Analyzer to identify external access and policy simulators to test policies before deployment. Create groups for roles, attach policies to groups, and add users to groups rather than attaching policies directly to users. Regularly audit unused access keys and deactivate old credentials.
VPC endpoints let private resources like S3 or DynamoDB be accessed from within your VPC without traffic leaving the AWS network. They're faster, cheaper, and more secure than NAT for this use case.
NAT gateways handle outbound internet access for private instances—for patching, downloading packages, calling external APIs. They don't allow inbound connections from the internet.
Use VPC endpoints for private access to AWS services. Use NAT gateways when private instances need to reach the internet outbound. If you're routing S3 traffic through NAT, you're paying unnecessary egress costs and adding latency.
Workload identities let compute workloads like EC2 instances, containers, or Lambda functions assume IAM roles using short-lived tokens instead of storing long-lived access keys. The role permissions are scoped to what the workload actually needs.
Long-lived credentials embedded in instance profiles or environment variables persist beyond the workload lifecycle and can be exfiltrated from logs or environment variables. Workload identity eliminates credentials from code entirely—Azure uses managed identities, AWS uses IAM roles with token federation, GCP uses workload identity federation.
Start at the edge: WAF or Cloud Firewall filtering malicious traffic before it reaches your load balancer. The load balancer sits in public subnets, routing to application servers in private subnets. Security groups on application servers allow only traffic from the load balancer.
Database servers sit in private subnets with no internet access at all, reachable only from application tier security groups. KMS encrypts data at rest. IAM roles handle authentication for any AWS service access.
For Kubernetes workloads, network policies restrict pod-to-pod communication, and service mesh adds mTLS between services. VPC endpoints keep traffic to S3 and DynamoDB internal.
Database security groups should have no inbound rules from 0.0.0.0/0 ever. The only inbound access comes from the application tier security group via security group references.
Configure the database security group to accept inbound PostgreSQL or MySQL traffic only from the application tier security group ID. This means the rule looks like: port 5432, source security group sg-xxxxxxxx. When application servers scale, they automatically get database access. When database servers scale, they inherit the same restrictions.
Outbound rules should be minimal—typically only to the application tier or specific external services the database needs to reach.
Customer-managed KMS keys give you control over key rotation, key policies, and cross-account access. You can inspect key policies and define exactly who can use the key. They cost money but provide audit trails.
Cloud-managed keys are free and automatic—AWS handles rotation and policies. You cannot inspect or modify their policies. They're fine for development and non-sensitive workloads.
Choose customer-managed keys for regulated workloads, production data requiring compliance controls, and scenarios where you need cross-account access to keys. Choose cloud-managed keys for development, test, and non-sensitive data where you want to minimize operational overhead.
Layer several controls. First, scope the role's permissions narrowly—if it only needs read access to specific prefixes, grant only s3:GetObject on those prefixes, not s3:*.
Second, use VPC endpoints with endpoint policies that restrict S3 actions to specific buckets. Without an endpoint, traffic to S3 leaves the VPC and re-enters from the internet. With an endpoint policy, you can deny actions like PutObject if the role shouldn't be writing data.
Third, enable S3 Block Public Access at the account level. Fourth, use CloudTrail with S3 data events to detect unusual GetObject patterns—large volumes of downloads from an unusual location or at unusual times.
SCPs are guardrails in AWS Organizations that restrict what actions IAM users and roles can perform, regardless of individual IAM permissions. They don't grant permissions—they deny actions that would otherwise be allowed.
Use SCPs to enforce organizational security standards. For example: deny creation of S3 buckets without encryption, deny creation of users with programmatic access keys older than 90 days, deny opening security groups to 0.0.0.0/0, or require MFA for delete operations on specific resources.
SCPs apply to all accounts in an OU, which makes them powerful for enforcing security baselines across your entire AWS footprint.
For Kubernetes services, implement mTLS through a service mesh like Istio or Linkerd. The service mesh handles certificate rotation automatically—all pod-to-pod traffic is encrypted without application code changes.
For non-Kubernetes services, use TLS for all network communication. Either terminate TLS at the load balancer and use internal encryption between services, or implement TLS end-to-end. Certificate management can be automated with services like AWS Certificate Manager or Let's Encrypt with cert-manager for Kubernetes.
The key requirement is enforcing TLS everywhere—not just on public endpoints. Internal services should also use TLS, especially when crossing network boundaries like availability zones.
If a security group allows all inbound traffic (0.0.0.0/0 for all ports), any resource with that security group is exposed to the internet. Port scanners will find it within hours.
Detection methods: AWS Config rules can detect overly permissive security groups and alert or auto-remediate. GuardDuty can flag unusual inbound traffic patterns. VPC Flow Logs capture all traffic including the permissive rule's effect. Security Hub aggregates these findings.
Prevention: Use AWS Config rules with remediation to automatically close security groups that open ports to 0.0.0.0/0. Set up AWS Security Hub to alert on configuration changes. Never allow 0.0.0.0/0 for database or cache ports—these should only accept traffic from specific security groups or CIDRs you control.
Enforce encryption at rest through multiple layers. First, use S3 Block Public Access at the account level to prevent creating unencrypted buckets. Second, use S3 bucket policies that require objects to be uploaded with encryption. Third, use AWS Config rules with remediation to detect and automatically fix unencrypted buckets.
For existing buckets, run a scan and remediate any unencrypted ones. Use S3 Inventory to track encryption status across all buckets. For KMS encryption, ensure the policy restricts key usage to specific roles and services that need access.
Automate with a bucket creation workflow that enforces default encryption via bucket policy. Lambda can intercept bucket creation and apply encryption settings automatically.
The cloud provider (AWS, Azure, GCP) is responsible for the security OF the cloud: physical data centers, hardware, networking infrastructure, hypervisor. You are responsible for security IN the cloud: data, identity, access management, application code, operating systems, network configuration.
The split varies by service type. For IaaS (EC2, VPC), you manage the OS and applications. For PaaS (RDS, Lambda), the provider manages the platform. For SaaS, most controls are provider-managed.
This means you cannot rely on the cloud provider to secure your data or configurations. Even with a "secure" cloud platform, misconfigured IAM, open security groups, or unencrypted data is your fault.
Detection starts with CloudTrail with S3 data events enabled. Set up Athena queries or GuardDuty to flag unusual GetObject patterns: large volumes from unusual IPs, at unusual times, or to unfamiliar geographic locations.
Response steps: immediately revoke the role's credentials using the role's inline session policies or AWS STS temporary credentials. Isolate the affected resources. Identify what data was accessed using CloudTrail logs. Identify the initial compromise vector (was it a GitHub secret, an exposed credential, a phishing attack?).
Prevent exfiltration by using VPC endpoints with endpoint policies to restrict S3 access to specific buckets. Enable S3 Block Public Access. Use service control policies to prevent overly permissive access to S3 from any account.
Security Hub is a centralized security findings aggregator. It collects findings from GuardDuty (threat detection), Inspector (vulnerability scanning), Macie (data classification), and Config (compliance monitoring). It normalizes findings into a common format and provides a unified dashboard across your AWS account or organization.
Enable Security Hub in each account and region. Use Security Hub standards (CIS AWS Foundations, PCI DSS) to check compliance against benchmarks. Route findings to a SIEM or ticketing system for response. Use automated remediation playbooks for common findings.
Without Security Hub, you would need to check each service individually and correlate findings manually. Security Hub automates this correlation and provides a single view of your security posture.
Federation is appropriate when external users (contractors, partners, customers) need access without creating permanent IAM users. Use AWS IAM Identity Center (formerly SSO) or SAML-based federation with your identity provider (Okta, Azure AD, Google Workspace).
For external users, create an IAM role with trust policy allowing your identity provider to assume it. Scope permissions to exactly what external users need. Set session duration to limit exposure. Enable MFA at the identity provider level, not just in AWS.
For programmatic access, use STS to issue short-lived credentials rather than access keys. Set external ID if a third party needs cross-account access. Regularly audit which federated identities are active and remove unused ones.
SCPs are guardrails at the AWS Organization level that restrict what actions IAM users and roles can perform, regardless of their individual permissions. They do not grant permissions — they only deny actions that would otherwise be allowed. IAM policies grant permissions to users and roles within an account.
Use SCPs to enforce organizational security standards across all accounts. Example SCPs: deny creation of S3 buckets without encryption, deny opening security groups to 0.0.0.0/0, require MFA for delete operations on production resources, block access to specific regions.
SCP hierarchy: root OU SCPs apply to all child OUs. Explicit deny in any SCP overrides any allow. If an SCP denies an action, no IAM policy within that account can permit it.
On EKS, use IAM roles for service accounts (IRSA) to assign AWS permissions to Kubernetes service accounts. Create an IAM role with a trust policy allowing the EKS cluster's OIDC provider to assume it, scoped to specific service account names in specific namespaces.
Within Kubernetes, use RBAC to grant permissions to service accounts, not to users. Network policies restrict pod-to-pod communication — a compromised pod cannot reach other pods unless explicitly allowed. Limit what pods can do via Pod Security Standards.
Avoid giving pods cluster-admin or using default service accounts. Each workload should have its own service account with minimal RBAC permissions. Enable EKS Secrets encryption for sensitive data in etcd.
VPC Flow Logs capture all network traffic flowing through VPC interfaces: source/destination IP, ports, bytes, action (accept/reject), timestamp. Enable Flow Logs at the VPC level to capture all traffic across all subnets.
Security monitoring use cases: detect port scans (many connection attempts to different ports from same IP), identify lateral movement (unusual traffic between subnets), find data exfiltration (large outbound transfers to unfamiliar IPs), catch open security groups (traffic being rejected that should be allowed).
Route Flow Logs to CloudWatch Logs or S3, then analyze with Athena or a SIEM. Set up CloudWatch alarms on unusual traffic patterns. Store logs with Object Lock to prevent tampering.
Enable automatic key rotation for KMS keys — AWS rotates the key material annually without any code changes or downtime. The old key material is retained so decryption of data encrypted with old keys continues to work.
For manual rotation (if you need rotation more frequently or on a specific schedule): create a new KMS key, update applications to use the new key for encryption, ensure existing data can still be decrypted with old key, then disable the old key. Do not delete old keys until all data encrypted with them is either re-encrypted or no longer needed.
Use envelope encryption: a data key encrypts data, the data key is encrypted by KMS key. Rotating the KMS key only requires re-encrypting data keys, not the data itself.
Detection: CloudTrail logs, GuardDuty findings, GuardDuty alerts, or unusual network traffic from VPC Flow Logs trigger investigation. Confirm breach through CloudTrail lookups for suspicious API calls (DescribeInstances, GetSecret, ListBuckets).
Containment: immediately isolate affected resources by revoking IAM credentials, security group changes, or Network ACL blocks. Preserve evidence — do not delete logs or stop instances that might contain artifacts.
Investigation: use CloudTrail to identify what actions were taken, from which IP, using which credentials. Check whether data was exfiltrated via S3 or other services. Identify the initial access vector — was it an exposed access key, a compromised service account, or a misconfigured resource?
Recovery: rotate all credentials that might be compromised. Remove any backdoors added by attacker (new IAM users, security group rules, cron jobs). Restore from known-good backups if data was modified.
Post-incident: document timeline, root cause, and remediation. Update detection rules to catch similar attacks faster. Notify affected customers if data was exposed.
Further Reading
- AWS IAM Documentation — Identity and Access Management best practices and policy types
- Amazon VPC Documentation — Virtual private cloud networking, endpoints, and security
- AWS KMS Documentation — Key management, encryption, and key policies
- AWS Security Hub — Centralized security findings and compliance monitoring
- Azure Security Documentation — Microsoft security best practices across Azure services
- Google Cloud Security — GCP security documentation and best practices
Conclusion
Key Takeaways
- Defense in depth means layering IAM, network isolation, and encryption—not relying on any single control
- Least privilege is the cardinal rule: grant only the permissions needed, nothing more
- IAM roles with short-lived tokens beat long-lived credentials embedded in code or environment variables
- VPC endpoints keep traffic internal and avoid NAT gateway costs for private resource access
- CloudTrail, GuardDuty, and Security Hub provide the monitoring foundation for any AWS environment
Cloud Security Checklist
# 1. Enable CloudTrail in all regions
aws cloudtrail create-trail --name my-trail --is-multi-region --bucket-name my-cloudtrail-bucket
# 2. Enable GuardDuty
aws guardduty enable-detector --detector-id $(aws guardduty list-detectors --query 'DetectorIds[0]' --output text)
# 3. Enforce encryption on S3 buckets
aws s3api put-bucket-encryption \
--bucket my-bucket \
--server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'
# 4. Create IAM group with minimal permissions
aws iam create-group --group-name readonly
aws iam attach-group-policy --group-name readonly --policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess
# 5. Block public access to S3 buckets
aws s3api put-public-access-block \
--bucket my-bucket \
--public-access-block-configuration 'BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true'
For more on managing cloud infrastructure, see our post on Cost Optimization.
Category
Related Posts
Kubernetes Network Policies: Securing Pod-to-Pod Communication
Implement microsegmentation in Kubernetes using Network Policies to control traffic flow between pods and enforce zero-trust networking.
Encryption at Rest: TDE, Key Management, and Performance
Learn Transparent Data Encryption (TDE), application-level encryption, and key management using AWS KMS and HashiCorp Vault. Performance overhead explained.
AWS Core Services for DevOps: EC2, ECS, EKS, S3, Lambda
Navigate essential AWS services for DevOps workloads—compute (EC2, ECS, EKS), storage (S3), serverless (Lambda), and foundational networking.