AWS Core Services for DevOps: EC2, ECS, EKS, S3, Lambda

Navigate essential AWS services for DevOps workloads—compute (EC2, ECS, EKS), storage (S3), serverless (Lambda), and foundational networking.

published: reading time: 23 min read author: GeekWorkBench

AWS Core Services for DevOps: EC2, ECS, EKS, S3, Lambda

AWS forms the backbone of many enterprise cloud strategies. Understanding its core services means understanding how compute, storage, networking, and serverless components fit together. This post covers the essential services for deploying and operating applications on AWS.

The services covered here appear in virtually every AWS architecture. Even if you plan to use higher-level services, knowing how the underlying components work helps you make better architectural decisions and debug problems when they arise.

Introduction

EC2 vs. ECS vs. EKS vs. Lambda

Choose EC2 when you need full control over the operating system, require specific hardware configurations, or run legacy applications that cannot be containerized. EC2 gives you the most flexibility at the cost of the most operational overhead.

Choose ECS when you want container orchestration without the complexity of Kubernetes. ECS integrates tightly with AWS services like ALB, CloudWatch, and IAM, making it a natural fit for teams already invested in AWS. Use Fargate launch type when you want serverless containers—AWS manages the EC2 fleet for you.

Choose EKS when your team knows Kubernetes and wants portability across cloud providers, or when you need Kubernetes-specific features like custom controllers, complex pod scheduling, or a broad ecosystem of third-party tools.

Choose Lambda when your workload is event-driven, short-running, or bursty. Lambda handles scaling automatically and charges only for execution time. If your function runs for hours continuously, EC2 or ECS is likely cheaper.

S3 Storage Class Selection

Use S3 Standard for frequently accessed data—hot storage for active workloads. Use S3 IA (Infrequent Access) for data that is accessed less than once per month but needs rapid access when needed. Use S3 Glacier for archival data that you need to retain but rarely access, with retrieval times of minutes to hours depending on the tier.

AWS Multi-Account Architecture

AWS resources are deployed to specific geographic regions, and regions are independent of each other. Each region has multiple availability zones (AZs)—physically separate data centers with independent power, networking, and cooling. Deploying across multiple AZs protects against single-datacenter failures.

# List available regions
aws ec2 describe-regions --output table

# Get current region
aws configure get region

Account structure shapes your AWS environment. Organizations use consolidated billing to manage multiple accounts under a single payer. Common patterns include separate accounts per environment (dev, staging, production), per team, or per application domain.

Organization
├── Management Account (billing, SCPs)
├── Security Account (GuardDuty, Security Hub)
├── Dev Account
├── Staging Account
└── Production Account

Service Control Policies (SCPs) at the organization level restrict what can be done in member accounts. This enforces guardrails without managing IAM in every account.

flowchart TD
    A[AWS Organization] --> B[Management Account]
    A --> C[Security Account]
    A --> D[Dev Account]
    A --> E[Staging Account]
    A --> F[Production Account]
    C --> G[GuardDuty]
    C --> H[Security Hub]
    D --> I[Dev VPC]
    E --> J[Staging VPC]
    F --> K[Production VPC]
    K --> L[ALB]
    L --> M[EKS Cluster]
    M --> N[ECS Tasks]
    N --> O[S3 Artifacts]

EC2 Instance Types and ASGs

EC2 provides virtual machines in the cloud. Instance types determine the CPU, memory, storage, and networking capacity. The naming pattern is family, generation, and size—for example, t3.micro is a burstable general purpose instance, third generation, micro size.

# Launch an EC2 instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.micro \
  --key-name my-key-pair \
  --security-group-ids sg-0123456789abcdef0 \
  --subnet-id subnet-0123456789abcdef0

# Describe instance status
aws ec2 describe-instance-status --instance-ids i-0abcdef1234567890

Auto Scaling Groups (ASGs) automatically adjust capacity based on demand. You define minimum, maximum, and desired capacity, along with scaling policies that trigger adjustments based on metrics like CPU utilization or request count.

# ASG CloudFormation snippet
AutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties:
    MinSize: 2
    MaxSize: 10
    DesiredCapacity: 2
    VPCZoneIdentifier:
      - !Ref PrivateSubnet1
      - !Ref PrivateSubnet2
    LaunchConfigurationName: !Ref LaunchConfig
    TargetGroupARNs:
      - !Ref TargetGroup
    HealthCheckType: ELB
    HealthCheckGracePeriod: 300

ASGs work with Elastic Load Balancers to distribute traffic across healthy instances. The load balancer performs health checks and removes unhealthy instances from the rotation automatically.

ECS Task Definitions and Services

Amazon Elastic Container Service (ECS) manages Docker containers on a cluster of EC2 instances or using AWS Fargate serverless compute. Task definitions describe what containers to run and how much resources they need.

{
  "family": "webapp",
  "containerDefinitions": [
    {
      "name": "webapp",
      "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/webapp:latest",
      "memory": 512,
      "cpu": 256,
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/webapp",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}

An ECS service maintains a desired count of task instances and automatically replaces failed tasks. It integrates with Application Load Balancers for traffic distribution and Auto Scaling for dynamic capacity adjustment.

# Register a new task definition revision
aws ecs register-task-definition --cli-input-json file://task-definition.json

# Update service to use new revision
aws ecs update-service \
  --cluster production \
  --service webapp \
  --task-definition webapp:2

Fargate removes the need to manage EC2 instances for container workloads. You specify CPU and memory requirements, and AWS handles the underlying infrastructure. This simplifies operations at the cost of less granular control over the compute environment.

EKS Cluster Management Basics

Amazon Elastic Kubernetes Service (EKS) provides a managed Kubernetes control plane. AWS handles the master nodes; you manage the worker nodes and workloads.

# Create an EKS cluster
aws eks create-cluster \
  --name production \
  --role-arn arn:aws:iam::123456789:role/eks-cluster-role \
  --resources-vpc-config subnetIds=subnet-0123456789abcdef0,subnet-0123456789abcdef1,securityGroupIds=sg-0123456789abcdef0

# Update kubeconfig
aws eks update-kubeconfig --name production

# Verify cluster access
kubectl get svc

EKS manages the Kubernetes control plane across multiple AZs for high availability. Worker nodes join the cluster via a node group, which can be managed by AWS (EKS Managed Node Groups) or self-managed.

# Node group configuration
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: production
  region: us-east-1

managedNodeGroups:
  - name: compute
    instanceType: t3.medium
    desiredCapacity: 3
    minSize: 2
    maxSize: 10
    volumeSize: 50
    ssh:
      allow: true

Kubernetes deployments, services, and ingresses work the same on EKS as on any Kubernetes cluster. The main difference is how you configure IAM roles for service accounts (IRSA) for workload authentication to AWS services.

S3 for Artifact Storage

Amazon S3 stores objects in buckets. For DevOps, S3 typically holds build artifacts, deployment packages, and infrastructure state. S3 integrates with everything on AWS through IAM policies and resource-based bucket policies.

# Create a bucket for artifacts
aws s3 mb s3://my-app-artifacts --region us-east-1

# Upload a build artifact
aws s3 cp ./dist/app.tar.gz s3://my-app-artifacts/prod/

# List bucket contents
aws s3 ls s3://my-app-artifacts/prod/

# Enable versioning for artifact history
aws s3api put-bucket-versioning \
  --bucket my-app-artifacts \
  --versioning-configuration Status=Enabled

Lifecycle policies automate archival and deletion. Move old artifacts to cheaper storage classes automatically, or delete artifacts older than a retention period.

{
  "Rules": [
    {
      "ID": "ArchiveOldArtifacts",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "prod/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "GLACIER"
        }
      ],
      "Expiration": {
        "Days": 365
      }
    }
  ]
}

Lambda for Serverless Workloads

AWS Lambda runs code in response to events without provisioning servers. You pay only for the compute time consumed—billed in milliseconds. Lambda is ideal for event-driven tasks, API backends, and background processing.

// Lambda handler for processing S3 uploads
exports.handler = async (event) => {
  const s3Event = event.Records[0].s3;
  const bucket = s3Event.bucket.name;
  const key = decodeURIComponent(s3Event.object.key.replace(/\+/g, " "));

  console.log(`Processing file: ${bucket}/${key}`);

  // Process the file...
  const result = await processUpload(bucket, key);

  return {
    statusCode: 200,
    body: JSON.stringify({ result }),
  };
};

Lambda functions run in a VPC by default with access to AWS services and the internet. To access VPC resources like RDS databases, configure the function with VPC subnet and security group attachments.

# Create a Lambda function
aws lambda create-function \
  --function-name my-processor \
  --runtime nodejs20.x \
  --role arn:aws:iam::123456789:role/lambda-execution-role \
  --handler index.handler \
  --zip-file fileb://function.zip \
  --vpc-config SubnetIds=subnet-0123456789abcdef0,SecurityGroupIds=sg-0123456789abcdef0

For more on managing AWS costs, see our post on Cost Optimization which covers EC2, Lambda, and S3 cost optimization strategies.

For more on securing AWS workloads, see Cloud Security for IAM best practices, VPC design, and encryption patterns, and Network Security for security groups, NACLs, and VPC endpoint configuration.

Trade-off Analysis

ScenarioEC2ECS/FargateEKSLambda
Full OS control neededYesNoNoNo
Serverless containersNoFargate launch typeNoNo
Kubernetes ecosystemNoNoYesNo
Pay-per-second billingNo (hourly)YesNoYes (100ms)
Cold start latencyNoneSecondsSecondsSeconds to minutes
Long-running workloadsBest choiceGoodGoodPoor (15 min max)
Stateful workloadsBest choiceLimitedGoodNo

Production Failure Scenarios

FailureImpactMitigation
ASG fails to scale due to ELB health check misconfigurationTraffic routed to unhealthy instances, requests failUse ELB health check type, test scale-in manually
ECS task stuck in PENDING due to insufficient resourcesService capacity drops, requests queued or droppedSet task completion timeouts, monitor pending count
EKS node group upgrade fails midwayPods evicted before new nodes ready, service disruptionUse surge unavailablity settings, upgrade one node at a time
S3 bucket policy denies access unexpectedlyApplication cannot read/write artifacts, deployments failUse IAM access analyzer, test bucket policies in dev first
Lambda VPC config causes cold start timeoutsRequests time out during scale-upPre-provision connections outside VPC handler, use provisioned concurrency

AWS Observability Hooks

EC2 and ASG monitoring:

# Get EC2 instance metrics
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-0abcdef1234567890 \
  --start-time 2026-03-24T00:00:00 \
  --end-time 2026-03-25T00:00:00 \
  --period 3600 \
  --statistics Average

# Check ASG health status
aws autoscaling describe-auto-scaling-groups \
  --auto-scaling-group-names my-asg \
  --query 'AutoScalingGroups[0].Instances[*].[InstanceId,HealthStatus,LifeCycleState]'

ECS monitoring:

# Check service health and running task count
aws ecs describe-services \
  --cluster production \
  --services webapp \
  --query 'services[0].{runningCount:runningCount,desiredCount:desiredCount,pendingCount:pendingCount}'

EKS monitoring:

# Check node health and pod distribution
kubectl get nodes -o wide
kubectl get pods -o wide --all-namespaces | grep -v Running

# Get cluster control plane health
aws eks describe-cluster \
  --name production \
  --query 'cluster.{status:status,version:version,endpoint:endpoint}'

Key CloudWatch metrics to alert on:

ServiceMetricAlert Threshold
EC2CPUUtilization> 80% for 5 minutes
EC2StatusCheckFailedany for 2 minutes
ASGCPUUtilization> 75% for 3 minutes
ECSCPUUtilization> 85% for 3 minutes
ECSRunningTaskCount< desired for 2 minutes
LambdaErrors> 0 for 5 minutes
LambdaDuration> 3000ms p99
S3BucketSizeBytesunexpected change

Common Pitfalls / Anti-Patterns

Using the default VPC. The default VPC has permissive security groups and is shared across all accounts in a region. Production workloads should use dedicated VPCs with explicit networking controls.

Attaching IAM policies directly to users instead of roles. Direct user policies create credential management nightmares and are harder to audit. Always use IAM roles for EC2, Lambda, and other compute services.

Not configuring ASG health checks properly. If your health check is too lenient, unhealthy instances stay in service. If it is too strict, instances get replaced during legitimate load spikes. Match health check type to your application needs.

Storing secrets in S3 object metadata or user data. S3 object metadata is not encrypted by default and appears in CloudTrail logs. Use AWS Secrets Manager or Systems Manager Parameter Store instead.

Using Lambda VPC config without considering cold starts. VPC-enabled Lambda functions must establish an ENI connection before executing, adding 10-30 seconds to cold starts. For latency-sensitive APIs, pre-provision connections or use Lambda outside VPC for initial request handling.

Capacity Estimation and Benchmark Data

Use these numbers for initial capacity planning. Actual performance varies by workload characteristics.

EC2 Instance Type Families

FamilyBest ForInstance TypesNetwork Performance
t3Burstable workloads, dev/testt3.micro → t3.2xlargeUp to 5 Gbps
m5General purpose, web serversm5.large → m5.24xlargeUp to 100 Gbps (xlarge+)
m6iGeneral purpose (latest gen)m6i.large → m6i.32xlargeUp to 100 Gbps
c5Compute optimized, batch processingc5.large → c5.24xlargeUp to 100 Gbps
c6iCompute optimized (latest gen)c6i.large → c6i.32xlargeUp to 100 Gbps
r5Memory optimized, databasesr5.large → r5.24xlargeUp to 100 Gbps
r6iMemory optimized (latest gen)r6i.large → r6i.32xlargeUp to 100 Gbps

Lambda Performance Parameters

ParameterValueNotes
Cold start (no VPC)100-500msDepends on runtime and package size
Cold start (with VPC)1-10 secondsENI attachment is the bottleneck
Provisioned concurrency~50msEliminates cold starts for warmed functions
Max execution duration900 seconds (15 min)Configure timeout based on workload
Default memory128 MBIncrease memory to boost CPU proportionally
Max concurrent executions1,000 per regionRequest limit increase if needed

S3 Performance Targets

MetricValue
PUT/LIST/DELETE latency100-200ms (p99)
GET latency60-100ms (p99)
Max request rate per prefix3,500 PUT/COPY/POST/LIST, 5,500 GET/HEAD per second
Multi-object deleteUp to 1,000 keys per request
Transfer accelerationAdds 20-30% on upload speed for distant regions

Service Limits for Planning

ServiceDefault LimitTypical Increase Request
Lambda concurrent executions1,000 per regionAWS Support
API Gateway requests per second10,000 per regionAWS Support
EBS volumes per account5,000AWS Support
VPCs per region5AWS Support
ENIs per instance (varies by type)3-15Instance type dependent

Additional References

Interview Questions

1. How does an Auto Scaling Group determine when to scale out and what metrics does it use?

What to cover:

  • ASGs respond to CloudWatch metrics: CPU utilization, memory, request count, custom metrics
  • Define min/max/desired capacity; ASG adjusts between min and max based on policies
  • Scaling policies: step scaling (add/remove instances in steps), target tracking (keep metric at value)
  • Health checks: ELB health checks mark unhealthy instances; ASG replaces them
  • Cooldown period prevents flapping; wait period before next scaling action
2. What is the difference between ECS with EC2 launch type and ECS with Fargate launch type?

What to cover:

  • EC2 launch type: you manage the EC2 fleet; more control over instance type and cost
  • Fargate: serverless; AWS manages the underlying nodes; you pay per task resource
  • Fargate removes SSH access and node-level customization
  • Fargate good for variable workloads; EC2 better for consistent high-throughput with reserved instances
  • Both use same task definitions and service scheduler; migration is straightforward
3. Walk through how you would set up IRSA (IAM Role for Service Accounts) for an EKS workload.

What to cover:

  • Create an IAM service account: aws iam create-service-account --name webapp
  • Annotate the Kubernetes service account with the IAM role ARN
  • Install AWS IAM authenticator; EKS uses it to map IAM roles to K8s service accounts
  • Pod automatically gets temporary credentials via OIDC token exchange
  • No need to store keys in secrets; credentials rotate automatically
4. How do you choose between S3 Standard, S3 IA, and S3 Glacier for artifact storage?

What to cover:

  • S3 Standard: frequently accessed (> once per month), immediate retrieval, highest storage cost
  • S3 IA: infrequent access (< once per month), lower storage cost, retrieval fees apply
  • S3 Glacier: archival, retrieval in minutes to hours, cheapest storage, access cost higher
  • Use lifecycle policies: move to IA after 30 days, Glacier after 90 days, delete after 365
  • Versioning + lifecycle = artifact history retained economically
5. What are the trade-offs between Lambda VPC config and Lambda outside VPC?

What to cover:

  • Lambda outside VPC: cold start 100-500ms, full AWS service access, internet access
  • Lambda inside VPC: cold start 1-10 seconds due to ENI attachment; no direct internet
  • VPC needed for: RDS, ElastiCache, private API Gateways, internal services
  • Solution: handler outside VPC for initial request, pre-provision connections, use provisioned concurrency
  • Consider: do you actually need VPC access or can you use AWS services directly?
6. How does multi-account AWS organization help with security and billing?

What to cover:

  • SCP (Service Control Policies) enforce guardrails at organization level across all accounts
  • Separate accounts per environment: dev/staging/prod isolate blast radius
  • Separate accounts per team or application domain for clean IAM boundaries
  • Consolidated billing: one payer account, track costs by account/tag
  • Security account aggregates GuardDuty, Security Hub findings centrally
7. What is the difference between EKS managed node groups and self-managed node groups?

What to cover:

  • Managed node groups: AWS handles node provisioning, updates, and termination
  • Managed: you specify instance type and count; AWS handles lifecycle
  • Self-managed: you create AMIs, manage kubelet, handle upgrades manually
  • Managed node groups support SSH with key pair if needed
  • Use managed for baseline; use self-managed when you need custom AMIs or specific kernel versions
8. How do you secure S3 bucket access? Walk through the options.

What to cover:

  • Block public access: bucket settings override bucket policies
  • IAM policies: grant access to specific buckets/prefixes per role
  • Bucket policies: JSON policies attached to bucket, can grant cross-account access
  • Access Analyzer: checks bucket policy for external access risks
  • VPC endpoints: access from within VPC without internet
  • Encrypt: SSE-KMS with CMK for audit trail of encryption key usage
9. What happens when an ECS task fails its health check?

What to cover:

  • ECS service scheduler marks task as unhealthy after grace period
  • Unhealthy task is stopped and replaced; new task launches if capacity allows
  • Health check grace period gives time for application to initialize
  • If task is stuck in PENDING: not enough resources, image pull failures, or health check misconfiguration
  • Check: task definition health check, container port mappings, startup time
10. How do you estimate Lambda costs for a production API and what factors affect the bill?

What to cover:

  • Billed per invocation and per GB-second of execution time
  • Duration: 100ms billing increments (round up); optimize memory/CPU tradeoffs
  • Data transfer: VPC egress charges apply; provisioned concurrency has hourly cost
  • Cold starts: do not count as billed duration unless function actually executes
  • Estimate: 1M requests × 500ms × 512MB = ~$0.20/month (very rough)
11. What are the key differences between Application Load Balancer (ALB) and Network Load Balancer (NLB) in AWS?

What to cover:

  • ALB operates at layer 7 (HTTP/HTTPS), NLB operates at layer 4 (TCP/UDP)
  • ALB supports path-based routing, host-based routing, and content-based routing
  • ALB terminates TLS and forwards decrypted traffic; NLB passes encrypted traffic through
  • NLB handles millions of requests per second with lower latency; ALB adds ~1-2ms latency
  • ALB integrates with ECS services for dynamic port mapping; NLB for high-throughput non-HTTP workloads
  • ALB includes built-in health checks; NLB health checks are simpler (TCP connect only)
12. How does Amazon ECR integrate with ECS and EKS for container image management?

What to cover:

  • ECR stores container images in a managed registry backed by S3 for durability
  • ECS task definitions reference ECR image URLs: `123456789.dkr.ecr.us-east-1.amazonaws.com/webapp:latest`
  • IAM policies control who can pull images from which repositories
  • Image scanning on push detects CVEs and prevents vulnerable images from deploying
  • Lifecycle policies auto-expire old image versions to reduce storage costs
  • ECR works with both ECS and EKS—same registry, different pull authentication
13. What is the difference between IAM roles and IAM users for AWS access?

What to cover:

  • Users have permanent access keys (long-term credentials); roles provide temporary credentials
  • Roles are assumed by identities (users, services, applications) for specific tasks
  • For EC2, Lambda, ECS: use instance profiles or task roles—no need to store keys
  • IAM users are for human access; service roles are for machine-to-machine access
  • Roles prevent credential leakage—keys cannot be stolen if keys do not exist
  • Use IAM roles for federation: users assume a role to get temporary elevated access
14. How does ASG health check type (EC2 vs ELB) affect instance replacement behavior?

What to cover:

  • EC2 health check: marks instance unhealthy if the instance status or system status becomes impaired
  • ELB health check: marks instance unhealthy if the ELB reports the instance as failed via its health check
  • ELB health check is more application-aware—checks if your service responds, not just if EC2 is running
  • Using EC2 health check when the application can be unhealthy but EC2 is fine leads to traffic to bad instances
  • Using ELB health check when the app is fine but the ELB health check endpoint is wrong leads to unnecessary replacements
15. What are the trade-offs between S3 Standard and S3 Intelligent-Tiering?

What to cover:

  • S3 Standard: highest storage cost, immediate access, no retrieval fees
  • S3 Intelligent-Tiering: monitors access patterns, auto-moves objects to lower-cost tiers after 30 days of no access
  • Intelligent-Tiering has a small monthly monitoring fee and possible retrieval fees from infrequent access tier
  • Best for: unpredictable access patterns, applications where you do not know access frequency in advance
  • Not best for: predictable hot data (Standard is cheaper), data accessed very frequently
16. How does Lambda concurrency work and what happens when you hit the reserved concurrency limit?

What to cover:

  • Lambda scales automatically up to 1000 concurrent executions per region by default
  • Reserved concurrency: guarantees a set number of executions for a function, isolates it from others
  • When reserved concurrency is exhausted, new invocations get throttled (429 Too Many Requests)
  • Provisioned concurrency: pre-warms instances to eliminate cold starts for a reserved allocation
  • Use reserved concurrency to prevent one function from consuming all regional capacity
  • Throttled invocations can be retried or routed to a dead-letter queue
17. What is the difference between ECS task definition revision and task definition family?

What to cover:

  • Family: the name of the task definition, like a versioned template
  • Revision: a specific version of the family (webapp:1, webapp:2, webapp:3)
  • When you register a new task definition, you specify family and get a new revision number
  • ECS service references a specific revision (webapp:2); updating the service picks up new revisions
  • Family groups related task definitions—webapp-service and webapp-worker might be separate families
18. How does VPC endpoint for S3 work and why would you use it?

What to cover:

  • VPC endpoint creates a private connection from your VPC to S3 without internet traversal
  • Without VPC endpoint, traffic to S3 goes through NAT gateway or internet gateway
  • VPC endpoint is free; NAT gateway has hourly cost plus data processing cost
  • Endpoint policy controls which S3 buckets can be accessed from the endpoint
  • Use VPC endpoints for: improved security (no internet exposure), cost reduction, lower latency
  • VPC endpoint for DynamoDB is separate from S3—create both for complete private AWS access
19. What is the purpose of EKS cluster endpoint access control (public vs private)?

What to cover:

  • Public endpoint: kubectl access from anywhere with authentication via AWS IAM
  • Private endpoint: kubectl access only from within the VPC—more secure for private clusters
  • Best practice: enable both, restrict public access via security groups
  • Private endpoint uses VPC internal DNS to resolve the cluster endpoint address
  • For hybrid scenarios, public endpoint with restricted CIDR blocks is a common pattern
20. How do you choose between S3 burst throughput and provisioned throughput?

What to cover:

  • S3 Standard: burst throughput capability—short spikes up to 3000 PUT/COPY/POST/LIST or 5000 GET/HEAD per second per prefix
  • Provisioned throughput: guaranteed sustained throughput for predictable workloads
  • Burst is sufficient for most workloads; provisioned for consistent high-throughput requirements
  • Burst replenishes over time; if you consistently need more than burst provides, provisioned is better
  • Cost: provisioned costs more; only use when burst is consistently insufficient for your workload

Further Reading

Conclusion

Key Takeaways

  • EC2 gives full control but maximum operational burden; use for legacy workloads and specific hardware needs
  • ECS with Fargate removes EC2 management; best for teams wanting containers without Kubernetes complexity
  • EKS provides Kubernetes portability; best for multi-cloud strategies and teams with Kubernetes expertise
  • Lambda is ideal for event-driven, short-running workloads; not suitable for long processes or stateful operations
  • Multi-account AWS organizations enforce guardrails via SCPs and simplify billing tracking

AWS Onboarding Checklist

# 1. Create organization and enable SCPs
aws organizations create-organization
aws organizations enable-service-control-policy --service-principal ALL

# 2. Set up VPC for production
aws ec2 create-vpc --cidr-block 10.0.0.0/16
aws ec2 create-subnet --vpc-id vpc-xxx --cidr-block 10.0.1.0/24

# 3. Create ECS cluster with Fargate
aws ecs create-cluster --cluster-name production --capacity-providers FARGATE

# 4. Set up CloudWatch alarms for critical metrics
aws cloudwatch put-metric-alarm \
  --alarm-name EC2-High-CPU \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --threshold 80 \
  --period 300 \
  --evaluation-periods 1

# 5. Enable S3 versioning on artifact bucket
aws s3api put-bucket-versioning \
  --bucket my-artifacts \
  --versioning-configuration Status=Enabled

Category

Related Posts

AWS Data Services: Kinesis, Glue, Redshift, and S3

Guide to AWS data services for building data pipelines. Compare Kinesis vs Kafka, use Glue for ETL, query with Athena, and design S3 data lakes.

#data-engineering #aws #kinesis

Data Migration: Strategies and Patterns for Moving Data

Learn proven strategies for migrating data between systems with minimal downtime. Covers bulk migration, CDC patterns, validation, and rollback.

#data-engineering #data-migration #cdc

Serverless Data Processing: Building Elastic Pipelines

Build scalable data pipelines using serverless services. Learn how AWS Lambda, Azure Functions, and Cloud Functions integrate for cost-effective processing.

#data-engineering #serverless #lambda