GitHub - aws-samples/sample-devops-agent-ecs-workshop

⚠️ Disclaimer: This repository includes intentional fault injection and stress test scenarios designed to demonstrate the AWS DevOps Agent's investigation capabilities. These scripts deliberately introduce issues such as memory leaks, network partitions, database stress, and service latency. Do not run these scripts in production environments. They are intended for learning and demonstration purposes only.

📦 Source Code: The source code for the Retail Store Sample Application can be found at: https://github.com/aws-containers/retail-store-sample-app

AWS DevOps Agent - ECS Troubleshooting Lab

A self-paced hands-on lab for learning ECS troubleshooting with AWS DevOps Agent

📚 Index

Section	Description
Overview	Lab introduction and learning objectives
Application Architecture	Microservices and infrastructure components
Quick Start	Deploy the infrastructure
AWS DevOps Agent Setup	Configure the DevOps Agent
Troubleshooting Labs	10 hands-on labs
Observability	CloudWatch monitoring setup
Cleanup	Destroy resources

⚠️ Important: Platform Requirements

The lab scripts (inject/fix) require a Linux/macOS bash environment.

Windows Users: The fault injection scripts are shell scripts that will not run natively on Windows. You have two options:

Recommended: Use AWS CloudShell - a browser-based shell with AWS CLI pre-installed

Alternative: Use WSL2 (Windows Subsystem for Linux), Git Bash, or SSH into a Linux EC2 instance

Terraform commands can be run from any terminal (Windows PowerShell, CMD, or Linux/macOS).

🚀 Ready to Deploy?

If you're familiar with ECS and just want to get started:

git clone https://github.com/aws-samples/sample-devops-agent-ecs-workshop.git
cd sample-devops-agent-ecs-workshop/terraform/ecs/default
terraform init && terraform apply

Skip to Deployment →

Overview

This lab provides a production-ready Amazon ECS deployment environment for learning how to troubleshoot containerized applications using AWS DevOps Agent. You'll deploy a multi-service retail store application, inject real faults, and use the DevOps Agent to investigate and resolve issues.

Lab Information	Details
Duration	2-3 hours
Level	300 (Advanced)
Target Audience	DevOps Engineers, SREs, Platform Engineers
Prerequisites	Basic AWS knowledge, familiarity with containers
Cost	~$3-4/hour (remember to clean up!)

This project is intended for educational purposes only and not for production use.

What You'll Learn

Deploy a distributed microservices application to Amazon ECS using Terraform
Configure AWS DevOps Agent to monitor your ECS infrastructure
Execute chaos engineering experiments using fault injection
Use DevOps Agent to investigate incidents and identify root causes
Apply recommended mitigations to resolve issues

Application Architecture

The lab deploys the AWS Retail Store Sample Application, a fully functional e-commerce application consisting of 5 microservices:

Microservices

Service	Language	Description	Backend
UI	Java (Spring Boot)	Store frontend, serves web pages	Calls other services
Catalog	Go	Product catalog API	RDS MariaDB
Cart	Java (Spring Boot)	Shopping cart management	DynamoDB
Checkout	Node.js (NestJS)	Checkout orchestration	ElastiCache Redis
Orders	Java (Spring Boot)	Order processing	RDS MariaDB + Amazon MQ

Note: This lab uses pre-built container images from Amazon ECR. The application source code is available in the AWS Retail Store Sample App repository.

Infrastructure Components

Category	Components
Compute	ECS Cluster (Fargate), 5 ECS Services, Application Load Balancer
Data Stores	RDS MariaDB (Catalog, Orders), DynamoDB (Cart), ElastiCache Redis (Checkout), Amazon MQ (Orders)
Networking	VPC with public/private subnets, NAT Gateway, Security Groups, ECS Service Connect
Observability	CloudWatch Container Insights (Enhanced), CloudWatch Logs, Alarms, Dashboard

Resource Tagging

All resources are tagged with ecsdevopsagent=true to enable AWS DevOps Agent discovery. This tag is applied to:

ECS Cluster and Services
RDS Database instances
DynamoDB Tables
ElastiCache clusters
Application Load Balancer
CloudWatch Log Groups
IAM Roles
Security Groups

Quick Start

Prerequisites

Git - Installation guide
AWS CLI - Installed and configured with appropriate credentials (Installation guide)
Terraform >= 1.0 - Installation guide
Session Manager Plugin - Required for ECS Exec (Installation guide)
jq - JSON processor for lab scripts (Installation guide)
AWS Permissions - Administrator access recommended. The lab creates multiple AWS resources (ECS, RDS, DynamoDB, ElastiCache, Amazon MQ, VPC, IAM roles, etc.). Using limited permissions may result in deployment failures.
Bash Shell (for lab scripts) - macOS/Linux terminal, AWS CloudShell, WSL2, or Git Bash on Windows

Step 1: Clone the Repository

git clone https://github.com/aws-samples/sample-devops-agent-ecs-workshop.git
cd sample-devops-agent-ecs-workshop

Step 2: Deploy Infrastructure

# Navigate to Terraform directory
cd terraform/ecs/default

# Initialize Terraform
terraform init

# Preview changes (optional)
terraform plan

# Deploy (~15-20 minutes)
terraform apply
# Type 'yes' when prompted

Step 3: Verify Deployment

After Terraform completes, it displays output values including the application URL:

Outputs:

ecs_cluster_name = "retail-store-ecs-cluster"
ui_service_url = "http://retail-xxxxx.us-east-1.elb.amazonaws.com"

Verify the application is running:

Copy the ui_service_url from the Terraform output
Open it in your browser - you should see the Retail Store home page
Verify services in the ECS Console → Clusters → retail-store-ecs-cluster → Services

You should see all 5 services running with 1/1 tasks:

Optional: Verify via CLI (Linux/macOS/CloudShell only)

# Get application URL
APP_URL=$(terraform output -raw ui_service_url)
echo "Application URL: $APP_URL"

# Test the application
curl -I $APP_URL

# Verify all services are running
aws ecs describe-services \
  --cluster $(terraform output -raw ecs_cluster_name) \
  --services ui catalog carts checkout orders \
  --query 'services[*].[serviceName,runningCount,desiredCount]' \
  --output table

Step 4: Access the Application

Open the APP_URL in your browser. You should see the Retail Store home page.

Test the application by:

Home Page - Featured products and categories
Catalog - Browse all products (powered by Catalog service)
Cart - Add/remove items (powered by Carts service)
Checkout - Complete your purchase (powered by Checkout service)
Orders - Order confirmation (powered by Orders service)

AWS DevOps Agent Setup

AWS DevOps Agent is a frontier AI agent that helps accelerate incident response and improve system reliability. It investigates incidents and identifies operational improvements like an experienced DevOps engineer.

Note: AWS DevOps Agent is currently in public preview and available in US East (N. Virginia) (us-east-1). The agent can monitor applications deployed in any AWS region.

What is an Agent Space?

An Agent Space is a logical container that defines the tools and infrastructure that AWS DevOps Agent has access to. It represents the boundary of what the agent can access and investigate during incident response.

The agent uses a dual-console architecture:

AWS Management Console - Administrators create and manage Agent Spaces, configure integrations, and set up access controls
DevOps Agent Web App - Operations teams use this for day-to-day incident response, investigations, and viewing recommendations

Step 1: Create an Agent Space

Navigate to the AWS DevOps Agent Console
Click Begin setup (or Create Agent Space if you have existing spaces)
Enter details:
- Name: retail-store-ecs-lab
- Description: Agent Space for ECS Troubleshooting Lab

Step 2: Configure IAM Roles

In Give this Agent Space AWS resource access, select Auto-create a new DevOps Agent role
Review the permissions that will be granted to the role
(Optional) Customize the role name if desired

Step 3: Configure Resource Discovery with Tags

Since this lab uses Terraform (not CloudFormation), you need to add a tag so the agent can discover your resources.

In the Include AWS tags section, click Add tag
Add tag: ecsdevopsagent = true

This tag enables the DevOps Agent to discover all lab resources including ECS cluster, services, RDS databases, DynamoDB tables, and related infrastructure.

Step 4: Enable Web App Access

In Enabling the Agent Space Web App, select Auto-create a new AWS DevOps Agent role
Review the permissions that will be granted
Leave other settings as default
Click Create

Step 5: Verify Setup

Wait 1-2 minutes for the Agent Space to be created
Click Admin access to open the Web App
Navigate to DevOps Center to view the discovered topology
Verify you can see the ECS cluster and services

You should see the following resources discovered:

ECS Cluster: retail-store-ecs-cluster
ECS Services: ui, catalog, cart, checkout, orders
RDS Instances: catalog-db, orders-db
DynamoDB Table: carts
ElastiCache: checkout-redis
Amazon MQ: RabbitMQ broker

Verify Resource Discovery (Optional - CLI)

Note: These commands require a bash shell (Linux/macOS/CloudShell)

# Verify ECS cluster tags
aws ecs describe-clusters --clusters retail-store-ecs-cluster \
  --query 'clusters[0].tags' --output table

# List all resources with the ecsdevopsagent tag
aws resourcegroupstaggingapi get-resources \
  --tag-filters Key=ecsdevopsagent,Values=true \
  --query 'ResourceTagMappingList[].ResourceARN' --output table

Starting an Investigation

From the DevOps Agent Web App:

Click Start Investigation

Enter a prompt describing what you want to investigate:

Check the health of my ECS services in the retail-store-ecs-cluster

Leave other options as default and click Start Investigating
The agent will analyze your infrastructure and provide insights

Safety Mechanisms

Mechanism	Description
Read-Only by Default	The agent only reads data; it does not modify resources
Scoped Access	Access is limited to resources within the Agent Space
Audit Logging	All agent actions are logged to CloudTrail
Human-in-the-Loop	Mitigation recommendations require human approval

Troubleshooting Labs

⚠️ Windows Users: The lab scripts require a bash shell environment. Use one of these options:

AWS CloudShell (Recommended) - Browser-based, no setup required

WSL2 (Windows Subsystem for Linux)

Git Bash (comes with Git for Windows)

SSH into a Linux EC2 instance

Before running lab scripts, ensure you have jq installed: jq --version

The labs are organized into two categories:

Configuration Labs (Labs 1-6)

These labs focus on common ECS misconfigurations that cause service failures:

Lab	Issue	Service	Difficulty
Lab 1	CloudWatch Logs Not Delivered	Catalog	Basic
Lab 2	Unable to Pull Secrets	Orders	Basic
Lab 3	Health Check Failures	UI	Basic
Lab 4	Security Group Blocked (Database Connectivity)	Catalog → RDS	Intermediate
Lab 5	Task Resource Limits (OOM)	Checkout	Intermediate
Lab 6	Service Connect Communication Broken	UI → Catalog	Intermediate

Performance Labs (Labs 7-10)

These labs inject real performance issues to simulate production incidents:

Lab	Issue	Service	Difficulty
Lab 7	CPU Stress	Catalog	Intermediate
Lab 8	DDoS Attack Simulation	UI/ALB	Advanced
Lab 9	DynamoDB Attack	Carts	Advanced
Lab 10	Auto-Scaling Not Working	Catalog	Advanced

Lab Workflow

Each lab follows a consistent pattern:

┌─────────────────────┐     ┌─────────────────────┐     ┌─────────────────────┐
│  1. Inject Fault    │────▶│  2. Observe Symptoms│────▶│  3. Start           │
│  (run inject script)│     │  (check app/metrics)│     │  Investigation      │
└─────────────────────┘     └─────────────────────┘     └──────────┬──────────┘
                                                                   │
                                                                   ▼
┌─────────────────────┐     ┌─────────────────────┐     ┌─────────────────────┐
│  6. Rollback Fault  │◀────│  5. Apply Fix       │◀────│  4. Agent Analyzes  │
│  (run rollback      │     │  (follow agent      │     │  & Identifies Root  │
│   script)           │     │   recommendations)  │     │  Cause              │
└─────────────────────┘     └─────────────────────┘     └─────────────────────┘

Lab 1: CloudWatch Logs Not Delivered

Scenario: The catalog service has stopped sending logs to CloudWatch. Without logs, you can't monitor the service's health or debug issues.

Inject:

./labs/lab1-logs-not-delivered/inject.sh

Symptoms:

Catalog service tasks failing to start
Service events showing ResourceInitializationError
No new logs appearing in CloudWatch

Investigation Prompts:

Why is the catalog service failing to start new tasks?

Check the ECS service events for the catalog service

Root Cause: Task definition references a non-existent CloudWatch log group.

Fix:

./labs/lab1-logs-not-delivered/fix.sh

Lab 2: Unable to Pull Secrets

Scenario: The orders service can't start because it can't retrieve database credentials from Secrets Manager.

Inject:

./labs/lab2-secrets-access-denied/inject.sh

Symptoms:

Orders service tasks fail to start
Error: "unable to pull secrets or registry auth"
Customers cannot place orders

Investigation Prompts:

Why is the orders service failing to start?

What IAM permissions does the orders service task execution role have?

Root Cause: Task execution role is missing secretsmanager:GetSecretValue permission.

Fix:

./labs/lab2-secrets-access-denied/fix.sh

Lab 3: Health Check Failures

Scenario: The UI service tasks keep restarting every few minutes. Customers see intermittent 503 errors.

Inject:

./labs/lab3-health-check-failures/inject.sh

Symptoms:

Tasks continuously restart
Service never stabilizes
Service events show "unhealthy" messages

Investigation Prompts:

Why does the UI service keep restarting tasks?

What health check configuration is the UI service using?

Root Cause: Health check path is misconfigured (/wrong-health-endpoint instead of /actuator/health).

Fix:

./labs/lab3-health-check-failures/fix.sh

Lab 4: Security Group Blocked

Scenario: The product catalog stopped loading. The catalog service is running but returns errors when fetching products. Database connection timeouts appear in the logs.

Inject:

./labs/lab4-security-group-blocked/inject.sh

Symptoms:

Catalog returns errors
Service is running and healthy
Database connection timeouts in logs
RDS appears healthy

Investigation Prompts:

The catalog service can't connect to the database. What's wrong?

What security groups are attached to the catalog service and the RDS database?

Root Cause: RDS security group is missing ingress rule allowing traffic from catalog service on port 3306.

Fix:

./labs/lab4-security-group-blocked/fix.sh

Lab 5: Task Resource Limits (OOM)

Scenario: The checkout service is crashing repeatedly. Tasks start but crash within seconds due to memory exhaustion.

Inject:

./labs/lab5-task-resource-limits/inject.sh

Symptoms:

Tasks crash shortly after starting
Container shows OutOfMemoryError: Container killed due to memory usage
Checkout unavailable - customers cannot complete purchases
Rapid task cycling as ECS keeps trying to start new tasks

Investigation Prompts:

Why is the checkout service crashing? The tasks keep restarting.

What is the exit code for the stopped checkout tasks? Is it an OOM kill?

Show me the memory configuration for the checkout service task definition

Root Cause: A memory-stress sidecar container is consuming more memory than the task limit allows, causing OOM kills.

Fix:

./labs/lab5-task-resource-limits/fix.sh

Lab 6: Service Connect Broken

Scenario: The UI loads but the product catalog is empty. The catalog service appears healthy but the UI can't communicate with it.

Inject:

./labs/lab6-service-connect-broken/inject.sh

Symptoms:

UI loads but catalog is empty
Catalog service is healthy
UI logs show connection errors

Investigation Prompts:

The product catalog is empty but the catalog service looks healthy. What's wrong?

How does the UI service connect to the catalog service?

Root Cause: UI service environment variable points to wrong endpoint (http://catalog-broken instead of http://catalog).

Fix:

./labs/lab6-service-connect-broken/fix.sh

Lab 7: CPU Stress

Scenario: Users report the product catalog is loading slowly. Page load times increased from under 1 second to 5-10 seconds.

Inject:

./labs/lab7-cpu-stress/inject.sh

Symptoms:

Slow response times
High CPU in Container Insights
Service is running but slow

Investigation Prompts:

The catalog service is slow. Is there high CPU utilization?

Show me the CPU metrics for the catalog service from Container Insights

Root Cause: stress-ng process consuming CPU inside the container.

Rollback:

./labs/lab7-cpu-stress/rollback.sh
# Or wait 5 minutes for auto-rollback

Lab 8: DDoS Attack Simulation

Scenario: The retail application is under attack! Users are reporting extremely slow page loads and timeouts. ALB metrics show a massive spike in request count - far beyond normal traffic levels.

Inject:

./labs/lab8-ddos-simulation/inject.sh

Symptoms:

Slow page loads and timeouts
ALB RequestCount through the roof (~300 req/s attack traffic)
5XX errors increasing
Rogue ECS tasks running http-flood-attack

Investigation Prompts:

The retail app is extremely slow. Users are complaining about timeouts. What's happening?

We're seeing a massive traffic spike on the ALB. Is this a DDoS attack?

Root Cause: Rogue ECS tasks flooding the ALB with HTTP requests using curl and GNU parallel.

Rollback:

./labs/lab8-ddos-simulation/fix.sh

Lab 9: DynamoDB Attack

Scenario: The shopping cart service is completely broken. Users cannot add items to cart - all operations are failing with throttling errors. CloudWatch shows massive spikes in DynamoDB ThrottledRequests. This looks like a DDoS attack on the database!

Inject:

./labs/lab9-dynamodb-attack/inject.sh

Symptoms:

Cart operations failing with throttling errors
Massive ThrottledRequests spike in CloudWatch
Rogue ECS tasks running dynamodb-stress-attack
Service returning 500 errors

Investigation Prompts:

The carts service is completely broken. Users can't add items to cart. Check DynamoDB for issues.

DynamoDB is being throttled heavily. What's consuming all the read capacity?

Are there any suspicious ECS tasks running that might be attacking DynamoDB?

Root Cause: Rogue ECS tasks flooding DynamoDB with scan requests. Table switched to low provisioned capacity (5 RCU) which is easily overwhelmed.

Rollback:

./labs/lab9-dynamodb-attack/fix.sh

Lab 10: Auto-Scaling Not Working

Scenario: The catalog service is experiencing high CPU load during a traffic spike. Auto-scaling should kick in to add more tasks, but the service isn't scaling. Users are complaining about slow response times.

Inject:

./labs/lab10-autoscaling-broken/inject.sh

Symptoms:

High CPU utilization visible in CloudWatch metrics
CloudWatch alarm in ALARM state
Service does NOT scale out (stays at current task count)
Application becomes slow/unresponsive

Investigation Prompts:

Why isn't my ECS service scaling even though CPU is high?

Check the auto-scaling configuration for the catalog service

Show me the CloudWatch alarms for the catalog service. Are the alarm actions enabled?

Root Cause: CloudWatch alarm actions are disabled, so even though the alarm fires, it doesn't trigger the scaling policy.

Fix:

./labs/lab10-autoscaling-broken/fix.sh

Fault Injection Scenarios

The labs/ directory contains all lab scripts organized by lab number:

Lab	Inject Script	Fix Script	Target	Duration
Lab 7	`labs/lab7-cpu-stress/inject.sh`	`labs/lab7-cpu-stress/fix.sh`	catalog	Until fixed
Lab 8	`labs/lab8-ddos-simulation/inject.sh`	`labs/lab8-ddos-simulation/fix.sh`	ui/ALB	Until fixed
Lab 9	`labs/lab9-dynamodb-attack/inject.sh`	`labs/lab9-dynamodb-attack/fix.sh`	carts	Until fixed
Lab 10	`labs/lab10-autoscaling-broken/inject.sh`	`labs/lab10-autoscaling-broken/fix.sh`	catalog	Until fixed

Environment Variables

Variable	Default	Description
`CLUSTER_NAME`	`retail-store-ecs-cluster`	ECS cluster name
`SERVICE_NAME`	varies	Target ECS service
`AWS_REGION`	`us-east-1`	AWS region
`STRESS_DURATION`	`300`	Duration in seconds
`CPU_WORKERS`	`2`	Number of CPU stress workers
`MEMORY_PERCENT`	`80`	Target memory percentage
`LATENCY_MS`	`500`	Network latency in milliseconds

Observability

The deployment includes production-grade observability to enable effective troubleshooting with AWS DevOps Agent:

CloudWatch Container Insights (Enhanced)

Container Insights is enabled in enhanced mode, providing:

CPU and memory utilization per service and task
Network I/O metrics for traffic analysis
Running task counts for availability monitoring
Performance metrics at container level

CloudWatch Logs

All ECS tasks send application logs to CloudWatch Logs:

Each service has its own log stream for easy isolation
Configurable retention (default: 30 days)
ECS Exec session logging for audit trails
Optional KMS encryption

CloudWatch Alarms

When cloudwatch_alarms_enabled = true (default), pre-configured alarms monitor:

CPU utilization > 80% per service
Memory utilization > 80% per service
Running task count < 1 (service down)
ALB 5XX errors spike detection
ALB latency p95 > 2 seconds

CloudWatch Dashboard

A unified dashboard displays service health, resource utilization, ALB metrics, and error rates:

terraform output cloudwatch_dashboard_url

This observability stack provides AWS DevOps Agent with the data it needs to correlate symptoms, identify root causes, and recommend mitigations during incidents.

Cleanup

Important: Remember to destroy all resources to avoid ongoing charges!

Option 1: Use the Destroy Script (Recommended)

The destroy script handles all dependencies automatically, ensuring a clean one-shot destruction:

./scripts/destroy.sh

This script will:

Scale down and delete all ECS services
Delete Load Balancers
Delete VPC Endpoints (common blocker for subnet deletion)
Delete NAT Gateways
Clean up orphaned network interfaces
Remove any terraform state locks
Run terraform destroy

Option 2: Manual Destruction

If you prefer manual control:

Step 1: Restore Lab Configurations

If you have any active lab faults, restore them first:

# Run fix scripts for any active labs
./labs/lab1-logs-not-delivered/fix.sh
./labs/lab2-secrets-access-denied/fix.sh
# ... etc

Step 2: Destroy Infrastructure

cd terraform/ecs/default
terraform destroy
# Type 'yes' when prompted

Destruction takes ~10-15 minutes.

Troubleshooting Destroy Failures

If terraform destroy fails with DependencyViolation errors on subnets, there are likely resources still using them:

# Find what's blocking subnet deletion
aws ec2 describe-network-interfaces \
  --filters "Name=subnet-id,Values=<subnet-id>" \
  --query "NetworkInterfaces[*].{ID:NetworkInterfaceId,Type:InterfaceType,Description:Description}"

# Common blockers are VPC Endpoints - delete them first
aws ec2 describe-vpc-endpoints --filters "Name=vpc-id,Values=<vpc-id>" --query 'VpcEndpoints[*].VpcEndpointId'
aws ec2 delete-vpc-endpoints --vpc-endpoint-ids <endpoint-id>

# Then retry terraform destroy
terraform destroy

If you get a state lock error:

# Force unlock (use the lock ID from the error message)
terraform force-unlock <lock-id>

# Or remove the lock file for local state
rm -f .terraform.tfstate.lock.info

Step 3: Delete DevOps Agent Space (Optional)

Navigate to AWS DevOps Agent Console
Select your Agent Space
Click Delete and confirm

Additional Resources

Contributing

See CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT-0 License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 913 Commits
docs		docs
labs		labs
scripts		scripts
terraform		terraform
.envrc		.envrc
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

License

aws-samples/sample-devops-agent-ecs-workshop

Folders and files

Latest commit

History

Repository files navigation

AWS DevOps Agent - ECS Troubleshooting Lab

📚 Index

⚠️ Important: Platform Requirements

🚀 Ready to Deploy?

Overview

What You'll Learn

Application Architecture

Microservices

Infrastructure Components

Resource Tagging

Quick Start

Prerequisites

Step 1: Clone the Repository

Step 2: Deploy Infrastructure

Step 3: Verify Deployment

Step 4: Access the Application

AWS DevOps Agent Setup

What is an Agent Space?

Step 1: Create an Agent Space

Step 2: Configure IAM Roles

Step 3: Configure Resource Discovery with Tags

Step 4: Enable Web App Access

Step 5: Verify Setup

Verify Resource Discovery (Optional - CLI)

Starting an Investigation

Safety Mechanisms

Troubleshooting Labs

Configuration Labs (Labs 1-6)

Performance Labs (Labs 7-10)

Lab Workflow

Lab 1: CloudWatch Logs Not Delivered

Lab 2: Unable to Pull Secrets

Lab 3: Health Check Failures

Lab 4: Security Group Blocked

Lab 5: Task Resource Limits (OOM)

Lab 6: Service Connect Broken

Lab 7: CPU Stress

Lab 8: DDoS Attack Simulation

Lab 9: DynamoDB Attack

Lab 10: Auto-Scaling Not Working

Fault Injection Scenarios

Environment Variables

Observability

CloudWatch Container Insights (Enhanced)

CloudWatch Logs

CloudWatch Alarms

CloudWatch Dashboard

Cleanup

Option 1: Use the Destroy Script (Recommended)

Option 2: Manual Destruction

Step 1: Restore Lab Configurations

Step 2: Destroy Infrastructure

Troubleshooting Destroy Failures

Step 3: Delete DevOps Agent Space (Optional)

Additional Resources

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages