Technology & Observability
34% of exam • Largest domain - Core AWS services
The Three Pillars of Observability
Metrics
CloudWatch
Logs
CloudWatch Logs
Traces
AWS X-Ray
Complete monitoring and observability platform for AWS
Primary Use Case:
Performance monitoring and operational health - tracks metrics, logs, and creates alerts for AWS resources
📈 CloudWatch Metrics
- • Monitor AWS resource performance (CPU, memory, disk, network)
- • Basic Monitoring: Free, every 5 minutes (default for EC2)
- • Detailed Monitoring: Paid, every 1 minute
- • Custom Metrics: Push your own application metrics (e.g., page load times, business KPIs)
- • Metrics are stored for 15 months
Example: Track EC2 CPU utilization over time to identify performance bottlenecks
📝 CloudWatch Logs
- • Centralized log storage and analysis from multiple sources
- • Collect logs from EC2, Lambda, CloudTrail, Route 53, VPC Flow Logs
- • Log Groups: Organize logs by application/service
- • Log Streams: Individual log files within a group
- • Search and filter log data with CloudWatch Logs Insights (query language)
- • Set retention policies (1 day to 10 years, or indefinite)
Example: Centralize application error logs from 100 EC2 instances for troubleshooting
🔔 CloudWatch Alarms
- • Trigger actions based on metric thresholds
- • Three states: OK, ALARM, INSUFFICIENT_DATA
- • Actions: Send SNS notifications, Auto Scaling, EC2 actions (stop, terminate, reboot, recover)
- • Set evaluation periods (e.g., "alert if CPU > 80% for 3 consecutive 5-minute periods")
Example: Alert DevOps team when application latency exceeds 2 seconds
📊 CloudWatch Dashboards
- • Customizable home pages for monitoring
- • Visualize metrics across multiple services and accounts
- • Can be shared across accounts and regions
- • Real-time data visualization with graphs, numbers, and text widgets
- • Automatic refresh (1 min, 5 min, 15 min)
Example: Single pane of glass showing EC2, RDS, and Lambda metrics for production environment
Auditing and compliance service - "Who did what, when, and from where?"
Primary Use Case:
Governance, compliance, and auditing - records API calls for security analysis and regulatory compliance
What CloudTrail Records (Event History)
- • Every API call made in your AWS account (Console, CLI, SDK, or AWS service)
- • Who: IAM user/role identity
- • When: Timestamp (UTC)
- • Where: Source IP address
- • What: Action performed (e.g., RunInstances, CreateBucket)
- • Result: Response elements and whether it succeeded or failed
Key Features
- • Event History: Last 90 days viewable in console (free)
- • Trails: Continuous logging to S3 for long-term storage (paid)
- • CloudTrail Insights: Detect unusual API activity (e.g., sudden spike in IAM actions)
- • Can log to CloudWatch Logs for real-time monitoring and alarms
Use Cases
- • Security Analysis: "Who deleted this S3 bucket?"
- • Compliance Auditing: Prove compliance with SOC 2, PCI-DSS, HIPAA
- • Troubleshooting: "Why did this EC2 instance stop?"
- • Change Tracking: Monitor resource configuration changes
- • Detect Unusual Activity: Identify unauthorized API calls
Exam Key Distinction:
CloudWatch = Performance monitoring (CPU, memory, latency)
CloudTrail = Auditing/logging API calls (who did what, when)
Unified interface for operational management of AWS and on-premises resources
Primary Use Case:
Manage EC2 and on-premises servers at scale - patching, configuration, remote access
🔐 Session Manager
Secure shell access to EC2 instances without SSH keys, bastion hosts, or opening port 22
- • Browser-based or CLI access
- • All session activity logged to S3 or CloudWatch Logs
- • No need to manage SSH keys
- • Works with IAM permissions
🔧 Patch Manager
Automate OS and software patching across EC2 and on-premises servers
- • Define maintenance windows for patching
- • Supports Windows, Linux, and macOS
- • Compliance reporting for patch status
- • Can auto-approve patches or require manual approval
▶️ Run Command
Execute commands or scripts on multiple instances remotely without SSH/RDP
- • Run pre-defined or custom commands
- • Target instances by tags or resource groups
- • View command execution status and output
🗄️ Parameter Store
Secure, hierarchical storage for configuration data, secrets, and license codes
- • Store plaintext or encrypted values (uses KMS)
- • Free tier: Up to 10,000 parameters
- • Version tracking and change notifications
- • Integrated with CloudFormation, EC2, Lambda
vs Secrets Manager: Parameter Store is free (basic), Secrets Manager has auto-rotation for DB credentials
Exam Tip:
"Patch EC2 instances," "manage fleet of servers," or "store configuration data" = Systems Manager
Real-time guidance and best practice recommendations to optimize your AWS environment
Primary Use Case:
Automated best practice checks across 5 categories to help reduce costs, improve performance, and enhance security
The 5 Pillars of Trusted Advisor:
💰 Cost Optimization
Identify ways to reduce your AWS bill
- • Idle RDS DB instances
- • Unassociated Elastic IP addresses
- • Underutilized EC2 instances
- • Reserved Instance optimization
⚡ Performance
Improve speed and responsiveness
- • High utilization EC2 instances
- • CloudFront content delivery optimization
- • EBS provisioned IOPS optimization
- • Route 53 latency resource record sets
🔒 Security
Close security gaps and enable protective features
- • S3 buckets with open access permissions
- • Security groups with unrestricted access
- • IAM use (root account usage)
- • MFA on root account
🛡️ Fault Tolerance
Increase availability and redundancy
- • EBS snapshots
- • RDS Multi-AZ deployments
- • Auto Scaling group health checks
- • VPC availability zones
📊 Service Limits (Service Quotas)
Monitor usage and warn when approaching service limits
- • Check if you're close to limits for VPC, EC2, EBS, RDS, etc.
- • Proactively request limit increases before hitting caps
- • Prevent service disruptions from reaching quotas
Access Levels by Support Plan:
Basic & Developer Support (Free)
Access to 7 core checks including:
- • S3 bucket permissions
- • Security groups - unrestricted access
- • IAM use
- • MFA on root account
- • EBS public snapshots
- • RDS public snapshots
- • Service limits
Business & Enterprise Support (Paid)
Access to full set of checks (115+ checks)
- • All 7 core checks
- • Plus 100+ additional checks
- • Cost optimization recommendations
- • Performance improvements
- • CloudWatch alarms integration
Exam Tip:
"Best practice recommendations," "reduce cost," "improve security/performance," or "check service limits" = Trusted Advisor
Services for tracking resources, compliance, and operational health
🗂️ AWS Config
Track resource configuration history and compliance
- • Records configuration changes to AWS resources over time
- • Configuration History: "How was this resource configured 6 months ago?"
- • Compliance Auditing: Check if resources comply with your rules (e.g., "All S3 buckets must have encryption enabled")
- • Config Rules: Define desired configurations, get alerts on non-compliance
- • Resource Relationships: See dependencies (e.g., which security group is attached to which EC2 instance)
Exam Tip:
"Configuration history/audit" or "compliance with internal policies" = AWS Config
Example: "How did my S3 bucket configuration look 3 months ago before the data breach?"
🏥 AWS Health Dashboard
View AWS service health and how it affects YOUR resources
Service Health Dashboard
Public view of AWS service health across ALL regions
• General AWS status (status.aws.amazon.com)
Personal Health Dashboard (PHD)
Shows issues affecting YOUR specific resources
• Personalized alerts and remediation guidance
- • Proactive notifications about AWS events impacting your account
- • Scheduled maintenance, service disruptions, security notifications
- • Integrates with EventBridge for automated responses
Exam Tip:
"AWS-side issues affecting my resources" = AWS Health Dashboard (Personal Health Dashboard)
Example: Get notified that AWS is performing maintenance on hardware hosting your EC2 instances
📡 Amazon EventBridge
Serverless event bus for building event-driven applications
- • React to AWS service events in real-time (e.g., EC2 state change, S3 object upload)
- • Route events to Lambda, SQS, SNS, Step Functions, etc.
- • Formerly called CloudWatch Events
Example: Automatically invoke Lambda when an EC2 instance is terminated
Know the differences - frequently tested on exam!
CloudWatch
What: Performance metrics & logs
Purpose: Monitor operational health
"Is my CPU high?"
CloudTrail
What: API call logs
Purpose: Auditing & compliance
"Who deleted this?"
AWS Config
What: Resource configuration history
Purpose: Compliance & change tracking
"How was this configured before?"
Health Dashboard
What: AWS service status
Purpose: AWS-side issues
"Is AWS having problems?"
Service Keyword Cheatsheet:
CloudWatch
Metrics, logs, alarms, dashboards
Keywords: "monitor performance," "CPU metrics," "application logs"
CloudTrail
API audit logs
Keywords: "who did what," "API calls," "compliance audit"
AWS Config
Configuration history & compliance
Keywords: "configuration changes," "compliance rules," "resource history"
Systems Manager
Patch management, operations
Keywords: "patch servers," "fleet management," "parameter store"
Trusted Advisor
Best practice recommendations (5 pillars)
Keywords: "cost optimization," "security recommendations," "service limits"
Health Dashboard
AWS service health status
Keywords: "AWS outage," "service disruption," "maintenance notification"
Most Commonly Confused:
- • CloudWatch vs CloudTrail: Performance monitoring vs API auditing
- • Config vs CloudTrail: Resource configuration vs API calls
- • Parameter Store vs Secrets Manager: Free config storage vs paid secrets with auto-rotation
- • Trusted Advisor vs Health Dashboard: Best practice recommendations vs AWS service health