Technology & Observability

34% of exam • Largest domain - Core AWS services

👁️Monitoring & Observability

The Three Pillars of Observability

Metrics

CloudWatch

Logs

CloudWatch Logs

Traces

AWS X-Ray

📊Amazon CloudWatch

Complete monitoring and observability platform for AWS

Primary Use Case:

Performance monitoring and operational health - tracks metrics, logs, and creates alerts for AWS resources

📈 CloudWatch Metrics

  • • Monitor AWS resource performance (CPU, memory, disk, network)
  • Basic Monitoring: Free, every 5 minutes (default for EC2)
  • Detailed Monitoring: Paid, every 1 minute
  • Custom Metrics: Push your own application metrics (e.g., page load times, business KPIs)
  • • Metrics are stored for 15 months

Example: Track EC2 CPU utilization over time to identify performance bottlenecks

📝 CloudWatch Logs

  • • Centralized log storage and analysis from multiple sources
  • • Collect logs from EC2, Lambda, CloudTrail, Route 53, VPC Flow Logs
  • Log Groups: Organize logs by application/service
  • Log Streams: Individual log files within a group
  • • Search and filter log data with CloudWatch Logs Insights (query language)
  • • Set retention policies (1 day to 10 years, or indefinite)

Example: Centralize application error logs from 100 EC2 instances for troubleshooting

🔔 CloudWatch Alarms

  • • Trigger actions based on metric thresholds
  • • Three states: OK, ALARM, INSUFFICIENT_DATA
  • Actions: Send SNS notifications, Auto Scaling, EC2 actions (stop, terminate, reboot, recover)
  • • Set evaluation periods (e.g., "alert if CPU > 80% for 3 consecutive 5-minute periods")

Example: Alert DevOps team when application latency exceeds 2 seconds

📊 CloudWatch Dashboards

  • • Customizable home pages for monitoring
  • • Visualize metrics across multiple services and accounts
  • • Can be shared across accounts and regions
  • • Real-time data visualization with graphs, numbers, and text widgets
  • • Automatic refresh (1 min, 5 min, 15 min)

Example: Single pane of glass showing EC2, RDS, and Lambda metrics for production environment

🔍AWS CloudTrail

Auditing and compliance service - "Who did what, when, and from where?"

Primary Use Case:

Governance, compliance, and auditing - records API calls for security analysis and regulatory compliance

What CloudTrail Records (Event History)

  • Every API call made in your AWS account (Console, CLI, SDK, or AWS service)
  • Who: IAM user/role identity
  • When: Timestamp (UTC)
  • Where: Source IP address
  • What: Action performed (e.g., RunInstances, CreateBucket)
  • Result: Response elements and whether it succeeded or failed

Key Features

  • Event History: Last 90 days viewable in console (free)
  • Trails: Continuous logging to S3 for long-term storage (paid)
  • CloudTrail Insights: Detect unusual API activity (e.g., sudden spike in IAM actions)
  • • Can log to CloudWatch Logs for real-time monitoring and alarms

Use Cases

  • Security Analysis: "Who deleted this S3 bucket?"
  • Compliance Auditing: Prove compliance with SOC 2, PCI-DSS, HIPAA
  • Troubleshooting: "Why did this EC2 instance stop?"
  • Change Tracking: Monitor resource configuration changes
  • Detect Unusual Activity: Identify unauthorized API calls

Exam Key Distinction:

CloudWatch = Performance monitoring (CPU, memory, latency)

CloudTrail = Auditing/logging API calls (who did what, when)

⚙️AWS Systems Manager

Unified interface for operational management of AWS and on-premises resources

Primary Use Case:

Manage EC2 and on-premises servers at scale - patching, configuration, remote access

🔐 Session Manager

Secure shell access to EC2 instances without SSH keys, bastion hosts, or opening port 22

  • • Browser-based or CLI access
  • • All session activity logged to S3 or CloudWatch Logs
  • • No need to manage SSH keys
  • • Works with IAM permissions

🔧 Patch Manager

Automate OS and software patching across EC2 and on-premises servers

  • • Define maintenance windows for patching
  • • Supports Windows, Linux, and macOS
  • • Compliance reporting for patch status
  • • Can auto-approve patches or require manual approval

▶️ Run Command

Execute commands or scripts on multiple instances remotely without SSH/RDP

  • • Run pre-defined or custom commands
  • • Target instances by tags or resource groups
  • • View command execution status and output

🗄️ Parameter Store

Secure, hierarchical storage for configuration data, secrets, and license codes

  • • Store plaintext or encrypted values (uses KMS)
  • Free tier: Up to 10,000 parameters
  • • Version tracking and change notifications
  • • Integrated with CloudFormation, EC2, Lambda

vs Secrets Manager: Parameter Store is free (basic), Secrets Manager has auto-rotation for DB credentials

Exam Tip:

"Patch EC2 instances," "manage fleet of servers," or "store configuration data" = Systems Manager

💼AWS Trusted Advisor

Real-time guidance and best practice recommendations to optimize your AWS environment

Primary Use Case:

Automated best practice checks across 5 categories to help reduce costs, improve performance, and enhance security

The 5 Pillars of Trusted Advisor:

💰 Cost Optimization

Identify ways to reduce your AWS bill

  • • Idle RDS DB instances
  • • Unassociated Elastic IP addresses
  • • Underutilized EC2 instances
  • • Reserved Instance optimization

⚡ Performance

Improve speed and responsiveness

  • • High utilization EC2 instances
  • • CloudFront content delivery optimization
  • • EBS provisioned IOPS optimization
  • • Route 53 latency resource record sets

🔒 Security

Close security gaps and enable protective features

  • • S3 buckets with open access permissions
  • • Security groups with unrestricted access
  • • IAM use (root account usage)
  • • MFA on root account

🛡️ Fault Tolerance

Increase availability and redundancy

  • • EBS snapshots
  • • RDS Multi-AZ deployments
  • • Auto Scaling group health checks
  • • VPC availability zones

📊 Service Limits (Service Quotas)

Monitor usage and warn when approaching service limits

  • • Check if you're close to limits for VPC, EC2, EBS, RDS, etc.
  • • Proactively request limit increases before hitting caps
  • • Prevent service disruptions from reaching quotas

Access Levels by Support Plan:

Basic & Developer Support (Free)

Access to 7 core checks including:

  • • S3 bucket permissions
  • • Security groups - unrestricted access
  • • IAM use
  • • MFA on root account
  • • EBS public snapshots
  • • RDS public snapshots
  • • Service limits

Business & Enterprise Support (Paid)

Access to full set of checks (115+ checks)

  • • All 7 core checks
  • • Plus 100+ additional checks
  • • Cost optimization recommendations
  • • Performance improvements
  • • CloudWatch alarms integration

Exam Tip:

"Best practice recommendations," "reduce cost," "improve security/performance," or "check service limits" = Trusted Advisor

⚖️Management & Governance Services

Services for tracking resources, compliance, and operational health

🗂️ AWS Config

Track resource configuration history and compliance

  • • Records configuration changes to AWS resources over time
  • Configuration History: "How was this resource configured 6 months ago?"
  • Compliance Auditing: Check if resources comply with your rules (e.g., "All S3 buckets must have encryption enabled")
  • Config Rules: Define desired configurations, get alerts on non-compliance
  • Resource Relationships: See dependencies (e.g., which security group is attached to which EC2 instance)

Exam Tip:

"Configuration history/audit" or "compliance with internal policies" = AWS Config

Example: "How did my S3 bucket configuration look 3 months ago before the data breach?"

🏥 AWS Health Dashboard

View AWS service health and how it affects YOUR resources

Service Health Dashboard

Public view of AWS service health across ALL regions

• General AWS status (status.aws.amazon.com)

Personal Health Dashboard (PHD)

Shows issues affecting YOUR specific resources

• Personalized alerts and remediation guidance

  • • Proactive notifications about AWS events impacting your account
  • • Scheduled maintenance, service disruptions, security notifications
  • • Integrates with EventBridge for automated responses

Exam Tip:

"AWS-side issues affecting my resources" = AWS Health Dashboard (Personal Health Dashboard)

Example: Get notified that AWS is performing maintenance on hardware hosting your EC2 instances

📡 Amazon EventBridge

Serverless event bus for building event-driven applications

  • • React to AWS service events in real-time (e.g., EC2 state change, S3 object upload)
  • • Route events to Lambda, SQS, SNS, Step Functions, etc.
  • • Formerly called CloudWatch Events

Example: Automatically invoke Lambda when an EC2 instance is terminated

🔄CloudWatch vs CloudTrail vs Config vs Health Dashboard

Know the differences - frequently tested on exam!

CloudWatch

What: Performance metrics & logs

Purpose: Monitor operational health

"Is my CPU high?"

CloudTrail

What: API call logs

Purpose: Auditing & compliance

"Who deleted this?"

AWS Config

What: Resource configuration history

Purpose: Compliance & change tracking

"How was this configured before?"

Health Dashboard

What: AWS service status

Purpose: AWS-side issues

"Is AWS having problems?"

💡Exam Tips - Quick Reference

Service Keyword Cheatsheet:

CloudWatch

Metrics, logs, alarms, dashboards

Keywords: "monitor performance," "CPU metrics," "application logs"

CloudTrail

API audit logs

Keywords: "who did what," "API calls," "compliance audit"

AWS Config

Configuration history & compliance

Keywords: "configuration changes," "compliance rules," "resource history"

Systems Manager

Patch management, operations

Keywords: "patch servers," "fleet management," "parameter store"

Trusted Advisor

Best practice recommendations (5 pillars)

Keywords: "cost optimization," "security recommendations," "service limits"

Health Dashboard

AWS service health status

Keywords: "AWS outage," "service disruption," "maintenance notification"

Most Commonly Confused:

  • CloudWatch vs CloudTrail: Performance monitoring vs API auditing
  • Config vs CloudTrail: Resource configuration vs API calls
  • Parameter Store vs Secrets Manager: Free config storage vs paid secrets with auto-rotation
  • Trusted Advisor vs Health Dashboard: Best practice recommendations vs AWS service health