Operational Excellence

Ability to run and monitor systems
Continuously improve supporting processes and procedures
How code will be deployed, updated, operated
Logging, reduce defects, perform quick, safe fixes

manage and automate changes
respond to events
successfully manage daily operations

Best practice areas

Prepare -> Operate -> Evolve

Perform Operations as code

Define your entire workload - applications and infrastructure as Code
Update it with code
Implement operational procedures as code
Limit human error
Enable consistent responses to events
Annotations are input to your operations as code

Automate documentation

Create annotated documentation after every build

Make frequent small changes

Enable components to be updated regularly

Refine operation procedures frequently

Identify opportunities to improve

Anticipate failure

Identify potential sources of failure so they can be mitigated

Learn from all operational events and failures

Share across teams and the organization

Security

protect informations, systems, assets, data, network and compute resources
identify who can do what
applies to both confidentiality and integrity
establish control to detect security events
protect data in transit and at rest
classify data into security levels
tools to identify and investigate security incidents
enable traceability

Best practice areas: Security, IAM, Detection, Infrastructure & Data protection, Incident response

Design principles

Implement a strong identity foundation

principle of least privilege
reduce reliance on long-term credentials

Enable traceability

monitor and alert actions, changes to your environment in real time
integrate logs and metrics with systems
automatically respond and take actions

Apply security at all layers

edge network, VPC, subnet, load balancer

Automate security best practices

create secure architectures
version controlled templates

Protect data in transit and at rest

classify data into security levels
encryption, tokenization

Keep people away from data

reduce chances of human error
eliminate and reduce direct access or manual processing of data

Prepare for security events

have incident management process that aligns with organizational requirements
run incident response simulations
use tools with automation

Cost Optimization

Avoid unnecessary costs
Understand where money is spend
Select appropriate resources
Analyze spending trends over time
Scale if needed without overspending
iterative process, improved during the lifetime
consider using managed services where you can
decommissioning resources
using correct pricing models to reduce cost

Best Practice Areas

Cloud financial management

Expenditure and usage awareness

Cost-effective resources

Managing demand and supply

Optimize over time

Design Principles

Implement Cloud Financial Management

use knowledge building, tools, processes programs resources to optimize cloud costs

Adopt a consumption model

Pay only for what you require
Increase or decrease usage based on business need

Measure overall efficiency

Measure business output vs associated costs of the workload

Stop spending money on undifferentiated heavy lifting

do not focus on building IT infrastructure, let AWS do that

Analyze and attribute expenditure

attribute cost and system usage to specific workload owners
measure ROI
workload owners can optimize and reduce cost

Performance Efficiency

use resources sparingly
use IT and compute resources efficiently
select right resource types and sizes
monitor performance
meet efficiency as demand changes and technologies evolve
gather data on all aspects and use it to select and configure resources
review your choices periodically
take advantage of new services
monitoring for deviance of expected performance
use tradeoffs to improve performance: compression, relaxed consistency requirements, caching

Best practice areas

Selection
Review
Monitoring
Tradeofss

Design Principles

Democratize technologies

Expertise in technologies such as NoSQL databases, machine learning is not available evenly across community, but these are available as services and can be consumed in the cloud

Go global in minutes

Deploy systems in multiple regions

Use server less architectures

lower transactional costs
easier to operate
managed services at cloud scale

Experiment more often

comparative testing of different types of instances , storage, configurations

Mechanical sympathy

use approach that aligns best
example: consider data access patterns when you select a database technology

Sustainability

maximize efficiency
reduce waste
energy reduction
minimizing total resources required
selection of efficient programming language
adoption of modern algorithms
efficient data storage technique
minimize requirements for high powered hardware
anticipate adoption of new more efficient hardware and software
understanding your impact

Reliability

recover from failure
mitigate disruption
design distributed systems
recovery planning
recover quickly from disruptions
managing service quota
data backups

Best practice areas

Foundations, Workload architecture (Well-planned)
Change management, Failure management (Monitoring)

Design Principles

Automatically recover from failure

monitor key performance indicators
configure systems to trigger automatic recovery when threshold is breached
automatic notification for failures

Test recovery procedures

test systems failure
validate recovery procedures

Scale horizontally to increase aggregate workload availability

replace one large resource with multiple smaller resources
distribute requests
minimize impact of single point of failure

Stop guessing capacity

monitor demand and system usage
automate addition and removal of resources

Manage change in automation

use automation to make changes to infrastructure

AWS Well-Architected Tool

compares your workloads to the state of the latest AWS best practices in terms of architecture
delivers step by step action plan for improvement

How it works

You define your workload
You answer a series of questions in the six areas

Well Architected Framework

Operational Excellence

Best practice areas

Perform Operations as code

Automate documentation

Make frequent small changes

Refine operation procedures frequently

Anticipate failure

Learn from all operational events and failures

Security

Design principles

Implement a strong identity foundation

Enable traceability

Apply security at all layers

Automate security best practices

Protect data in transit and at rest

Keep people away from data

Prepare for security events

Cost Optimization

Best Practice Areas

Design Principles

Implement Cloud Financial Management

Adopt a consumption model

Measure overall efficiency

Stop spending money on undifferentiated heavy lifting

Analyze and attribute expenditure

Performance Efficiency

Best practice areas

Design Principles

Democratize technologies

Go global in minutes

Use server less architectures

Experiment more often

Mechanical sympathy

Sustainability

Reliability

Best practice areas

Design Principles

Automatically recover from failure

Test recovery procedures

Scale horizontally to increase aggregate workload availability

Stop guessing capacity

Manage change in automation

AWS Well-Architected Tool

How it works

Comments

Post a Comment