AWS Topics for certified Architect
AWS Config conformance packs
AWS Config is a service that provides details on how your AWS resources are configured, how they relate to each other , how they were configured in the past
Features
- Specify the resource types you want AWS Config to record.
- S3 bucket to receive a configuration snapshot
- SNS to send configuration stream notifications
- Rules that you want AWS Config to use to evaluate compliance information
- Conformance packs, or a collection of AWS Config rules and remediation actions
- Aggregator to get a centralized view of your resource inventory and compliance - collects AWS Config configuration and compliance data from multiple AWS accounts and AWS Regions into a single account and Region.
- Write advanced queries by referring to the configuration schema of the AWS resource.
https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html
Ways to Use AWS Config
- notify you whenever resources are created, modified, or deleted
- evaluate the configuration settings of your AWS resources for compliance to rules and get notified of violations
- Auditing and Compliance with internal policies and best practices
- View how the resource you intend to modify is related to other resources and assess the impact of your change.
- Use the historical configurations of your resources provided by AWS Config to troubleshoot issues and to access the last known good configuration of a problem resource.
- Security Analysis - IAM policy at specific point in time, EC2 security groups configurations, etc
Conformance pack
A collection of AWS Config rules and remediation actions that can be deployed as a single entity in an account and a Region or across an organization in AWS Organizations.
Formats:
- a YAML template that contains the list of AWS Config managed or custom rules and remediation actions.
- AWS Systems Manager documents (SSM documents)
AWS Security Hub
- Automate security best practice checks, aggregate security alerts into a single place and format, and understand your overall security posture across all of your AWS accounts.
- Performs security best practice checks, aggregates alerts, and enables automated remediation.
- Automated checks based on a collection of security controls curated by experts
- support for common frameworks like CIS, PCI DSS, and more.
- Integrates findings from other services like Config, Firewall Manager, etc
- Conduct Cloud Security Posture Management (CSPM)
- Security Orchestration, Automation, and Response (SOAR) workflows
- Integration with EventBridge.
- data ingestion into your Security Information and Event Management (SIEM), ticketing, and other tools by consolidating the integrations between AWS services and your downstream tooling and by normalizing your findings.
- Visualize your security findings to discover new insights
- Searching, correlating, and aggregating, and fine-tuning diverse security findings by accounts and resources as well as visualizing findings in the Security Hub dashboard.
AWS Managed Microsoft AD
- Run Microsoft Active Directory (AD) as a managed service.
- Highly available pair of domain controllers connected to your virtual private cloud (VPC), run in different Availability Zones in a Region of your choice.
- Run directory-aware workloads in the AWS Cloud, including Microsoft SharePoint and custom .NET and SQL Server-based applications.
- Configure a trust relationship between AWS Managed Microsoft AD in the AWS Cloud and your existing on-premises Microsoft Active Directory, providing users and groups with access to resources in either domain, using AWS IAM Identity Center.
- Connect your AWS resources with an existing on-premises Microsoft Active Directory.
- Manage users and groups
- Provide single sign-on to applications and services
- Create and apply group policy
- Enable multi-factor authentication by integrating with your existing RADIUS-based MFA infrastructure to provide an additional layer of security when users access AWS applications
- Securely connect to Amazon EC2 Linux and Windows instances
Amazon S3 Event Notifications
Receive notifications when certain events happen in your S3 bucket.
Configuration in the notification subresource that's associated with a bucket.
Publish notifications for the following events:
New object created events
Object removal events
Restore object events
Reduced Redundancy Storage (RRS) object lost events
Replication events
S3 Lifecycle expiration events
S3 Lifecycle transition events
S3 Intelligent-Tiering automatic archival events
Object tagging events
Object ACL PUT events
Amazon Simple Notification Service (Amazon SNS) topics
Amazon Simple Queue Service (Amazon SQS) queues
AWS Lambda function
Amazon EventBridge
For more information, see Supported event destinations.
Amazon Simple Queue Service FIFO (First-In-First-Out) queues aren't supported
AWS Glue
- Data integration service
- Discover, prepare, move, and integrate data from multiple sources.
- Tooling for authoring, running jobs, and implementing business workflows.
- Discover and connect to more than 70 diverse data sources and manage your data in a centralized data catalog.
- Visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes.
- Search and query cataloged data using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
- Workloads like ETL, ELT, and streaming in one service,
- Integrates with AWS analytics services and Amazon S3 data lakes.
- Discover and organize data
- Automatically discover data – Use AWS Glue crawlers to automatically infer schema information and integrate it into your AWS Glue Data Catalog.
- Manage schemas and permissions – Validate and control access to your databases and tables.
- Connect to a wide variety of data sources – Tap into multiple data sources, both on premises and on AWS, using AWS Glue connections to build your data lake.
- Transform, prepare, and clean data for analysis
- Build complex ETL pipelines with simple job scheduling – Invoke AWS Glue jobs on a schedule, on demand, or based on an event.
- Clean and transform streaming data in transit – Enable continuous data consumption, and clean and transform it in transit. This makes it available for analysis in seconds in your target data store.
- Deduplicate and cleanse data with built-in machine learning – Clean and prepare your data for analysis without becoming a machine learning expert by using the FindMatches feature. This feature deduplicates and finds records that are imperfect matches for each other.
- Built-in job notebooks – AWS Glue job notebooks provide serverless notebooks with minimal setup in AWS Glue so you can get started quickly.
- Edit, debug, and test ETL code – With AWS Glue interactive sessions, you can interactively explore and prepare data. You can explore, experiment on, and process data interactively using the IDE or notebook of your choice.
- Define, detect, and remediate sensitive data – AWS Glue sensitive data detection lets you define, identify, and process sensitive data in your data pipeline and in your data lake.
- Automatically scale based on workload – Dynamically scale resources up and down based on workload. This assigns workers to jobs only when needed.
- Automate jobs with event-based triggers – Start crawlers or AWS Glue jobs with event-based triggers, and design a chain of dependent jobs and crawlers.
- Run and monitor jobs – Run AWS Glue jobs with your choice of engine, Spark or Ray. Monitor them with automated monitoring tools, AWS Glue job run insights, and AWS CloudTrail. Improve your monitoring of Spark-backed jobs with the Apache Spark UI.
AWS Athena
- interactive query service
- analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL.
- Point Athena at your data stored in Amazon S3 and begin using standard SQL to run ad-hoc queries and get results in seconds.
- Run data analytics using Apache Spark
- Submit Spark code for processing and receive the results directly.
- Simplified notebook experience in Amazon Athena console to develop Apache Spark applications using Python or Athena notebook APIs.
- Athena SQL and Apache Spark on Amazon Athena are serverless,
- Athena scales automatically—running queries in parallel—s
- Athena helps you analyze unstructured, semi-structured, and structured data stored in Amazon S3. Examples include CSV, JSON, or columnar data formats such as Apache Parquet and Apache ORC. You can use Athena to run ad-hoc queries using ANSI SQL, without the need to aggregate or load the data into Athena.
- Athena integrates with Amazon QuickSight for easy data visualization. You can use Athena to generate reports or to explore data with business intelligence tools or SQL clients connected with a JDBC or an ODBC driver. For more information, see What is Amazon QuickSight in the Amazon QuickSight User Guide and Connecting to Amazon Athena with ODBC and JDBC drivers.
- Athena integrates with the AWS Glue Data Catalog, which offers a persistent metadata store for your data in Amazon S3. This allows you to create tables and query data in Athena based on a central metadata store available throughout your Amazon Web Services account and integrated with the ETL and data discovery features of AWS Glue. For more information, see Integration with AWS Glue and What is AWS Glue in the AWS Glue Developer Guide.
- Amazon Athena makes it easy to run interactive queries against data directly in Amazon S3 without having to format data or manage infrastructure. For example, Athena is useful if you want to run a quick query on web logs to troubleshoot a performance issue on your site. With Athena, you can get started fast: you just define a table for your data and start querying using standard SQL.
- You should use Amazon Athena if you want to run interactive ad hoc SQL queries against data on Amazon S3, without having to manage any infrastructure or clusters. Amazon Athena provides the easiest way to run ad hoc queries for data in Amazon S3 without the need to setup or manage any servers.
- For a list of AWS services that Athena leverages or integrates with, see AWS service integrations with Athena.
- Query services like Amazon Athena, data warehouses like Amazon Redshift, and sophisticated data processing frameworks like Amazon EMR all address different needs and use cases.
Amazon EMR
- Run Hadoop, Spark, and Presto
- Running SQL queries,
- Run a wide variety of scale-out data processing tasks for applications such as machine learning, graph analytics, data transformation, streaming data, and virtually anything you can code. =
- Use custom code to process and analyze extremely large datasets with the latest big data processing frameworks such as Spark, Hadoop, Presto, or Hbase.
- Full control over the configuration of your clusters and the software installed on them.
- Use Amazon Athena to query data that you process using Amazon EMR.
- If you use EMR and already have a Hive metastore, you can run your DDL statements on Amazon Athena and query your data immediately without affecting your Amazon EMR jobs.
Amazon Redshift
- A data warehouse
- Pull together data from many different sources – like inventory systems, financial systems, and retail sales systems – into a common format, and store it for long periods of time.
- Build sophisticated business reports from historical data, then a data warehouse like Amazon Redshift is the best choice.
- The query engine in Amazon Redshift has been optimized to perform especially well on running complex queries that join large numbers of very large database tables. When you need to run queries against highly structured data with lots of joins across lots of very large tables, choose Amazon Redshift.
Apache Parket and AWS Glue
- AWS Glue supports using the Parquet format.
- A performance-oriented, column-based data format.
- Use AWS Glue to read Parquet files from Amazon S3 and from streaming sources as well as write Parquet files to Amazon S3.
- read and write bzip and gzip archives containing Parquet files from S3.
- Configure compression behavior on the S3 connection parameters instead of in the configuration discussed on this page.
AWS Global accelerator
- improve the availability, performance, and security of your public applications.
- provides two global static public IPs that act as a fixed entry point to your application endpoints, such as Application Load Balancers, Network Load Balancers, Amazon Elastic Compute Cloud (EC2) instances, and elastic IPs.
- Use cases
- Use traffic dials to route traffic to the nearest Region or achieve fast failover across Regions.
- Accelerate API workloads by up to 60%, leveraging TCP termination at the edge.
- Global static IP
- Simplify allowlisting in enterprise firewalling and IoT use cases.
- Low-latency gaming and media workloads
- Use custom routing to deterministically route traffic to a fleet of EC2 instances.
Amazon Kinesis Data Streams
- serverless streaming data service that makes it easy to capture, process, and store data streams at any scale.
- Stream Data from iOS or devices is directed to Kinesis Data Streams
- KDS then ingests and stores data streams for processing - clickstream, service logs, sensor data, in app user events
- Use KDS with Lambda, Managed Service for Apache Flunk , Spark on Amazon ERM, Ec2 to output into dashboards or real time applications
- Great for real time analytics
Amazon Kinesis
- (ETL) service that reliably captures, transforms, and delivers streaming data to data lakes, data stores, and analytics services.
- Lambda functions to transform the data
- Ingest from AWS SDK, AWS Services, Kinessis Data Streams
- Build it transformations are supported
- Write to s3, Redshift, API Gateway,
Amazon Redshift Spectrum
- query data directly from files on Amazon S3.
- you need an Amazon Redshift cluster and a SQL client that's connected to your cluster so that you can run SQL commands. The cluster and the data files in Amazon S3 must be in the same AWS Region.
- efficiently query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables.
Comments
Post a Comment