Design Resilient Architectures

Quiz & Concept reviews

Design Resilient Architectures

EC2 Instances Placement Strategies

When you launch a new Amazon EC2 instance, the EC2 service attempts to place the instance in such a way that all of your instances are spread out across underlying hardware to minimize correlated failures. You can use placement groups to influence the placement of a group of interdependent instances to meet the needs of your workload. Depending on the type of workload, you can create a placement group using one of the following placement strategies:

There is no additional Charge for using the following placement strategies.

Cluster placement group

  • Instances run within a single AZ

  • Cluster group can span peered virtual private networks (VPCs) in the same Region.

  • Best for applications that benefit from low network latency, high network throughput, or both.

Partition placement group

  • uses logical segment called partitions, each partition has its own set of racks, Each rack has its own network and power source.

  • Each partition comprises multiple instances. The instances in a partition do not share racks with the instances in the other partitions, allowing you to contain the impact of a single hardware failure to only the associated partition.

  • Partitions can span multiple Availability Zones in the same Region.

  • A partition placement group can have a maximum of seven partitions per Availability Zone. The number of instances that can be launched into a partition is limited only by the limits of your account.

  • Partition placement groups are used to deploy large distributed and replicated workloads, such as HDFS, HBase, and Cassandra, across distinct racks.

  • When you launch instances into a partition placement group, Amazon EC2 tries to distribute the instances evenly across the number of partitions that you specify.

Spread placement group.

  • A group of instances are each placed on distinct racks, with each rack having its own network and power source.

  • It can span multiple Availability Zones in the same Region.

  • A maximum of seven running instances per Availability Zone per group.

    • Therefore Example, to deploy 15 Amazon EC2 instances in a single Spread placement group, the company needs to use 3 Availability Zones.

    • 7 Max running instance per AZ per group; 15 instances divided by 7 gives you 3 Availability Zones

  • It is Recommended for applications that have a small number of critical instances that should be kept separate from each other. This reduces the risk of simultaneous failures that might occur when instances share the same racks.

Find more resources here on the Placement strategies

Amazon Kinesis Data Streams vs Amazon Kinesis Data Firehose

Kinesis Data Streams fuctionality

Kinesis Data Firehose Functionality

Key DifferencesKinesis Data StreamKinesis Firehose
Data Processing:Offers the capability for custom data processing using Kinesis Data Analytics or external processing frameworks
It offers more flexibility and control for complex processing needsPrimarily focuses on data delivery, but allows basic transformations and format conversions.
It is a simpler, fully managed solution for direct data delivery
Management:Requires manual scaling and shard management.Fully managed, automatic scaling.
Data Storage:Stores data for 24 hours by default, up to 8760 hours (365 days, using the IncreaseStreamRetentionPeriod parameterDoes not store data, directly delivers to specified destinations.
Integrations:Requires integration with other AWS services for data storage or analytics.Integrates directly with services like S3, Redshift, and Elasticsearch for immediate data delivery.
Use Cases:Best for applications requiring real-time analytics, complex processing, or temporary data storage.Ideal for simple, real-time data delivery needs without the necessity of storage or complex processing.
Pricing:Charged based on the number of shards and data retention period.Charged based on the amount of data ingested and transformed.
Practical Use Cases:Real-time monitoring and analytics for large-scale applications.
Complex event processing in financial trading platforms.
Aggregating and analyzing high-volume IoT device data.Streaming log data directly to S3 for later analysis.
Feeding data into Redshift for business intelligence purposes.
Sending real-time application logs to Elasticsearch for operational insights.

AWS Direct Connect lets you establish a dedicated network connection between your network and one of the AWS Direct Connect locations. Using industry-standard 802.1q VLANs, this dedicated connection can be partitioned into multiple virtual interfaces. AWS Direct Connect does not involve the Internet; instead, it uses dedicated, private network connections between your intranet and Amazon VPC. AWS Direct Connect cannot be used to improve application resiliency to handle spikes in traffic.

AWS Global Accelerator vs Amazon Cloudfront

Amazon CloudFrontAWS Global Accelerator
CloudFront is used to deliver static assets (such as videos, images, and files) securely to various devices around the globe with low latency.is a service that uses edge locations to look for the optimal pathway from your users to your applications.
uses multiple sets of dynamically changing IP addressesprovide you a set of static IP addresses as a fixed entry point to your applications.
pricing is mainly based on data transfer out and HTTP requestscharges a fixed hourly fee and an incremental charge over your standard Data Transfer rates, also called a Data Transfer-Premium fee (DT-Premium).
uses Edge Locations to cache contentuses Edge Locations to find an optimal pathway to the nearest regional endpoint.
it is designed to handle HTTP protocolit is best used for both HTTP and non-HTTP protocols such as TCP and UDP

Amazon CloudFront Example:

Let us say you have a streaming website with thousands of videos in your repository. It is inefficient to serve these videos uniquely whenever a user requests for it as this will lead to high bandwidth requirements and high CPU/Memory/disk utilization, which in turn results in frequent downtimes, endless video buffering, and irritated users who are trying to load their favorite shows. Speeding up the website is as simple as offloading the videos, thumbnails, and any static assets from your server to Amazon S3, and using CloudFront to serve and cache these assets.

AWS Global Accelerator Example:

For example, you have a banking application that is scattered through multiple AWS regions and low latency is a must. Global Accelerator will route the user to the nearest edge location then route it to the nearest regional endpoint where your applications are hosted.

Read more here

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. Amazon EMR uses Hadoop, an open-source framework, to distribute your data and processing across a resizable cluster of Amazon EC2 instances.

Using EMR involves significant infrastructure management efforts to set up and maintain the EMR cluster. Additionally this option involves a major development effort to write custom migration jobs to copy the database data into Redshift.

The engineering team at an e-commerce company wants to migrate from Amazon Simple Queue Service (Amazon SQS) Standard queues to FIFO (First-In-First-Out) queues with batching.

As a solutions architect, which of the following steps would you have in the migration checklist? (Select three)

  • Delete the existing standard queue and recreate it as a FIFO (First-In-First-Out) queue

  • Make sure that the name of the FIFO (First-In-First-Out) queue ends with the .fifo suffix

  • Make sure that the throughput for the target FIFO (First-In-First-Out) queue does not exceed 3,000 messages per second