This post was co-written by FactSet’s Cloud Infrastructure team, Gaurav Jain, Nathan Goodman, Geoff Wang, Daniel Cordes, Sunu Joseph and AWS Solution Architects, Amit Borulkar and Tarik Makota. At FactSet, their goal for cloud platform on AWS Cloud is to have high developer velocity alongside enterprise governance. They wanted application teams to have a frictionless […]| Amazon Web Services
In this post, we explore an efficient approach to managing encryption keys in a multi-tenant SaaS environment through centralization, addressing challenges like key proliferation, rising costs, and operational complexity across multiple AWS accounts and services. We demonstrate how implementing a centralized key management strategy using a single AWS KMS key per tenant can maintain security and compliance while reducing operational overhead as organizations scale.| AWS Architecture Blog
In this post, we explore how CommSec, Australia's leading online broker, transitioned from a multicloud environment to AWS as their sole cloud provider while implementing Amazon Application Recovery Controller (ARC) zonal shift to maintain high availability and operational resilience. The consolidation resulted in significant benefits including 25% base capacity reduction, two times faster deployments, and improved failover capabilities through ARC zonal shift, enabling CommSec to continue se...| AWS Architecture Blog
This two-part series shows how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. This post starts by presenting our motivation, our requirements, and the solution architecture, focusing on feature serving.| AWS Architecture Blog
This two-part series shows how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. This post covers the process of collecting features in real-time and batch ingestion into an online store, and the technical approaches for stable operation.| AWS Architecture Blog
In this post, we demonstrate how to deploy the DeepSeek-R1-Distill-Qwen-32B model using AWS DLCs for vLLMs on Amazon EKS, showcasing how these purpose-built containers simplify deployment of this powerful open source inference engine. This solution can help you solve the complex infrastructure challenges of deploying LLMs while maintaining performance and cost-efficiency.| Amazon Web Services
As cloud spending continues to surge, organizations must focus on strategic cloud optimization to maximize business value. This blog post explores key insights from MIT Technology Review's publication on cloud optimization, highlighting the importance of viewing optimization as a continuous process that encompasses all six AWS Well-Architected pillars.| AWS Architecture Blog
In this post, you’ll learn how Zapier has built their serverless architecture focusing on three key aspects: using Lambda functions to build isolated Zaps, operating over a hundred thousand Lambda functions through Zapier's control plane infrastructure, and enhancing security posture while reducing maintenance efforts by introducing automated function upgrades and cleanup workflows into their platform architecture.| AWS Architecture Blog
In this post, we discuss HashiCorp’s journey from manual, stress-inducing failover procedures to a streamlined, confident approach that fundamentally changed how they deliver on their enterprise-grade resilience promises.| Amazon Web Services
Internet Routing Internet routing today is handled through the use of a routing protocol known as BGP (Border Gateway Protocol). Individual networks on the Internet are represented as an autonomous system (AS). An autonomous system has a globally unique autonomous system number (ASN) which is allocated by a Regional Internet Registry (RIR), who also handle […]| Amazon Web Services
In this post, we show you how to implement comprehensive monitoring for Amazon Elastic Kubernetes Service (Amazon EKS) workloads using AWS managed services. This solution demonstrates building an EKS platform that combines flexible compute options with enterprise-grade observability using AWS native services and OpenTelemetry.| AWS Architecture Blog
In this post, you'll learn how Scale to Win configured their network topology and AWS WAF to protect against DDoS events that reached peaks of over 2 million requests per second during the 2024 US presidential election campaign season. The post details how they implemented comprehensive DDoS protection by segmenting human and machine traffic, using tiered rate limits with CAPTCHA, and preventing CAPTCHA token reuse through AWS WAF Bot Control.| AWS Architecture Blog
AWS Transform for VMware is a service that tackles cloud migration challenges by significantly reducing manual effort and accelerating the migration of critical VMware workloads to AWS Cloud. In this post, we highlight its comprehensive capabilities, including streamlined discovery and assessment, intelligent network conversion, enhanced security and compliance, and orchestrated migration execution.| Amazon Web Services
In this post, you learn how you can use generative AI services on Amazon Web Services (AWS) to automate your sustainability reporting requirements, reduce manual effort, and improve accuracy. You do this by implementing an automated solution for extracting, processing, and validating data from corporate reports.| AWS Architecture Blog
In this post, we explore the Amazon Bedrock baseline architecture and how you can secure and control network access to your various Amazon Bedrock capabilities within AWS network services and tools. We discuss key design considerations, such as using Amazon VPC Lattice auth policies, Amazon Virtual Private Cloud (Amazon VPC) endpoints, and AWS Identity and Access Management (IAM) to restrict and monitor access to your Amazon Bedrock capabilities.| Amazon Web Services
This post demonstrates how the Issuer Solutions business of Global Payments, as a service provider, implemented cross-Region failover for an AWS PrivateLink backed service exposed to their customers. Their solution enables failover to a secondary Region without customer coordination, reducing Recovery Time Objective (RTO).| AWS Architecture Blog
In this post, we explore a unique scenario where an ISV, unable to provide a floating license option for cloud usage, worked with Stellantis to develop an alternative solution. This approach, implemented with the ISV’s permission, treats named user licenses as if they were floating, automatically assigning and removing them based on the state of user workbench instances.| AWS Architecture Blog
Organizations managing large audio and video archives face significant challenges in extracting value from their media content. Consider a radio network with thousands of broadcast hours across multiple stations and the challenges they face to efficiently verify ad placements, identify interview segments, and analyze programming patterns. In this post, we demonstrate how you can automatically transform unstructured media files into searchable, analyzable content.| Amazon Web Services
With the trends to autonomous teams and microservice style architectures, web frontend tiers are challenged to become more flexible and integrate different components with independent architectures and technology stacks. Two scenarios are prominent: Micro-Frontends, where there is a single page application and components within this page are owned by different teams Web portals, where there […]| Amazon Web Services
In this post, we share how Pegasystems (Pega) built Launchpad, its new SaaS development platform, to solve a core challenge in multi-tenant environments: enabling secure customer customization. By running tenant code in isolated environments with AWS Lambda, Launchpad offers its customers a secure, scalable foundation, eliminating the need for bespoke code customizations.| AWS Architecture Blog
Cash App, a leading peer-to-peer payments and digital wallet service from Block, Inc., has implemented resilience improvements across the entire technology stack. In this post, we discuss how Cash App improved the resilience of its compute platform built on Amazon Elastic Kubernetes Service (Amazon EKS) by implementing a dual-cluster topology to reduce single points of failure. We also discuss how Cash App used AWS Fault Injection Service (AWS FIS) to conduct an Availability Zone power interr...| AWS Architecture Blog
In this post, we'll explore how to maximize the value of dashcam footage through best practices for implementing and managing Computer Vision systems in commercial fleet operations. We'll demonstrate how to build and deploy edge-based machine learning models that provide real-time alerts for distracted driving behaviors, while effectively collecting, processing, and analyzing footage to train these AI models.| AWS Architecture Blog
In this post, you will learn how Amazon Web Services (AWS) customer, Maya, the Philippines’ leading fintech company and digital bank, built an API management platform to address the growing complexities of managing multiple APIs hosted on Amazon API Gateway.| AWS Architecture Blog
As part of the re:Invent 2023 keynote, Dr. Werner Vogels introduced the Frugal Architect mindset. This mindset emphasizes the importance of continuous learning, curiosity, and regular revision of architectural choices with a focus on cost and sustainability. Cost and sustainability should be treated as critical non-functional requirements, alongside factors like security, compliance, and performance. The […]| Amazon Web Services
AWS Architecture Blog| aws.amazon.com
In today’s fast-paced software as a service (SaaS) landscape, tenant portability is a critical capability for SaaS providers seeking to stay competitive. By enabling seamless movement between tiers, tenant portability allows businesses to adapt to changing needs. However, manual orchestration of portability requests can be a significant bottleneck, hindering scalability and requiring substantial resources. As […]| Amazon Web Services
The design of cloud workloads can be a complex task, where a perfect and universal solution doesn’t exist. We should balance all the different trade-offs and find an optimal solution based on our context. But how does it work in practice? Which guiding principles should we follow? Which are the most important areas we should […]| Amazon Web Services
This post was co-written with Shyam Narayan, a leader in the Accenture AWS Business Group, and Hui Yee Leong, a DevOps and platform engineer, both based in Australia. Hui and Shyam specialize in designing and implementing complex AWS transformation programs across a wide range of industries. Enterprises that operate out of multiple locations such as […]| Amazon Web Services
We are excited to announce the availability of an enhanced AWS Well-Architected Framework. In this update, you’ll find expanded guidance across all six pillars of the Framework: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability. In this release, we updated the implementation guidance for the new and existing best practices to be more prescriptive. This includes enhanced recommendations and steps […]| Amazon Web Services
In today’s digital world, businesses are increasingly turning to the cloud for its scalability, agility, and cost-effectiveness. Migrating your data center to the cloud can be a daunting task, but with the right approach and tools, it can be a successful journey. This Let’s Architect! blog post will guide you through the process of migrating […]| Amazon Web Services
Introduction ComfyUI is an open-source node-based workflow solution for Stable Diffusion. It offers the following advantages: Significant performance optimization for SDXL model inference High customizability, allowing users granular control Portable workflows that can be shared easily Developer-friendly Due to these advantages, ComfyUI is increasingly being used by artistic creators. In this post, we will introduce […]| Amazon Web Services
AWS Regions provide fault isolation boundaries that prevent correlated failure and contain the impact from AWS service impairments to a single Region when they occur. You can use these fault boundaries to build multi-Region applications that consist of independent, fault-isolated replicas in each Region that limit shared fate scenarios. This allows you to build multi-Region […]| Amazon Web Services
Genomics workflows run on large pools of compute resources and take petabyte-scale datasets as inputs. Workflow runs can cost as much as hundreds of thousands of US dollars. Given this large scale, scientists want to estimate the projected cost of their genomics workflow runs before deciding to launch them. In Part 6 of this series, […]| Amazon Web Services
This post was co-written with Luke Sudgen, Lead DevOps Engineer Post Trade, and Padraig Murphy, Solutions Architect Post Trade, from London Stock Exchange Group. In this post, we’ll discuss some failure scenarios that were tested by London Stock Exchange Group (LSEG) Post Trade Technology teams during a chaos engineering event supported by AWS. Chaos engineering […]| Amazon Web Services
Enterprise document management systems (EDMS) manage the lifecycle and distribution of documents. They often rely on keyword-based search functionality. However, it increasingly becomes hard to discover documents as such repositories grow to tens of thousands of items. In this blog, we discuss how Amazon Web Services (AWS) built an intelligent search bot on top of […]| Amazon Web Services
Wind energy plays a crucial role in global decarbonization efforts by generating emission-free power from an abundant resource. In 2022, wind energy produced 2100 terawatt-hours (TWh) globally, or over 7% of global electricity, with expectations to reach 7400 TWh by 2030. Despite its potential, several challenges must be addressed to help meet grid decarbonization targets. […]| Amazon Web Services
A common challenge organizations face is how to gain confidence in and provide evidence for the continuous resilience of their workloads. Using modern chaos engineering principles can help in meeting this challenge, but the practice of chaos engineering can become complex. As a result, both the definition of the inputs and comprehension of the outputs […]| Amazon Web Services
Introduction Zurich Insurance Group is a leading multi-line global insurer operating in more than 200 territories. Headquartered in Zurich, Switzerland, their main business is life and property and casualty (P&C) insurance. In 2022, Zurich began a multi-year program to accelerate their digital transformation and innovation through migration of 1,000 workloads to AWS, including core insurance […]| Amazon Web Services
Update (May 2023): After 8 years, this solution continues to serve as a pillar for how Amazon builds remote client libraries for resilient systems. Most AWS SDKs now support exponential backoff and jitter as part of their retry behavior when using standard or adaptive modes. Consequently, this pattern can be leveraged without having to incorporate […]| Amazon Web Services