In today’s fast-paced IT environment, monitoring and visualizing patching compliance across your infrastructure is crucial. Traditionally, creating comprehensive patching dashboards in Amazon QuickSight has been a manual, time-intensive process requiring multiple steps for each visual component. Amazon Q in QuickSight is an AI-powered assistant that enhances data analysis and visualization capabilities within Amazon QuickSight. This […]| AWS Cloud Operations Blog
Introduction Salesforce Commerce Cloud empowers thousands of retailers worldwide to create seamless shopping experiences. Behind these experiences lies a complex infrastructure that demands reliable monitoring at scale. As the platform evolved from static, first-party instances to dynamic cloud-based environments, the monitoring needs outgrew the self-managed Prometheus solution. This post details Salesforce’s Commerce Cloud journey from […]| AWS Cloud Operations Blog
In today’s cloud-driven landscape, development sandboxes have become enablers of innovation, offering safe environments for experimentation and testing. However, as organizations scale, these sandbox environments often grow increasingly complex and difficult to manage. Unchecked, this complexity can lead to escalating costs from abandoned resources, increased security risks, and diminished productivity—undermining the very benefits sandboxes are […]| AWS Cloud Operations Blog
With organizations increasingly recognizing governance as a strategic enabler rather than a compliance burden, this year’s Cloud Governance under AWS Cloud Ops track delivers cutting-edge sessions that bridge the gap between operational excellence and business innovation. The governance landscape is evolving rapidly, and this year’s sessions are organized around four critical themes that reflect the […]| AWS Cloud Operations Blog
Managing metrics collection at scale in complex cloud environments presents significant challenges for organizations, particularly when it comes to controlling costs and maintaining operational efficiency. As the volume of metrics grows exponentially with the expansion of container deployments and other cloud-native workloads, customers often struggle to balance comprehensive monitoring with resource optimization. This can lead […]| Amazon Web Services
AWS Organizations enables customers to centrally manage their AWS accounts. Since many customers prefer to automate the account creation process, they can leverage CreateAccount API, thereby creating an account vending pipeline. This pipeline standardizes the deployment of policies, roles, and resources across new accounts while managing the complete lifecycle through eventual account closure. Through this […]| AWS Cloud Operations Blog
Modern architectures generate vast amounts of observability data across metrics, logs, and traces. When issues arise, teams spend hours—sometimes days—manually correlating information across multiple dashboards to identify root causes, directly impacting MTTR and productivity. Amazon CloudWatch Application Signals addresses this challenge by providing deep application visibility through automatic instrumentation, capturing key metrics like latency, error […]| AWS Cloud Operations Blog
AWS Config tracks configuration changes across your AWS resources and AWS Organizations. AWS Config uses the configuration recorder to detect changes and records them as configuration items (CIs). As your infrastructure grows and becomes more complex, choosing the appropriate recording frequency becomes critical for maintaining operational visibility, meeting compliance requirements, and supporting your security posture. Since the launch of the periodic recording […]| AWS Cloud Operations Blog
As organizations rapidly deploy large language models (LLMs) and generative AI agents to power increasingly intelligent workloads, they struggle to monitor and troubleshoot the complex interactions within their AI applications. Traditional monitoring tools fall short in providing the visibility across components, leading to developers and AI/ML engineers to manually correlate interaction logs or building custom […]| Amazon Web Services
Determining how to protect and recover an application can often be easier than determining how quickly your business needs that application recovered. Establishing the correct recovery objective targets at an application level is a critical part of business continuity planning, though. This blog is intended to help customers as they establish or reevaluate recovery targets, […]| Amazon Web Services
In my previous blog, I shared how to evolve leadership for agentic AI using familiar mental models. As a CTO, I’ve been thinking about the corresponding architectural shifts required: We need to move from building predictable systems to developing autonomous capabilities that augment teams. Based on hands-on explorations and working with fellow technology leaders navigating […]| Amazon Web Services
Today, we are making it easier for you to manage the alternate contacts (billing, operations, and security) on your member accounts in AWS Organizations. You can now programmatically manage your account alternate contact information in addition to the existing experience in the AWS console. This launch ensures that the right individuals receive important AWS notifications […]| Amazon Web Services
Do you have thousands of Amazon CloudWatch alarms across AWS Regions and want to quickly identify which ones are low-value alarms or misconfigured alarms across regions? Are you looking for ways to identify alarms which are in ‘ALARM’ or ‘IN_SUFFICIENT’ state for several days and need to be revisited? Do you need a cleanup mechanism […]| Amazon Web Services
The design of cloud workloads can be a complex task, where a perfect and universal solution doesn’t exist. We should balance all the different trade-offs and find an optimal solution based on our context. But how does it work in practice? Which guiding principles should we follow? Which are the most important areas we should […]| Amazon Web Services
With AI's rapid evolution, boards face multi-faceted risks requiring diverse oversight, technical expertise, agile risk-mitigation, clear values guiding deployment, and robust cybersecurity - proactively managing uncertainties while capturing AI's transformative potential.| Amazon Web Services