Datadog and Amazon Web Services (AWS) are deepening their partnership with a new Strategic Collaboration Agreement (SCA) and a suite of product enhancements aimed at improving AI observability, cloud security and operational reliability for enterprises running large-scale workloads on AWS.
Announced at AWS re:Invent, the updates reflect a broader industry shift as cloud-native applications and AI workloads push organisations to demand more unified monitoring, faster issue resolution and tighter security across distributed environments.
A Unified View for Cloud and AI Workloads
Datadog already integrates with more than 1,000 technologies, including 100 AWS-specific services. With enterprises increasingly adopting LLMs, agentic workflows, serverless systems and Kubernetes-based architectures, observability needs have become more complex and time-sensitive.
“Customers need end-to-end visibility across their stack — otherwise a single issue can trigger a cascade of failures,” said Sean Fernandez, CIO at ROLLER. Datadog’s consolidated view, he noted, reduces investigation time from hours to seconds, significantly improving uptime and resilience during cloud transformation.
These pressures are forcing organisations to modernise the way they monitor and secure AWS workloads, from model performance and inference-layer security to storage behaviour and multi-cloud risk.
New Capabilities: AI Observability, Cost Controls, Storage Insights and Automated Remediation
At re:Invent, Datadog showcased a broad set of new and preview features, including:
AI & LLM Observability
Monitoring and debugging for Amazon Bedrock Agents and Strands Agents Framework.
Visibility into agent workflows, model behaviour and inference issues in real time.
Cloud Storage Management
Detailed insights into Amazon S3 buckets and prefixes, enabling teams to identify waste, control object storage costs and prevent unexpected billing spikes.
Integrated Incident Automation
Datadog MCP Server integration with AWS DevOps Agent (Preview) automates queries for logs, traces and metrics during investigations.
Kiro IDE integration (Preview) brings contextual telemetry directly into developers’ workflows for faster debugging.
Expanded Serverless and Container Support
Full visibility into AWS Lambda Managed Instances and Amazon ECS Managed Instances.
Support for Amazon ECS Express Mode, improving troubleshooting for high-scale container environments.
AI-Augmented Remediation
Bits AI for serverless environments and Kubernetes clusters offers guided, evidence-based recommendations to close incidents faster.
Cloud Cost Optimisation
Automated recommendations for AWS Lambda and Amazon RDS, identifying misconfigurations, inefficient provisioning and unnecessary logging costs.
Data Pipeline Enhancements
Predefined packs for VPC, CloudTrail, and CloudFront telemetry.
S3 log rehydration for quick historical log access and processing.
Together, these additions provide a deeper, more automated layer of observability — especially critical as organisations accelerate adoption of AI workloads on AWS.
Strengthening the AWS–Datadog Partnership
Datadog’s new Strategic Collaboration Agreement reinforces multi-year joint engineering, co-selling and marketplace initiatives across regions and industries.
“As cloud-native and AI-first applications accelerate, observability and security across AWS environments are now central to reliability and cost optimisation,” said Jarrod Buckley, VP of Channels and Alliances at Datadog.
AWS echoed this sentiment:
“Observability is essential to help customers build with confidence and scale their AI initiatives,” said Chris Grusz, Managing Director, Technology Partnerships at AWS.
The strengthened partnership aims to help enterprises minimise operational risk, speed up modernisation and deploy generative AI with better insight into performance, cost, and security.
A New Era of AI-Ready Observability
The message from re:Invent is clear: monitoring must evolve alongside AI.
With LLM pipelines, autonomous agents and multi-cloud deployments becoming mainstream, traditional observability tools are no longer enough.
Datadog’s latest offerings reflect the growing demand for:
Holistic visibility across heterogeneous systems
AI-first troubleshooting and automation
Integrated cost governance
Strong security for AI and cloud-native workloads
As enterprises scale AI and cloud adoption, this deeper AWS–Datadog collaboration positions both companies at the centre of next-generation observability and operational intelligence.
