
Datadog unveils open-source AI models for better observability
Datadog has announced the release of two artificial intelligence foundation models through its AI research lab, aimed at advancing anomaly detection and analysis in machine learning and observability.
The two projects unveiled are Toto, an open-weights, zero-shot time series foundation model, and BOOM, described as the largest public benchmark of observability metrics to date. Both are intended to support the broader research community through open-source access and published findings.
Datadog AI Research is focused on bridging the gap between academic developments and practical implementation within the domains of cloud observability and security. The company states that it will regularly contribute to the research ecosystem by making research outputs and model artefacts accessible to researchers, developers and engineers worldwide.
Toto is the first open-source foundation model created specifically for observability purposes. It is designed to function as a time series foundation model (TSFM), learning from large datasets to be adaptable across numerous forecasting and anomaly detection tasks—similar in concept to large language models, but applied to telemetry data. Its training relied exclusively on Datadog's internal telemetry metrics, which the company says enables it to outperform other TSFMs on a range of tasks.
According to Datadog, Toto's "zero-shot" design allows it to deliver immediate forecasts and anomaly detection without requiring tuning for each new data series. This quality, the company notes, is particularly valuable when monitoring systems generate billions of individual time series that change frequently. While existing TSFMs reportedly face challenges handling the complexities of telemetry data, the introduction of Toto is claimed to offer improved performance both for observability and general forecasting objectives.
BOOM, the second release from the lab, provides a new benchmark targeted specifically at observability metrics, which often display characteristics such as high sparsity, sudden spikes and cold-start scenarios that differ from standard time series data. The benchmark comprises 350 million observations across 2,807 real-world multivariate series, reflecting the scale and variety often encountered in production telemetry environments. Datadog submits that BOOM will serve as a resource for researchers attempting to advance forecasting models suitable for observability use cases.
"Today marks the launch of our first open-source foundation model and we expect to continuously release AI projects through Datadog AI Research," said Ameet Talwalkar, Chief Scientist at Datadog. "The lab offers an exciting opportunity to develop research ideas and build prototypes that will contribute to the community. We will also collaborate with applied AI teams to build tools that will solve customer problems and transform how engineers work."
Datadog has stated that the new lab will work closely with its product and engineering teams to ensure that the advances stemming from projects such as Toto and BOOM are translated into practical benefits for customers using its monitoring and security platform.
Both Toto and BOOM are immediately available for download under a permissive open-source licence. Datadog is encouraging researchers and members of the open-source software community to use and improve upon these tools to further observability forecasting research.
Recently, Datadog expanded its AI monitoring capacity with the acquisition of Metaplane, an AI-powered observability start-up. This acquisition is intended to bolster the company's capabilities in providing reliable and robust AI systems for business customers.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


NZ Herald
13 hours ago
- NZ Herald
Hawke's Bay runner plans marathon for 115 women killed on runs
Why US$42b DataDog is going all in on AI The enterprise software company DataDog is investing almost US$1b a year into artificial intelligence.


Scoop
13-06-2025
- Scoop
Datadog Expands Log Management Offering With New Long-Term Retention, Search And Data Residency Capabilities
Datadog, Inc. (NASDAQ: DDOG), the monitoring and security platform for cloud applications, today at DASH announced new capabilities in its log management suite, which are designed to help organisations optimise logging costs at scale and meet the stringent data retention, auditability and data residency requirements of regulated industries. Logs are critical for threat detection, incident response and audit trails. However, lack of flexibility, high costs and data retention limitations remain roadblocks for security and compliance teams. Financial services, healthcare and insurance companies face similar challenges, having to comply with regulations and maintain full control over sensitive operational data, including their logs. Likewise, organisations operating under regional data residency laws or internal security policies are often required to store data within controlled environments, whether on-premises or in-region cloud infrastructure. These organisations need to remain compliant while having a scalable and efficient log management strategy. Traditional solutions, however, often introduce high costs, operational overhead and fragmented workflows. At its DASH conference in 2023, Datadog launched Flex Logs, which has since become one of its fastest-growing products. Flex Logs decouples the costs of log storage from the costs of querying. It provides both short- and long-term log retention for a nominal monthly fee without sacrificing visibility, enabling streamlined correlation between all of an organisation's logs, metrics and traces. To help companies meet data residency regulations, policies and preferences—while further optimising cost and efficiency—Datadog has launched new log management capabilities that build on the foundation set by Flex Logs. Datadog's latest enhancements enable organisations to support modern SIEM and security workflows while maintaining full visibility, cost consciousness and operational efficiency: Archive Search queries logs from customer-owned cold storage without requiring re-indexing. Archived logs can be searched the same way as logs under retention in the Log Explorer without introducing new tools or extra training. Datadog keeps the user experience consistent, regardless of the age of logs. Flex Frozen is a new storage tier extending log retention to over seven years, eliminating the need for managing and securing external archives. Built for audit-heavy, compliance-driven environments, Flex Frozen simplifies data retention by keeping logs inside Datadog in order to reduce overhead, simplify reporting and analytics, and improve accessibility. CloudPrem enables enterprises to deploy Datadog's indexing and search capabilities within their own infrastructure. Whether it's due to regional data residency laws or internal compliance mandates, customers can now keep their logs local—while continuing to use the Datadog UI and workflows they trust. 'As compliance standards grow more complex and global data regulations tighten, organisations face mounting pressure to retain log data longer, search it faster and keep it where it belongs,' said Michael Whetten, VP of Product at Datadog. 'With today's launches, Datadog makes it easier to manage logs, control their costs and stay compliant without sacrificing performance, accessibility or the user experience.'


Scoop
12-06-2025
- Scoop
Datadog's Internal Developer Portal Boosts Engineering Autonomy And Helps Ship Code With Confidence
Press Release – Datadog Datadog IDP accelerates incident response by bringing a live, centralised engineering knowledge base into every incident for faster triage, better decision making and improved coordination. AUCKLAND – JUNE 12, 2025 – Datadog, Inc. (NASDAQ: DDOG), the monitoring and security platform for cloud applications, today at DASH launched its Internal Developer Portal (IDP), which is the first and only developer portal built on live observability data. Engineering teams are under pressure to ship faster while still meeting stricter standards to keep their code reliable, secure, cost effective and compliant with legal, regulatory and company policies. Developers must navigate an expanding set of requirements—including code quality, testing, security scans, infrastructure configurations, observability and compliance. At the same time, they need to understand the systems and services their code depends on, who owns these services, and how they're performing in real time. As this complexity and cognitive load grow, developers increasingly rely on platform engineers to unblock them, which stretches resources for both teams and slows software delivery across the organisation. Datadog IDP gives developers the autonomy to ship quickly and confidently—while meeting production standards and keeping pace with constantly changing systems. Unlike static portals that rely on manual upkeep, Datadog IDP builds on its APM product suite to automatically map services and dependencies, and bring Datadog's real-time performance, service ownership and engineering knowledge together in one place. The new product enables developers to build, test, deploy and monitor software with self-service actions and built-in delivery guardrails, while providing platform engineers with scorecards to help them meet reliability, security and monitoring standards. Datadog IDP accelerates incident response by bringing a live, centralised engineering knowledge base into every incident for faster triage, better decision making and improved coordination. Engineers can focus on solving issues—rather than searching for them across disparate systems—by leveraging these capabilities as part of Datadog's unified platform: Software Catalog: A live system of record showing what software is running, who is responsible for it, and how it is performing across an organisation. This record is automatically synced to live telemetry collected in Datadog, so teams can easily find services, dependencies and their performance metrics, along with critical engineering context such as owners, on-call rotations, source code, runbooks, dashboards and documentation. Self-Service Actions: Pre-built, pre-approved templates powered by Datadog's App Builder and Workflow Automation make it quick and easy for developers to accomplish tasks—like scaffolding a new service, provisioning infrastructure resources or triggering remediation actions—independently while meeting internal requirements. Scorecards: A set of out-of-the-box and custom pass/fail rules that allow platform engineers and engineering managers to track compliance with reliability, security, observability, cost, and other standards across services and teams. Engineering Reports: Out-of-the-box visibility into engineering reliability, software delivery performance and compliance with engineering standards, while offering actionable, personalised views for developers, team leads and executives. 'Datadog's IDP brings together both observed and declared system states, as well as existing systems of record. This combination shows not only developer intention but also what is actually in production. Whether developers onboard new teams or are tasked with complex projects such as migrating code from EC2 to Kubernetes, Datadog automatically provides visibility into their systems and reflects changes as they are being made—without stale metadata or manual syncing,' said Michael Whetten, VP of Product at Datadog. 'Datadog IDP empowers developers to collaborate more effectively and deliver software that meets their organisation's standards, at the pace that is expected from them.' Datadog IDP's service ownership and other information are available across Datadog's unified platform. Status Pages, for example, leverages the same ownership metadata populated through IDP to make it easy to communicate incident scope and impact to stakeholders. And on-call engineers can now query service owners, recent changes and other critical information hands-free from IDP for faster investigations using a Voice Interface. To learn more about Datadog IDP, please visit: Datadog IDP was announced during the keynote at DASH, Datadog's annual conference. The replay of the keynote is available here. During DASH, Datadog also announced launches in AI Observability, Applied AI, AI Security and Log Management. About Datadog Datadog is the observability and security platform for cloud applications. Our SaaS platform integrates and automates infrastructure monitoring, application performance monitoring, log management, user experience monitoring, cloud security and many other capabilities to provide unified, real-time observability and security for our customers' entire technology stack. Datadog is used by organisations of all sizes and across a wide range of industries to enable digital transformation and cloud migration, drive collaboration among development, operations, security and business teams, accelerate time to market for applications, reduce time to problem resolution, secure applications and infrastructure, understand user behavior and track key business metrics.