13-06-2025
What AI Agents Are Getting Right – And Wrong
Agentic AI promises to transform IT operations, but many platforms still fall short. Here's what's ... More working, what's hype and how CIOs can separate fad from reality.
For many CIOs and technology executives, AI's promise was straightforward: smarter, faster and more efficient IT operations. The technology was envisioned as a game-changer, capable of reducing operational costs, automating mundane tasks, enhancing system reliability and freeing up human resources for much more important work.
But ask them today, and you might hear frustration rather than enthusiasm. That's because the reality on the ground is starkly different from the optimistic projections that have headlined the news so far, primarily due to the complexities involved in effectively integrating AI into IT operations. It's a challenge that was front and center at Agentic AI Demo Day, where executives gathered to explore how autonomous agents can help streamline operations — but only if the underlying complexity is addressed first.
Despite the global operational intelligence market valued at $3.2 billion in 2024, according to IMARC Group, and projected to reach $6.8 billion by 2033, growing at a CAGR of 8.8%, enterprises are still grappling with real-world barriers to effectively implementing AI in their IT operations. At the heart of it all is one major hurdle — untangling the operational complexity that prevents AI agents from delivering on their promise.
The big question, though, is: How can they move past this complexity and harness the true power of AI?
According to Andy Thurai, industry analyst at Field CTO, a major problem for enterprise IT today is that many organizations still run their IT operations through 'manual incident management processes,' a reality he described as 'shocking.' A 2024 report from the Uptime Institute found that nearly 60% of enterprises suffered major outages and downtimes tied to escalating IT complexity. One joint report by Splunk, a Cisco company, and global research institute Oxford Economics estimated the yearly global cost of such downtimes to be $400 billion.
That's a huge cost when you think about the sheer numbers and it shows why enterprises are now scrambling to simplify the long-standing inefficiencies in IT. And in that scramble, many technical decision makers have bought into the AI hype and deployed AI tools which didn't fully solve their operational problems.
While traditional machine learning and GenAI tools have addressed specific operational tasks — like forecasting or summarization — they still fall short when it comes to cross-domain workflow automation and real-time system orchestration. 'AI tools have tackled the easy parts,' Thurai said during Fabrix's Agentic AI Demo Day. 'But they haven't solved the fundamental workflow problems at the heart of IT operations.'
Instead, many organizations have adopted fragmented point solutions that generate too much noise and too few insights.
'AI solutions promised to streamline operations, but instead, companies ended up with fragmented tools producing too much data and too few actionable insights,' Thurai explained during a recent webinar. He noted that one of the biggest pain points today is 'alert fatigue,' where IT teams are overwhelmed by excessive system alerts, diminishing their effectiveness and responsiveness.
Thurai's sentiment is rooted in facts, with a report by McKinsey noting that while 92% of companies plan to grow their AI investments over the next three years, just 1% of surveyed C-Suite leaders describe their organizations as 'AI mature' — meaning AI is fully embedded into their operations and driving positive business outcomes. Many organizations face data overload from numerous sources, increasing rather than reducing existing operational pressures.
As Thurai explained, this problem stems from the fact that modern enterprises rely heavily on intricate, microservices-based architectures. Systems at companies like Netflix, Uber and Amazon manage thousands of interdependent services simultaneously, dramatically increasing operational complexity. When incidents occur, traditional monitoring tools struggle to quickly pinpoint root causes, resulting in delayed resolutions that can cost millions in downtime and lost productivity.
To address these shortcomings, the industry is gradually shifting toward agentic AI — also called agentic AIOps when applied to IT environments — which are so-called autonomous agents capable of independent action, reasoning and adaptive decision-making without constant human oversight. These agentic systems are particularly suited to IT operations precisely because they can operate independently, detecting and resolving incidents autonomously.
While much is still being understood about how these agentic systems behave at scale and experts continue to call for companies to prioritize safety in building or deploying AI agents, they could potentially mitigate human error, reduce response times and directly address IT departments' alert fatigue.
As Thurai noted in the webinar, organizations can achieve unprecedented efficiency, resilience and proactive management across their IT environments by orchestrating intelligent agents that can analyze, predict and act autonomously
Companies are beginning to explore how these autonomous systems can be deployed effectively. For example, — which offers a modern intelligence platform for the agentic AI era — recently showcased practical ways businesses can deploy AI agents for operational intelligence during its Agentic AI Demo Day. The company's platform enables businesses to build customized AI agents tailored to specific operational scenarios, from anomaly detection to real-time event management.
The anomaly detector agents demonstrated at the event can autonomously identify KPI deviations, automatically open trouble tickets and dynamically adjust system capacity. Event intelligence agents also showed capabilities in real-time alert correlation and executing closed-loop remediation.
In practical terms, this means that AI agents — like Fabrix's solutions demonstrated — have a strong potential to significantly reduce operational costs and improve overall system reliability for organizations. However, the adoption of advanced autonomous systems isn't without hurdles.
isn't the only player in this emerging space. Cisco has also introduced AI-native observability tools that incorporate agent-like behaviors to automate root cause analysis and AI observability. Similarly, Dynatrace is layering AI agents into its Davis AI engine to enhance multi-domain remediation across cloud-native environments. These developments reflect a broader move toward intelligent automation — though each vendor is taking a different route.
Still, these agentic systems remain in early phases. Critics note that many so-called agentic platforms are still rule-based at their core, lacking the true autonomy and reasoning needed to adapt across diverse workflows. Even Fabrix's approach, while promising, is still evolving and may require customization for complex enterprise environments.
As competition heats up, the key differentiator may not be the platform itself — but how well it balances adaptability, trust and enterprise-grade integration.
Thurai warned that without robust guardrails, autonomous AI could exhibit unpredictable, or 'stochastic' behaviors. Companies must invest not only in agentic platforms but also in frameworks ensuring security, observability and ethical AI practices.
'Implementing guardrails and quality controls are essential,' Thurai said. 'Without proper oversight, you risk AI that doesn't just hallucinate — these systems can confidently produce inaccurate outcomes, leading to significant operational risks.'
That message was echoed by multiple speakers at the Agentic AI Demo Day, including Cisco and IBM executives, who emphasized the need for enterprise-grade controls like embedded testing, persona-based access governance and auditable AI execution paths. These capabilities, they argued, are non-negotiables for agentic platforms that aim to operate autonomously at enterprise scale.
Another significant challenge enterprises face is the severe shortage of skilled IT professionals. Korn Ferry predicts a global shortage of up to 85 million tech workers by 2030. This skill gap could further worsen already challenging operational issues, forcing enterprises to rely increasingly on automation and AI-driven solutions. Autonomous agents could be helpful in this regard, providing a critical lifeline that fills talent gaps and performs routine and even complex tasks previously managed by overstretched human teams.
For now, the road to fully autonomous AI operations remains under construction. Enterprises considering this journey must prepare carefully, ensuring that their investments in agentic AI are matched with a thorough understanding of potential pitfalls with rushing to deploy AI, as well as a disciplined approach to implementing AI.
Despite these challenges, the potential rewards — reduced downtime, increased operational efficiency and substantial cost savings — make agentic AI an investment worth serious consideration. But as Thurai noted, agentic AIOps — which describes the application of autonomous, decision-making AI agents within AI-powered IT operations — is still in the very early stages and only a few vendors offer it.
In the next year, he added, 'we'll probably see too much vendor snake oil coming out of the market saying, 'Oh, we're an AI agent platform,' when they really aren't.' The big message, according to Thurai, is that as we enter into a new agentic era for AI applications, choosing the right vendor could be the deciding factor between scalable automation and another failed AI deployment.
'The major difference between choosing the right vendor and wrong vendor, especially in IT ops, is not just about the platform,' he said, 'but the capabilities that can and should be expandable by agents.'