logo
#

Latest news with #Databricks'

‘AI hallucinations are hard to remove completely': Naveen Rao, VP of AI, Databricks
‘AI hallucinations are hard to remove completely': Naveen Rao, VP of AI, Databricks

The Hindu

time4 days ago

  • Business
  • The Hindu

‘AI hallucinations are hard to remove completely': Naveen Rao, VP of AI, Databricks

Data and analytics firm Databricks was primed to ride the AI wave as data platform is crucial to train models. The company pioneered 'lakehouse architecture', an open data management system that combines flexibility, cost efficiency and scalability. Post a $10 billion funding round in December that valued the company at $62 billion, the AI company is being keenly watched by investors. But with the dramatic rise in valuations of AI firms comes the pressure to sell AI tools. In an exclusive interaction with The Hindu, Naveen Rao, VP of AI at Databricks, shared about the enterprise AI market, the AGI moment, and the hype around AI agents. THB: AI adoption is in full swing even as hallucinations persist. Why is this the case? Naveen Rao: There are a few reasons. We've gotten more mature on what our application areas are for AI. Basically, we have figured out the use cases where we can tolerate some error. Initially, people tried to apply them to all kinds of areas where precision was required which was a mistake. Secondly, we are adding value in these areas so the models have actually gotten better. When you put information into context either through retrieval or just entering all the required context into the prompt, we see less hallucinations. The models adhere to the information in the prompts with higher fidelity. Over time, we'll continue to work at driving down hallucinations. There can be other checks and balances like having another model judge whether the output is good. That gives us higher precision — it's not 100% but in many cases it's approaching north of 90 — 95% accuracy. But hallucinations are intrinsic to these AI models because LLMs are probabilistic, right? It's an auto regressive, next token prediction. So, it's very hard to remove completely. THB: Databricks recently announced a $ 100 million partnership with Anthropic. How will this play out in the agentic AI market? Naveen: We can't have AI agents in companies because there are errors, especially when they're doing multi-step tasks. I do think AI agents are definitely hyped. The definition of an AI agent has shifted to fit the narrative. The original intent of the word was to describe an entity that has agency meaning it can act completely on its own. Now, it has evolved into multi-part systems where multiple LLMs work together to solve a task that usually makes humans faster or more efficient. Even if AI agents were perfectly behaved, we don't want something that can act completely on its own with no governance. We have built this into Databricks' governance layer for data extended to GenAI. So, whenever an AI agent is built, it has certain access rights and entitlements and it can't go willy-nilly everywhere. In the coming years, as we start to solve these problems, things will become reliable and accurate. THB: Can we attain AGI with the current level of LLM advances? Naveen: I mean, whether it's a route to AGI or not will be seen in the technology itself. I personally don't think it is. I don't think that autoregressive loss is the right way to make something that can truly understand causation in a system. Humans learn in a different way. We learn by coming up with a mechanistic understanding of how we can solve a task. So, one thing leads to another thing then another. When I want to solve the same task again, I have a sense of the inputs that cause the causation to the next state. LLMs don't do this. Maybe we'll get there, but the current paradigm of very large pre-training on a huge corpus of unstructured data and then doing some sort of reinforcement learning to modify its behaviour is not what will lead to something that can truly act on its own. THB: So, you don't believe that AGI is just around the corner? Naveen: No, I think it's a much harder problem than a lot of people want to give it credit for and we are not at a point where we're close. Yes, we have made huge progress towards autonomous systems that can understand natural language. LLMs have solved natural language, which is a big thing. I don't want to minimize what's been done or their economic impact — LLMs are very useful tools. THB: How do you compare enterprise AI against the consumer AI market? Do you find it easier to navigate? Naveen: Enterprises tend to be slower to adopt. They're generally very rational actors, whereas consumers are somewhat irrational. That makes it harder to go after consumer, because it's hard to understand exactly why they're going to buy. Over time the enterprise market will be bigger than the consumer market, I believe. There are really only a few different product surfaces. Search tools is a big one in the consumer segment, like Perplexity or ChatGPT. Image generation and others are really mostly for fun. But I don't know how much people pay for fun. Usually, the novelty wears off unless they're used for business purposes. Whereas in enterprises we see companies really trying to look for an ROI so they're willing to invest a lot because it means something about the company versus their competitors. THB: What is AI's killer app now? Naveen: Right now, it's coding. AI tools tend to be effective when they're structured or their output is easily measured. With writing code, you can tell if the code complies pretty easily. Although coding agents hallucinate a lot but most of it is still useful. THB: What do you think about the view that students should stop studying software engineering? Naveen: I don't agree with it — someone has to understand how these systems work. Even if code generation is automated, it doesn't mean that the physics of writing software goes away. Somebody will still have to work with the code and check it. We have to understand the basics. I think it's a very poor advice to say we should stop studying computer science altogether.

PuppyGraph Announces New Native Integration to Support Databricks' Managed Iceberg Tables
PuppyGraph Announces New Native Integration to Support Databricks' Managed Iceberg Tables

Business Wire

time13-06-2025

  • Business
  • Business Wire

PuppyGraph Announces New Native Integration to Support Databricks' Managed Iceberg Tables

SAN FRANCISCO--(BUSINESS WIRE)--PuppyGraph, the first real-time, zero-ETL graph query engine, today announced native integration with Managed Iceberg Tables on the Databricks Data Intelligence Platform. This milestone allows organizations to run complex graph queries directly on Iceberg Tables governed by Unity Catalog- no data movement and no ETL pipelines. "Databricks' new Iceberg capabilities provide a truly open, scalable foundation. With PuppyGraph, teams can ask complex relationship-driven questions without ever leaving their lakehouse. " -- Weimo Liu, CEO of PuppyGraph Share Databricks Managed Iceberg Tables, launching in Public Preview at this year's Data + AI Summit, offers full support for the Apache Iceberg™ REST Catalog API. This allows external engines, such as Apache Spark™, Apache Flink™, and Apache Kafka™, to interoperate seamlessly with tables governed by Unity Catalog. Managed Iceberg Tables provide automatic performance optimizations, which deliver cost-efficient storage and lightning-fast queries out of the box. By combining PuppyGraph's in-place graph engine with the openness and scale of Managed Iceberg Tables, teams can now: Query massive Iceberg datasets as a live graph, in real-time Use graph traversal to detect fraud, lateral movement, and network paths Perform Root Cause Analysis on telemetry data using service relationship graphs Eliminate the need for ETL into siloed graph databases Scale analytics across petabytes with minimal operational overhead Coinbase and CipherOwl are joint customers of Databricks and PuppyGraph. At the Data + AI Summit, both will share how graph analytics has powered their products and enabled real-time insights directly on managed lakehouses. "This changes how graph analytics fits into the modern data stack," said Weimo Liu, CEO of PuppyGraph. "Databricks' new Iceberg capabilities provide a truly open, scalable foundation. With PuppyGraph, teams can ask complex relationship-driven questions without ever leaving their lakehouse." To learn more about how PuppyGraph integrates with Apache Iceberg™ and the Databricks Data Intelligence Platform, visit or see the joint talk with Coinbase at Data + AI Summit 2025. About PuppyGraph: PuppyGraph is the first and only real time, zero-ETL graph query engine in the market, empowering data teams to query existing relational data stores as a unified graph model deployed in under 10 minutes, bypassing traditional graph databases' cost, latency, and maintenance hurdles. Capable of scaling with petabytes of data and executing complex 10-hop queries in seconds, PuppyGraph supports use cases from enhancing LLMs with knowledge graphs to fraud detection, cybersecurity and more. Trusted by industry leaders, including Coinbase, Netskope, CipherOwl, Prevalent AI, Clarivate, and more. Learn more at and follow the company on LinkedIn, YouTube and X.

Informatica Expands Partnership with Databricks as Launch Partner for Managed Iceberg Tables and OLTP Database Service at Data + AI Summit 2025
Informatica Expands Partnership with Databricks as Launch Partner for Managed Iceberg Tables and OLTP Database Service at Data + AI Summit 2025

Business Wire

time12-06-2025

  • Business
  • Business Wire

Informatica Expands Partnership with Databricks as Launch Partner for Managed Iceberg Tables and OLTP Database Service at Data + AI Summit 2025

REDWOOD CITY, Calif.--(BUSINESS WIRE)--Informatica (NYSE: INFA), an AI-powered enterprise cloud data management leader, today announced a significant expansion of its partnership with Databricks at the 2025 Data + AI Summit. Informatica is a launch partner for two major innovations from Databricks—Managed Iceberg Tables and Databricks Lakebase, a first-of-its-kind, modern database built for AI. Informatica also unveiled GenAI-focused enhancements to its Intelligent Data Management Cloud (IDMC) platform to accelerate data and AI at scale with Databricks. These announcements further strengthen Informatica's leadership in cloud data management and its deep integration with the Databricks Data Intelligence Platform. Support/Launch Partner for Databricks' New Managed Iceberg Tables As a launch partner for Databricks Managed Iceberg Tables, Informatica enables customers to ingest, cleanse, govern and transform Iceberg-format data at enterprise scale. This allows organizations to convert any data to Iceberg format and leverage open table formats with confidence while maintaining high-performance analytics and AI workloads on the Databricks Data Intelligence Platform. Launch Partner/Connectivity for Databricks' New OLTP Database Informatica is also a launch partner for Databricks Lakebase, a new fully managed, Postgres-compatible database that supports high-volume transactions. Informatica enables seamless data loading and transformation from over 300 sources into the Databricks PostgreSQL service. This will help customers support transactional database (OLTP) use cases by leveraging all their enterprise data assets within the Databricks Data Intelligence Platform, providing uniform enterprise data management for analytics, AI and now transactional workloads. Accelerating GenAI Adoption with CAI for Mosaic AI In addition, Informatica is introducing new capabilities aimed at accelerating the adoption of AI agents and GenAI on Databricks Mosaic AI, Databricks' suite of AI solutions that helps enterprises build and deploy quality AI agent systems. These include: Mosaic AI connectors for Cloud Application Integration (CAI): Rapidly deploy AI agents that integrate enterprise data with Mosaic AI through a no-code interface. GenAI Recipes for CAI: Pre-configured templates that simplify and speed up GenAI application development and deployment. Enhanced Volume Support for Databricks Integration New volume support for Informatica's Cloud Data Integration (CDI) and Cloud Data Ingestion and Replication (CDIR) allows Databricks customers to move and manage non-tabular datasets more efficiently via Unity Catalog—reinforcing Informatica's strength in no-code, governed data integration. 'As a launch partner for our Managed Iceberg Tables and Lakebase, Informatica is committed to supporting Databricks' goal of helping customers leverage open table formats,' said Roger Murff, VP of Technology Partners at Databricks. 'With Informatica's support for GenAI through Databricks Mosaic AI connectors and GenAI recipes, we're enabling enterprises to streamline AI initiatives that harness data intelligence and create real business impact.' 'Informatica continues to be at the leading edge of Generative AI, enabling our joint customers to build a data foundation of trusted, AI-ready data,' said Rik Tamm-Daniels, Group Vice President of Strategic Ecosystems and Technology at Informatica. 'As a launch partner, today's announcement showcases our ongoing commitment to innovating with Databricks to maximize customer value through deep product enhancement and partnership alignment.' Join Informatica at DAIS 2025 Informatica invites attendees to visit booth #325 at the Databricks Data + AI Summit to explore how Informatica and Databricks are jointly driving the future of AI and enterprise data management. About Informatica Informatica (NYSE: INFA), a leader in AI-powered enterprise cloud data management, helps businesses unlock the full value of their data and AI. As data grows in complexity and volume, only Informatica's Intelligent Data Management Cloud™ delivers a complete, end-to-end platform with a suite of industry-leading, integrated solutions to connect, manage and unify data across any cloud, hybrid or multi-cloud environment. Powered by CLAIRE® AI, Informatica's platform integrates natively with all major cloud providers, data warehouses and analytics tools— giving organizations the freedom of choice, avoiding vendor lock-in and delivering better ROI by enabling access to governed data, simplifying operations and scaling with confidence. Trusted by 5,000+ customers in nearly 100 countries—including over 80 of the Fortune 100—Informatica is the backbone of platform-agnostic, cloud data-driven transformation.

Databricks says annualized revenue will reach $3.7 billion by next month
Databricks says annualized revenue will reach $3.7 billion by next month

CNBC

time12-06-2025

  • Business
  • CNBC

Databricks says annualized revenue will reach $3.7 billion by next month

Databricks, a data analytics software vendor, said on Wednesday that it expects to generate $3.7 billion in annualized revenue by July, with year-over-year growth of 50%. CFO Dave Conte delivered the numbers at a briefing for investors and analysts tied to the company's Data and AI Summit in San Francisco on Wednesday. Growth in the October quarter was 60%, Databricks said in late 2024. Databricks is one of the most highly valued tech startups, announcing in December that it raised $10 billion at a $62 billion valuation. Snowflake, its closest public market competitor, has a market cap of about $70 billion on annualized revenue of just over $4 billion, based on its latest quarter. Conte didn't give any indication of when Databricks might file for an IPO. On Wednesday, fintech company Chime priced its IPO, and stablecoin issuer Circle started trading on the New York Stock Exchange last week. Databricks had $2.6 billion in revenue in its fiscal year that ended in January, with a net retention rate exceeding 140%, unchanged from last year. In the first quarter of the new fiscal year, nearly 50 of Databricks' 15,000-plus customers were spending over $10 million annually, Conte said. "We want to combine good revenue growth and good product velocity with profitability," Conte said. The company has roughly 8,000 employees. Earlier on Wednesday, Databricks CEO Ali Ghodsi said the company is hiring 3,000 people in 2025. Databricks was close to being free cash flow positive for the first time in the most recent fiscal year, Conte said. In addition to Snowflake, competition also comes from cloud providers that sell their own data warehousing software. Also on Wednesday, Databricks announced a preview of Lakebase database software drawing on technology from its recent $1 billion acquisition of startup Neon. Lakebase stands to expand the size of Databricks' market opporunity, Conte said. Databricks ranked third on CNBC's newly release 2025 Disruptor 50 list, behind only Anduril and OpenAI.

Databricks launches Lakebase Postgres database for AI era
Databricks launches Lakebase Postgres database for AI era

Techday NZ

time12-06-2025

  • Business
  • Techday NZ

Databricks launches Lakebase Postgres database for AI era

Databricks has launched Lakebase, a fully managed Postgres database designed specifically for artificial intelligence (AI) applications, and made it available in Public Preview. Lakebase integrates an operational database layer into Databricks' Data Intelligence Platform, with the goal of enabling developers and enterprises to build data applications and AI agents more efficiently on a single multi-cloud environment. Purpose-built for AI workloads Operational databases, commonly known as Online Transaction Processing (OLTP) systems, are fundamental to application development across industries. The market for these databases is estimated at over USD $100 billion. However, many OLTP systems are based on architectures developed decades ago, which makes them challenging to manage, inflexible, and expensive. The current shift towards AI-driven applications has introduced new technical requirements, including the need for real-time data handling and scalable architecture that supports AI workloads at speed and scale. Lakebase, which leverages Neon technology, delivers operational data to the lakehouse architecture — combining low-cost data storage with computing resources that automatically scale to meet workload requirements. This design allows for the convergence of operational and analytical systems, reducing latency for AI processes and offering enterprises current data for real-time decision-making. "We've spent the past few years helping enterprises build AI apps and agents that can reason on their proprietary data with the Databricks Data Intelligence Platform," said Ali Ghodsi, Co-founder and CEO of Databricks. "Now, with Lakebase, we're creating a new category in the database market: a modern Postgres database, deeply integrated with the lakehouse and today's development stacks. As AI agents reshape how businesses operate, Fortune 500 companies are ready to replace outdated systems. With Lakebase, we're giving them a database built for the demands of the AI era." Key features Lakebase separates compute and storage, supporting independent scaling for diverse workloads. Its cloud-native architecture offers low latency (under 10 milliseconds), high concurrency (over 10,000 queries per second), and is designed for high-availability transactional operations. The service is built on Postgres, an open source database engine widely used by developers and supported by a rich ecosystem. For AI workloads, Lakebase launches in under a second and operates on a consumption-based payment model, so users only pay for the resources they use. Branching capabilities allow developers to create copy-on-write database clones, supporting safe testing and experimentation by both humans and AI agents. Lakebase automatically syncs data with lakehouse tables and provides an online feature store for machine learning model serving. It also integrates with other Databricks services, including Databricks Apps and Unity Catalog. The database is managed entirely by Databricks, with features such as encrypted data at rest, high availability, point-in-time recovery, and enterprise-grade compliance and security. Market adoption and customer perspectives According to the company, hundreds of enterprises participated in the Private Preview stage of Lakebase. Potential applications for the technology span sectors, from personalised product recommendations in retail to clinical trial workflow management in healthcare. Jelle Van Etten, Head of Global Data Platform at Heineken, commented: "At Heineken, our goal is to become the best-connected brewer. To do that, we needed a way to unify all of our datasets to accelerate the path from data to value. Databricks has long been our foundation for analytics, creating insights such as product recommendations and supply chain enhancements. Our analytical data platform is now evolving to be an operational AI data platform and needs to deliver those insights to applications at low latency." Anjan Kundavaram, Chief Product Officer at Fivetran, said: "Lakebase removes the operational burden of managing transactional databases. Our customers can focus on building applications instead of worrying about provisioning, tuning and scaling." David Menninger, Executive Director at ISG Software Research, said: "Our research shows that the data and insights from analytical processes are the most critical data to enterprises' success. In order to act on that information, they must be able to incorporate it into operational processes via their business applications. These two worlds are no longer separate. By offering a Postgres-compatible, lakehouse-integrated system designed specifically for AI-native and analytical workloads, Databricks is giving customers a unified, developer-friendly stack that reduces complexity and accelerates innovation. This combination will help enterprises maximise the value they derive across their entire data estate — from storage to AI-enabled application deployment." Integration and partner network Lakebase is launching with support from a network of partners, including technology vendors and system integrators such as Accenture, Deloitte, Cloudflare, Informatica, Qlik, and Redis, among others. These partnerships are designed to ease data integration, enhance business intelligence, and support governance for customers as they adopt Lakebase as part of their operational infrastructure. Lakebase is now available in Public Preview with further enhancements planned in the coming months. Customers can access the preview directly through their Databricks workspace.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store