logo
Nvidia Dynamo — Next-Gen AI Inference Server For Enterprises

Nvidia Dynamo — Next-Gen AI Inference Server For Enterprises

Forbes25-03-2025

Dynamo Inference Server
At the GTC 2025 conference, Nvidia introduced Dynamo, a new open-source AI inference server designed to serve the latest generation of large AI models at scale. Dynamo is the successor to Nvidia's widely used Triton Inference Server and represents a strategic leap in Nvidia's AI stack. It is built to orchestrate AI model inference across massive GPU fleets with high efficiency, enabling what Nvidia calls AI factories to generate insights and responses faster and at a lower cost.
This article attempts to provide a technical overview of Dynamo's architecture, features and the value it offers enterprises.
At its core, Dynamo is a high-throughput, low-latency inference-serving framework for deploying generative AI and reasoning models in distributed environments. It integrates into Nvidia's full-stack AI platform as the operating system of AI factories, connecting advanced GPUs, networking, and software to enhance inference performance.
Nvidia's CEO Jensen Huang emphasized Dynamo's significance by comparing it to the dynamos of the Industrial Revolution—a catalyst that converts one form of energy into another—except here, it converts raw GPU compute into valuable AI model outputs at an unparalleled scale.
Dynamo aligns with Nvidia's strategy of providing end-to-end AI infrastructure. It has been built to complement Nvidia's new Blackwell GPU architecture and AI data center solutions. For example, Blackwell Ultra systems provide the immense compute and memory for AI reasoning, while Dynamo provides the intelligence to utilize those resources efficiently.
Dynamo is fully open source, continuing Nvidia's open approach to AI software. It supports popular AI frameworks and inference engines, including PyTorch, SGLang, Nvidia's TensorRT-LLM and vLLM. This broad compatibility means enterprises and startups can adopt Dynamo without rebuilding their models from scratch. It seamlessly integrates with existing AI workflows. Major cloud and technology providers like AWS, Google Cloud, Microsoft Azure, Dell, Meta and others are already planning to integrate or support Dynamo, underscoring its strategic importance across the industry.
Dynamo is designed from the ground up to serve the latest reasoning models, such as DeepSeek R1. Serving large LLMs and highly capable reasoning models efficiently requires new approaches beyond what earlier inference servers provided.
Dynamo introduces several key innovations in its architecture to meet these needs:
Dynamic GPU Planner: Dynamically adds or removes GPU workers based on real-time demand, preventing over-provisioning or underutilization of hardware. In practice, this means if user requests spike, Dynamo can temporarily allocate more GPUs to handle the load, then scale back, optimizing utilization and cost.
LLM-Aware Smart Router: Intelligently routes incoming AI requests across a large GPU cluster to avoid redundant computations. It keeps track of what each GPU has in its knowledge cache (the part of memory storing recent model context) and sends each query to the GPU node best primed to handle it. This context-aware routing prevents repeatedly re-thinking the same content and frees up capacity for new requests.
Low-Latency Communication Library (NIXL): Provides state-of-the-art, accelerated GPU-to-GPU data transfer and messaging, abstracting away the complexity of moving data across thousands of nodes. By reducing communication overhead and latency, this layer ensures that splitting work across many GPUs doesn't become a bottleneck. It works across different interconnects and networking setups, so enterprises can benefit whether they use ultra-fast NVLink, InfiniBand, or Ethernet clusters.
Distributed Memory (KV) Manager: Offloads and reloads inference data (particularly 'keys and values' cache data from prior token generation) to lower-cost memory or storage tiers when appropriate. This means less critical data can reside in system memory or even on disk, cutting expensive GPU memory usage, yet be quickly retrieved when needed. The result is higher throughput and lower cost without impacting the user experience.
Disaggregated serving: Traditional LLM serving would perform all inference steps (from processing the prompt to generating the response) on the same GPU or node, which often underutilized resources. Dynamo instead splits these stages into a prefill stage that interprets the input and a decode stage that produces the output tokens, which can run on different sets of GPUs.
As AI reasoning models become mainstream, Dynamo represents a critical infrastructure layer for enterprises looking to deploy these capabilities efficiently. Dynamo revolutionizes inference economics by enhancing speed, scalability and affordability, allowing organizations to provide advanced AI experiences without a proportional rise in infrastructure costs.
For CXOs prioritizing AI initiatives, Dynamo offers a pathway to both immediate operational efficiencies and longer-term strategic advantages in an increasingly AI-driven competitive landscape.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Is Quantum Computing (QUBT) Stock a Buy on This Bold Technological Breakthrough?
Is Quantum Computing (QUBT) Stock a Buy on This Bold Technological Breakthrough?

Yahoo

time30 minutes ago

  • Yahoo

Is Quantum Computing (QUBT) Stock a Buy on This Bold Technological Breakthrough?

Quantum computing stocks are heating up again, offering investors a front-row seat to what could be the next massive tech revolution. Even Nvidia (NVDA) CEO Jensen Huang, once skeptical about near-term adoption, recently said quantum computing was at an 'inflection point,' signaling a dramatic shift from his earlier stance that it was 'decades away.' Companies in this space are finally beginning to move from the research lab to real-world commercialization. Quantum Computing (QUBT) just hit a major milestone in that journey. The company announced the successful shipment of its first commercial entangled photon source to a South Korean research institution. This cutting-edge product is a foundational piece of QUBT's quantum cybersecurity platform, which won a 2024 Edison Award. The shipment not only showcases the company's ability to execute globally, but also underscores growing demand for integrated quantum solutions. CoreWeave Just Revealed the Largest-Ever Nvidia Blackwell GPU Cluster. Should You Buy CRWV Stock? AMD Is Gunning for Nvidia's AI Chip Throne. Should You Buy AMD Stock Now? The Saturday Spread: Statistical Signals Flash Green for CMG, TMUS and VALE Our exclusive Barchart Brief newsletter is your FREE midday guide to what's moving stocks, sectors, and investor sentiment - delivered right when you need the info most. Subscribe today! With real momentum behind it and a clear roadmap ahead, QUBT could be a high-risk, high-reward play for investors looking to capitalize on the coming wave of quantum adoption. Based in Hoboken, New Jersey, Quantum Computing is an integrated photonics company that focuses on the development of quantum machines for both commercial and government markets in the United States. The company specializes in thin-film lithium niobate chips. These chips are central to QUBT's mission of building quantum machines that operate at room temperature and require low power. Valued at $2.7 billion by market cap, QUBT shares have exploded over the past year, soaring more than 3,000%. However, the stock has cooled in 2025, rising just 17.4% year-to-date amid growing skepticism over the commercialization timeline for quantum technology. Following last year's sharp rally, QUBT's valuation has reached nosebleed territory, with a staggering price-sales ratio of 7,475x, far above the sector median. This suggests the stock is extremely overvalued compared to its industry peers. On May 16, Shares of QUBT popped nearly 40% in a single trading session after Quantum Computing reported Q1 results that illustrate both nascent revenue traction and the substantial investments required to advance its quantum photonics roadmap. The company recognized approximately $39,000 in revenue for the quarter, representing a 42.7% year-over-year increase from a similarly low base. However, this figure fell roughly 61% short of consensus forecasts, highlighting the early stage nature of commercial adoption. Gross margin contracted to 33.3% from 40.7% a year earlier, While net income was reported at nearly $17 million or $0.13 per share, beating the estimate of $0.08, it was driven primarily by a non-cash gain on the mark-to-market valuation of warrant-related derivative liabilities. Operating expenses rose to approximately $8.3 million, up from $6.3 million in the year-ago quarter, as the company expanded staffing and advanced its Quantum Photonic Chip Foundry in Tempe, Arizona. The balance sheet remains robust: cash and cash equivalents totaled about $166.4 million with no debt, providing a multi-year runway at current expenditure levels. Revenue divisions are still emerging, with initial sales tied to prototype devices, quantum cybersecurity platforms, and early foundry orders, but detailed segment reporting is limited given the infancy of commercial deployments. Looking ahead, management indicated they expect only modest photonic foundry revenue in the back half of 2025, with revenue likely to accelerate in 2026 as additional customers come online. Earlier this year, Quantum Computing disclosed collaborations with NASA's Langley Research Center and the Sanders Tri-Institutional Therapeutics Discovery Institute. These partnerships were formed to validate their quantum photonic technologies in demanding, real-world settings, removing sunlight noise from space-based LiDAR and enhancing drug discovery workflows. On May 12, Quantum Computing said it has completed its Quantum Photonic Chip Foundry in Tempe, Arizona, positioning it to meet demand in data communications and telecommunications. This facility enables scalable production of entangled photon sources, enhancing QCI's competitive standing against established photonics firms and emerging quantum hardware startups. The foundry's completion transitions R&D toward revenue generation. For now, only a single analyst covers QUBT stock, assigning it a 'Strong Buy' rating with a price target of $22, implying upside of 14%. For investors, QUBT remains a highly speculative stock with unique technology but limited commercial traction. Despite partnerships and bold claims, it lags far behind the commercial sucess of industry giants like International Business Machines (IBM) and Nvidia (NVDA). Without a clear path to profitability or a meaningful share of the market, its lofty valuation is difficult to justify in today's competitive and capital-sensitive environment. Lastly, investors should note that quantum computing stocks often move more on hype than fundamentals, making QUBT a highly speculative bet. On the date of publication, Nauman Khan did not have (either directly or indirectly) positions in any of the securities mentioned in this article. All information and data in this article is solely for informational purposes. This article was originally published on

Make Over a 2.4% One-Month Yield Shorting Nvidia Out-of-the-Money Puts
Make Over a 2.4% One-Month Yield Shorting Nvidia Out-of-the-Money Puts

Yahoo

time30 minutes ago

  • Yahoo

Make Over a 2.4% One-Month Yield Shorting Nvidia Out-of-the-Money Puts

Nvidia Inc. (NVDA) stock is cheap based on free cash flow (FCF) price targets. Investors can short out-of-the-money (OTM) NVDA put options to make a 1-month 2.4% yield. This is at 5% lower exercise prices, providing a cheaper potential buy-in point for investors. NVDA closed at $143.85 on Friday, June 20. In my last Barchart article on May 30, I argued that NVDA stock was worth $191.34 per share. That is still one-third (+33.0%) higher than Friday's price. The Saturday Spread: Statistical Signals Flash Green for CMG, TMUS and VALE Make Over a 2.4% One-Month Yield Shorting Nvidia Out-of-the-Money Puts Stop Missing Market Moves: Get the FREE Barchart Brief – your midday dose of stock movers, trending sectors, and actionable trade ideas, delivered right to your inbox. Sign Up Now! This article will discuss one way to play NVDA by shorting out-of-the-money (OTM) puts. That way, an investor can set a potentially lower buy-in point and get paid for this. But first, let's look at Nvidia's free cash flow and the related price target. In my last Barchart article, I showed that Nvidia's Q1 FCF of $26.125 billion represented an astounding 59.3% of quarterly revenue. That means almost 60% of its sales revenue goes straight into its bank account with no cash outlays on it (even after record-high capex spending). Moreover, I showed that over the last 12 months (LTM), its FCF margin was almost 50% (48.5%). That implies going forward its FCF could rise to a record high. For example, based on analysts' next 12-month (NTM) projections of $225 billion, using a 50% FCF margin free cash flow could exceed $112 billion: $225b x 50% FCF margin = $112.5b FCF How to value NVDA? Let's think about what the market might be projecting. For example, let's assume the market believes Nvidia will make $100 billion in FCF, slightly less than 4 times its Q1 FCF. So, given its market cap today of $3,508 billion, that represents a 2.85% yield: $100b/$3,508 = 0.0285 = 2.850% FCF yield So, using our NTM forecast of $112.5b, its market cap could rise to $3.75 trillion $112.5b / 0.0285 = $3,947 billion NTM mkt cap That represents an upside of 12.5% from today's market cap: $3,947b / $3,508b mkt cap today = 1.125 So, that makes its target price at least 12.5% more: $143.85 x 1.125 = $161.83 However, if Nvidia makes better than 50% FCF margins over the next year, its target price could be much higher. For example, even a 10% higher FCF margin leads to a 24% upside: 0.55 x $225b = $123.75b FCF $123.75b / 0.0285 FCF yield = $4,342 billion mkt cap; $4,342b / $3,508b = 1.2377 = +23.8% upside 1.1238 x $143.85 p/sh = $178 per share The bottom line is that Nvidia's strong FCF and FCF margins will lead to a significantly higher price, between $162 and $178 per share. This coincides with what other analysts are projecting. For example, 66 analysts surveyed by Yahoo! Finance show an average price target of $172.60. Similarly, Barchart's mean survey shows $174.83 per share. In addition, which tracks analysts who have written recently on NVDA stock, has an average price of $179.87 from 40 analysts. My analysis above shows you why these analysts have these higher price targets. But there is no guarantee NVDA will rise to these targets over the next year. So, one way to play this is to sell short out-of-the-money (OTM) puts in nearby expiry periods. In my May 30 Barchart article, I suggest selling short the $128 strike price put expiring July 3 for a 3.125% yield at a 3.72% out-of-the-money (i.e., below the trading price) strike. For example, the midpoint premium was $4.00, so $4.00/$128.00 equals 0.03125. That was for a one-month play (34 days to expiry or DTE). Today, that strike price has a much lower premium of just 39 cents. So, an investor has already made $3.61 (i.e., $4.00-$0.39), or a net 2.82% yield (i.e., $3.61/$128 = 0.028). It makes sense to roll this over and set a new one-month short-put play. That means buying back the short put at 39 cents and reinvesting at a slightly higher strike price one month out. For example, look at the July 25 expiration period (i.e., 34 DTE). It shows that the $137 strike price put options expiring July 25, i.e., 4.7% below Friday's price, have a $3.40 midpoint premium. That means a new short seller of these puts can make a 2.48% yield over the next month (i.e., $3.40/$137.00 = 0.0248). For less risk-averse investors, a 2.70% yield is possible at the $138 strike price(i.e., $3.72/$138.00 = 0.02696). This strike price is just 4% below Friday's close. Moreover, even after rolling the prior trade over, the net yield with the $137 strike put play is still 2.20% (i.e., $3.40-0.39 = $3.01/$137.00 = 0.02197). So, that means over two months, a short seller of these OTM puts will have made 2.82% plus 2.20%, or 5.02% total (2.51% on average for both months). In addition, an investor's breakeven point, even if NVDA falls to $137.00 over the next month, is lower: $137 - $3.40 = $133.60 p/ sh That is -7.125% below Friday's closing price. In other words, this is a good way to set a lower buy-in point for new investors in NVDA stock. For existing investors, it is a way to potentially lower their average cost, as well as produce extra income on their holdings. And don't forget, given the target price of $172.60, the breakeven point presents a potential upside of over 29%: $172.60/$133.60-1 = 1.292 -1 = +29.2% upside The bottom line is that investors can potentially make over a 2% yield over the next month shorting these out-of-the-money (OTM) puts. On the date of publication, Mark R. Hake, CFA did not have (either directly or indirectly) positions in any of the securities mentioned in this article. All information and data in this article is solely for informational purposes. This article was originally published on Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

If I Could Buy Only 1 "Magnificent 7" Stock Over the Next Year, Alphabet Would Be It, but Here's the Key Reason
If I Could Buy Only 1 "Magnificent 7" Stock Over the Next Year, Alphabet Would Be It, but Here's the Key Reason

Yahoo

time2 hours ago

  • Yahoo

If I Could Buy Only 1 "Magnificent 7" Stock Over the Next Year, Alphabet Would Be It, but Here's the Key Reason

Alphabet shares have dipped 2% over the past year, while most "Magnificent Seven" stocks posted double-digit percentage gains. Market leaders like Nvidia and Microsoft may look flashier, but Alphabet could offer better value. A tasty combination of affordable shares and artificial intelligence (AI) expertise sets this stock apart from the rest. 10 stocks we like better than Alphabet › The "Magnificent Seven" moniker was originally intended as a warning to long-term investors. Remember, the movie by the same name doesn't have the happiest of endings, and the tragedy made sense as a metaphor for potential market bubbles. Still, the Magnificent Seven group keeps setting the tone for the overall stock market, and most of these stocks are market darlings in 2025, with double-digit price gains over the last 52 weeks. But Google parent Alphabet (NASDAQ: GOOG) (NASDAQ: GOOGL) is lagging behind with a 2% price dip over the last year, and the stock looks downright undervalued in many ways. It's the only Magnificent Seven stock I have bought this year, for one simple reason: It's the best combination of affordable shares and unbeatable artificial intelligence (AI) expertise in this elite group. The other Magnificent Seven companies may have a leg up on Alphabet in the AI market so far. Nvidia's (NASDAQ: NVDA) profitable sales growth is unbeatable. Revenue-based market shares suggest that the cloud computing solutions from Amazon (NASDAQ: AMZN) and Microsoft (NASDAQ: MSFT) are running circles around Google Cloud. But those proven and promised results are firmly baked into the stock prices. Nvidia stock trades at 47 times earnings and 49 times free cash flows today. Microsoft and Amazon have P/E ratios in the mid-30s and cash flow multiples well above Nvidia's. At the same time, Alphabet stock looks affordable at 19 times earnings and 28 times free cash flows. The numbers never tell the whole story, and there's more to say about Alphabet's long-term growth opportunities. From AI services to quantum computing systems, the company was built to thrive amid ever-changing markets and unexpected economy jolts. But the modest stock valuation is a great starting point for further research. Before you buy stock in Alphabet, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Alphabet wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $659,171!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $891,722!* Now, it's worth noting Stock Advisor's total average return is 995% — a market-crushing outperformance compared to 172% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 9, 2025 Suzanne Frey, an executive at Alphabet, is a member of The Motley Fool's board of directors. John Mackey, former CEO of Whole Foods Market, an Amazon subsidiary, is a member of The Motley Fool's board of directors. Anders Bylund has positions in Alphabet, Amazon, and Nvidia. The Motley Fool has positions in and recommends Alphabet, Amazon, Microsoft, and Nvidia. The Motley Fool recommends the following options: long January 2026 $395 calls on Microsoft and short January 2026 $405 calls on Microsoft. The Motley Fool has a disclosure policy. If I Could Buy Only 1 "Magnificent 7" Stock Over the Next Year, Alphabet Would Be It, but Here's the Key Reason was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store