Latest news with #ForrestNorrod


Channel Post MEA
7 days ago
- Business
- Channel Post MEA
Oracle And AMD Collaborate To Deliver Breakthrough Performance In AI Workloads
Oracle and AMD have announced that AMD Instinct MI355X GPUs will be available on Oracle Cloud Infrastructure (OCI) to give customers more choice and more than 2X better price-performance for large-scale AI training and inference workloads compared to the previous generation. Oracle will offer zettascale AI clusters accelerated by the latest AMD Instinct processors with up to 131,072 MI355X GPUs to enable customers to build, train, and inference AI at scale. 'To support customers that are running the most demanding AI workloads in the cloud, we are dedicated to providing the broadest AI infrastructure offerings,' said Mahesh Thiagarajan, executive vice president, Oracle Cloud Infrastructure. 'AMD Instinct GPUs, paired with OCI's performance, advanced networking, flexibility, security, and scale, will help our customers meet their inference and training needs for AI workloads and new agentic applications.' To support new AI applications that require larger and more complex datasets, customers need AI compute solutions that are specifically designed for large-scale AI training. The zettascale OCI Supercluster with AMD Instinct MI355X GPUs meets this need by providing a high-throughput, ultra-low latency RDMA cluster network architecture for up to 131,072 MI355X GPUs. AMD Instinct MI355X delivers nearly triple the compute power and a 50 percent increase in high-bandwidth memory than the previous generation. 'AMD and Oracle have a shared history of providing customers with open solutions to accommodate high performance, efficiency, and greater system design flexibility,' said Forrest Norrod, executive vice president and general manager, Data Center Solutions Business Group, AMD. 'The latest generation of AMD Instinct GPUs and Pollara NICs on OCI will help support new use cases in inference, fine-tuning, and training, offering more choice to customers as AI adoption grows.' AMD Instinct MI355X Coming to OCI AMD Instinct MI355X-powered shapes are designed with superior value, cloud flexibility, and open-source compatibility—ideal for customers running today's largest language models and AI workloads. With AMD Instinct MI355X on OCI, customers will be able to benefit from: Significant performance boost: Helps customers increase performance for AI deployments with up to 2.8X higher throughput. To enable AI innovation at scale, customers can expect faster results, lower latency, and the ability to run larger AI workloads. Helps customers increase performance for AI deployments with up to 2.8X higher throughput. To enable AI innovation at scale, customers can expect faster results, lower latency, and the ability to run larger AI workloads. Larger, faster memory: Allows customers to execute large models entirely in memory, enhancing inference and training speeds for models that require high memory bandwidth. The new shapes offer 288 gigabytes of high-bandwidth memory 3 (HBM3) and up to eight terabytes per second of memory bandwidth. Allows customers to execute large models entirely in memory, enhancing inference and training speeds for models that require high memory bandwidth. The new shapes offer 288 gigabytes of high-bandwidth memory 3 (HBM3) and up to eight terabytes per second of memory bandwidth. New FP4 support: Allows customers to deploy modern large language and generative AI models cost-effectively with the support of the new 4-bit floating point compute (FP4) standard. This enables ultra-efficient and high-speed inference. Allows customers to deploy modern large language and generative AI models cost-effectively with the support of the new 4-bit floating point compute (FP4) standard. This enables ultra-efficient and high-speed inference. Dense, liquid-cooled design: Enables customers to maximize performance density at 125 kilowatts per rack for demanding AI workloads. With 64 GPUs per rack at 1,400 watts each, customers can expect faster training times with higher throughput and lower latency. Enables customers to maximize performance density at 125 kilowatts per rack for demanding AI workloads. With 64 GPUs per rack at 1,400 watts each, customers can expect faster training times with higher throughput and lower latency. Built for production-scale training and inference: Supports customers deploying new agentic applications with a faster time-to-first token (TTFT) and high tokens-per-second throughput. Customers can expect improved price performance for both training and inference workloads. Supports customers deploying new agentic applications with a faster time-to-first token (TTFT) and high tokens-per-second throughput. Customers can expect improved price performance for both training and inference workloads. Powerful head node: Assists customers in optimizing their GPU performance by enabling efficient job orchestration and data processing with an AMD Turin high-frequency CPU with up to three terabytes of system memory. Assists customers in optimizing their GPU performance by enabling efficient job orchestration and data processing with an AMD Turin high-frequency CPU with up to three terabytes of system memory. Open-source stack: Enables customers to leverage flexible architectures and easily migrate their existing code with no vendor lock-in through AMD ROCm. AMD ROCm is an open software stack that includes popular programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development on AMD GPUs. Enables customers to leverage flexible architectures and easily migrate their existing code with no vendor lock-in through AMD ROCm. AMD ROCm is an open software stack that includes popular programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development on AMD GPUs. Network innovation with AMD Pollara: Provides customers with advanced RoCE functionality that enables innovative network fabric designs. Oracle will be the first to deploy AMD Pollara AI NICs on backend networks, providing advanced RoCE functions such as programmable congestion control and support for open industry standards from the Ultra Ethernet Consortium (UEC) for high-performance and low latency networking.


Techday NZ
13-06-2025
- Business
- Techday NZ
Oracle unveils AMD-powered zettascale AI cluster for OCI cloud
Oracle has announced it will be one of the first hyperscale cloud providers to offer artificial intelligence (AI) supercomputing powered by AMD's Instinct MI355X GPUs on Oracle Cloud Infrastructure (OCI). The forthcoming zettascale AI cluster is designed to scale up to 131,072 MI355X GPUs, specifically architected to support high-performance, production-grade AI training, inference, and new agentic workloads. The cluster is expected to offer over double the price-performance compared to the previous generation of hardware. Expanded AI capabilities The new announcement highlights several key hardware and performance enhancements. The MI355X-powered cluster provides 2.8 times higher throughput for AI workloads. Each GPU features 288 GB of high-bandwidth memory (HBM3) and eight terabytes per second (TB/s) of memory bandwidth, allowing for the execution of larger models entirely in memory and boosting both inference and training speeds. The GPUs also support the FP4 compute standard, a four-bit floating point format that enables more efficient and high-speed inference for large language and generative AI models. The cluster's infrastructure includes dense, liquid-cooled racks, each housing 64 GPUs and consuming up to 125 kilowatts per rack to maximise performance density for demanding AI workloads. This marks the first deployment of AMD's Pollara AI NICs to enhance RDMA networking, offering next-generation high-performance and low-latency connectivity. Mahesh Thiagarajan, Executive Vice President, Oracle Cloud Infrastructure, said: "To support customers that are running the most demanding AI workloads in the cloud, we are dedicated to providing the broadest AI infrastructure offerings. AMD Instinct GPUs, paired with OCI's performance, advanced networking, flexibility, security, and scale, will help our customers meet their inference and training needs for AI workloads and new agentic applications." The zettascale OCI Supercluster with AMD Instinct MI355X GPUs delivers a high-throughput, ultra-low latency RDMA cluster network architecture for up to 131,072 MI355X GPUs. AMD claims the MI355X provides almost three times the compute power and a 50 percent increase in high-bandwidth memory over its predecessor. Performance and flexibility Forrest Norrod, Executive Vice President and General Manager, Data Center Solutions Business Group, AMD, commented on the partnership, stating: "AMD and Oracle have a shared history of providing customers with open solutions to accommodate high performance, efficiency, and greater system design flexibility. The latest generation of AMD Instinct GPUs and Pollara NICs on OCI will help support new use cases in inference, fine-tuning, and training, offering more choice to customers as AI adoption grows." The Oracle platform aims to support customers running the largest language models and diverse AI workloads. OCI users leveraging the MI355X-powered shapes can expect significant performance increases—up to 2.8 times greater throughput—resulting in faster results, lower latency, and the capability to run larger models. AMD's Instinct MI355X provides customers with substantial memory and bandwidth enhancements, which are designed to enable both fast training and efficient inference for demanding AI applications. The new support for the FP4 format allows for cost-effective deployment of modern AI models, enhancing speed and reducing hardware requirements. The dense, liquid-cooled infrastructure supports 64 GPUs per rack, each operating at up to 1,400 watts, and is engineered to optimise training times and throughput while reducing latency. A powerful head node, equipped with an AMD Turin high-frequency CPU and up to 3 TB of system memory, is included to help users maximise GPU performance via efficient job orchestration and data processing. Open-source and network advances AMD emphasises broad compatibility and customer flexibility through the inclusion of its open-source ROCm stack. This allows customers to use flexible architectures and reuse existing code without vendor lock-in, with ROCm encompassing popular programming models, tools, compilers, libraries, and runtimes for AI and high-performance computing development on AMD hardware. Network infrastructure for the new supercluster will feature AMD's Pollara AI NICs that provide advanced RDMA over Converged Ethernet (RoCE) features, programmable congestion control, and support for open standards from the Ultra Ethernet Consortium to facilitate low-latency, high-performance connectivity among large numbers of GPUs. The new Oracle-AMD collaboration is expected to provide organisations with enhanced capacity to run complex AI models, speed up inference times, and scale up production-grade AI workloads economically and efficiently.


Techday NZ
11-06-2025
- Science
- Techday NZ
AMD supercomputers lead Top500 rankings with record exaflops
El Capitan and Frontier, both powered by AMD processors and accelerators, have retained the top two positions on the latest Top500 list of the world's most powerful supercomputers. Supercomputing leadership The recently released Top500 rankings show that El Capitan, based at Lawrence Livermore National Laboratory, remains the fastest system globally, registering a High Performance Linpack (HPL) score of 1.742 exaflops. Frontier, situated at Oak Ridge National Laboratory, holds the second position with an HPL result of 1.353 exaflops. Both supercomputers were constructed by HPE and utilise AMD hardware at their core. El Capitan uses AMD Instinct MI300A accelerated processing units (APUs), integrating CPU and GPU functionality within a single package, aimed at supporting large-scale artificial intelligence and scientific workloads. Frontier leverages AMD EPYC CPUs alongside AMD Instinct MI250X GPUs for a variety of advanced computational research needs, including modelling in energy, climate, and next-generation artificial intelligence. Broader AMD presence AMD technologies now underpin 172 supercomputing systems out of the 500 included in the latest Top500 list. This figure represents more than a third of all the high-performance systems measured. Notably, 17 new systems joined the list this year running on AMD processors, five of which use the latest 5th Gen AMD EPYC architecture. The expanded presence spans institutions such as the University of Stuttgart's High-Performance Computing Center, where the Hunter system is powered by AMD Instinct MI300A APUs; the University of Hull's Viper supercomputer; and Italy's new EUROfusion Pitagora system at CINECA, powered by 5th Gen AMD EPYC CPUs. Performance and efficiency In addition to sheer computational power, AMD's showing on the Top500 list extends to energy efficiency. According to the most recent Green500 list, 12 of the 20 most energy-efficient supercomputers globally use AMD EPYC processors and AMD Instinct accelerators. El Capitan and Frontier ranked 26th and 32nd respectively on the Green500 index, reflecting their performance-per-watt capabilities given their computing output. This was echoed in alternative benchmarks. On the HPL-MxP test, which measures mixed-precision computing suited for artificial intelligence workloads, El Capitan debuted at the top, reaching 16.7 exaflops, with Frontier in third place and LUMI, another AMD system, in fourth. The HPCG (High-Performance Conjugate Gradient) test, a complementary performance metric for scientific applications, saw El Capitan post the highest benchmark score of 17.4 petaflops, marking it out for memory bandwidth enabled by the Instinct MI300A architecture. Institutional perspectives "From El Capitan to Frontier, AMD continues to power the world's most advanced supercomputers, delivering record-breaking performance and leadership energy efficiency," said Forrest Norrod, Executive Vice President and General Manager, Data Center Solutions Group, AMD. "With the latest Top500 list, AMD not only holds the top two spots but now powers 172 of the world's fastest systems—more than ever before—underscoring our accelerating momentum and the trust HPC leaders place in our CPUs and GPUs to drive scientific discovery and AI innovation." Rob Neely, Associate Director for Weapon Simulation and Computing at Lawrence Livermore National Laboratory, described the impact of El Capitan: "El Capitan is a transformative national resource that will dramatically expand the computational capabilities of the NNSA labs at Livermore, Los Alamos and Sandia in support of our national security and science missions. With AMD's advanced APU architecture, we can now perform simulations with the precision and confidence we set as a goal 15 years ago, when the path to exascale was difficult to foresee. As a bonus, this platform is a true 'two-fer' - an HPC and AI powerhouse that will fundamentally reshape how we fulfill our mission." Future direction The distinction on the Top500 and Green500 lists coincides with a broader shift within high performance computing, as artificial intelligence and traditional HPC workloads increasingly converge. AMD's presence in the sector demonstrates demand for scalable and efficient compute platforms amid growing power requirements for data-intensive scientific and industrial workloads. The results also indicate the use of a portfolio that includes CPUs, GPUs, and APUs to accelerate developments across domains ranging from nuclear safety and climate modelling, to training large language models and generative artificial intelligence inference.
Yahoo
20-05-2025
- Business
- Yahoo
AMD to offload ZT Systems server-manufacturing for $3bn
Advanced Micro Devices (AMD) has agreed to divest ZT Systems' data centre infrastructure manufacturing business to Sanmina for up to $3bn The deal positions Sanmina as a preferred partner for the manufacturing of AMD's cloud rack and cluster-scale AI solutions. Under the deal terms, Sanmina will take over ZT Systems' operations for $2.25bn in cash, $300m premium split between cash and equity. The deal also includes a contingent consideration of $450m based on the financial performance of the business over the next three years. With the acquisition, Sanmina expects to increase scale and end-market exposure to cloud and ai infrastructure. It also anticipates bolstering its offering of end-to-end component technology, systems integration, and supply chain solutions. Sanmina chairman and CEO Jure Sola said: "ZT Systems' liquid cooling capabilities, high-quality manufacturing capacity and significant cloud and AI infrastructure experience are the perfect complement to Sanmina's global portfolio, mission-critical technologies and vertical integration capabilities. 'Together, we will be better able to deliver a competitive advantage to our customers with solutions for the entire product lifecycle. We look forward to our ongoing partnership with AMD as we work together to set the standard for quality and flexibility to benefit the entire AI ecosystem.' Subject to regulatory approval and standard closing conditions, the transaction is scheduled for completion by late 2025. AMD, however, will maintain ZT Systems' design and customer support teams. AMD Data Center Solutions business unit executive vice-president and general manager Forrest Norrod said: "We look forward to working with Sanmina as our preferred NPI manufacturing partner. 'This agreement will help accelerate the US-based manufacturing of AMD AI end-to-end training and inference solutions – which are optimised for our customer's unique environments, ready-to-deploy at scale and based on our open approach. 'Together, we will accelerate time-to-market and set the standard for quality and flexibility to benefit the entire AI ecosystem." Morgan Stanley is advising AMD financially, while Latham & Watkins is providing legal counsel. The sale follows AMD's deal to acquire ZT Systems for $4.9bn in 2024, with the intention to sell the server manufacturing business upon completion of the deal. "AMD to offload ZT Systems server-manufacturing for $3bn" was originally created and published by Verdict, a GlobalData owned brand. The information on this site has been included in good faith for general informational purposes only. It is not intended to amount to advice on which you should rely, and we give no representation, warranty or guarantee, whether express or implied as to its accuracy or completeness. You must obtain professional or specialist advice before taking, or refraining from, any action on the basis of the content on our site. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Yahoo
20-05-2025
- Business
- Yahoo
AMD strikes a deal to sell ZT Systems' server-manufacturing business for $3B
Semiconductor giant AMD followed through with its plan to spin out ZT Systems' server-manufacturing business. AMD announced on Monday that it was selling ZT Systems' server-manufacturing business to electronic manufacturing services company Sanmina. The $3 billion deal is a mix of cash and stock: $2.25 billion in cash; a $300 million premium, including 50% cash and 50% equity; and a $450 million contingent payment based on financial performance over the next three years, according to Reuters. The deal is expected to close by the end of 2025, subject to regulatory approval. After this divestiture, AMD will maintain control of ZT Systems' rack-scale AI solutions design business. This announcement isn't a shock. When AMD announced its intent to acquire ZT Systems, an AI and cloud infrastructure company, for $4.9 billion in August 2024, the company said at the time that it planned to divest that part of ZT Systems' business after the deal formally closed. AMD's acquisition of ZT Systems officially closed in March 2025, according to Reuters. Alongside this announcement, AMD said that Sanmina will become a "preferred" new product introduction manufacturing partner for AMD cloud rack and cluster-scale AI solutions. 'By combining the deep experience of our AI systems design team with our new preferred NPI partnership with Sanmina, we expect to strengthen our U.S-based manufacturing capabilities for rack and cluster-scale AI systems and accelerate quality and time-to-market for our cloud customers,' said Forrest Norrod, executive vice president and general manager, data center solutions business unit at AMD, in the company's announcement. TechCrunch reached out to AMD for comment. This article originally appeared on TechCrunch at