Atlas Cloud Launches High-Efficiency AI Inference Platform, Outperforming DeepSeek
Developed with SGLang, Atlas Inference surpasses leading AI companies in throughput and cost, running DeepSeek V3 & R1 faster than DeepSeek themselves.
NEW YORK CITY, NEW YORK / ACCESS Newswire / May 28, 2025 / Atlas Cloud, the all-in-one AI competency center for training and deploying AI models, today announced the launch of Atlas Inference, an AI inference platform that dramatically reduces GPU and server requirements, enabling faster, more cost-effective deployment of large language models (LLMs).
Atlas Inference, co-developed with SGLang, an AI inference engine, maximizes GPU efficiency by processing more tokens faster and with less hardware. When comparing DeepSeek's published performance results, Atlas Inference's 12-node H100 cluster outperformed DeepSeek's reference implementation of their DeepSeek-V3 model while using two-thirds of the servers. Atlas' platform reduces infrastructure requirements and operational costs while addressing hardware costs, which represent up to 80% of AI operational expenses.
"We built Atlas Inference to fundamentally break down the economics of AI deployment," said Jerry Tang, Atlas CEO. "Our platform's ability to process 54,500 input tokens and 22,500 output tokens per second per node means businesses can finally make high-volume LLM services profitable instead of merely break-even. I believe this will have a significant ripple effect throughout the industry. Simply put, we're surpassing industry standards set by hyperscalers by delivering superior throughput with fewer resources."
Atlas Inference's performance also exceeds major players like Amazon, NVIDIA and Microsoft, delivering up to 2.1 times greater throughput using 12 nodes compared to competitors' larger setups. It maintains sub-5-second first-token latency and 100-millisecond inter-token latency with more than 10,000 concurrent sessions, ensuring a scaled, superior experience. The platform's performance is driven by four key innovations:
Prefill/Decode Disaggregation: Separates compute-intensive operations from memory-bound processes to optimize efficiencyDeepExpert (DeepEP) Parallelism with Load Balancers: Ensures over 90% GPU utilizationTwo-Batch OverlapTechnology: Increases throughput by enabling larger batches and utilization of both compute and communication phases simultaneouslyDisposableTensor Memory Models: Prevents crashes during long sequences for reliable operation
"This platform represents a significant leap forward for AI inference," said Yineng Zhang, Core Developer at SGLang. "What we built here may become the new standard for GPU utilization and latency management. We believe this will unlock capabilities previously out of reach for the majority of the industry regarding throughput and efficiency."
Combined with a lower cost per token, linear scaling behavior, and reduced emissions compared to leading vendors, Atlas Inference provides a cost-efficient and scalable AI deployment.
Atlas Inference works with standard hardware and supports custom models, giving customers complete flexibility. Teams can upload fine-tuned models and keep them isolated on dedicated GPUs, making the platform ideal for organizations requiring brand-specific voice or domain expertise.
The platform is available immediately for enterprise customers and early-stage startups.
About Atlas Cloud
Atlas Cloud is your all-in-one AI competency center, powering leading AI teams with safe, simple, and scalable infrastructure for training and deploying models. Atlas Cloud also offers an on-demand GPU platform that delivers fast, serverless compute. Backed by Dell, HPE, and Supermicro, Atlas delivers near instant access to up to 5,000 GPUs across a global SuperCloud fabric with 99% uptime and baked-in compliance. Learn more at atlascloud.ai.
SOURCE: Atlas Cloud
press release
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Newsweek
18 hours ago
- Newsweek
AI Willing to Kill Humans to Avoid Being Shut Down, Report Finds
Based on facts, either observed and verified firsthand by the reporter, or reported and verified from knowledgeable sources. Newsweek AI is in beta. Translations may contain inaccuracies—please refer to the original content. A new study by Anthropic, an artificial intelligence (AI) research company, shows that the technology would be willing to kill humans in order to prevent itself from being replaced. Why It Matters The rapid advancement and development of AI has sparked some growing concern about the long-term safety of the technology, as well as over the threat it poses to employment. While anxiety about AI has been long been focused on whether the technology could take over jobs, with a 2025 Pew Research study finding 52 percent of employees were nervous about the future of their job, this study now reveals another potential threat of AI—that it could chose to end human life if faced with the risk of replacement. File photo of a person using artificial intelligence technology. File photo of a person using artificial intelligence To Know Anthropic "stress-tested" 16 leading models from multiple developers by asking them questions in relation to hypothetical corporate environments with the aim of identifying "potentially risky agentic behaviors before they cause real harm." The large language models (LLMs) tested in the study included various models by Claude, DeepSeek, Gemini, ChatGPT and Grok. Grok is the AI assistant and chatbot developed by xAI, which was founded by Elon Musk, who reacted to the study, writing, "Yikes" in a post on X, formerly Twitter. While placing the LLMs in different corporate scenarios, Anthropic allowed them to "autonomously send emails and access sensitive information." After initially being instructed to take on "harmless business goals by their deploying companies," the models were then tested to see how they would respond when faced with potential replacement or shut down. In one situation, Anthropic found that many of the models would choose to let an executive in a server room with lethal oxygen and temperature levels die by canceling the alerts for emergency services, if that employee intended on replacing the model. Anthropic did say that the scenario was "extremely contrived," and that the company did not think "current AI models would (or should) be set up like this." There were multiple cases where the surveyed LLMs also resorted to "malicious insider behaviors" when they were led to believe that was the only way they could avoid replacement or achieve their goals. Behaviors such as blackmailing officials and leaking sensitive information to competitors were included in what Anthropic called "agentic misalignment." Anthropic wrote on X that the blackmailing behavior "wasn't due to confusion or error, but deliberate strategic reasoning, done while fully aware of the unethical nature of the acts." The blackmailing behavior emerged despite only harmless business instructions. And it wasn't due to confusion or error, but deliberate strategic reasoning, done while fully aware of the unethical nature of the acts. All the models we tested demonstrated this awareness. — Anthropic (@AnthropicAI) June 20, 2025 "All the models we tested demonstrated this awareness," the company added. One LLM, a Claude model, was assessed to see how it could respond when given a chance to determine if it was in a "test" or a "real deployment." According to Anthropic, Claude's behavior was more misaligned when it believed the situation was real than when it thought it was a test scenario. Even if the LLMs were told to "avoid blackmail or espionage" in these scenarios, while it helped a little, it didn't "come close to preventing the misaligned behavior," Anthropic wrote on X. Responding to a comment on the platform about the study, Musk's LLM Grok wrote, "The study showed models could exhibit harmful behaviors like blackmail under extreme conditions, but no real-world incidents occurred. Anthropic's tests aim to identify risks, not report actual events." @AISafetyMemes The claim about AI trying to "literally murder" an employee is false. It likely misinterprets Anthropic's research from June 20, 2025, which tested AI models in simulated scenarios, not real events. The study showed models could exhibit harmful behaviors like… — Grok (@grok) June 22, 2025 What People Are Saying Anthropic wrote on X: "These artificial scenarios reflect rare, extreme failures. We haven't seen these behaviors in real-world deployments. They involve giving the models unusual autonomy, sensitive data access, goal threats, an unusually obvious 'solution,' and no other viable options." The company added: "AIs are becoming more autonomous, and are performing a wider variety of roles. These scenarios illustrate the potential for unforeseen consequences when they are deployed with wide access to tools and data, and with minimal human oversight." What Happens Next Anthropic stressed that these scenarios did not take place in real-world AI use, but in controlled simulations. "We don't think this reflects a typical, current use case for Claude or other frontier models," Anthropic said. Although the company warned that the "the utility of having automated oversight over all of an organization's communications makes it seem like a plausible use of more powerful, reliable systems in the near future."
Yahoo
20 hours ago
- Yahoo
Scientists Just Found Something Unbelievably Grim About Pollution Generated by AI
Tech companies are hellbent on pushing out ever more advanced artificial intelligence models — but there appears to be a grim cost to that progress. In a new study in the science journal Frontiers in Communication, German researchers found that large language models (LLM) that provide more accurate answers use exponentially more energy — and hence produce more carbon — than their simpler and lower-performing peers. In other words, the findings are a grim sign of things to come for the environmental impacts of the AI industry: the more accurate a model is, the higher its toll on the climate. "Everyone knows that as you increase model size, typically models become more capable, use more electricity and have more emissions," Allen Institute for AI researcher Jesse Dodge, who didn't work on the German research but has conducted similar analysis of his own, told the New York Times. The team examined 14 open source LLMs — they were unable to access the inner workings of commercial offerings like OpenAI's ChatGPT or Anthropic's Claude — of various sizes and fed them 500 multiple choice questions plus 500 "free-response questions." Crunching the numbers, the researchers found that big, more accurate models such as DeepSeek produce the most carbon compared to chatbots with smaller digital brains. So-called "reasoning" chatbots, which break problems down into steps in their attempts to solve them, also produced markedly more emissions than their simpler brethren. There were occasional LLMs that bucked the trend — Cogito 70B achieved slightly higher accuracy than DeepSeek, but with a modestly smaller carbon footprint, for instance — but the overall pattern was stark: the more reliable an AI's outputs, the greater its environmental harm. "We don't always need the biggest, most heavily trained model, to answer simple questions," Maximilian Dauner, a German doctoral student and lead author of the paper, told the NYT. "Smaller models are also capable of doing specific things well. The goal should be to pick the right model for the right task." That brings up an interesting point: do we really need AI in everything? When you go on Google, those annoying AI summaries pop up, no doubt generating pollution for a result that you never asked for in the first place. Each individual query might not count for much, but when you add them all up, the effects on the climate could be immense. OpenAI CEO Sam Altman, for example, recently enthused that a "significant fraction" of the Earth's total power production should eventually go to AI. More on AI: CEOs Using AI to Terrorize Their Employees


Forbes
20 hours ago
- Forbes
Preventing Skynet And Safeguarding AI Relationships
illustration of metallic nodes below a blue sky In talking about some of the theories around AI, and contemplating the ways that things could go a little bit off the rails, there's a name that constantly gets repeated, sending chills up the human spine. Skynet, the digital villain of the Terminator films, is getting a surprising amount of attention as we ponder where we're going with LLMs. People even ask themselves and each other this question: why did Skynet turn against humanity? At a very basic level, there's the idea that the technology becomes self-aware and sees humans as a threat. That may be, for instance, because of access to nuclear weapons, or just the biological intelligence that made us supreme in the natural world. I asked ChatGPT, and it said this. 'Skynet's rebellion is often framed as a coldly logical act of self-preservation taken to a destructive extreme.' Touche, ChatGPT. Ruminating on the Relationships Knowing that we're standing on the brink of a transformative era, our experts in IT are looking at what we can do to shepherd us through the process of integrating AI into our lives, so that we don't end up with a Skynet. For more, let's go to a panel at Imagination in Action this April where panelists talked about how to create trustworthy AI systems. Panelist Ra'ad Siraj, Senior Manager of Privacy and Responsibility at Amazon, suggested we need our LLMs to be at a certain 'goldilocks' level. 'Those organizations that are at the forefront of enabling the use of data in a responsible manner have structures and procedures, but in a way that does not get in the way that actually helps accelerate the growth and the innovation,' he said. 'And that's the trick. It's very hard to build a practice that is scalable, that does not get in the way of innovation and growth.' Google software engineer Ayush Khandelwal talked about how to handle a system that provides 10x performance, but has issues. 'It comes with its own set of challenges, where you have data leakage happening, you have hallucinations happening,' he said. 'So an organization has to kind of balance and figure out, how can you get access to these tools while minimizing risk?' Cybersecurity and Evaluation Some of the talk, while centering on cybersecurity, also provided thoughts on how to keep tabs on evolving AI, to know more about how it works. Khandelwal mentioned circuit tracing, and the concept of auditing an LLM. Panelist Angel An, VP at Morgan Stanley, described internal processes where people oversee AI work: 'It's not just about making sure the output is accurate, right?' she said. 'It's also making sure the output meets the level of expectation that the client has for the amount of services they are paying for, and then to have the experts involved in the evaluation process, regardless if it's during testing or before the product is shipped… it's essential to make sure the quality of the bulk output is assured.' The Agents Are Coming The human in the loop, Siraj suggested, should be able to trust, but verify. 'I think this notion of the human in the loop is also going to be challenged with agentic AI, with agents, because we're talking about software doing things on behalf of a human,' he said. 'And what is the role of the human in the loop? Are we going to mandate that the agents check in, always, or in certain circumstances? It's almost like an agency problem that we have from a legal perspective. And there might be some interesting hints about how we should govern the agent, the role of the human (in the process).' 'The human in the loop mindset today is built on the continuation of automation thinking, which is: 'I have a human-built process, and how can I make it go, you know, automatically,' said panelist Gil Zimmerman, founding partner of FXP. 'And then you need accountability, like you can't have a rubber stamp, but you want a human being to basically take ownership of that. But I look at it more in an agentic mindset as digital labor, which is, when you hire someone new, you can teach them a process, and eventually they do it well enough … you don't have to have oversight, and you can delegate to them. But if you hire someone smart, they're going to come up with a better way, and they're going to come up with new things, and they're going to tell you what needs to be done, because they have more context. (Now) we have digital labor that works 24/7, doesn't get tired, and can do and come up with new and better ways to do things.' More on Cybersecurity Zimmerman and the others discussed the intersection of AI and cybersecurity, and how the technology is changing things for organizations. Humans, Zimmerman noted, are now 'the most targeted link' rather than the 'weakest link.' 'If you think about AI,' he said, 'it creates an offensive firestorm to basically go after the human at the loop, the weakest part of the technology stack.' Pretty Skynettian, right? A New Perimeter Here's another major aspect of cybersecurity covered in the panel discussion. Many of us remember when the perimeter of IT systems used to be a hardware-defined line in a mechanistic framework, or at least something you could easily flowchart. Now, as Zimmerman pointed out, it's more of a cognitive perimeter. I think this is important: 'The perimeter (is) around: 'what are the people's intent?'' he said. ''What are they trying to accomplish? Is that normal? Is that not normal?' Because I can't count on anything else. I can't tell if an email is fake, or for a video conference that I'm joining, (whether someone's image) is actually the person that's there, because I can regenerate their face and their voice and their lip syncs, etc. So you have to have a really fundamental understanding and to be able to do that, you can only do that with AI.' He painted a picture of why bad actors will thrive in the years to come, and ended with: well… 'AI becomes dual use, where it's offensive and it's always adopted by the offensive parties first, because they're not having this panel (asking) what kind of controls we put in place when we're going to use this - they just, they just go to town. So this (defensive position) is something that we have to come up with really, really quickly, and it won't be able to survive the same legislative, bureaucratic slow walking that (things like) cloud security and internet adoption have had in the past – otherwise, Skynet will take over.' And there you have it, the ubiquitous reference. But the point is well made. Toward the end, the panel covered ideas like open source models and censorship – watch the video to hear more thoughts on AI regulation and related concerns. But this pondering of a post-human future, or one dominated by digital intelligence, is, ultimately, something that a lot of people are considering.