Latest news with #GPT-3

How India Inc. can reduce energy demand with edge computing

Time of India

5 days ago

Business
Time of India

How India Inc. can reduce energy demand with edge computing

Recently, AI-generated, Ghibli-style images took the internet by storm. But behind their charm lies an invisible environmental cost—water and energy consumption. Even Sam Altman, CEO of OpenAI, acknowledged the toll, tweeting: 'It's super fun seeing people love images in ChatGPT… But our GPUs are melting.'As artificial intelligence continues to evolve, the energy demands of its infrastructure are becoming a growing concern. Traditional AI relies heavily on massive, centralized data centers operating round-the-clock. These facilities, packed with thousands of servers running complex computations, also consume enormous energy for cooling to prevent overheating. Currently, data centers account for roughly 2% of global electricity use —a number poised to rise as AI models become more complex. For perspective, training a single advanced model like GPT-3 can use as much electricity as several hundred homes consume in a year. So, the million-dollar question is: How can we continue to harness AI's potential while curbing its environmental impact? One of the most promising answers lies in edge computing . Edge computing processes data closer to where it's generated—on devices such as smartphones, IoT sensors, and embedded systems—rather than routing everything through centralized cloud data centers. This shift cuts down on transmission energy and reduces dependence on cloud infrastructure, making AI deployments significantly more energy efficient. This article explores why Indian enterprises must embrace edge AI to curb energy usage—and how advancements in chip design are driving more sustainable, local AI processing. Centralized Data Centers: A Growing Energy Challenge By 2026, data center electricity consumption is expected to exceed 1,000 terawatt-hours —roughly equivalent to Japan's entire electricity demand. The explosion of data centers is straining global power grids. Beyond computation, these facilities require constant cooling—often powered by fossil fuels—contributing to rising carbon emissions and climate risk. Even with increased investment in renewables, the pace may not be enough to keep up with AI's surging energy needs. How edge computing reduces energy use Edge computing decentralizes workloads by processing data at the edge of the network or directly on devices. This reduces the burden on cloud infrastructure and lowers overall energy consumption. Instead of continuously streaming data to remote servers, edge devices process data locally and send only essential insights. For example, an edge-enabled surveillance system can analyze footage in real time and transmit only alerts or key clips—saving substantial energy otherwise spent on transmission and storage. Additionally, local processing reduces idle time caused by round trips to the cloud, further boosting energy efficiency. Energy-efficient chipsets powering the edge A new wave of energy-efficient AI chipsets and microcontrollers is enabling powerful edge applications—from wearables and autonomous systems to smart homes and industrial automation. These chips are purpose-built for high-efficiency AI processing, integrating features like neural accelerators and micro-NPUs in compact, low-power formats. Optimized for tasks such as vision recognition, audio sensing, and real-time decision-making, these chipsets bring intelligence directly to the device. Techniques like adaptive power scaling, heterogeneous computing, and low-precision AI operations allow them to balance performance and energy efficiency—resulting in faster processing, lower memory usage, and longer battery life. With built-in security features and compatibility across ecosystems, these chipsets are simplifying deployment of scalable, secure AI at the edge. Techniques for building edge AI models Edge AI models are designed to work within the limited power and resource constraints of edge devices, enabling real-time and accurate data processing. Key techniques include: 1. Model Compression and Simplification Using quantization (reducing calculation precision) and pruning (removing unnecessary neural connections), developers can significantly shrink models. These lightweight versions consume less memory and power—without sacrificing accuracy. 2. Streamlined Architectures Models like MobileNet for image recognition and TinyBERT for language tasks are built specifically for constrained devices, balancing low power consumption with performance. 3. Leveraging Pre-Trained Models Platforms offering pre-trained models that can be fine-tuned for specific use cases enable businesses to integrate AI more efficiently. Embedding these models directly into chipsets allows for faster deployment of AI solutions with lower energy consumption—even without deep AI expertise. This minimizes the need for extensive customization and shortens go-to-market timelines. For silicon vendors, offering chips with an ecosystem of ready-to-deploy models adds significant value. A chip preloaded with AI capabilities lets customers bypass development hurdles and start immediately. Overcoming challenges in edge AI Despite its advantages, edge AI must overcome a few key hurdles to scale effectively: 1. Hardware Constraints Edge devices lack the compute, memory, and storage of cloud servers. Addressing this, demands continuous innovation in low-power, high-performance chip design. 2. Managing Complex Edge Ecosystems The decentralized nature of edge computing means managing a vast network of devices. As IoT adoption grows, robust frameworks and tools are essential for coordination and scalability. 3. Ensuring Security With sensitive data processed locally, security becomes non-negotiable. Techniques like secure boot, data encryption, and regular firmware updates are essential to maintaining trust and safeguarding information. Conclusion As AI becomes increasingly embedded in everyday life, sustainability must be at the forefront. Edge computing offers a powerful solution by moving intelligence closer to the data source—cutting energy use and easing pressure on central infrastructure. For India Inc., edge AI is more than just a trend—it's a strategic imperative. To align with the Net Zero Scenario, emissions must fall by 50% by 2030. Edge AI is paving the way for smarter, greener, and more responsive solutions—and the time to act is now.

Sam Altman's Lies About ChatGPT Are Growing Bolder

Gizmodo

11-06-2025

Gizmodo

Sam Altman's Lies About ChatGPT Are Growing Bolder

The AI brain rot in Silicon Valley manifests in many varieties. For OpenAI's figurehead Sam Altman, this often results in a lot of vague talk about artificial intelligence as the panacea to all of the world's woes. Altman's gaslighting reached new heights this week as he cited wildly deflated numbers for OpenAI's water and electricity usage compared to numerous past studies. In a Tuesday blog post, Altman cited internal figures for how much energy and water a single ChatGPT query uses. The OpenAI CEO claimed a single prompt requires around 0.34 Wh, equivalent to what 'a high-efficiency lightbulb would use in a couple of minutes.' For cooling these data centers used to process AI queries, Altman suggested a student asking ChatGPT to do their essay for them requires '0.000085 gallons of water, roughly one-fifteenth of a teaspoon.' Altman did not offer any evidence for these claims and failed to mention where his data comes from. Gizmodo reached out to OpenAI for comment, but we did not hear back. If we took the AI monger at his word, we only need to do some simple math to check how much water that actually is. OpenAI has claimed that as of December 2025, ChatGPT has 300 million weekly active users generating 1 billion messages per day. Based on the company's and Altman's own metrics, that would mean the chatbot uses 85,000 gallons of water per day, or a little more than 31 million gallons per year. ChatGPT is hosted on Microsoft data centers, which use quite a lot of water already. The tech giant has plans for 'closed-loop' centers that don't use extra water for cooling, but these projects won't be piloted for at least another year. Fresh numbers shared by @sama earlier today: 300M weekly active ChatGPT users 1B user messages sent on ChatGPT every day 1.3M devs have built on OpenAI in the US — OpenAI Newsroom (@OpenAINewsroom) December 4, 2024 These data centers were already water- and power-hungry before the advent of generative AI. For Microsoft, water use spiked from 2021 to 2022 after the tech giant formulated a deal with OpenAI. A study from University of California researchers published in late 2023 claimed the older GPT-3 version of ChatGPT drank about .5 liters for every 10 to 50 queries. If you take that data at its most optimistic, OpenAI's older model would be using 31 million liters of water per day, or 8.18 million gallons. And that's for an older model, not today's current, much more powerful (and far more demanding) GPT-4.1 plus its o3 reasoning model. The size of the model impacts how much energy it uses. There have been multiple studies about the environmental impact of training these models, and since they continuously have to be retrained as they grow more advanced, the electricity cost will continue to escalate. Altman's figures don't mention which queries are formulated through its multiple different ChatGPT products, including the most advanced $200-a-month subscription that grants access to GPT-4o. It also ignores the fact that AI images require much more energy to process than text queries. Altman's entire post is full of big tech optimism shrouded in talking points that make little to no sense. He claims that datacenter production will be 'automated,' so the cost of AI 'should eventually converge to near the cost of electricity.' If we are charitable and assume Altman is suggesting that the expansion of AI will somehow offset the electricity necessary to run it, we're still left holding today's bag and dealing with rising global temperatures. Multiple companies have tried to solve the water and electricity issue with AI, with some landing on plans to throw data centers into the ocean or build nuclear power plants just to supply AI with the necessary electricity. Long before any nuclear plant can be built, these companies will continue to burn fossil fuels. The OpenAI CEO's entire blog is an encapsulation of the bullheaded big tech oligarch thinking. He said that 'entire classes of jobs' will go the way of the dodo, but it doesn't matter since 'the world will be getting so much richer so quickly that we'll be able to seriously entertain new policy ideas we never could before.' Altman and other tech oligarchs have suggested we finally encourage universal basic income as a way of offsetting the impact of AI. OpenAI knows it won't work. He's never been serious enough about that idea that he has stumped for it harder than he has before cozying up to President Donald Trump to ensure there's no future regulation on the AI industry. 'We do need to solve the safety issues,' Altman said. But that doesn't mean that we all shouldn't be expanding AI to every aspect of our lives. He suggests we ignore the warming planet because AI will solve that niggling issue in due course. But if temperatures rise, requiring even more water and electricity to cool these data centers, I doubt AI can work fast enough to fix anything before it's too late. But ignore that; just pay attention to that still unrevealed Jony Ive doohickey that may or may not gaslight you as the world burns.

Why Multilingual AI Isn't The Same As Global-Ready

Forbes

11-06-2025

Forbes

Why Multilingual AI Isn't The Same As Global-Ready

Alessa Cross is on the founding team at Ventrilo AI. getty Despite advances in multilingual modeling, most language AI systems remain anchored in English. Earlier generations, such as GPT-3, drew over 90% of their pretraining data from English sources, and even newer models inherit structural and linguistic biases shaped by English-dominant internet content. The result is a persistent gap between models that can speak many languages and those that can function across linguistic contexts. Yet in enterprise settings, it remains common to treat these systems as "global-ready" out of the box. When we ask these models to serve users around the world, we're regularly asking them to operate beyond their training context. The results may still be grammatically correct, but they're often structurally or culturally misaligned. If we want AI to be more than an English-speaking assistant with a multilingual dictionary, we must move past the illusion that translation equals localization. That begins with recognizing how deeply English-centric assumptions shape model behavior, and what it takes to build systems that scale across different languages and cultural domains. Language models perform best in the languages they see most during training. As linguistic distance from English increases, so does performance degradation, especially in languages with different syntactic rules or limited training representation. In lower-resource languages, models often misread tone or intent. A polite inquiry in Hindi or Japanese may be interpreted as vague or indecisive. Still, many benchmarks, evaluation datasets, annotation protocols and UX assumptions are also designed around American English. Early voice assistants illustrate this well. Early versions of Siri struggled significantly with Arabic and regional dialects despite strong performance across the board. The issue was that the model failed to reflect how speakers naturally structure requests. In many cases, users were forced to adopt English-like phrasing just to be understood, which is a complete reversal of the human-computer relationship. In human-centered design, technology is meant to adapt to the user, not the other way around. Instead of augmenting communication, the system becomes another barrier to it. Why Localization Is Not Translation Translating prompts into other languages does not localize a product. Localization is a systems-level redesign that touches the underlying model assumptions, UX patterns and even architecture. Consider, for example: • Design Accommodations: Sentence length and structure vary significantly by language. German UI text can be up to three times longer than its English equivalent, which often requires layout redesigns to prevent truncation or overlap. • Tone And Register Calibration: What reads as efficient in American English might come across as curt or even disrespectful in Japanese. Nuanced tone calibration requires cultural familiarity. • Contextual Expectations: A summarization tool for U.S. hospitals must parse SOAP notes and colloquial shorthand. The same tool in France will need to work with structured, coded documentation and adhere to GDPR constraints. When these factors are ignored, linguistic gaps can grow into usability failures, and eventually into market rejection. Why Domain Matters As Much As Language Language is just one dimension. For enterprise users, especially in regulated sectors, functional alignment often matters more than linguistic fidelity. In the U.S., enterprise AI may be expected to integrate with Salesforce and Slack. In China, workflows depend on WeChat Work or regional CRMs. In India, workflows might span WhatsApp and legacy accounting platforms. When AI ignores these distinctions, it becomes a tool that teams work around, rather than one that augments their productivity. In healthcare or finance, a mismatch can be far more costly. Even low-level decisions, such as whether your retrieval system applies a single global index or language-specific pipelines, shape usability. How does it prioritize query routing when users include partial translations? These decisions impact latency and quality, even before any output is generated. Designing for global use means understanding not just language, but the lived context of work. Accepting The World's Inputs Most language models are trained on relatively clean, curated data. Real-world inputs are anything but. Especially in multilingual markets, users frequently mix languages and use non-standard grammar, local idioms, creative spelling and hybrid sentence structures. Systems need to account for: • Robust normalization without flattening distinctions essential to meaning • Fast, on-device language detection to route inputs to the right models quickly and accurately • Region-specific preprocessing, because what counts as 'clean' input in one market might erase meaning in another For many users and global contexts, how well your AI handles edge cases is a prerequisite to usability. The Feedback Loop As Core Architecture No global AI product will be perfectly localized on day one. But systems that can learn from localized usage improve quickly if the infrastructure supports it. When translation models are built using user-generated content and a corpus which closely mirrors real-world usage, they consistently outperform models trained on broad, general-domain data. These results show the value of tuning your translations to what your users are saying, and suggest that adding direct feedback loops could improve quality even further, and more quickly. Enabling that learning requires infrastructure that can: • Capture user edits and corrections • Cluster user abandonments and rephrased queries • Feed localized usage patterns back into training • Surface divergence between user expectations and model output Often, what separates a brittle AI language product from a scalable platform is how effectively it can learn and improve from real usage. Redefining 'Global-Ready' Global readiness is a question of whether your AI product feels intelligent and usable to users who don't share your default assumptions. That requires investment across four distinct axes: • Language: Beyond the translation layer, design systems that feel native in syntax, tone and usage. • Domain: Align with local workflows, documentation styles, data structures and regulatory standards. • Input: Engineer systems to understand how users actually speak and write. • Feedback: Treat user interactions as training data. Build infrastructure that allows products to learn from localized usage, especially when they deviate from expected patterns. AI products that ignore how global teams actually work tend to stall at the edge of familiar markets. Whether they scale into global workflows or stay locked in English-first regions often comes down to investments made early in design. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

The best use cases for each ChatGPT model

Android Authority

29-05-2025

Business
Android Authority

The best use cases for each ChatGPT model

Calvin Wankhede / Android Authority While ChatGPT has existed in various forms for some time, its true mainstream success began with the release of GPT-3 in 2020. Since then, ChatGPT has evolved significantly, both for better and worse. Although the tool is now more useful than ever before, it's also become somewhat confusing. Depending on your subscription level, you might have up to eight different models to choose from, making it tricky to identify which is best suited for your task. As someone who has been a ChatGPT Plus user since subscriptions first became available, I rely on ChatGPT frequently. Sometimes it's for brainstorming, proofreading, personal organizing, or other productive activities. Other times, it's purely for entertainment — such as creating alternate timelines or pondering random philosophical ideas. Setting aside the fact that I clearly need more friends, these interactions have given me ample experience with which model works best in various situations. The truth is, there isn't one perfect use case for each ChatGPT model, as many overlap. Still, let's take a closer look at the seven models currently available, exploring the ideal scenarios for each. GPT-4o is great for generalist tasks, especially for free users Kaitlyn Cimino / Android Authority Best for : General-purpose tasks, including editing, questions, and brainstorming : General-purpose tasks, including editing, questions, and brainstorming Availability: Free or higher ChatGPT defaults to GPT-4o for a good reason: it's a solid generalist. This multimodal model can process and analyze text, images, audio, and even video, making GPT-4o ideal for a wide range of tasks, including: Composing emails Basic brainstorming and creative content Summarizing text, and basic creative content Basic editing and proofreading Simple questions That's some of the official use cases, but your imagination is the true limit. Personally, I've used GPT-4o extensively for my creative writing projects. It's also been my go-to for: Creating alternate timelines and similar role-playing scenarios Fetching general information, such as gardening tips and simple queries Performing straightforward edits and summarization Although I'm not a coder, I've heard many people successfully use GPT-4o for basic coding projects, thanks to its looser usage limits. That said, the newer GPT-4.1 is generally a much better choice for coding tasks, as we'll discuss shortly. Overall, GPT-4o is a reliable tool for just about anything, but it's important to note that, based on my experience, it becomes more prone to hallucinations as queries grow more complex. For straightforward requests with clear outcomes, GPT-4o works very well, but it struggles significantly with genuine reasoning and complex logic, making occasional errors more likely. For example, while working on an alternate timeline about Rome, GPT-4o mistakenly pulled information from a previous, unrelated timeline project I created months earlier involving a divergent North America. Despite obvious differences in divergence points, nations, and events, GPT-4o sometimes couldn't distinguish these separate contexts clearly. The key takeaway is that you should always verify any ChatGPT response independently, but this is especially important with GPT-4o, at least in my experience. Additionally, free users are limited to 10 messages every three hours, though paid Plus subscribers have an increased limit of 80 messages every three hours. GPT-4.1: Great for coding and a better generalist for Plus, Pro, and Team members Best for : Coding and detailed generalist tasks that require greater accuracy : Coding and detailed generalist tasks that require greater accuracy Availability: Plus or higher While GPT-4o remains the default, those with paid subscriptions might consider the newer GPT-4.1 as their daily driver instead. Initially accessible only via third-party software or OpenAI's API, GPT-4.1 is now fully integrated into ChatGPT for users with a Plus subscription or higher. The improved intelligence and speed of GPT-4.1 mean it can handle all the scenarios listed previously under GPT-4o, with notable enhancements. Other advantages include: It's a great option for coders looking for a balance between speed, accuracy, efficiency, and cost-effectiveness. Significantly better performance than GPT-4o for detailed proofreading, editing, and brainstorming on slightly more complex topics. Clearer and faster responses, reducing the need for extensive back-and-forth corrections. The primary downside of GPT-4.1 compared to GPT-4o is its tighter usage restriction, capped at 40 messages every three hours for Plus users. Still, this limit is likely sufficient for most users, aside from particularly extensive projects. In my personal and entertainment projects, I've occasionally reached the cap, but in those cases, I simply switch back to GPT-4o to complete the job. GPT-4.1 shares the same multimodal capabilities as GPT-4o, but delivers clear improvements across the board. According to OpenAI's official metrics, the new model offers: 21.4% higher coding accuracy : GPT-4.1 scores 54.6% versus GPT-4o's 33.2%. : GPT-4.1 scores 54.6% versus GPT-4o's 33.2%. 10.5% improvement in instruction-following accuracy : GPT-4.1 achieves 38.3% compared to GPT-4o's 27.8%. : GPT-4.1 achieves 38.3% compared to GPT-4o's 27.8%. 6.7% better accuracy for long-context tasks: GPT-4.1 scores 72% versus GPT-4o's 65.3%. As of this writing, GPT-4.1 has only been available to Plus users for about a week, so I haven't fully explored every scenario. However, my initial experiences indicate that GPT-4.1 hallucinates far less often and maintains greater consistency when staying on topic. Unlike GPT-4o, it doesn't randomly blend ideas from previous projects, a frequent issue I encountered with alternative timelines. Additionally, GPT-4.1 follows instructions more carefully and refrains from improvising unnecessarily — a tendency I've noticed in other models. OpenAI 01 Pro Mode: Powerful and precise, but best for specialized business tasks Calvin Wankhede / Android Authority Best for : Complex business and coding tasks demanding exceptional detail and accuracy : Complex business and coding tasks demanding exceptional detail and accuracy Availability: Pro or higher As you might guess, OpenAI's 01 Pro Mode requires an expensive Pro membership and therefore targets companies, independent professionals, or freelancers who handle specialized business and enterprise tasks. Although there's no firm cap, sustained, intensive use can temporarily restrict your access. For example, according to user Shingwun on Reddit, sending more than around 200 messages during a workday can quickly trigger temporary restrictions. Potential use cases for 01 Pro Mode include: Drafting highly detailed risk-analysis reports or internal memos. Creating multi-page research summaries. Developing sophisticated algorithms tailored to specific business requirements. Building specialized applications or plug-ins. Parsing complex STEM topics directly from detailed research papers. These represent just a few possible applications, but ultimately, this model is designed for extremely complex tasks. For everyday programming assistance or quicker queries, there are honestly faster and more suitable tools. Due to its advanced reasoning capabilities, 01 Pro Mode typically takes more time per response, which can become a significant bottleneck, even though the end results are often worth the wait. GPT-03 is great for general business productivity and beyond C. Scott Brown / Android Authority Best for : Business productivity, Plus-level tasks that need advanced reasoning : Business productivity, Plus-level tasks that need advanced reasoning Availability: Plus or higher If you're working on a complex, multi-step project, you'll find that models like GPT-4o are more prone to producing responses riddled with logic errors or outright hallucinations. While such mistakes can occur with any AI, GPT-03 is specifically designed with advanced reasoning in mind, making it typically better suited for tasks such as: Risk analysis reports and similarly detailed documents. Analyzing existing content more deeply and objectively, compared to the overly positive responses typical of other models. Drafting strategic business outlines based on competitor and internal data. Providing more thorough explanations for concepts related to math, science, and coding than GPT-4o or GPT-4.1. Personally, I often use GPT-03 for deeper analysis of both my personal and professional projects. I've found it particularly helpful as a tool for working through my own thoughts and ideas. While I would never fully entrust an AI to serve as a genuine advisor, GPT-03 is valuable when you want to explore or develop an idea with AI assistance. Just be sure to verify any conclusions or ideas you reach with outside sources and additional scrutiny. For example, I've used GPT-03 to help refine my own ethical and philosophical viewpoints, but always confirm these ideas by consulting both online resources and real people. Remember, AI models are very good at providing logical-sounding answers, but they can also mislead, exaggerate, or even unintentionally gaslight you. Therefore, exercise caution when using GPT-03 in this manner. AI models might provide logical-sounding answers, but they can also mislead, exaggerate, or even unintentionally gaslight you. It's also important to recognize GPT-03's other limitations. First, because GPT-03 prioritizes reasoning, responses are typically slower compared to some of the other models. Additionally, Plus, Team, or Enterprise subscribers are limited to just 100 messages per week. Depending on your project's complexity, this could be sufficient, but it also means you'll need to be more selective when choosing to use this model. Pro-level accounts, however, enjoy unlimited access to GPT-03. Lastly, although OpenAI promotes GPT-03 as ideal for advanced coding tasks, my research across Reddit and other online communities suggests a different perspective. The consensus seems to be that while GPT-03 excels at very specific coding scenarios, it can also be prone to hallucination unless prompts are crafted carefully. Most coders find GPT-4.1 to be a generally better fit for typical coding tasks. GPT-4o-mini and GPT-4.1-mini: Best for API users or when you hit usage limits Edgar Cervantes / Android Authority Best for : API users, or anyone needing a backup when other model limits are reached : API users, or anyone needing a backup when other model limits are reached Availability: Free or higher I'm grouping these two models together, as they're even more similar to each other than GPT-4o and GPT-4.1. According to OpenAI, GPT-4o-mini is best suited for fast technical tasks, such as: Quick STEM-related queries Programming Visual reasoning In reality, while it performs well enough for these cases, its limitations can become apparent for anyone doing intensive coding or using the model daily. Even though the 300-message-per-day limit sounds generous, it really depends on your workflow and the size of your projects. Ultimately, GPT-4o-mini works well as a backup if you hit message caps on other models, but I think its best use case is actually outside of ChatGPT — as a cost-effective choice for API users running larger projects. As for GPT-4.1-mini: this newer model is the default for all ChatGPT users (replacing GPT-4o-mini), though you'll still have access to both on Plus or higher tiers. One big change is that 4.1-mini also supports free accounts, so you're not restricted by payment tier. GPT-4.1-mini works much like GPT-4o-mini but with better coding ability and improved overall performance. It's a useful fallback when you max out your limit on other models, but in my opinion, both mini variants still shine brightest as affordable, lower-power options for API-based projects rather than as your main engine for regular ChatGPT queries. Still, 4.1-mini is gradually rolling out to all free users and will automatically be selected if you hit the GPT-4o cap. GPT-4o-mini-high: Best as a backup for GPT-03 and for faster reasoning Kaitlyn Cimino / Android Authority Best for : Faster reasoning than o3, and as a backup : Faster reasoning than o3, and as a backup Availability: Plus or higher GPT-4o-mini-high (formerly known as GPT-03-mini-high) used to be a favorite among those looking for less restrictive coding and more flexibility for unique projects. The current version doesn't have quite the same reputation for coding, but it still has a few official OpenAI use cases: Solving complex math equations with full step-by-step breakdowns—great for homework and learning Drafting SQL queries for data extraction and database work Explaining scientific concepts in clear, accessible language Based on my experience and what I've read in community forums, the best way to use GPT-4o-mini-high is as a backup: when you run out of credits or hit your message cap on GPT-03, mini-high offers a similar experience, though it's not quite as robust. This model is limited to 100 messages per day for Plus, Teams, and Enterprise users, while Pro users get unlimited access. GPT-4.5: Powerful generalist, but best for refinement or high-value queries Mishaal Rahman / Android Authority Best for : Final refinement, editing, or as a premium alternative to GPT-4.1 : Final refinement, editing, or as a premium alternative to GPT-4.1 Availability: Plus or higher GPT-4.5 is arguably the most powerful generalist model available, offering a noticeable leap over GPT-4.1 and GPT-4o in many scenarios. However, its strict usage limits mean you'll want to be selective. While GPT-4.5 used to allow for 50 messages per week, Plus users are now limited to just 20 weekly messages. Pro users also have a cap, but OpenAI hasn't published exact numbers. From what I've seen, most people don't reach the Pro limit easily, but if you're passionate about using GPT-4.5, you'll need to spring for the $200/month Pro tier. For more casual users like me, that's a pretty tough sell. So, what do I mean by refinement? Essentially, I like to use GPT-4o or GPT-4.1 to rough out a project and get it where I want it, then bring in GPT-4.5 for the final polish. For instance, when working on an alternate history timeline for a fiction series, I used GPT-4.1 for the main draft, then uploaded the result to GPT-4.5 to help refine the language and catch any logic gaps. The finished product was much tighter, and I only had to use a few of my 20 weekly messages. Whether it's for last-step editing, advanced review, or double-checking a critical project, GPT-4.5 excels as a finishing tool. Just keep in mind that it's not practical for multi-step, back-and-forth work unless you're on the Pro plan. My favorite workflow: Mixing models for the best results Edgar Cervantes / Android Authority While GPT-4.5 is my go-to for final refinement, I actually hop between models quite a bit depending on the project. The web version of ChatGPT makes it easy to switch models mid-conversation (even if you sometimes need to re-explain the context). For creative projects, I usually start with GPT-4.1 for drafting, then jump to GPT-03 if I need deeper reasoning or want to double-check my thinking. After narrowing things down further in GPT-4.1, I'll finish the project in GPT-4.5 for a final pass. This model dance helps catch mistakes, uncover new ideas, and produce cleaner, more reliable results. Ultimately, there's no one 'right' combination for everyone. You'll want to experiment with the models to find a workflow that fits your needs. For example, programmers might use a cheaper model like GPT-4.1 for initial coding, then switch to 01 Pro Mode for an advanced review of their work. Writers and researchers might prefer the blend of GPT-03's reasoning with GPT-4.5's editing finesse. How do you cross-utilize the different models? Maybe you have a hot take you can share in the comments that I didn't previously consider.

Meta's Llama AI team has been bleeding talent. Many top researchers have joined French AI startup Mistral.

Business Insider

26-05-2025

Business
Business Insider

Meta's Llama AI team has been bleeding talent. Many top researchers have joined French AI startup Mistral.

Meta's open-source Llama models helped define the company's AI strategy. Yet the researchers who built the original version have mostly moved on. Of the 14 authors credited on the landmark 2023 paper that introduced Llama to the world, just three still work at Meta: research scientist Hugo Touvron, research engineer Xavier Martinet, and technical program leader Faisal Azhar. The rest have left the company, many of them to join or found its emerging rivals. Meta's brain drain is most visible at Mistral, the Paris-based startup co-founded by former Meta researchers Guillaume Lample and Timothée Lacroix, two of Llama's key architects. Alongside several fellow Meta alums, they're building powerful open-source models that directly compete with Meta's flagship AI efforts. The exits over time raise questions about Meta's ability to retain top AI talent just as it faces a new wave of external and internal pressure. The company is delaying its largest-ever AI model, Behemoth, after internal concerns about its performance and leadership, The Wall Street Journal reported. Llama 4, Meta's latest release, received a lukewarm reception from developers, many of whom now look to faster-moving open-source rivals like DeepSeek and Qwen for cutting-edge capabilities. Inside Meta, the research team has also seen a shake-up. Joelle Pineau, who led the company's Fundamental AI Research group (FAIR) for eight years, announced last month that she would step down. She will be replaced by Robert Fergus, who co-founded FAIR in 2014 and then spent five years at Google's DeepMind before rejoining Meta this month. The leadership reshuffle follows a period of quiet attrition. Many of the researchers behind Llama's initial success have left FAIR since publishing their landmark paper, even as Meta continues to position the model family as central to its AI strategy. With so many of its original architects gone and rivals moving faster in open-source innovation, Meta now faces the challenge of defending its early lead without the team that built it. That's particularly significant because the 2023 Llama paper was more than just a technical milestone. It helped legitimize open-weight large language models with underlying code and parameters that are freely available for others to use, modify, and build on, as viable alternatives to proprietary systems at the time, like OpenAI's GPT-3 and Google's PaLM. Meta trained its models using only publicly available data and optimized them for efficiency, enabling researchers and developers to run state-of-the-art systems on a single GPU chip. For a moment, Meta looked like it could lead the open frontier. Two years later, that lead has slipped, and Meta no longer sets the pace. Despite investing billions into AI, Meta still doesn't have a dedicated "reasoning" model, one built specifically to handle tasks that require multi-step thinking, problem-solving, or calling external tools to complete complex commands. That gap has grown more noticeable as other companies like Google and OpenAI prioritize these features in their latest models. The average tenure of the 11 departed authors at Meta was over five years, suggesting they weren't short-term hires but researchers deeply embedded in Meta's AI efforts. Some left as early as January 2023; others stayed through the Llama 3 cycle, and a few left as recently as this year. Together, their exits mark the quiet unraveling of the team that helped Meta stake its AI reputation on open models. A Meta spokesperson pointed to an X post about Llama research paper authors who have left. The list below, based on information from the researchers' LinkedIn profiles, shows where each of them ended up. Naman Goyal Left Meta: February 2025 Time at Meta: 6 years, 7 months Baptiste Rozière Current role: AI Scientist at Mistral Left Meta: August 2024 Time at Meta: 5 years, 1 month Aurélien Rodriguez Current role: Director, Foundation Model Training at Cohere Left Meta: July 2024 Time at Meta: 2 years, 7 months Eric Hambro Current role: Member of Technical Staff at Anthropic Left Meta: November 2023 Time at Meta: 3 years, 3 months Timothée Lacroix Left Meta: June 2023 Time at Meta: 8 years, 5 months Marie-Anne Lachaux Current role: Founding Member and AI Research Engineer at Mistral Left Meta: June 2023 Time at Meta: 5 years Thibaut Lavril Current role: AI Research Engineer at Mistral Left Meta: June 2023 Time at Meta: 4 years, 5 months Armand Joulin Current role: Distinguished Scientist at Google DeepMind Left Meta: May 2023 Time at Meta: 8 years, 8 months Gautier Izacard Current role: Technical Staff at Microsoft AI Left Meta: March 2023 Time at Meta: 3 years, 2 months Edouard Grave Current role: Research Scientist at Kyutai Left Meta: February 2023 Time at Meta: 7 years, 2 months Guillaume Lample Left Meta: Early 2023 Time at Meta: 7 years