Optimizing AI apps in a million-token world

Fast Company04-06-2025

The context size problem in large language models is nearly solved.
In recent months, models like GPT-4.1, LLaMA 4, and DeepSeek V3 have reached context windows ranging from hundreds of thousands to millions of tokens. We're entering a phase where entire documents, threads, and histories can fit into a single prompt. It marks real progress—but it also brings new questions about how we structure, pass, and prioritize information.
WHAT IS CONTEXT SIZE (AND WHY WAS IT A CHALLENGE)?
Context size defines how much text a model can process in one go, and is measured in tokens, which are small chunks of text, like words or parts of words. It shaped the way we worked with LLMs: splitting documents, engineering recursive prompts, summarizing inputs—anything to avoid truncation.
Now, models like LLaMA 4 Scout can handle up to 10 million tokens, and DeepSeek V3 and GPT-4.1 go beyond 100K and 1M respectively. With those capabilities, many of those older workarounds can be rethought or even removed.
FROM BOTTLENECK TO CAPABILITY
This progress unlocks new interaction patterns. We're seeing applications that can reason and navigate across entire contracts, full Slack threads, or complex research papers. These use cases were out of reach not long ago. However, just because models can read more does not mean they automatically make better use of that data.
The paper ' Why Does the Effective Context Length of LLMs Fall Short? ' examines this gap. It shows that LLMs often attend to only part of the input, especially the more recent or emphasized sections, even when the prompt is long. Another study, ' Explaining Context Length Scaling and Bounds for Language Models,' explores why increasing the window size does not always lead to better reasoning. Both pieces suggest that the problem has shifted from managing how much context a model can take to guiding how it uses that context effectively.
Think of it this way: Just because you can read every book ever written about World War I doesn't mean you truly understand it. You might scan thousands of pages, but still fail to retain the key facts, connect the events, or explain the causes and consequences with clarity.
What we pass to the model, how we organize it, and how we guide its attention are now central to performance. These are the new levers of optimization.
CONTEXT WINDOW ≠ TRAINING TOKENS
A model's ability to accept a large context does not guarantee that it has been trained to handle it well. Some models were exposed only to shorter sequences during training. That means even if they accept 1M tokens, they may not make meaningful use of all that input.
This gap affects reliability. A model might slow down, hallucinate, or misinterpret input if overwhelmed with too much or poorly organized data. Developers need to verify if the model was fine tuned for long contexts, or simply adapted to accept them.
WHAT CHANGES FOR ENGINEERS
With these new capabilities, developers can move past earlier limitations. Manual chunking, token trimming, and aggressive summarization become less critical. But this does not remove the need for data prioritization.
Prompt compression, token pruning, and retrieval pipelines remain relevant. Techniques like prompt caching help reuse portions of prompts to save costs. Mixture-of-experts (MoE) models, like those used in LLaMA 4 and DeepSeek V3, optimize compute by activating only relevant components.
Engineers also need to track what parts of a prompt the model actually uses. Output quality alone does not guarantee effective context usage. Monitoring token relevance, attention distribution, and consistency over long prompts are new challenges that go beyond latency and throughput.
IT IS ALSO A PRODUCT AND UX ISSUE
For end users, the shift to larger contexts introduces more freedom—and more ways to misuse the system. Many users drop long threads, reports, or chat logs into a prompt and expect perfect answers. They often do not realize that more data can sometimes cloud the model's reasoning.
Product design must help users focus. Interfaces should clarify what is helpful to include and what is not. This might mean offering previews of token usage, suggestions to refine inputs, or warnings when the prompt is too broad. Prompt design is no longer just a backend task, but rather part of the user journey.
THE ROAD AHEAD: STRUCTURE OVER SIZE
Larger context windows open important doors. We can now build systems that follow extended narratives, compare multiple documents, or process timelines that were previously out of reach.
But clarity still matters more than capacity. Models need structure to interpret, not just volume to consume. This changes how we design systems, how we shape user input, and how we evaluate performance.
The goal is not to give the model everything. It is to give it the right things, in the right order, with the right signals. That is the foundation of the next phase of progress in AI systems.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

After Zuckerberg spent billions on an AI 'dream team,' he has to deliver for Meta shareholders

CNBC

a day ago

CNBC

After Zuckerberg spent billions on an AI 'dream team,' he has to deliver for Meta shareholders

When Mark Zuckerberg feels the heat, he opens his wallet. The 41-year-old Facebook founder and Meta CEO is on a spending spree like never before in an effort to position his company at the forefront of the artificial intelligence boom and make up for recent costly mistakes in a market that's rapidly revolutionizing the business world. Following last week's stunning $14.3 billion investment in Scale AI, which brought with it Meta's hiring of the startup's founder, Alexandr Wang, and a small group of his top staffers, Meta now plans to hire former GitHub CEO Nat Friedman and his business partner, Daniel Gross, who had been CEO of $32 billion AI startup Safe Superintelligence, CNBC reported this week. Meta previously tried to buy Safe Superintelligence, which was launched a year ago by OpenAI co-founder Ilya Sutskever, sources told CNBC. According to other sources, Meta had previously been in talks to buy Perplexity AI, which was valued at $14 billion in a funding round in May. The people who spoke to CNBC about the various dealmaking pursuits asked not to be named due to confidentiality. Zuckerberg told investors at the top of the most recent earnings call in April, "The major theme right now, of course, is how AI is transforming everything we do." At the same time, Meta upped its capital expenditures range for the year to between $64 billion and $72 billion from between $60 billion and $65 billion to reflect more data center investments in AI and potentially higher hardware costs. What Zuckerberg didn't say then is that he was about to start shelling out mounds of cash to revamp his AI organization. "Mark Zuckerberg is in founder mode and he's not going to be stopped," said Gil Luria, an analyst at D.A. Davidson, in an interview on Friday with CNBC's "Money Movers." Luria has a buy rating on the stock, but said that to win in AI, Meta needs to be successful with the next round, with the dream team that they're building." At Meta, AI is being embedded across the company, from its core online advertising unit and Instagram algorithms to its effort to build the metaverse. Better AI models and technology enhance the company's existing business, both by improving ad targeting and by bringing down costs. However, the building of fundamental models used by the vast community of developers — where the company competes with Google, OpenAI, Anthropic and others — is where Meta is viewed by many as a laggard. Meta's unique open-source approach is built around the Llama family of models. Its most recent update in April, the Llama 4 AI models, was not well received by developers. At the time, Meta only released two smaller versions of Llama 4 and said it would eventually release a bigger and more powerful "Behemoth" model. "On the heels of a successful rollout of Llama 3 a year ago, Llama 4 that came out this year was an absolute failure, almost by his admission," Luria said, referring to Zuckerberg. "Meta can't afford to fail in having the leading AI model. So they're out in the marketplace desperately trying to replace their AI team right now." Meta didn't respond to a request for comment for this story. Bringing on Scale AI's Wang was Zuckerberg's most headline-grabbing move yet. While Meta is gaining a 49% stake in the startup, Zuckerberg's real prize in the deal was hiring Wang, a dropout from the Massachusetts Institute of Technology who started his company at age 19. Zuckerberg then turned his attention to Github's Friedman and Gross, who have been investing together at their venture firm NFDG. They will work on products under Wang, one source familiar with the matter told CNBC on Thursday. Meta, meanwhile, will get a stake in NFDG, according to multiple sources. A Meta spokesperson didn't comment on the planned hires and said the company "will share more about our superintelligence effort and the great people joining this team in the coming weeks." Not all of Zuckerberg's recruits are costing billions of dollars. Some are in the tens or hundreds of millions. That's according to OpenAI CEO Sam Altman. Altman said on the latest episode of the "Uncapped" podcast, which his brother hosts, that Meta has tried to lure OpenAI employees by offering signing bonuses as high as $100 million, with even larger annual compensation packages. "I've heard that Meta thinks of us as their biggest competitor," Altman said on the podcast. "Their current AI efforts have not worked as well as they have hoped and I respect being aggressive and continuing to try new things." Meta technology chief Andrew Bosworth told CNBC's "Closing Bell Overtime" on Friday that Altman is countering the offers. "The market is setting a rate here for a level of talent which is really incredible and kind of unprecedented in my 20-year career as a technology executive," said Bosworth, who joined Meta in 2006. Wall Street is mostly giving Zuckerberg the benefit of the doubt, for now. Meta shares were flat this week after slipping about 2% last week. Shares are still up 17% for the year, outpacing the Nasdaq and all the company's megacap peers. Analysts at Argus maintained their buy recommendation on the stock this week and lifted their price target to $790 a share from $725 a share. The stock closed on Friday at $682.35. "The company's ability to capitalize on GenAI advances in advertising targeting is a particularly relevant opportunity to drive advertising spending, which is the company's lifeblood," the Argus analysts wrote. D.A. Davidson's Luria said that Zuckerberg has put more pressure on himself to turn Meta into a long-term AI leader, but said he won't bet against him. Luria said: "The last time Mr. Zuckerberg felt like he was under the gun," he snapped up Instagram for $1 billion, a deal that set the stage for Facebook to become a dominant player in mobile. That was in 2012, just as Facebook was about to hit the public market. Luria also highlighted Zuckerberg's controversial $19 billion purchase of WhatsApp two years later. He sees the Meta CEO making an equally bold wager in AI. "He's going to rebuild the team and they're going to come back," Luria said.

ChatGPT use linked to cognitive decline, research reveals

Yahoo

2 days ago

Yahoo

ChatGPT use linked to cognitive decline, research reveals

Relying on the artificial intelligence chatbot ChatGPT to help you write an essay could be linked to cognitive decline, a new study reveals. Researchers at the Massachusetts Institute of Technology Media Lab studied the impact of ChatGPT on the brain by asking three groups of people to write an essay. One group relied on ChatGPT, one group relied on search engines, and one group had no outside resources at all. The researchers then monitored their brains using electroencephalography, a method which measures electrical activity. The team discovered that those who relied on ChatGPT — also known as a large language model — had the 'weakest' brain connectivity and remembered the least about their essays, highlighting potential concerns about cognitive decline in frequent users. 'Over four months, [large language model] users consistently underperformed at neural, linguistic, and behavioral levels,' the study reads. 'These results raise concerns about the long-term educational implications of [large language model] reliance and underscore the need for deeper inquiry into AI's role in learning.' The study also found that those who didn't use outside resources to write the essays had the 'strongest, most distributed networks.' While ChatGPT is 'efficient and convenient,' those who use it to write essays aren't 'integrat[ing] any of it' into their memory networks, lead author Nataliya Kosmyna told Time Magazine. Kosmyna said she's especially concerned about the impacts of ChatGPT on children whose brains are still developing. 'What really motivated me to put it out now before waiting for a full peer review is that I am afraid in 6-8 months, there will be some policymaker who decides, 'let's do GPT kindergarten,'' Kosmyna said. 'I think that would be absolutely bad and detrimental. Developing brains are at the highest risk.' But others, including President Donald Trump and members of his administration, aren't so worried about the impacts of ChatGPT on developing brains. Trump signed an executive order in April promoting the integration of AI into American schools. 'To ensure the United States remains a global leader in this technological revolution, we must provide our Nation's youth with opportunities to cultivate the skills and understanding necessary to use and create the next generation of AI technology,' the order reads. 'By fostering AI competency, we will equip our students with the foundational knowledge and skills necessary to adapt to and thrive in an increasingly digital society.' Kosmyna said her team is now working on another study comparing the brain activity of software engineers and programmers who use AI with those who don't. 'The results are even worse,' she told Time Magazine. The Independent has contacted OpenAI, which runs ChatGPT, for comment.

ChatGPT use linked to cognitive decline: MIT research

The Hill

3 days ago

The Hill

ChatGPT use linked to cognitive decline: MIT research

ChatGPT can harm an individual's critical thinking over time, a new study suggests. Researchers at MIT's Media Lab asked subjects to write several SAT essays and separated subjects into three groups — using OpenAI's ChatGPT, using Google's search engine and using nothing, which they called the 'brain‑only' group. Each subject's brain was monitored through electroencephalography (EEG), which measured the writer's brain activity through multiple regions in the brain. They discovered that subjects who used ChatGPT over a few months had the lowest brain engagement and 'consistently underperformed at neural, linguistic, and behavioral levels,' according to the study. The study found that the ChatGPT group initially used the large language model, or LLM, to ask structural questions for their essay, but near the end of the study, they were more likely to copy and paste their essay. Those who used Google's search engine were found to have moderate brain engagement, but the 'brain-only' group showed the 'strongest, wide-ranging networks.' The findings suggest that using LLMs can harm a user's cognitive function over time, especially in younger users. It comes as educators continue to navigate teaching when AI is increasingly accessible for cheating. 'What really motivated me to put it out now before waiting for a full peer review is that I am afraid in 6-8 months, there will be some policymaker who decides, 'let's do GPT kindergarten.' I think that would be absolutely bad and detrimental,' the study's main author Nataliya Kosmyna told TIME. 'Developing brains are at the highest risk.' However, using AI in education doesn't appear to be slowing down. In April, President Trump signed an executive order that aims to incorporate AI into U.S. classrooms. 'The basic idea of this executive order is to ensure that we properly train the workforce of the future by ensuring that school children, young Americans, are adequately trained in AI tools, so that they can be competitive in the economy years from now into the future, as AI becomes a bigger and bigger deal,' Will Scharf, White House staff secretary, said at the time.