logo
Encountered a problematic response from an AI model? More standards and tests are needed, say researchers

Encountered a problematic response from an AI model? More standards and tests are needed, say researchers

CNBC11 hours ago

As the usage of artificial intelligence — benign and adversarial — increases at breakneck speed, more cases of potentially harmful responses are being uncovered. These include hate speech, copyright infringements or sexual content.
The emergence of these undesirable behaviors is compounded by a lack of regulations and insufficient testing of AI models, researchers told CNBC.
Getting machine learning models to behave the way it was intended to do so is also a tall order, said Javier Rando, a researcher in AI.
"The answer, after almost 15 years of research, is, no, we don't know how to do this, and it doesn't look like we are getting better," Rando, who focuses on adversarial machine learning, told CNBC.
However, there are some ways to evaluate risks in AI, such as red teaming. The practice involves individuals testing and probing artificial intelligence systems to uncover and identify any potential harm — a modus operandi common in cybersecurity circles.
Shayne Longpre, a researcher in AI and policy and lead of the Data Provenance Initiative, noted that there are currently insufficient people working in red teams.
While AI startups are now using first-party evaluators or contracted second parties to test their models, opening the testing to third parties such as normal users, journalists, researchers, and ethical hackers would lead to a more robust evaluation, according to a paper published by Longpre and researchers.
"Some of the flaws in the systems that people were finding required lawyers, medical doctors to actually vet, actual scientists who are specialized subject matter experts to figure out if this was a flaw or not, because the common person probably couldn't or wouldn't have sufficient expertise," Longpre said.
Adopting standardized 'AI flaw' reports, incentives and ways to disseminate information on these 'flaws' in AI systems are some of the recommendations put forth in the paper.
With this practice having been successfully adopted in other sectors such as software security, "we need that in AI now," Longpre added.
Marrying this user-centred practice with governance, policy and other tools would ensure a better understanding of the risks posed by AI tools and users, said Rando.
Project Moonshot is one such approach, combining technical solutions with policy mechanisms. Launched by Singapore's Infocomm Media Development Authority, Project Moonshot is a large language model evaluation toolkit developed with industry players such as IBM and Boston-based DataRobot.
The toolkit integrates benchmarking, red teaming and testing baselines. There is also an evaluation mechanism which allows AI startups to ensure that their models can be trusted and do no harm to users, Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific, told CNBC.
Evaluation is a continuous process that should be done both prior to and following the deployment of models, said Kumar, who noted that the response to the toolkit has been mixed.
"A lot of startups took this as a platform because it was open source, and they started leveraging that. But I think, you know, we can do a lot more."
Moving forward, Project Moonshot aims to include customization for specific industry use cases and enable multilingual and multicultural red teaming.
Pierre Alquier, Professor of Statistics at the ESSEC Business School, Asia-Pacific, said that tech companies are currently rushing to release their latest AI models without proper evaluation.
"When a pharmaceutical company designs a new drug, they need months of tests and very serious proof that it is useful and not harmful before they get approved by the government," he noted, adding that a similar process is in place in the aviation sector.
AI models need to meet a strict set of conditions before they are approved, Alquier added. A shift away from broad AI tools to developing ones that are designed for more specific tasks would make it easier to anticipate and control their misuse, said Alquier.
"LLMs can do too many things, but they are not targeted at tasks that are specific enough," he said. As a result, "the number of possible misuses is too big for the developers to anticipate all of them."
Such broad models make defining what counts as safe and secure difficult, according to a research that Rando was involved in.
Tech companies should therefore avoid overclaiming that "their defenses are better than they are," said Rando.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Last day to save on your TechCrunch All Stage pass — prices go up tonight
Last day to save on your TechCrunch All Stage pass — prices go up tonight

Yahoo

time3 hours ago

  • Yahoo

Last day to save on your TechCrunch All Stage pass — prices go up tonight

It's now or never — this is your final day to lock in savings for TechCrunch All Stage, happening July 15 in Boston. Prices increase tonight, June 22, at 11:59 p.m. PT. If you've been thinking about attending, now's the time to commit. Register here. TC All Stage brings together founders, investors, and operators from across the startup landscape for one high-intensity day of strategy, storytelling, and startup momentum. Whether you're building something new or backing what's next, this is the room to be in. At TC All Stage, sessions are designed for action, not applause. You'll get a firsthand look at what it takes to fundraise, scale, and lead in today's market. Expect candid takes, real feedback, and tactical insight you can put to work immediately. Some of the voices shaping the day include: And beyond the sessions? You'll connect at roundtables, pitch competitions, and Side Events hosted across Boston — the kind of after-hours meetups where ideas and deals take root. This is your last chance to save. Join us at TC All Stage and spend a day surrounded by people pushing tech forward. Secure your pass before midnight and save up to $210. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Nvidia: How the chipmaker evolved from a gaming startup to an AI giant
Nvidia: How the chipmaker evolved from a gaming startup to an AI giant

Yahoo

time3 hours ago

  • Yahoo

Nvidia: How the chipmaker evolved from a gaming startup to an AI giant

Over the past two decades, Nvidia (NVDA) has skyrocketed into global conversation. The semiconductor company is considered an international leader in the design and manufacturing of computer chips and helped revolutionize the rise of artificial intelligence (AI). Beyond its strengths in the gaming, data, and AI fields, Nvidia announced plans this March for a quantum research center in Boston, where CEO Jensen Huang said researchers could tackle problems from drug discovery to materials development. Here's a look at Nvidia's path to where it is today, from creating hardware for the gaming industry to designing the chips that power AI. On April 5, 1993, Jensen Huang, Chris Malachowsky, and Curtis Priem founded Nvidia with an initial focus on designing and producing 3D graphics processors for computing and video games. The company's first product release, the multimedia processor NV1, didn't get the reception the founders were hoping for. What followed was a financial situation so dire that Nvidia laid off half its staff, leading to its unofficial motto: 'Our company is 30 days from going out of business.' In addition to the NV1's unimpressive return, a partnership that Nvidia had forged with Japanese video game company Sega to produce console graphics chips fell through, adding to the pressure. However, even as it pivoted to another company for chips, Sega invested $5 million in Nvidia — funding which allowed Nvidia to survive going out of business. Despite financial challenges and a smaller team, Nvidia released its next chip in 1997. It was a success. RIVA 128 allowed for support of high-resolution 2D and 3D graphics, and over a million units were sold in its first four months of sales. With the foundation of RIVA 128 sales, Nvidia produced RIVA TNT, which further cemented its place in the industry with better image quality and performance. Two years later, on Jan. 22, 1999, Nvidia went public on the New York Stock Exchange (NYSE) at $12 a share, and by May, it shipped out its 10,000,000th graphics processor. Later in 1999, Nvidia released GeForce 256, calling it the world's first 'Graphics Processing Unit.' By marketing the chip directly to customers instead of just including it within a device or console, the company popularized the term 'GPU.' With their ability to break larger tasks into smaller ones that could run at the same time, known as parallel processing, GPUs took on the heavy workload of powering graphics. It allowed devices to work on other processing functions faster, which meant GeForce 256 offered smoother, faster, and more realistic graphics. Finding growing success in supplying GPUs to both customers and consoles like Xbox, Nvidia joined the Nasdaq 100 and the S&P 500 in 2001. In 2006, Nvidia launched CUDA, a platform that allowed users to access their GPUs' parallel processing capabilities to run their own software instead of just graphics. Between 2006 to 2017, Nvidia invested nearly $12 billion in research & development with a large portion of those funds going towards CUDA. CUDA downloads slowed entering the 2010s, and while CUDA provided users with the ability to use chips for purposes other than gaming, it didn't initially seem to pay off for investors. 'Some investors were big Nvidia fans in the late 2000s and gave them the benefit of the doubt for the first five years of the CUDA investment," Acquired podcast co-host Ben Gilbert said in a 2022 episode. "But in the mid-2010s, market demand still wasn't showing up in a big way, and it was becoming a bigger and bigger investment." However, later technological developments would make CUDA crucial to the company. "It made all of our products more expensive since we were selling these gamer cards while putting computing acceleration into them," Bryan Catanzaro, Nvidia's vice president of applied deep learning research, told Yahoo Finance in 2023. "It took a lot of commitment to follow through. … I would say it was about 10 years before Wall Street really started to believe this investment was worth anything." In 2012, students Alex Krizhevsky and Ilya Sutskever used CUDA to train the visual-recognition neural network AlexNet with two Nvidia GPUs. AlexNet's breakthrough performance in identifying images demonstrated that using GPUs to train machine learning models cut training times significantly compared to the CPUs that were previously used. Following this advancement, Nvidia began pivoting its focus to artificial intelligence, supported by its revenue from gaming. By 2016, it announced the DGX-1, a system designed specifically for deep learning and the large language models that were on the rise. That year, Nvidia stock nearly tripled in price. 'It's 'destiny meets serendipity,'' Nvidia CEO Jensen Huang told Yahoo Finance at the time. 'People think it's an overnight success, but like most overnight successes, it took us years.' At the same time, Nvidia took the opportunity to make strategic acquisitions, such as wireless company Icera in 2011 and hardware company The Portland Group in 2013. It tried to acquire the semiconductor and design company Arm (ARM) in 2020, but the deal ultimately fell through after regulatory concerns. In March 2022, Nvidia announced the H100 'Hopper' chip, promising faster training and better performance for artificial intelligence. Controlling a significant majority of the market share with this GPU, major companies, including Alphabet (GOOG), Amazon (AMZN), and Microsoft (MSFT), turned to Nvidia with billions as they began to develop AI and data-driven products. One such company is OpenAI ( whose relationship with Nvidia stretches back to 2016, when Nvidia donated the first DGX-1 supercomputer to the startup. In November 2022, OpenAI launched ChatGPT, a language model built on Nvidia GPUs that quickly reached headlines. In less than two months, ChatGPT set the record for the fastest-growing consumer application in history, according to a UBS study, reaching 100 million monthly active users in January 2023. "A new computing era has begun," Nvidia CEO Jensen Huang said in a 2023 statement. "Companies worldwide are transitioning from general-purpose to accelerated computing and generative AI." With investors increasingly interested in artificial intelligence and as the demand for GPUs to run models continues to grow, Nvidia's revenue for the quarter ending in January 2024 more than doubled its results year over year. Following the quarterly report's release, Nvidia had the largest one-day gain in stock market history, adding $277 billion in value. It then hit a valuation of $2 trillion the following day. The record wouldn't stand for long, however; Nvidia beat it again just two months later. That March, Nvidia announced its next chip: Blackwell. Offering higher performance with reduced cost and energy consumption, the chips were designed to work better than previous versions when linked to work together in large numbers. Soon after, Nvidia announced a 10-for-1 stock split in June 2024. Following the split, it passed Microsoft and Apple (AAPL) to become the world's most valuable company at $3.3 trillion. By November 2024, it was added to the Dow Jones Industrial Average. Despite its successes, Nvidia also encountered challenges throughout its rise. In 2018, it faced a class-action lawsuit alleging it did not properly disclose to investors the impact of the cryptocurrency market on revenue from sales of GPUs. At the time, 'miners' of cryptocurrencies such as bitcoin (BTC-USD) and ethereum (ETH-USD) used the GPUs to complete transactions and secure new crypto tokens. The process requires significant computational power, which made Nvidia GPUs a popular choice. Nvidia paid a $5.5 million settlement in 2022 to the SEC because of the issue, and in December 2024, the Supreme Court dismissed Nvidia's appeal, allowing the 2018 case to proceed. This wasn't Nvidia's first time managing legal issues regarding its chips. In 2016, it settled a case involving the marketed performance and actual capabilities of its GTX 970, with payouts of $30 per purchase. On top of legal issues, there is also the challenge of supply keeping up with demand. A global chip shortage first occurred in early 2020 as a result of the coronavirus pandemic and an increased reliance on technology for remote work. Other factors that lengthened the shortage through 2023 included the initial US-China trade war, severe weather events, and the Russia-Ukraine war. A December 2024 report from the IDC projected global demand for AI and high-performance computing (HPC) to grow by over 15% in 2025. President Trump announced Project Stargate in January 2025, which involves tech companies such as Oracle (ORCL), OpenAI, and SoftBank (SFTBY) investing $500 billion in AI infrastructure in the United States over the next four years. Nvidia, as a technology partner to the project, saw a jump in its stock, and reached a $3.6 trillion market cap. Later in the month, however, the Chinese company DeepSeek released its own AI model, which was reportedly trained at a significantly lower cost than that of competitors. Following the announcement, Nvidia stock dropped $589 billion, almost 17%, marking the largest single-day loss in stock market history. Following the drop, March 2025 brought the debut of Nvidia's Blackwell Ultra, the successor to Blackwell. The new chip was announced to have 1.5 times the performance of the previous chip, which could help AI models answer queries faster. In April 2025, Trump banned the export of the company's H20 chip to China, as chips like Nvidia's are critical in the race to develop AI technologies. In its first quarter report, Nvidia said it expects to miss $8 billion in potential sales because of the ban. Despite the expanded limitations on exports, Nvidia continues to grow and even briefly passed Microsoft again in June as the world's most valuable company. Looking forward, some even expect it could be the first company to hit a $4 trillion market cap. '[Nvidia] really got the AI revolution going,' ARK Invest founder Cathie Wood told Yahoo Finance earlier this year, 'and we think it's still going to play a mighty role." — Nina is a data reporter intern for Yahoo Finance. Sign in to access your portfolio

Nvidia: How the chipmaker evolved from a gaming startup to an AI giant
Nvidia: How the chipmaker evolved from a gaming startup to an AI giant

Yahoo

time3 hours ago

  • Yahoo

Nvidia: How the chipmaker evolved from a gaming startup to an AI giant

Over the past two decades, Nvidia (NVDA) has skyrocketed into global conversation. The semiconductor company is considered an international leader in the design and manufacturing of computer chips and helped revolutionize the rise of artificial intelligence (AI). Beyond its strengths in the gaming, data, and AI fields, Nvidia announced plans this March for a quantum research center in Boston, where CEO Jensen Huang said researchers could tackle problems from drug discovery to materials development. Here's a look at Nvidia's path to where it is today, from creating hardware for the gaming industry to designing the chips that power AI. On April 5, 1993, Jensen Huang, Chris Malachowsky, and Curtis Priem founded Nvidia with an initial focus on designing and producing 3D graphics processors for computing and video games. The company's first product release, the multimedia processor NV1, didn't get the reception the founders were hoping for. What followed was a financial situation so dire that Nvidia laid off half its staff, leading to its unofficial motto: 'Our company is 30 days from going out of business.' In addition to the NV1's unimpressive return, a partnership that Nvidia had forged with Japanese video game company Sega to produce console graphics chips fell through, adding to the pressure. However, even as it pivoted to another company for chips, Sega invested $5 million in Nvidia — funding which allowed Nvidia to survive going out of business. Despite financial challenges and a smaller team, Nvidia released its next chip in 1997. It was a success. RIVA 128 allowed for support of high-resolution 2D and 3D graphics, and over a million units were sold in its first four months of sales. With the foundation of RIVA 128 sales, Nvidia produced RIVA TNT, which further cemented its place in the industry with better image quality and performance. Two years later, on Jan. 22, 1999, Nvidia went public on the New York Stock Exchange (NYSE) at $12 a share, and by May, it shipped out its 10,000,000th graphics processor. Later in 1999, Nvidia released GeForce 256, calling it the world's first 'Graphics Processing Unit.' By marketing the chip directly to customers instead of just including it within a device or console, the company popularized the term 'GPU.' With their ability to break larger tasks into smaller ones that could run at the same time, known as parallel processing, GPUs took on the heavy workload of powering graphics. It allowed devices to work on other processing functions faster, which meant GeForce 256 offered smoother, faster, and more realistic graphics. Finding growing success in supplying GPUs to both customers and consoles like Xbox, Nvidia joined the Nasdaq 100 and the S&P 500 in 2001. In 2006, Nvidia launched CUDA, a platform that allowed users to access their GPUs' parallel processing capabilities to run their own software instead of just graphics. Between 2006 to 2017, Nvidia invested nearly $12 billion in research & development with a large portion of those funds going towards CUDA. CUDA downloads slowed entering the 2010s, and while CUDA provided users with the ability to use chips for purposes other than gaming, it didn't initially seem to pay off for investors. 'Some investors were big Nvidia fans in the late 2000s and gave them the benefit of the doubt for the first five years of the CUDA investment," Acquired podcast co-host Ben Gilbert said in a 2022 episode. "But in the mid-2010s, market demand still wasn't showing up in a big way, and it was becoming a bigger and bigger investment." However, later technological developments would make CUDA crucial to the company. "It made all of our products more expensive since we were selling these gamer cards while putting computing acceleration into them," Bryan Catanzaro, Nvidia's vice president of applied deep learning research, told Yahoo Finance in 2023. "It took a lot of commitment to follow through. … I would say it was about 10 years before Wall Street really started to believe this investment was worth anything." In 2012, students Alex Krizhevsky and Ilya Sutskever used CUDA to train the visual-recognition neural network AlexNet with two Nvidia GPUs. AlexNet's breakthrough performance in identifying images demonstrated that using GPUs to train machine learning models cut training times significantly compared to the CPUs that were previously used. Following this advancement, Nvidia began pivoting its focus to artificial intelligence, supported by its revenue from gaming. By 2016, it announced the DGX-1, a system designed specifically for deep learning and the large language models that were on the rise. That year, Nvidia stock nearly tripled in price. 'It's 'destiny meets serendipity,'' Nvidia CEO Jensen Huang told Yahoo Finance at the time. 'People think it's an overnight success, but like most overnight successes, it took us years.' At the same time, Nvidia took the opportunity to make strategic acquisitions, such as wireless company Icera in 2011 and hardware company The Portland Group in 2013. It tried to acquire the semiconductor and design company Arm (ARM) in 2020, but the deal ultimately fell through after regulatory concerns. In March 2022, Nvidia announced the H100 'Hopper' chip, promising faster training and better performance for artificial intelligence. Controlling a significant majority of the market share with this GPU, major companies, including Alphabet (GOOG), Amazon (AMZN), and Microsoft (MSFT), turned to Nvidia with billions as they began to develop AI and data-driven products. One such company is OpenAI ( whose relationship with Nvidia stretches back to 2016, when Nvidia donated the first DGX-1 supercomputer to the startup. In November 2022, OpenAI launched ChatGPT, a language model built on Nvidia GPUs that quickly reached headlines. In less than two months, ChatGPT set the record for the fastest-growing consumer application in history, according to a UBS study, reaching 100 million monthly active users in January 2023. "A new computing era has begun," Nvidia CEO Jensen Huang said in a 2023 statement. "Companies worldwide are transitioning from general-purpose to accelerated computing and generative AI." With investors increasingly interested in artificial intelligence and as the demand for GPUs to run models continues to grow, Nvidia's revenue for the quarter ending in January 2024 more than doubled its results year over year. Following the quarterly report's release, Nvidia had the largest one-day gain in stock market history, adding $277 billion in value. It then hit a valuation of $2 trillion the following day. The record wouldn't stand for long, however; Nvidia beat it again just two months later. That March, Nvidia announced its next chip: Blackwell. Offering higher performance with reduced cost and energy consumption, the chips were designed to work better than previous versions when linked to work together in large numbers. Soon after, Nvidia announced a 10-for-1 stock split in June 2024. Following the split, it passed Microsoft and Apple (AAPL) to become the world's most valuable company at $3.3 trillion. By November 2024, it was added to the Dow Jones Industrial Average. Despite its successes, Nvidia also encountered challenges throughout its rise. In 2018, it faced a class-action lawsuit alleging it did not properly disclose to investors the impact of the cryptocurrency market on revenue from sales of GPUs. At the time, 'miners' of cryptocurrencies such as bitcoin (BTC-USD) and ethereum (ETH-USD) used the GPUs to complete transactions and secure new crypto tokens. The process requires significant computational power, which made Nvidia GPUs a popular choice. Nvidia paid a $5.5 million settlement in 2022 to the SEC because of the issue, and in December 2024, the Supreme Court dismissed Nvidia's appeal, allowing the 2018 case to proceed. This wasn't Nvidia's first time managing legal issues regarding its chips. In 2016, it settled a case involving the marketed performance and actual capabilities of its GTX 970, with payouts of $30 per purchase. On top of legal issues, there is also the challenge of supply keeping up with demand. A global chip shortage first occurred in early 2020 as a result of the coronavirus pandemic and an increased reliance on technology for remote work. Other factors that lengthened the shortage through 2023 included the initial US-China trade war, severe weather events, and the Russia-Ukraine war. A December 2024 report from the IDC projected global demand for AI and high-performance computing (HPC) to grow by over 15% in 2025. President Trump announced Project Stargate in January 2025, which involves tech companies such as Oracle (ORCL), OpenAI, and SoftBank (SFTBY) investing $500 billion in AI infrastructure in the United States over the next four years. Nvidia, as a technology partner to the project, saw a jump in its stock, and reached a $3.6 trillion market cap. Later in the month, however, the Chinese company DeepSeek released its own AI model, which was reportedly trained at a significantly lower cost than that of competitors. Following the announcement, Nvidia stock dropped $589 billion, almost 17%, marking the largest single-day loss in stock market history. Following the drop, March 2025 brought the debut of Nvidia's Blackwell Ultra, the successor to Blackwell. The new chip was announced to have 1.5 times the performance of the previous chip, which could help AI models answer queries faster. In April 2025, Trump banned the export of the company's H20 chip to China, as chips like Nvidia's are critical in the race to develop AI technologies. In its first quarter report, Nvidia said it expects to miss $8 billion in potential sales because of the ban. Despite the expanded limitations on exports, Nvidia continues to grow and even briefly passed Microsoft again in June as the world's most valuable company. Looking forward, some even expect it could be the first company to hit a $4 trillion market cap. '[Nvidia] really got the AI revolution going,' ARK Invest founder Cathie Wood told Yahoo Finance earlier this year, 'and we think it's still going to play a mighty role." — Nina is a data reporter intern for Yahoo Finance.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store