Latest news with #Poolside


TechCrunch
06-06-2025
- Business
- TechCrunch
EleutherAI releases massive AI training dataset of licensed and open domain text
EleutherAI, an AI research organization, has released what it claims is one of the largest collections of licensed and open-domain text for training AI models. The dataset, called The Common Pile v0.1, took around two years to complete in collaboration with AI startups Poolside, Hugging Face, and others, along with several academic institutions. Weighing in at 8 terabytes in size, The Common Pile v0.1 was used to train two new AI models from EleutherAI, Comma v0.1-1T and Comma v0.1-2T, that EleutherAI claims perform on par with models developed using unlicensed, copyrighted data. AI companies, including OpenAI, are embroiled in lawsuits over their AI training practices, which rely on scraping the web — including copyrighted material like books and research journals — to build model training datasets. While some AI companies have licensing arrangements in place with certain content providers, most maintain that the U.S. legal doctrine of fair use shields them from liability in cases where they trained on copyrighted work without permission. EleutherAI argues that these lawsuits have 'drastically decreased' transparency from AI companies, which the organization says has harmed the broader AI research field by making it more difficult to understand how models work and what their flaws might be. '[Copyright] lawsuits have not meaningfully changed data sourcing practices in [model] training, but they have drastically decreased the transparency companies engage in,' Stella Biderman, EleutherAI's executive director, wrote in a blog post on Hugging Face early Friday. 'Researchers at some companies we have spoken to have also specifically cited lawsuits as the reason why they've been unable to release the research they're doing in highly data-centric areas.' The Common Pile v0.1, which can be downloaded from Hugging Face's AI dev platform and GitHub, was created in consultation with legal experts, and it draws on sources including 300,000 public domain books digitized by the Library of Congress and the Internet Archive. EleutherAI also used Whisper, OpenAI's open-source speech-to-text model, to transcribe audio content. EleutherAI claims Comma v0.1-1T and Comma v0.1-2T are evidence that the Common Pile v0.1 was curated carefully enough to enable developers to build models competitive with proprietary alternatives. According to EleutherAI, the models, both of which are 7 billion parameters in size and were trained on only a fraction of the Common Pile v0.1, rival models like Meta's first Llama AI model on benchmarks for coding, image understanding, and math. Techcrunch event Save $200+ on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Save $200+ on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Boston, MA | REGISTER NOW Parameters, sometimes referred to as weights, are the internal components of an AI model that guide its behavior and answers. 'In general, we think that the common idea that unlicensed text drives performance is unjustified,' Biderman wrote in her post. 'As the amount of accessible openly licensed and public domain data grows, we can expect the quality of models trained on openly licensed content to improve.' The Common Pile v0.1 appears to be in part an effort to right EleutherAI's historical wrongs. Years ago, the company released The Pile, an open collection of training text that includes copyrighted material. AI companies have come under fire — and legal pressure — for using The Pile to train models. EleutherAI is committing to releasing open datasets more frequently going forward in collaboration with its research and infrastructure partners.


NDTV
21-05-2025
- Business
- NDTV
Paris Beats London As Europe's Leading Tech Ecosystem
Stockholm: Paris has been named as the new European tech champion, beating London for the first time on some metrics, according to data from Dealroom, which collects information on startups and venture capital firms. Between 2017 and 2024, the combined enterprise value of Paris startups increased 5.3 times, compared with 4.2 times for London, Dealroom said, after assessing dozens of metrics that contribute to a successful tech ecosystem. Although London attracted bigger funding rounds, the actual valuations of the companies have not increased dramatically, while the funding rounds secured by Paris-based companies have had a bigger impact on valuations, it said. French tech companies, including Mistral AI and Poolside, raised $7.8 billion last year, less than London's $11.3 billion. Europe has been falling behind other regions in tech innovations, with only some countries trying to boost tech investments. While the market capitalisation of global tech, media and telecom companies rose from $7 trillion in 2000 to $34 trillion last year, Europe's share dropped from 30 per cent to just 7 per cent, a McKinsey report said on Wednesday. If Europe had maintained its share, it would have generated an additional $8 trillion in market value, it said. Paris is also the only European city on Dealroom's top five global champions list, which is dominated by U.S. cities. It comes a month ahead of Paris hosting one of the largest global tech conferences, VivaTech, featuring top executives from companies such as Nvidia, Alibaba, Meta, OpenAI, Mistral, Anthropic and Cohere. Last year's conference was attended by more than 165,000 people. "It's not just about the competitiveness of Paris on the AI scene today, it's also about what will happen next and how we can keep on attracting the talent, investment, and the tech activities," Francois Bitouzet, managing director of VivaTech, told Reuters. Since coming to power in 2017, French President Emmanuel Macron has talked about wanting France to be a world leader in AI and 'deep-tech', inviting several firms to invest in the country and pushing for creation of startup incubator Station F.


Time of India
21-05-2025
- Business
- Time of India
Paris named as Europe's leading tech ecosystem, beating London
By Supantha Mukherjee STOCKHOLM: Paris has been named as the new European tech champion , beating London for the first time on some metrics, according to data from Dealroom, which collects information on startups and venture capital firms. Between 2017 and 2024, the combined enterprise value of Paris startups increased 5.3 times, compared with 4.2 times for London, Dealroom said, after assessing dozens of metrics that contribute to a successful tech ecosystem . Although London attracted bigger funding rounds, the actual valuations of the companies have not increased dramatically, while the funding rounds secured by Paris-based companies have had a bigger impact on valuations, it said. French tech companies, including Mistral AI and Poolside, raised $7.8 billion last year, less than London's $11.3 billion. Europe has been falling behind other regions in tech innovations, with only some countries trying to boost tech investments. While the market capitalisation of global tech, media and telecom companies rose from $7 trillion in 2000 to $34 trillion last year, Europe's share dropped from 30% to just 7%, a McKinsey report said on Wednesday. If Europe had maintained its share, it would have generated an additional $8 trillion in market value, it said. Paris is also the only European city on Dealroom's top five global champions list, which is dominated by U.S. cities. It comes a month ahead of Paris hosting one of the largest global tech conferences, VivaTech, featuring top executives from companies such as Nvidia, Alibaba, Meta , OpenAI, Mistral, Anthropic and Cohere. Last year's conference was attended by more than 165,000 people. "It's not just about the competitiveness of Paris on the AI scene today, it's also about what will happen next and how we can keep on attracting the talent, investment, and the tech activities," Francois Bitouzet, managing director of VivaTech, told Reuters. Since coming to power in 2017, French President Emmanuel Macron has talked about wanting France to be a world leader in AI and 'deep-tech', inviting several firms to invest in the country and pushing for creation of startup incubator Station F.


CNA
21-05-2025
- Business
- CNA
Paris named as Europe's leading tech ecosystem, beating London
STOCKHOLM :Paris has been named as the new European tech champion, beating London for the first time on some metrics, according to data from Dealroom, which collects information on startups and venture capital firms. Between 2017 and 2024, the combined enterprise value of Paris startups increased 5.3 times, compared with 4.2 times for London, Dealroom said, after assessing dozens of metrics that contribute to a successful tech ecosystem. Although London attracted bigger funding rounds, the actual valuations of the companies have not increased dramatically, while the funding rounds secured by Paris-based companies have had a bigger impact on valuations, it said. French tech companies, including Mistral AI and Poolside, raised $7.8 billion last year, less than London's $11.3 billion. Europe has been falling behind other regions in tech innovations, with only some countries trying to boost tech investments. While the market capitalisation of global tech, media and telecom companies rose from $7 trillion in 2000 to $34 trillion last year, Europe's share dropped from 30 per cent to just 7 per cent, a McKinsey report said on Wednesday. If Europe had maintained its share, it would have generated an additional $8 trillion in market value, it said. Paris is also the only European city on Dealroom's top five global champions list, which is dominated by U.S. cities. It comes a month ahead of Paris hosting one of the largest global tech conferences, VivaTech, featuring top executives from companies such as Nvidia, Alibaba, Meta, OpenAI, Mistral, Anthropic and Cohere. Last year's conference was attended by more than 165,000 people. "It's not just about the competitiveness of Paris on the AI scene today, it's also about what will happen next and how we can keep on attracting the talent, investment, and the tech activities," Francois Bitouzet, managing director of VivaTech, told Reuters. Since coming to power in 2017, French President Emmanuel Macron has talked about wanting France to be a world leader in AI and 'deep-tech', inviting several firms to invest in the country and pushing for creation of startup incubator Station F.


Reuters
21-05-2025
- Business
- Reuters
Paris named as Europe's leading tech ecosystem, beating London
STOCKHOLM, May 21 (Reuters) - Paris has been named as the new European tech champion, beating London for the first time on some metrics, according to data from Dealroom, which collects information on startups and venture capital firms. Between 2017 and 2024, the combined enterprise value of Paris startups increased 5.3 times, compared with 4.2 times for London, Dealroom said, after assessing dozens of metrics that contribute to a successful tech ecosystem. Although London attracted bigger funding rounds, the actual valuations of the companies have not increased dramatically, while the funding rounds secured by Paris-based companies have had a bigger impact on valuations, it said. French tech companies, including Mistral AI and Poolside, raised $7.8 billion last year, less than London's $11.3 billion. Europe has been falling behind other regions in tech innovations, with only some countries trying to boost tech investments. While the market capitalisation of global tech, media and telecom companies rose from $7 trillion in 2000 to $34 trillion last year, Europe's share dropped from 30% to just 7%, a McKinsey report said on Wednesday. If Europe had maintained its share, it would have generated an additional $8 trillion in market value, it said. Paris is also the only European city on Dealroom's top five global champions list, which is dominated by U.S. cities. It comes a month ahead of Paris hosting one of the largest global tech conferences, VivaTech, featuring top executives from companies such as Nvidia (NVDA.O), opens new tab, Alibaba ( opens new tab, Meta (META.O), opens new tab, OpenAI, Mistral, Anthropic and Cohere. Last year's conference was attended by more than 165,000 people. "It's not just about the competitiveness of Paris on the AI scene today, it's also about what will happen next and how we can keep on attracting the talent, investment, and the tech activities," Francois Bitouzet, managing director of VivaTech, told Reuters. Since coming to power in 2017, French President Emmanuel Macron has talked about wanting France to be a world leader in AI and 'deep-tech', inviting several firms to invest in the country and pushing for creation of startup incubator Station F.