DataSet/DataTable - Search News

How to Tell a Good Speech Dataset for AI From a Bad One

Speech AI datasets look interchangeable until production exposes gaps in transcripts, speakers, audio conditions, licenses, ...

Tech Times

AI Chart Understanding Breakthrough: MIT-IBM Dataset Lets Small Models Beat GPT-4o

MIT and IBM released ChartNet, a 1.7-million-sample synthetic training dataset that lets compact open-source vision-language ...

6dOpinion

Why Government-Provided Data May Actually Be Bad For The Economy

Free government data lowers the cost of academic research, but it can also divert talented people from more productive work ...

C&EN

Chemists ran 50,688 reactions to make a huge open dataset

The dataset, which the researchers have made available on the Open Reaction Database, is nearly five times as large as the ...

Frontiers

Showcasing FAIR² Data Articles: Unlocking Trustworthy, AI-Ready Scientific Data for Reuse and Impact in Space Technologies

Scientific knowledge is fundamentally built on data; yet, for too long, research datasets have remained siloed, poorly documented, and inconsistently ...

VentureBeat

Table-augmented generation shows promise for complex dataset querying, outperforms text-to-SQL

AI has transformed the way companies work and interact with data. A few years ago, teams had to write SQL queries and code to extract useful information from large swathes of data. Today, all they ...

Wired

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results