Technology

Listening to All the Music AI Is Trained on Would Take Decades

World0 views1 min

Four leaked AI music training datasets contain over 31 million songs, including hits by artists like Taylor Swift, Bad Bunny, and the Beatles, requiring 91 years to listen to one dataset alone. AI-generated music often replicates existing tracks, raising copyright concerns as seen in lawsuits against platforms like Suno, which has produced songs resembling Michael Jackson’s *Thriller* and Ed Sheeran’s *Shape of You*.

Four leaked datasets reveal the vast scale of music used to train AI systems, totaling over 31 million tracks. One dataset alone contains 12 million songs, requiring 91 years of continuous listening. The collections include major hits by artists such as Bad Bunny, Taylor Swift, Billie Eilish, and the Beatles, alongside jazz, classical, and lesser-known works. The datasets highlight how AI music generators replicate existing songs, sometimes closely. Suno, a popular AI music platform, has faced legal action from record labels for producing tracks resembling Michael Jackson’s *Thriller*, Ed Sheeran’s *Shape of You*, and Chuck Berry’s *Johnny B. Goode*. A spokesperson for Suno acknowledged safeguards against unauthorized distribution but did not address specific lawsuits or training data claims. AI companies keep their training data secret, but these datasets confirm the massive volume of music used. They span genres and decades, from 1998’s *You Get What You Give* by the New Radicals to modern pop and jazz classics. The sheer scale raises questions about copyright, artistic integrity, and the ethical use of existing works in AI training. Researchers accessed the datasets through academic papers, revealing how AI models learn from vast musical libraries. The findings underscore the industry’s reliance on copyrighted material, with potential legal and creative repercussions. The datasets are publicly searchable, allowing scrutiny of AI’s training habits.

This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.

Comments (0)

Log in to comment.

Loading...