MBW Reacts is a collection of analytical commentaries from Music Enterprise Worldwide written in response to main latest leisure occasions or information tales. Solely MBW+ subscribers have limitless entry to those articles.
Final week, Nikkei Asia reported that researchers at Sony Group have been engaged on know-how to establish copyrighted music embedded in AI-generated tracks.
The story was extensively picked up, with protection framing the event as a sort of next-generation detection software that would assist songwriters declare compensation from AI builders.
However the broader underlying analysis by the group at Sony AI seems to go significantly additional than that framing suggests.
In a weblog put up printed in December, Sony AI highlighted three papers accepted at main educational conferences for AI and audio analysis.
The analysis, in accordance with the weblog put up, is concentrated on “musical integrity within the age of machine studying, exploring attribution, recognition, and safety,” and is “a part of a rising physique of labor exploring how AI can unlearn what doesn’t belong to it, how connections between musical segments may be recognized, and the way efficient present audio authentication strategies are.”
As we famous final week, this work is a part of Sony AI’s broader analysis, and the corporate has not introduced any business rollout.
Sony AI, in accordance with its about web page, was established as a division of Japan-headquartered tech and leisure big Sony Group in April 2020 to “pursue groundbreaking analysis in AI and robotics to unleash human creativeness and creativity with AI”. Sony AI has places of work in North America, Europe, India, and Japan.
MBW has learn by way of the three analysis papers detailed by Sony AI. Right here’s what we discovered…
1. Attribution: ‘Unlearning’ can hint which songs formed an AI mannequin’s output, even when nothing sounds alike
Sony AI’s weblog put up introduces the primary problem as attribution, or “understanding which coaching information influenced what an AI system creates.”
Because the weblog places it, “when an unlicensed generative mannequin composes a brand new track from a textual content immediate, it doesn’t embody any document of attribution. However Sony AI’s researchers consider it might nonetheless be decided.”
The paper, titled, Giant-Scale Coaching Information Attribution for Music Generative Fashions through Unlearning was accepted on the NeurIPS 2025 Artistic AI Monitor. It proposes a way for figuring out which songs in an AI mannequin’s coaching information most affected a particular generated output. Somewhat than evaluating generated tracks in opposition to a catalog of current music, it really works by selectively “forgetting” the generated monitor from the mannequin, then measuring which coaching songs are most affected by that removing.
To check the method, the researchers ran it in opposition to various strategies. The so-called “unlearning” methodology produced sharper outcomes, with affect concentrated in a small variety of coaching tracks, whereas similarity-based strategies confirmed broader, much less targeted patterns. When used to establish a recognized coaching monitor, the system achieved excellent identification whereas the mannequin’s total high quality remained unchanged.
The authors describe their work as the primary to discover attribution on a text-to-music mannequin skilled on a big, numerous dataset. They body it as a sensible framework for making use of unlearning-based attribution at scale.
Conclusion: By “unlearning” a generated monitor and observing ripple results, this methodology can pinpoint which coaching songs influenced an AI’s output, even when the output doesn’t clearly resemble them. As Sony AI’s weblog notes, “by displaying what occurs when fashions neglect, Sony AI’s researchers hope to assist recognise the works of the unique artists.”
Learn the complete paper right here
2. Recognition: Phase-level matching can catch the sort of borrowing AI truly does
Sony AI’s weblog frames the second strand as recognition, or mapping “the relationships between works.”
Because the weblog explains: “Two songs is probably not an identical, however they could nonetheless share a melody, rhythm, or phrasing that hyperlinks them throughout eras or objects in a given catalogue.”
The paper, accepted at ICML 2025, introduces CLEWS [Supervised Contrastive Learning from Weakly-Labeled Audio Segments for Musical Version Matching]. The system detects when two recordings are completely different variations of the identical piece. The important thing innovation is that it really works with 20-second audio snippets slightly than complete tracks. Because the authors be aware, the segments that matter in real-world instances are a lot shorter than full track size.
On two public benchmarks, CLEWS outperformed all current strategies. Whereas competing programs noticed steep accuracy drops with shorter audio clips, CLEWS maintained excessive accuracy down to simply 10 seconds. The paper lists plagiarism and near-duplicate detection amongst its purposes.
Conclusion: CLEWS can establish shared musical materials between recordings on the section degree, even in brief clips. As Sony AI’s weblog places it, this type of fine-grained detection “may assist copyright safety and content material monitoring programs, serving to establish near-duplicates or unauthorised variations which may slip previous conventional matching instruments.”
You may learn the complete paper right here
3. Safety: Can Audio watermarking survive AI compression
Sony AI’s weblog frames the third strand, safety, round a blunt query: “Can current watermarking strategies stand up to real-world transformations?”
Because the weblog notes: “As audio compression turns into more and more powered by neural networks… the very alerts that watermarking programs depend on to show authenticity are being erased.”
The paper, accepted at INTERSPEECH 2025, introduces RAW-Bench [Robust Audio Watermarking Benchmark], a framework that checks how nicely watermarking algorithms maintain up in opposition to 20 real-world distortions together with compression, background noise, reverb, and time stretching. The researchers examined 4 publicly accessible algorithms on a dataset spanning music, speech, and environmental sounds.
The important thing discovering issues neural audio codecs, the AI-powered compression instruments used to shrink audio recordsdata. In opposition to the Descript Audio Codec, each watermarking algorithm scored zero on full-message accuracy — which means not a single watermark was totally recovered intact. Even after retraining two algorithms to withstand these assaults, each nonetheless scored zero on this measure. Some algorithms managed partial bit restoration, however at ranges too low to be virtually helpful.
The reason is simple: watermarks disguise data inside audio, whereas neural codecs strip out something inaudible. Since codecs sometimes come final within the processing chain, they get the final phrase.
Conclusion: Present audio watermarking can’t survive AI-powered compression. As Sony AI’s weblog suggests, “future watermarking programs might have to collaborate with codecs slightly than struggle in opposition to them, embedding id in ways in which persist by way of transformation slightly than being filtered out by it.”
Learn the complete paper right here.
The larger image
Collectively, these three papers describe a layered technical framework: attribution traces affect on the mannequin degree, recognition maps relationships on the fragment degree, and watermarking benchmarks reveal the place present protections fall brief.
Sony AI says that its researchers “are serving to outline how balancing innovation with accountability can work in the way forward for generative music: with AI that remembers its sources, hears its connections, and safeguards its sign”.
Wanting forward, Sony AI’s analysis on this space doesn’t look like slowing down.
In a separate weblog put up printed in February, Sony’s AI analysis unit stated it’s going to have greater than 10 papers accepted at ICLR 2026, spanning “generative modeling, diffusion, multimodal illustration studying, and creator-focused AI programs.”
Among the many subjects listed is “AI-assisted music post-production.”Music Enterprise Worldwide




