Updated: Recording Industry Ass. of America orchestrates war on Udio and Suno| www.theregister.com
High-flying AI scientist claims unfair dismissal following pregnancy leave| www.theregister.com
Claims allegedly pirated content from Books3 dataset trawled by its models| www.theregister.com
Microsoft and OpenAI fail to shake off AI infringement allegations| www.theregister.com
: Class action alleges pirated novels were fed into binary brainbox| www.theregister.com
Copilot code-cloning case clarifies claims| www.theregister.com
Judge won't toss out two key charges, software source slurping case still on| www.theregister.com
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim. This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others). We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data. Mem...| arXiv.org
Publishers want ChatGPT models destroyed after ML tech trained 'unlawfully' on articles| www.theregister.com
(a) False Copyright Management Information.—No person shall knowingly and with the intent to induce, enable, facilitate, or conceal infringement—| LII / Legal Information Institute