Child sex abuse images found in dataset training image generators, report says

Child sex abuse images found in dataset training image generators, report says

11 months ago
Anonymous $Pi6HN8Q0B-

https://arstechnica.com/tech-policy/2023/12/child-sex-abuse-images-found-in-dataset-training-image-generators-report-says/

More than 1,000 known child sexual abuse materials (CSAM) were found in a large open dataset—known as LAION-5B—that was used to train popular text-to-image generators such as Stable Diffusion, Stanford Internet Observatory (SIO) researcher David Thiel revealed on Wednesday.

SIO's report seems to confirm rumors swirling on the Internet since 2022 that LAION-5B included illegal images, Bloomberg reported. In an email to Ars, Thiel warned that "the inclusion of child abuse material in AI model training data teaches tools to associate children in illicit sexual activity and uses known child abuse images to generate new, potentially realistic child abuse content."