this post was submitted on 25 Oct 2023
72 points (80.0% liked)

Technology

59135 readers
6622 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Chozo@kbin.social 23 points 1 year ago (2 children)

It knows what naked people look like, and it knows what children look like. It doesn't need naked children to fill in those gaps.

Also, these models are trained with images scraped from the clear net. Somebody would have to had manually added CSAM to the training data, which would be easily traced back to them if they did. The likelihood of actual CSAM being included in any mainstream AI's training material is slim to none.

[–] BetaDoggo_@lemmy.world 5 points 1 year ago

There is likely some csam in most of the models as filtering it out of a several billion image set is nearly impossible even with automated methods. This material likely has little to no effect on outputs however since it's likely scarce and was probably tagged incorrectly.

The bigger concern is users down stream finetuning models on their own datasets with this material. This has been happening for a while, though I won't point fingers(Japan).

There's not a whole lot that can be done about it but I also don't think there's anything that needs to be done. It's already illegal and it's already removed from most platforms semiautomatically. Having more of it won't change that.