this post was submitted on 22 May 2024

0 points (NaN% liked)

News

23295 readers

5390 users here now

Welcome to the News community!

Rules:

1. Be civil

Attack the argument, not the person. No racism/sexism/bigotry. Good faith argumentation only. This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban. Do not respond to rule-breaking content; report it and move on.

2. All posts should contain a source (url) that is as reliable and unbiased as possible and must only contain one link.

Obvious right or left wing sources will be removed at the mods discretion. We have an actively updated blocklist, which you can see here: https://lemmy.world/post/2246130 if you feel like any website is missing, contact the mods. Supporting links can be added in comments or posted seperately but not to the post body.

3. No bots, spam or self-promotion.

Only approved bots, which follow the guidelines for bots set by the instance, are allowed.

4. Post titles should be the same as the article used as source.

Posts which titles don’t match the source won’t be removed, but the autoMod will notify you, and if your title misrepresents the original article, the post will be deleted. If the site changed their headline, the bot might still contact you, just ignore it, we won’t delete your post.

5. Only recent news is allowed.

Posts must be news from the most recent 30 days.

6. All posts must be news articles.

No opinion pieces, Listicles, editorials or celebrity gossip is allowed. All posts will be judged on a case-by-case basis.

7. No duplicate posts.

If a source you used was already posted by someone else, the autoMod will leave a message. Please remove your post if the autoMod is correct. If the post that matches your post is very old, we refer you to rule 5.

8. Misinformation is prohibited.

Misinformation / propaganda is strictly prohibited. Any comment or post containing or linking to misinformation will be removed. If you feel that your post has been removed in error, credible sources must be provided.

9. No link shorteners.

The auto mod will contact you if a link shortener is detected, please delete your post if they are right.

10. Don't copy entire article in your post body

For copyright reasons, you are not allowed to copy an entire article into your post body. This is an instance wide rule, that is strictly enforced in this community.

founded 1 year ago

MODERATORS

rjc@lemmy.world

Thekingoflorda@lemmy.world

dezmd@lemmy.world

Tenthrow@lemmy.world

JonsJava@lemmy.world

gedaliyah@lemmy.world

Blackbeard@lemmy.world

little_cow@lemmy.world

jeffw@lemmy.world

enu@lemmy.world

“CSAM generated by AI is still CSAM,” DOJ says after rare arrest (arstechnica.com)

submitted 5 months ago by jeffw@lemmy.world to c/news@lemmy.world

22 comments fedilink hide all child comments

top 22 comments

sorted by: hot top controversial new old

[–] over_clox@lemmy.world 0 points 5 months ago (2 children)

Then we should be able to charge AI (the developers moreso) for the same disgusting crime, and shut AI down.

[–] FaceDeer@fedia.io 0 points 5 months ago

Camera-makers, too. And people who make pencils. Lock the whole lot up, the sickos.

[–] Darkassassin07@lemmy.ca 0 points 5 months ago (1 children)

....no

That'd be like outlawing hammers because someone figured out they make a great murder weapon.

Just because you can use a tool for crime, doesn't mean that tool was designed/intended for crime.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

It would be more like outlawing ivory grand pianos because they require dead elephants to make - the AI models under question here were trained on abuse.

[–] wandermind@sopuli.xyz 0 points 5 months ago (1 children)

Sounds to me it would be more like outlawing grand pianos because of all of the dead elephants - while some people are claiming that it is possible to make a grand piano without killing elephants.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

There's CSAM in the training set[1] used for these models so some elephants have been murdered to make this piano.

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

[–] FaceDeer@fedia.io 0 points 5 months ago

3,226 suspected images out of 5.8 billion. About 0.00006%. And probably mislabeled to boot, or it would have been caught earlier. I doubt it had any significant impact on the model's capabilities.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

It is amazing how Lemmy can usually be such a well informed audience but for some reason when it comes to AI people simply refuse to acknowledge that it was trained on CSAM https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

And don't understand how generative AI combines existing concepts to synthesize images - it doesn't have the ability to create novel concepts.

[–] grue@lemmy.world 0 points 5 months ago

it was trained on CSAM

In that case, why haven't the people who made the AI models been arrested?

[–] Empricorn@feddit.nl 0 points 5 months ago (1 children)

This is tough. If it was just a sicko who generated the images for himself locally... that is the definition of a victimless crime, no? And it might actually dissuade him from seeking out real CSAM....

BUT, iirc he was actually distributing the material, and even contacted minors, so... yeah he definitely needed to be arrested.

But, I'm still torn on the first scenario...

[–] Dave@lemmy.nz 0 points 5 months ago (1 children)

What is the AI trained on?

[–] FaceDeer@fedia.io 0 points 5 months ago (1 children)

Image-generating AI is capable of generating images that are not like anything that was in its training set.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

AI can compose novel looking things from components it has been trained on - it can't imagine new concepts. If CSAM is being generated it's because it was included in it's training set which is highly suspected as we know the common corpus had CSAM in it: https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

[–] GBU_28@lemm.ee 0 points 5 months ago (1 children)

If it has images of construction equipment and houses, it can make images of houses that look like construction equipment. Swap out vocabulary as needed.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

Cool, how would it know what a naked young person looks like? Naked adults look significantly different.

[–] GBU_28@lemm.ee 0 points 5 months ago (1 children)

It understands young and old.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

Is a kid just a 60% reduction by volume of an adult? And these are generative algorithms... nobody really understands how it perceives the world and word relations.

[–] FaceDeer@fedia.io 0 points 5 months ago (1 children)

It understands young and old. That means it knows a kid is not just a 60% reduction by volume of an adult.

We know it understands these sorts of things because of the very things this whole kerfuffle is about - it's able to generate images of things that weren't explicitly in its training set.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

But it doesn't fully understand young and "naked young person" isn't just a scaled down "naked adult". There are physiological changes that people go through during puberty which is why the "It understands young vs. old" is a clearly vapid and low effort comment. Yours has more meaning behind it so I'd clarify that just being able to have a vague understanding of young and old doesn't mean it can generate CSAM.

[–] FaceDeer@fedia.io 0 points 5 months ago (1 children)

But it doesn't fully understand young and "naked young person" isn't just a scaled down "naked adult".

Do you actually know that, or are you just assuming it?

Personally, I'm basing my assertions off of experience with related situations, where I've asked image AIs to generate images of things that I'm quite sure weren't in its training set and that require conceptual understanding to create "hybrids." It's done a decent job of those so I'm assuming that it can figure out this specific situation as well, since most of these models have a lot of examples of naked people and young people in their training sets. But I haven't actually asked any AIs to generate images of naked young people to test this one specific case.

[–] xmunk@sh.itjust.works 0 points 5 months ago (1 children)

My opinion here is that "naked young person" isn't as simple as other compound concepts because there are physiological changes we go through during puberty that an AI can't reverse engineer. Something like "Italian samurai" involves concepts that occur at a surface level that it can easily understand while "naked young person" involves some components that can't be derived simply from applying "young" to "naked person" or "naked" to "young person".

Someone did have a valid counter argument in this subthread though: https://sh.itjust.works/comment/11713795

[–] FaceDeer@fedia.io 0 points 5 months ago

Well, I haven't gone to any of my image AIs and actually asked them to generate naked pictures of young people. So unless you want to go there this will necessarily involve some degree of theoretical elements.

However, according to the article it's possible to generate this stuff with Stable Diffusion models, and Stable Diffusion models have a negligible amount of CSAM in the training set. So short of actually doing the experiment that would seem to settle it.

I think a lot of people don't appreciate just how surprisingly sophisticated the "world model" that these image AIs have learned is. There was a paper a while back where some researchers were trying to analyze how image generators were working internally, and they discovered that if you were to for example ask one to make a picture of a bicycle it will first come up with a depth map of the image before it starts doing anything to the visual output. That shows that the AI has figured out what the three-dimensional form of a bicycle is based entirely on a pile of two-dimensional training images, with no other clues telling it that the third dimension even exists in the first place.