Stanford researchers find Mastodon has a massive child abuse material problem : technology

[–] while1malloc0@beehaw.org 133 points 1 year ago* (last edited 1 year ago) (2 children)

While the study itself is a good read and I agree with the conclusions—Mastodon, and decentralized social media need better moderation tools—it’s hard to not read the Verge headline as misleading. One of the study authors gives more context here https://hachyderm.io/@det/110769470058276368. Basically most of the hits came from a large Japanese instance that no one federates with; the author even calls out that the blunt instrument most Mastodon admins use is to blanket defederate with instances hosted in Japan due to their more lax (than the US) laws around CSAM. But the headline seems to imply that there’s a giant seedy underbelly to places like mastodon.social[1] that are rife with abuse material. I suppose that’s a marketing problem of federated software in general.

There is a seedy underbelly of mainstream Mastodon instances, but it’s mostly people telling you how you’re supposed to use Mastodon if you previously used Twitter.

[–] glorbo@lemmy.one 29 points 1 year ago (1 children)

In my opinion the biggest issue the author points out is that cached materials are sometimes retained even after moderator action. Which honestly just sounds like a straight up bug more than anything. Though if I were running an instance, the feds showing up at my door with a warrant because I've been accidentally distributing CSAM would be my nightmare scenario. And of course jurisdiction plays a part, too: an American user on a Canadian server might see drawn depictions of sexualized minors, think "weird but not illegal," and now the Canadian admin has content that's illegal in Canada on their Canadian server and has no idea.

IMO I think the best solution to this is something similar to what Renaud Chaput (Mastodon's resident infra boffin) described in his recent blog post. Effectively, give admins a way to hand this off to pluggable third-party services. Admins that are worried about this sort of thing can then have some degree of safety via e.g. PhotoDNA, whereas others can take on additional risk and preserve additional privacy.

All that said: yeah the headline makes it sound like .social is some 8chan-esque hellhole, whereas in reality my feed is 99% German programmers sharing milquetoast political takes.

[–] jherazob@beehaw.org 22 points 1 year ago* (last edited 1 year ago) (1 children)

The person outright rejects defederation as a solution when it IS the solution, if an instance is in favor of this kind of thing you don't want to federate with them, period.

I also find worrying the amount of calls for a "Fediverse police" in that thread, scanning every image that gets uploaded to your instance with a 3rd party tool is an issue too, on one side you definitely don't want this kinda shit to even touch your servers and on the other you don't want anybody dictating that, say, anti-union or similar memes are marked, denounced and the person who made them marked, targeted and receiving a nice Pinkerton visit.

This is a complicated problem.

Edit: I see somebody suggested checking the observations against the common and well used Mastodon blocklists, to see if the shit is contained on defederated instances, and the author said this was something they wanted to check, so i hope there's a followup

load more comments (1 replies)

[–] OneRedFox@beehaw.org 56 points 1 year ago (2 children)

Yeah I recall that the Japanese instances have a big problem with that shit. As for the rest of us, Facebook actually open sourced some efficient hashing algorithms for use for dealing with CSAM; Fediverse platforms could implement these, which would just leave the issue of getting an image hash database to check against. All the big platforms could probably chip in to get access to one of those private databases and then release a public service for use with the ecosystem.

[–] zephyrvs@lemmy.ml 11 points 1 year ago (2 children)

That'd be useless though, because first, it'd probably opt-in via configuration settings and even if it wasn't, people would just fork and modify the code base or simply switch to another ActivityPub implementation.

We're not gonna fix society using tech unless we're all hooked up to some all knowing AI under government control.

[–] OneRedFox@beehaw.org 18 points 1 year ago (3 children)

That’d be useless though, because first, it’d probably opt-in via configuration settings and even if it wasn’t, people would just fork and modify the code base or simply switch to another ActivityPub implementation.

No it wouldn't, because it'd still be significantly easier for instances to deal with CSAM content with this functionality built into the platforms. And I highly doubt there's going to be a mass migration from any Fediverse platform that implements such a feature (though honestly I'd be down to defederate with any instance that takes serious issue with this).

load more comments (3 replies)

[–] crystal@feddit.de 11 points 1 year ago (8 children)

That's not the point. Yes, child porn sites can host child porn. Other sites/instances can't stop that. But what other instances can stop, is redistributing said child porn. And for that purpose, such technology would be useful.

load more comments (8 replies)

[–] pineapplelover@lemm.ee 8 points 1 year ago (2 children)

Facebook actually being good for once? Unheard of

[–] Paradoxvoid@aussie.zone 7 points 1 year ago* (last edited 1 year ago)

As much as we can (and should) lambast Facebook/Meta's C-Suite for terrible decisions, their engineers are generally pretty legit.

load more comments (1 replies)

[–] pglpm@lemmy.ca 45 points 1 year ago* (last edited 1 year ago) (3 children)

I'm not fully sure about the logic and perhaps hinted conclusions here. The internet itself is a network with major CSAM problems (so maybe we shouldn't use it?).

[–] mudeth@lemmy.ca 27 points 1 year ago* (last edited 1 year ago) (2 children)

It doesn't help to bring whataboutism into this discussion. This is a known problem with the open nature of federation. So is bigotry and hate speech. To address these problems, it's important to first acknowledge that they exist.

Also, since fed is still in the early stages, now is the time to experiment with mechanisms to control them. Saying that the problem is innate to networks is only sweeping it under the rug. At some point there will be a watershed event that'll force these conversations anyway.

The challenge is in moderating such content without being ham-fisted. I must admit I have absolutely no idea how, this is just my read of the situation.

[–] shiri@foggyminds.com 22 points 1 year ago

@mudeth @pglpm you really don't beyond our current tools and reporting to authorities.

This is not a single monolithic platform, it's like attributing the bad behavior of some websites to HTTP.

Our existing moderation tools are already remarkably robust and defederating is absolutely how this is approached. If a server shares content that's illegal in your country (or otherwise just objectionable) and they have no interest in self-moderating, you stop federating with them.

Moderation is not about stamping out the existence of these things, it's about protecting your users from them.

If they're not willing to take action against this material on their servers, then the only thing further that can be done is reporting it to the authorities or the court of public opinion.

[–] pglpm@lemmy.ca 14 points 1 year ago* (last edited 1 year ago) (1 children)

Maybe my comment wasn't clear or you misread it. It wasn't meant to be sarcastic. Obviously there's a problem and we want (not just need) to do something about it. But it's also important to be careful about how the problem is presented - and manipulated - and about how fingers are pointed. One can't point a finger at "Mastodon" the same way one could point it at "Twitter". Doing so has some similarities to pointing a finger at the http protocol.

Edit: see for instance the comment by @while1malloc0@beehaw.org to this post.

[–] mudeth@lemmy.ca 7 points 1 year ago (5 children)

Understood, thanks. Yes I did misread it as sarcasm. Thanks for clearing that up :)

However I disagree with @shiri@foggyminds.com in that Lemmy, and the Fediverse, are interfaced with as monolithic entities. Not just by people from the outside, but even by its own users. There are people here saying how they love the community on Lemmy for example. It's just the way people group things, and no amount of technical explanation will prevent this semantic grouping.

For example, the person who was arrested for CSAM recently was running a Tor exit node, but that didn't help his case. As shiri pointed out, defederation works for black-and-white cases. But what about in cases like disagreement, where things are a bit more gray? Like hard political viewpoints? We've already seen the open internet devolve into bubbles with no productive discourse. Federation has a unique opportunity to solve that problem starting from scratch, and learning from previous mistakes. Defed is not the solution, it isn't granular enough for one.

Another problem defederation is that it is after-the-fact and depends on moderators and admins. There will inevitably be a backlog (pointed out in the article). With enough community reports, could there be a holding-cell style mechanism in federated networks? I think there is space to explore this deeper, and the study does the useful job of pointing out liabilities in the current state-of-the-art.

load more comments (5 replies)

[–] Penguinblue@kbin.social 12 points 1 year ago (1 children)

This is exactly what I thought. The story here is that the human race has a massive child abuse material problem.

load more comments (1 replies)

[–] Mandy@beehaw.org 40 points 1 year ago (1 children)

Pedos that got banned from platforms turn to other platform who hasnt done it yet

In other news: the sky is blue

load more comments (1 replies)

[–] lohrun@fediverse.boo 40 points 1 year ago (1 children)

One of the problems with the fediverse is that each server keeps its own copy of the content. It is definitely a worry that bad actors push content to federated servers to get them taken down due to the content they now are storing.

[–] JCPhoenix@beehaw.org 13 points 1 year ago (4 children)

What's the reason for that? Caching purposes?

[–] deksesuma@beehaw.org 16 points 1 year ago

General idea is that if there is only one copy, taking something down is knocking that server out of service.

[–] TheSaneWriter@lemmy.thesanewriter.com 7 points 1 year ago

I think so. My Lemmy instance for example is currently storing several gigabytes of images in my cloud buckets, but with my 4 users I'm reasonably confident it didn't all come from us.

[–] owls@community.yshi.org 6 points 1 year ago

If I'm running a tiny little single-user instance on a potato and my post goes to the mastodon.social federated feed, it would be impolite for them to direct 20,000 requests at my potato all at once. Instead, their servers grabs one copy and serves it to their users. If they're set up for 20k eyeballs online at once, they've got capacity to serve them all the photo.

Mastodon has a configurable clean-up period for cached media so you don't use infinite disk. That gives a bad actor an easy way to robustly host images for a couple days: post it, let it federate out, and then take your server down. Everyone else is now doing crimes for you, and cleaning it up is a reactive process by dozens of server admins.

load more comments (1 replies)

[–] jordanlund@lemmy.one 38 points 1 year ago (2 children)

"massive child abuse material problem"

"112 instances of known CSAM across 325,000 posts"

While any instance is unacceptable, does 112/325,000 constitute a "massive problem"?

0.0000034462% of posts are unacceptable! Massive problem!

[–] crystal@feddit.de 25 points 1 year ago

You moved the period in the wrong direction. It's 0.034462%.

[–] ParsnipWitch@feddit.de 8 points 1 year ago

That's just the material they knew was CSAM from previous investigations.

There were also 713 uses of the top 20 CSAM-related hashtags across the Fediverse on posts that contained media, as well as 1,217 text-only posts that pointed to “off-site CSAM trading or grooming of minors.” The study notes that the open posting of CSAM is “disturbingly prevalent.”

[–] sphere_au@reddthat.com 36 points 1 year ago

So instances that are actually supporting CSAM material can and should be dealt with by law enforcement. That much is simple (and I'm surprised it hasn't been done with certain ... instances, to be honest). But I think the apparently less clearly solved issues have known and working solutions that apply to other parts of the web as well. No content moderation is perfect, but in general, if admins are acting in good faith, I don't think there should be too much of a problem:

For when federation inadvertently spreads some of the material through to other instances' databases: Isn't this the same situation as when ISP's used to cache web traffic to save on bandwidth costs? In that situation, too, browsed web pages would end up in the ISP's cache which could then harbour whatever material the user was looking at. As I recall, the ISP would just ban CSAM and other illegal material in their terms of service, and remove anyone reported as violating the rule, and that sufficed.
As for "bad" instances/users: It's impossible to block all instances and all users that might disseminate this material as you'd have to go to a "block everything, then allow known entities" rule which would break the Fediverse model. Again, users or site admins found to be acting in bad faith should be blocked and reported (either automatically or manually). Some may slip through the net, but as long as admins are seen to be doing the best they can, that should be enough.

There seem to be concerns about "surveillance" of material on Mastodon, which strikes me as a bit odd. Mastodon isn't a private platform. People who want private messaging should use an E2EE messaging app like Signal, not a social networking platform like Mastodon (or Twitter, Threads etc.). Mastodon data is already public and is likely already being surveilled, and will be so regardless of what anyone involved with the network wants, because there's no access control on it anyway. Having Mastodon itself contain code to keep the network clean, even if it only applies to part of the network, just allows those Mastodon admins who are running that part of the code to take some of the responsibility on themselves for doing so, reducing the temptation for third parties to do it for them.

[–] Cylinsier@beehaw.org 33 points 1 year ago (3 children)

The researchers suggest that decentralized networks like Mastodon need to implement more robust moderation tools and reporting mechanisms to address the prevalence of CSAM.

I agree, but who's going to pay for it? Those aren't just freely available additions to any application that you only need to toggle on.

[–] pineapplelover@infosec.pub 12 points 1 year ago (2 children)

One way to do this is to block hashes. This is a slippery slope though because it could be used maliciously. Only way to do this and protect freedom of information is to make this fully open source.

[–] scrubbles@poptalk.scrubbles.tech 8 points 1 year ago (7 children)

Block hash lists then? Something like a community driven hashlist for CSAM would work, of the majority of federated instances report it as that type then it would get added to the list. Instances could then choose what lists they wanted to block.

...instances could also show what lists they subscribe to so they users could see what sort of moderation they choose

[–] glorbo@lemmy.one 7 points 1 year ago

So the standard approach to this is so-called "perceptual hashing." Effectively, using cryptographic hashes (sha256, etc.) doesn't really work well in this case. Given a piece of illegal content, that content is likely to still be just as illegal with a single pixel changed -- however, it'll have a completely different cryptographic hash. So instead, a hash function that determines how "similar-looking" two images are, ignoring things like dimensions, color palette, JPEG compression artifacts, etc. This is obviously way fuzzier, and is prone to both false positives and negatives.

Because all this is inherently kinda fuzzy, the exact database of hashes is usually "secret sauce" if you will. If it were public, it would be super easy to circumvent. As an example, given an illegal image:

Is the image's hash in the DB?
No? All done, you can post it with impunity.
Yes? Change one random pixel, GOTO 1.

As a result even "public" databases are distributed with NDAs etc. This obviously does not jive well with an open source, federated network like Mastodon, and I have my doubts as to how willing the relevant agencies would be to give their databases to every rando with $5 to spin up a Pleroma instance on a VPS. A public DB might help in some cases, but unfortunately more illegal content is produced every day, and so it would be extremely hard to keep up with the bad actors.

load more comments (6 replies)

load more comments (1 replies)

[–] abhibeckert@beehaw.org 11 points 1 year ago* (last edited 1 year ago) (1 children)

I agree, but who’s going to pay for it?

How about police/the tax payer?

If university researchers can find the stuff, then police can find it too. There should be an established way to flag the user (or even the entire instance) so that content can be removed from the fediverse while simultaneously asking for all data that is available to try to catch the criminals.

And of course, if regular users come across anything illegal they will report it too, and it should be removed quickly (I'd hope immediately in many cases, especially if the post was by a brand new/untrusted account).

[–] swnt@feddit.de 8 points 1 year ago (4 children)

A decentralised platform like the Fediverses won't easily work with nation states and their taxes. Even with Wikipedia today, it's not funded directly via any government - but rather by certain universities giving some money to it + all the private doners.

And even if we get that working, power politics will mess this up like so often when things actually get troublesome.

It might be interesting to explore cryptocurrencies as for donations here though. They do have international liquidity and they can't be misused foe power politics.

load more comments (4 replies)

[–] zephyrvs@lemmy.ml 8 points 1 year ago* (last edited 1 year ago)

The researchers can't be taken seriously if they don't acknowledge that you can't force free software to do something you don't want it to.

Even if we started way down at the stack and we added a CSAM hash scanner to the Linux kernel, people would just fork the kernel and use their own build without it.

Same goes for nginx or any other web server or web proxy. Same goes for Tor. Same goes for Mastodon or any other Fedi/ActivityPub implementation.

It. Does. Not*. Work.

* Please, prove me wrong, I'm not all knowing, but short of total surveillance, I see no technical solution to this.

[–] zygo_histo_morpheus@programming.dev 29 points 1 year ago* (last edited 1 year ago) (2 children)

Is there any way mastodon stands out from other self hosted websites? Would the CSAM material be harder to distribute or easier to prosecute if they ran, say, a self-hosted bulletin board for it instead?

[–] peter@feddit.uk 7 points 1 year ago

Probably just the ease at which you can find it since each instance is linked, it basically becomes a search engine that might not have the same controls/protection as Google etc

load more comments (1 replies)

[–] teawrecks@sopuli.xyz 25 points 1 year ago

I for one am all for instances being forcibly taken down by police if they can't moderate CSAM appropriately.

Moderation is a very real challenge. The internet at large aimed to solved it by centralizing everything to a few mega corps with AI moderation. The fediverse aims to solve it by keeping instances small and holding both mods and users accountable.

[–] Bendavisunlv6@lemmynsfw.com 21 points 1 year ago

This is one of the things I don’t like about the whole Twitter format. There’s no moderator layer. Every lemmy community must be created by a moderator and that mod can be held accountable.

There isn’t even a concept of communities on Twitter / Mastodon. Hashtags? Nobody owns monitoring them, and they can be freely improvised at will. It really is just the instance and its zillion users with nothing in between. Imagine a lemmy instance admin being responsible for all the moderation… would never work.

[–] deCorp0@lemmy.dbzer0.com 20 points 1 year ago

Hi, since Mastodon is no longer acceptable due to the 0.04 percent of instances found to have abusive material, would someone please suggest the alternative social network with 0 percent of these incidents? Companies like Facebook and Twitter are driven by shareholders and greed, Mastodon is a community effort and you’ll certainly find bad actors there, but I feel less dirty contributing to a community project, versus helping billionaires like Zuck and Elon line their pockets harvesting my data.

[–] FlashMobOfOne@beehaw.org 18 points 1 year ago* (last edited 1 year ago)

Mastodon.art doesn't.

And the beauty of Mastodon is you can block an entire instance, as can your admin, when something awful is posted. Mastodon even has a hashtag they use as an alert for this kind of thing. (#Fediblock)

[+] zephyrvs@lemmy.ml 14 points 1 year ago* (last edited 1 year ago) (2 children)

[removed by mod]

[–] aes@beehaw.org 41 points 1 year ago (2 children)

This is a whataboutist counterpoint at best. Universities and their researchers are not a monolith.

[–] rikudou@lemmings.world 12 points 1 year ago

What a coincidence, Mastodon is not a monolith either.

[–] Applejuicy@feddit.nl 12 points 1 year ago (1 children)

OP unironically linked an article referring to a 50k donation in 2004 to a physics department to show that the Stanford Cyber Policy center somehow is not out for wellbeing of kids? Imagine being this delusional.

load more comments (1 replies)

[–] sanzky@beehaw.org 7 points 1 year ago

This is just bad press. The actual study is quite good and offers good recommendations on how to improve moderation on the fediverse

[–] eskimofry@lemmy.one 14 points 1 year ago (2 children)

I don't trust stanford to not work on behalf of the CIA or other 3 alphabet orgs. They kind of turn a blind eye to CSA in churches but a federated media? This sounds like a smear job.

load more comments (2 replies)

[–] alyaza@beehaw.org 13 points 1 year ago

not surprised at all. this is a growing pain here too because this was previously a thing handled invisibly by platforms and federation makes it fall to individual sysadmins and whoever they have on staff. the tools for this stuff are, in general, not here yet--and as people have noted there are potential conflicts with some of the principles of federation introduced by those tools that can't be totally handwaved.

[–] IronKrill@lemmy.ca 11 points 1 year ago

I browsed through an anime instance while trying to convince myself to like Mastodon and unfortunately I believe I've found some of this myself. I wasn't going to confirm it was real, I just reported and closed out but considering I've never seen such content on other websites and this instance was rife with it, I don't find this article hard to believe at all.

[–] sub_@beehaw.org 9 points 1 year ago

I think some of the problematic instances have been defederated, IIRC there's a large japanese instance that was defederated long time ago due to child abuse content. But still since I've been seeing increases of hate speech and dog whistling misogyny and homophobia in some instances, I won't be surprised if CSAM stuff has been trading under our noses.

The main issue is that, with so many users nowadays and small moderation teams, especially in the larger instances, it's hard to moderate and tackle CSAM problems effectively. I really wish larger instances would limit user registrations or start splitting off into smaller manageable ones.

Also, since they are trading using certain hashtags, blocking those hashtags might not be a bad idea.

Technology