this post was submitted on 24 Jun 2023

165 points (95.6% liked)

Lemmy

12535 readers

28 users here now

Everything about Lemmy; bugs, gripes, praises, and advocacy.

For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.

founded 4 years ago

MODERATORS

nutomic@lemmy.ml

165

Proof that bots are manipulating content (lemmyonline.com)

submitted 1 year ago* (last edited 1 year ago) by xtremeownage@lemmyonline.com to c/lemmy@lemmy.ml

150 comments fedilink hide all child comments

See THIS POST

Notice- the 2,000 upvotes?

https://gist.github.com/XtremeOwnageDotCom/19422927a5225228c53517652847a76b

It's mostly bot traffic.

Important Note

The OP of that post did admit, to purposely using bots for that demonstration.

I am not making this post, specifically for that post. Rather- we need to collectively organize, and find a method.

Defederation is a nuke from orbit approach, which WILL cause more harm then good, over the long run.

Having admins proactively monitor their content and communities helps- as does enabling new user approvals, captchas, email verification, etc. But, this does not solve the problem.

The REAL problem

But, the real problem- The fediverse is so open, there is NOTHING stopping dedicated bot owners and spammers from...

Creating new instances for hosting bots, and then federating with other servers. (Everything can be fully automated to completely spin up a new instance, in UNDER 15 seconds)
Hiring kids in africa and india to create accounts for 2 cents an hour. NEWS POST 1 POST TWO
Lemmy is EXTREMELY trusting. For example, go look at the stats for my instance online.... (lemmyonline.com) I can assure you, I don't have 30k users and 1.2 million comments.
There is no built-in "real-time" methods for admins via the UI to identify suspicious activity from their users, I am only able to fetch this data directly from the database. I don't think it is even exposed through the rest api.

What can happen if we don't identify a solution.

We know meta wants to infiltrate the fediverse. We know reddits wants the fediverse to fail.

If, a single user, with limited technical resources can manipulate that content, as was proven above-

What is going to happen when big-corpo wants to swing their fist around?

Edits

Removed most of the images containing instances. Some of those issues have already been taken care of. As well, I don't want to distract from the ACTUAL problem.
Cleaned up post.

top 50 comments

sorted by: hot top controversial new old

[–] xtremeownage@lemmyonline.com 34 points 1 year ago* (last edited 1 year ago) (20 children)

@ruud@lemmy.world (lemmy.world)
@nutomic@lemmy.ml (lemmy.ml)
@TheDude@sh.itjust.works (sh.itjust.works)
@db0@lemmy.dbzer0.com (dbzer0)

What, corrective courses of action shall we seek?

I sent messages to:

https://startrek.website/u/ValueSubtracted (startek.website)
https://oceanbreeze.earth/u/windocean (oceanbreeze.earth)
https://normalcity.life/u/EuphoricPenguin22 (normalcity.life)

I blocked / defederated these instances:

https://lemmy.dekay.se/ (appears to just be a spambot server)

[–] AlmightySnoo@lemmy.world 15 points 1 year ago* (last edited 1 year ago) (1 children)

Just wanted to point out that according to your stats, unless I don't understand them well, only 26 bots come from lemmy.world (which has open sign-ups, and uses the "easy to break" (/s) captcha) and 16 from lemmy.ml (which doesn't have open sign-ups and relies on manual approvals).

For some perspective, lemmy.world has almost 48k users right now. Speaking of "corrective action" is a bit of a stretch IMO.

[–] xtremeownage@lemmyonline.com 9 points 1 year ago* (last edited 1 year ago) (1 children)

This post isn't about lemmy.world, nor am I blaming lemmy.world!

I am trying to drag in the admins of the big instances, to come up with a collective plan to address this issue.

There isn't a single instance causing this problems. The bots are distributed amongst normal users, in normal instances.

WIth- the exception of a instance or two with nothing but bot traffic.

[–] AlmightySnoo@lemmy.world 6 points 1 year ago* (last edited 1 year ago) (1 children)

I'm just saying that context and scale matter. If an anti-spam solution is 99% effective, then chances are that on an instance with 100k users you are still going to have around 1k bots that have bypassed it.

[–] xtremeownage@lemmyonline.com 7 points 1 year ago (3 children)

Your right- But, the problem is-

At a fediverse-level, we don't really have ANY spam prevention currently.

Lets assume, at an instance level, all admins do their part, enable applicant approvals, enable captchas, email verification, and EVERY TOOL they have at their disposal.

There is NOTHING stopping these bots from just creating new instances, and using those.

Keep focused on the problem- the problem, is platform-wide lack of the ability to prevent bots.

I don't agree with the beehaw approach, of bulk-defederation, as such, a better solution is needed.

[–] Kichae@kbin.social 8 points 1 year ago* (last edited 1 year ago) (2 children)

The beehaw approach wasn't "bulk defederation". They blocked two Lemmy instances they were having trouble with. The bulk of their block list are Mastodon and Pleroma instances well known for trolling other sites and stirring up shit.

Edit: Autocomplete refuses to accept that I talk a lot about federation and defederating, and is desperately trying to convince me I'm talking about anything else that states with "de".

load more comments (2 replies)

[–] fubo@lemmy.world 6 points 1 year ago

Some older federated services, like IRC, had to drop open federation early in their history to prevent abusive instances from cropping up constantly, and instead became multiple different federations with different policies.

That's one way this service might develop. Not necessarily, but it's gotta be on the table.

[–] o_o@programming.dev 4 points 1 year ago (1 children)

There is NOTHING stopping these bots from just creating new instances, and using those.

I read somewhere that mastodon prevents this by requiring a real domain to federate with. This would make it costly for bots to spin up their own instances in bulk. This solution could be expanded to require domains of a certain “status” to allow federation. For example, newly created domains might be blacklisted by default.

load more comments (1 replies)

[–] TheDude@sh.itjust.works 6 points 1 year ago

Thanks will keep an eye on this thread.

[–] Mutelogic@sh.itjust.works 4 points 1 year ago (2 children)

It looks like the OP is responsible for the upvote bots (inferred from his edit?). Maybe to prove the original point?

[–] xtremeownage@lemmyonline.com 5 points 1 year ago

That is correct- Please see my revised post. I removed lots of the data and parts, to help point out the bigger problem we need to solve.

load more comments (1 replies)

load more comments (17 replies)

[–] tugg@lemmyverse.org 30 points 1 year ago* (last edited 1 year ago) (3 children)

I dont have much to add other than I am an experienced admin and was dismayed at how vulnerable Lemmy is. Having an option to have open registrations with no checks is not great. No serious platform would allow that.

I dont know of a bulletproof way to weed put the bad actors, but a voting system that Lemmy can leverage, with a minimum reputation in order to stay federated might work. This would require some changes that I'm not sure the devs can or would make. Without any protection in place, people will get frustrated and abandon Lemmy. I would.

[–] Martineski@lemmy.fmhy.ml 5 points 1 year ago* (last edited 1 year ago) (1 children)

When I made a post saying that 90% (now ~95%) of accounts on lemmy are bots the amount of people saying that there's no proof and/or saying to me that there's a lot of people joining from reddit right now was astonishing.

Edit: one person said me that noone would make 1.6mln bots when there are only 150k-200k users on the platform, like WTF.

[–] flambonkscious@sh.itjust.works 7 points 1 year ago

Another thing is people are likely pre-creating bot accounts and then sitting in them in case additional protections are created...

The problem is, these accounts look to us just like any new user, lurking around getting a feel for the place - there's no way to distinguish them until they start this bots acting in some fashion

load more comments (2 replies)

[–] Rottcodd@kbin.social 28 points 1 year ago (2 children)

The place feels different today than it did just a couple of days ago, and it positively reeks of bots.

I'm seeing far fewer original posts and far more links to karma-farmer quality pabulum, all of which pretty much instantly somehow get hundreds of upvotes.

The bots are here. And they're circlejerking.

[–] xtremeownage@lemmyonline.com 12 points 1 year ago* (last edited 1 year ago) (3 children)

Yup. And, I would bet money, it will get progressively worse, unless steps are taken to prevent it.

[–] towerful@beehaw.org 6 points 1 year ago (1 children)

Theres some that aren't just money.
There are bots that mirror content from Reddit, just linking to them.
I've seen posts that are 3 or 4 crossposts (between community/instances) deep.

I want content.
I don't want bot content

[–] xtremeownage@lemmyonline.com 4 points 1 year ago

Give it a week or two, and you will start to see the emergence of tools to assist with combating these issues.

I am working on trying to build a GUI for one project to help combat spam.

There is also lemmy_helper And- its only a short matter of time before we gain access to much more powerful tools to help.

load more comments (2 replies)

[–] csm10495@sh.itjust.works 3 points 1 year ago

Honestly to me it's the same. If anything it seems like less content but idk.

[–] ikiru@lemmy.ml 28 points 1 year ago

Why can't we just have nice things?

[–] o_o@programming.dev 21 points 1 year ago* (last edited 1 year ago) (1 children)

Honestly, I’m interested to see how the federation handles this problem. Thank you for all the attention you’re bringing to it.

My fear is that we might overcorrect by becoming too defederation-happy, which is a fear it seems that you share. However I disagree with your assertion that the federation model is more risky than conventional Reddit-like models. Instance owners have just as many tools (more, in fact) as Reddit does to combat bots on their instance. Plus we have the nuke-from-orbit defederation option.

Since it seems like most of these bots are coming from established instances (rather than spoofing their own), I agree with you that the right approach seems to be for instance mods to maintain stricter signups (captcha, email verification, application, or other original methods). My hope is that federation will naturally lead to a “survival of the fittest” where more bot-ridden instances will copy the methods of the less bot-ridden instances.

I think an instance should only consider defederation if it’s already being plagued by bot interference from a particular instance. I don’t think defederation should be a pre-emptive action.

load more comments (1 replies)

[–] RoundSparrow@lemmy.ml 14 points 1 year ago (1 children)

There is no built-in “real-time” methods for admins via the UI to identify suspicious activity from their users, I am only able to fetch this data directly from the database. I don’t think it is even exposed through the rest api.

The people doing the development seem to have zero concern that their all the major servers are crashing with nginx 500 errors on their front page under routine moderate loads, nothing close to a major website. There is no concern to alert operators of internal federation failures, etc.

I am only able to fetch this data directly from the database.

I too had to resort to this, and published an open source tool - primitive and non-elegant, to try and get something out there for server operators: !lemmy_helper@lemmy.ml

[–] xtremeownage@lemmyonline.com 3 points 1 year ago (1 children)

Thanks, I'll take a look at that one.

[–] RoundSparrow@lemmy.ml 6 points 1 year ago (1 children)

I you have SQL statements to share, please do. Ill toss them into the app.

[–] xtremeownage@lemmyonline.com 3 points 1 year ago (1 children)

I believe you already saw my post yesterday, for auditing comments, voting history, and post history, right?

[–] RoundSparrow@lemmy.ml 6 points 1 year ago

Yes, thank you. And if you come up with any that cross-reference comments and postings by remote instance server better than the ones in lemmy_helper, please share. I'd really like to see if we can get "most recent hour, most recent day" queries so we can at least see federated data is flowing from which servers.

[–] db0@lemmy.dbzer0.com 10 points 1 year ago (4 children)

I noticellot of instances which were flooded with bots due to the open registration. I have most of them degenerated for this reason.

[–] xtremeownage@lemmyonline.com 9 points 1 year ago* (last edited 1 year ago) (18 children)

We need a better solution for this, rather then mass-bulk defederation.

In my opinion- that is going to greatly slowdown the spread and influence of this platform. Also IMO- I think these bots are purposely TRYING to get instances to defederate from each other.

Meta is pushing its "fediverse" thing. Reddit, is trying to squash the fediverse. Honestly, it makes perfect sense that we have bots trying to upvote the idea of getting instances to defederate each other.

Once- everything is defederated- lots of communities will start to fall apart.

[–] db0@lemmy.dbzer0.com 8 points 1 year ago (11 children)

I agree. This is why I started the Fediseer which makes it easy for any instance to be marked as safe through human review. If people cooperate on this, we can add all good instances, no matter how small, while spammers won't be able to easily spin up new instances and just spam.

load more comments (11 replies)

load more comments (17 replies)

load more comments (3 replies)

[–] lukas@lemmy.haigner.me 10 points 1 year ago* (last edited 1 year ago) (1 children)

Hiring kids in africa and india to create accounts for 2 cents an hour.

Heads up that this depends on the operation size. Captchas are a solved problem. Commercial software exists that can solve Captchas automatically. You migrate from pay on demand services to computer vision software when it's financially beneficial.

Computers are cheaper and better at solving Captchas than humans atm, and it doesn't look like that's going to change any time soon. As long as you pay attention to your proxies, it's rare to see solution attempts fail. Some pay on demand services no longer employ people.

[–] can@sh.itjust.works 7 points 1 year ago (1 children)

Computers are cheaper and better at solving Captchas than humans atm

This is hilarious

load more comments (1 replies)

[–] dedale@kbin.social 7 points 1 year ago (2 children)

Hello. The post you mentioned was made as a warning, to prove a point. That the fediverse is currently extremely vulnerable to bots.

user 'alert', made the post then upvoted with his bots. To prove how easy it was to manipulate traffic, even without funding.

see:
https://kbin.social/m/lemmy@lemmy.ml/t/79888/Protect-Moderate-Purge-Your-Sever

It's proof that anyone could easily manipulate content unless instance owners take the bot issue seriously.

[–] xtremeownage@lemmyonline.com 4 points 1 year ago

I did update my post, shortly before you posted this, to include that- as well as- removing a lot of the data for individual instances as it derives from the point / problem I am trying to identify.

The data, however, is quite valuable in exposing that this WILL be a problem for us, especially if we do not identify a solution for it.

[–] db0@lemmy.dbzer0.com 4 points 1 year ago

Absolutely. Me and a couple of others have been warning against this for a week now.

[–] xtremeownage@lemmyonline.com 5 points 1 year ago

Does appear, a few of these are common hosts though-

lemmy.dekay.se, bbs.darkswitch.net. normalcity.life. etc.

[–] AnarchoGravyBoat@kbin.social 5 points 1 year ago (1 children)

@xtremeownage

I think that one of the most difficult things to deal with more common bots, spamming, reposting, etc.

Is that parsing all the commentary and dealing with it on a service wide level is really hard to do, in terms of computing power and sheer volume of content. Seems to me that do this on an instance level with user numbers in the 10's of thousands is a heck of a lot more reasonable than doing it on a 10's of millions of users service.

What I'm getting at is that this really seems like something that could (maybe even should) be built into the instance moderation tools, at least some method of marking user activity as suspicious for further investigation by human admins/mods.

We're really operating on the assumption that people spinning up instances are acting in good faith, until they prove that they aren't, I think the first step is giving good faith actors the tools to moderate effectively, then worrying about bad faith admins.

load more comments (1 replies)

[–] Ataraxia@lemmy.world 4 points 1 year ago (1 children)

Lol well it was fun while it lasted! Man there are some really greedy assholes out there.

[–] xtremeownage@lemmyonline.com 3 points 1 year ago

Well- I have not seen much evidence that supports this is actively being used... yet.

Just- bringing more attention to how easy it is to do.

[–] Cinner@kbin.social 3 points 1 year ago (1 children)

Reposting this in comment from a reply elsewhere in the thread.

If anything there should be SOME centralization that allows other (known, somehow verified) instances to vote to disallow spammy instances from federating. In some way that couldn't be abused. This may lead to a fork down the road (think BTC vs BCH) due to community disagreements but I don't really see any other way this doesn't become an absolute spamfest. As it stands now one server admin could spamfest their own server with their own spam, and once it starts federating EVERYONE gets flooded. This also easily creates a DoS of the system.

Asking instance admins to require CAPTCHA or whatever to defeat spam doesn't work when the instance admins are the ones creating spam servers to spam the federation.

[–] xtremeownage@lemmyonline.com 9 points 1 year ago* (last edited 1 year ago)

I'm working to build a simple GUI around db0's project.

Its simple enough- allows instance owners to vet out other instances.

Edit-

@nohbdyuno@sh.itjust.works you going to continue just blindly downvoting everything? lol.

[–] Sibbo@sopuli.xyz 3 points 1 year ago (1 children)

I really hope that some researchers will get interested into this and develop some cool solutions to this. Maybe we are lucky and they even implement them into Lemmy.

[–] xtremeownage@lemmyonline.com 3 points 1 year ago

I agree, I think the data is easily there to perform the proper analysis, and there are enough hooks in the platform to apply the results.

load more comments