this post was submitted on 05 Jul 2023
64 points (91.0% liked)

Fediverse

28725 readers
287 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 2 years ago
MODERATORS
 

I made this tool to help self-hosters, new admins, or smaller instances have more global and updated content on their instances.

This is the similar to Lemmy Community Seeder but is designed to be run periodically to capture new communities, and include EVERYTHING by default.

EDIT: As noted in the comments, this is an admin tool. Please do not run it as a user if you don't know what you are doing. If you want a better "All," ask your admin first! That said, lemmony in no way constitutes abuse! You can cause a DOS with curl, but that's not what curl was written for. This tool is to legitimately use an API to enhance our experience. Admins that desire to accommodate high volume on a public service will not know this tool is running against, or on their instances. If it causes performance issues, that is unfortunate. They are free to throttle, ban or block API access to their instance in a multitude of ways.

EDIT 2: Donate to your instance/admin if you like Lemmy!

top 27 comments
sorted by: hot top controversial new old
[–] coldhotman@nrsk.no 10 points 1 year ago (2 children)

I did something similar. When I federated around 700 active communities the instance used around 3GB per day. That's too much. Include EVERYTHING?

I'll call the coast guard because you all will be drowning lol

[–] iAmNotorious@lemmy.world 5 points 1 year ago (1 children)

Quite a bit of space could be saved with database compression. The database side of things has lower hanging fruit right now though.

[–] coldhotman@nrsk.no 2 points 1 year ago (1 children)

Really? Like a lot? I thought the biggest space hog was images by far.

[–] ipha@lemmy.world 2 points 1 year ago (2 children)

Images are not federated, they only live on the hosting instance.

Thumbnails might copied though, I'm not sure.

[–] SmashingSquid@notyour.rodeo 2 points 1 year ago (1 children)

Does the image get stores on the poster's instance or the instance hosting the community they're posting to?

[–] ipha@lemmy.world 2 points 1 year ago

The poster's.

[–] coldhotman@nrsk.no 1 points 1 year ago* (last edited 1 year ago) (1 children)

Hmm. Could you help me debug https://nrsk.no/post/160040, since the image from post https://lemmy.dbzer0.com/post/481984 is hosted at https://nrsk.no/pictrs/image/e899afcc-83fa-481e-b9d4-f5e0e693c1e3.jpeg?

I would initially think the image was federated as per the system design but you obviously know more about how my instance works than I do.

Any tips will greatly appreciated, I also suggest you make a github report since this is a bug every lemmy instance seems to struggle with.

[–] hawkwind@lemmy.management 1 points 1 year ago (1 children)

There is some discussion. https://github.com/LemmyNet/lemmy/issues/2947

I am still fairly confident that it shouldn't be storing images, but I'll admit my pict-rs directory is growing quite fast compared to the database. Have to keep a close eye on this.

[–] coldhotman@nrsk.no 0 points 1 year ago (1 children)

I am still fairly confident that it shouldn’t be storing images

If the lead developers discusses this not as a bug but an inconvenience then there's nothing I can ever say or show you to ever convince you otherwise.

Enjoy your non-proxying Lemmy experience - Since your pict-rs directory is growing but not due to your instance caching images it's probably the docker log bug that was discussed over a year ago.

[–] hawkwind@lemmy.management 1 points 1 year ago

I'm not convinced either one of us knows what the software is SUPPOSED to do, and I am pretty sure nobody knows what it's actually doing. Here's another thread: https://github.com/LemmyNet/lemmy/issues/3163

[–] hawkwind@lemmy.management 1 points 1 year ago* (last edited 1 year ago) (2 children)

EVERYTHING by default. Also working on "discover only" for searching without the subscribe-to-everything. That said: It's far less than 3GB per day for EVERYTHING I can see, plus: you don't HAVE to keep it forever. Were you doing something that got other than text?

[–] Alfi@lemmy.alfi.casa 2 points 1 year ago (1 children)

Do you have a link to a documentation concerning retention/cleanup for instances?

[–] hawkwind@lemmy.management 1 points 1 year ago

I don't. I haven't looked yet either because I haven't crossed that bridge. I think there were some admins on matrix chatting about it though. It will become an issue for large instances like near term, so I suspect someone will tackle it very soon, if they haven't already.

[–] coldhotman@nrsk.no -1 points 1 year ago (1 children)

Were you doing something that got other than text?

Images federate, friend. There's a lot of cat pics on the lemmyverse.

[–] hawkwind@lemmy.management 3 points 1 year ago

They're not supposed to, and don't call me friend, buddy.

[–] Thief@lemmy.world 6 points 1 year ago* (last edited 1 year ago) (1 children)

I ran this and it causes a lot of load. Only issue is any user can run it so basically lemmy servers are pretty much open to be ddos attacked by a user subscribing to everything it seems. I have a pretty good server and it consumed 30% of the cpu continuously just adding all of lemmy.world. Unclear how much disk space I just commited myself to also. Small instances would be decimated by this. Wouldnt be hard for someone to load this docker up 10 times and pull from the biggest 10 lemmy servers all at once to max out a server and cause a lot of other issues.

Other than this, good job. Seems to work well. Maybe too well.

[–] hawkwind@lemmy.management 1 points 1 year ago

Added a lot of features if you want to try again.

[–] ram@lemmy.ramram.ink 3 points 1 year ago (1 children)

I'd recommend anyone using this to really consider how much data this'll use on their system.

[–] usbpc@programming.dev 3 points 1 year ago

I like this idea, I've been thinking about running my own private instance but decided against it as I like the main feed with many different communities that larger instances offer.

[–] VentraSqwal@links.dartboard.social 2 points 1 year ago (2 children)

This is a neat idea. I'm guessing it's something the admin of the instance has to run?

[–] usbpc@programming.dev 5 points 1 year ago

If you are on a smaller instance you should probably ask the admin(s) if they are okay with something like this, it would put a lot of extra strain on the server and might overload small instances.

[–] hawkwind@lemmy.management 2 points 1 year ago

Technically no. Any user could. I wouldn't recommend it though, as it will subscribe you to every community.

[–] wintermute@feddit.de 1 points 1 year ago (2 children)

Hosting an instance myself, I’m not amused, because if forces my instance to literally sync all content there is on the lemmyverse, drastically increasing traffic, storage use etc.

Please don’t force resource consumption beyond any rational usage!

[–] hawkwind@lemmy.management -1 points 1 year ago

The weird rage people have about this. I'm not sure where it comes from. If there are 100 communities, only the top 1-5 will contribute 90% of the content. If you have even one user subscribed to the top 20 or 50 communities, you are already likely getting 90%+ of this traffic. After subscribing to literally every community in the lemmyverse, I promise your instance will not see any meaningful increase. I'm willing to be proven wrong, but not one of the ragers has offered a credible reason other than fears based on misunderstanding. No offense.

load more comments
view more: next ›