this post was submitted on 12 Jun 2023
74 points (100.0% liked)

Lemmy

12546 readers
160 users here now

Everything about Lemmy; bugs, gripes, praises, and advocacy.

For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.

founded 4 years ago
MODERATORS
 

Some of you may have noticed that federated actions are slow to synchronize between Lemmy instances. This is most likely because of the setting "Federation worker count" under /admin. It determines how many federation activities can be sent out at once. The default value is 64 which is enough for small or medium sized instances. But for large instances it needs to be increased.

Grep the server logs for "Maximum number of activitypub workers reached" and "Activity queue stats" to confirm that this affects you. For lemmy.ml I just changed the value to 512, you have to experiment what is sufficient. The new value is only applied after restarting Lemmy. In my case changing the value through the website didnt work (maybe because its overloaded). Instead I had to update local_site.federation_worker_count directly in the database.

Edit: I had to increase the value to 160k for lemmy.ml. Now the stats arent getting logged anymore, so Im not sure if the pending queue is still building up or not.

top 13 comments
sorted by: hot top controversial new old
[–] ruud@lemmy.world 6 points 1 year ago (1 children)

Good tip. I changed it to 512 also, yesterday.

[–] kadu@lemmy.world 3 points 1 year ago

As a user, I just woke up and can say Lemmy.world feels waaay smoother on both desktop and mobile, interacting with other instances too for absolute sure.

[–] Noedel@lemmy.ml 4 points 1 year ago

Good luck today, admins!

[–] poVoq@slrpnk.net 3 points 1 year ago

Ah maybe that should be mentioned somewhere that this specific config requires a restart of the Lemmy backend.

I think settings that do require this should be only in the lemmy.hjson config file.

I increased ours to 128 yesterday, but wanted to keep downtime low so I didn't restart.

[–] ProfessionalHandJob@lemmy.beyondcombustion.net 2 points 1 year ago* (last edited 1 year ago)

my server is just me currently.... but it's got 100GB of RAM and 30CPUs so i kicked my value up to 200k but this shit (the fediverse) is still slow as hell/doesn't sync with most other servers because their specs are so low. People need to stop running them on $4 VPS shit boxes.

[–] bdonvr@lemmy.rogers-net.com 2 points 1 year ago (1 children)

And that only effects outbound federation basically? So a small instance shouldn't have much issues with this, even if it's subscribed to a lot of very busy communities?

[–] nutomic@lemmy.ml 2 points 1 year ago
[–] RoundSparrow@lemmy.ml 2 points 1 year ago* (last edited 1 year ago)

Going from 512 to 160,000 is a massive parameter change.

Network replication like this presents a ton of issues with servers going up and down, database insert locking the tables, desire for backfill and integrity checks, etc.

Today things are going poorly, this posting has an example: https://lemmy.ml/post/1239920 -- comments are not showing up on other instances after hours.

From a denial-of-service perspective, intentional or accidental, I think we need to start discussing the protocol for federation. When servers connect to each others, how frequently, how they behave when getting errors from a remote host.

Is lemmy_server doing all the federation replication in-code? I would propose moving it to a independent service - perhaps a shell application - and start looking at replication (and associated queues) as a core thing to manage, optimize, and operate. It isn't working seamlessly and having hundreds of servers creates a huge amount of complexity.

[–] simple@lemmy.world 2 points 1 year ago

I did notice that syncing has been slow lately, hopefully this does fix it.

[–] fchaverri@mamut.cr 1 points 1 year ago

@nutomic I wonder if my mastodon instance requires similar tweaking, I haven't seen any new updates from Lemmy.ml today on my feed...

[–] ono@lemmy.ca 0 points 1 year ago* (last edited 1 year ago) (1 children)

I have a bunch of lemmy.ml communities still stuck in "Subscribe Pending" state even after waiting for days. Canceling and re-subscribing does not help.

[–] nutomic@lemmy.ml 0 points 1 year ago (1 children)
[–] snowe@lemmy.ml 1 points 1 year ago

I am also seeing this.

load more comments
view more: next ›