this post was submitted on 04 Jul 2023
4 points (83.3% liked)

Lemmy Server Performance

3 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
 

I spent several hours tracing in production (updating the code a dozen times with extra logging) to identify the actual path the lemmy_server code uses for outbound federation of votes to subscribed servers.

Major popular servers, Beehaw, Leemy.world, Lemmy.ml - have a large number of instance servers subscribing to their communities to get copies of every post/comment. Comment votes/likes are the most common activity, and it is proposed that during the PERFORMANCE CRISIS that outbound vote/like sharing be turned off by these overwhelmed servers.

pull request for draft:

https://github.com/LemmyNet/lemmy/compare/main...RocketDerp:lemmy_comment_votes_nofed1:no_federation_of_votes_outbound0

EDIT: LEMMY_SKIP_FEDERATE_VOTES environment variable

you are viewing a single comment's thread
view the rest of the comments
[โ€“] King@vlemmy.net 2 points 1 year ago* (last edited 1 year ago) (1 children)

Thanks for doing all this.

Do we have any real numbers from a real server? How many votes are trying to be federated to how many servers?

Just ballparking some approximate numbers:

15000 * 4000 * 10 = 600,000,000 federated actions. That is around 7,000 per second 24/7 for one community.

IMO, this real time federation just doesn't scale. We need to start planning the specs for federation batching.

[โ€“] RoundSparrow@lemmy.ml 1 points 1 year ago

I'm hoping the 'subscribed servers' is maybe only 300 or so? But I don't know, the big sites haven't been sharing information like that in my experience. They did say there were "millions" of outbound federation tasks. I expect the number of votes by user is higher than your number. They did put in code changes to detect servers they can't reach and to stop attempting delivery.

We need to start planning the specs for federation batching.

I think a pull app that goes around to servers with content and uses the front-end API to grab 300 or more comments at a time, etc is the way to go. The client API is geared toward batch delivery. since lemmy.ml is so unstable for discussion, I opened a topic on GitHub: https://github.com/RocketDerp/lemmy_helper/discussions/4 - where I proposed some new /api/syncshare to get more raw data out of the PostgreSQL tables.