this post was submitted on 23 Jun 2023
3 points (100.0% liked)

Lemmy Server Performance

3 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
 

Heyho,

as a PostgreSQL guy, i'm currently working an tooling environment to simulate load on a lemmy instance and measure the database usage.

The tooling is written in Go (just because it is easy to write parallel load generators with it) and i'm using tools like PoWA to have a deep look at what happens on the database.

Currently, i have some troubles with lemmy itself, that make it hard to realy stress the database. For example the worker pool is far to small to bring the database near any real performance bootlenecks. Also, Ratelimiting per IP is an issue.

I though about ignoring the reverse proxy in front of the lemmy API and just spoof Forwarded-For headers to work around it.

Any ideas are welcome. Anyone willing to help is welcome.

Goals if this should be:

  • Get a good feeling for the current database usage of lemmy
  • Optimize Statements and DB Layout
  • Find possible improvements by caching

As your loved core devs for lemmy have large Issue Tracker backlog, some folks that talk rust fluently would be great, so we can work on this dedicated topic and provide finished PR's

Greatings, Loki (@tbe on github)

top 4 comments
sorted by: hot top controversial new old
[–] phiresky@lemmy.world 0 points 1 year ago (1 children)

Hey! I'm working on a rust tool right now to import a month of reddit dump into a lemmy instance using the federation api (for benchmarking / load testing as well).

[–] vapeloki@lemmy.ml 0 points 1 year ago (1 children)

Nice! This would greatly help to populate the database and get much better results!

[–] phiresky@lemmy.world 1 points 1 year ago

my reddit import code is here for the time being: https://github.com/phiresky/lemmy/tree/reddit-importer

it works but is undocumented

[–] RoundSparrow@lemmy.ml 0 points 1 year ago* (last edited 1 year ago)

One of the big concerns I have is that there seems to be no sense of the problems being faced. The project was built around very little data for years, and growing pains abound.

As of today, lemmy.ml says this is the posting with the most comments (local), 852: https://lemmy.ml/post/1186515 This federated posting from Beehaw has over 1000: https://lemmy.ml/post/1265302

On Reddit, a "large" news event, such as the discovery of the Titanic submarine this week, can have 10,000 comments - https://old.reddit.com/r/news/comments/14g7ipn/debris_field_discovered_within_search_area_near/

And that isn't even a major news breaking event on the order of a terrorist attack, Japan earthquake/nuclear incident, famous person being shot, etc.

load more comments
view more: next ›