this post was submitted on 17 Aug 2023
4 points (100.0% liked)
Lemmy Server Performance
3 readers
1 users here now
Lemmy Server Performance
lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
ok, experimenting on a massive test data set of over 5 million posts... this PostgreSQL works pretty well
This limits any one community to 1000 posts, picking the most recent created posts. This gives a way to age out older data in very active communities without removing any posts at all for small communities.
An even less-intrusive approach is to not add any new field to existing tables. Establish a reference table say called include_range. There is already an ENUM value for each sort type, so include_range table with these columns:
sort_type ENUM, lowest_id BigInt, highest_id BigInt
Run a variation of this to populate that table:
Against every sort order, including OLD. Capture only two BigInt results: the MIN(id) and the MAX(id) - that will give a range over the whole table. Then every SELECT on post_aggregates / post table includes a WHERE id >= lowest_id AND id <= highest_id
That would put in a basic sanity check that ages-out content, and it would be right against the primary key!
A core design issue of either approach is that server operators can modify the building of this data without needing to modify or restart the lemmy_server Rust code.
3 hours later... I put it into code and am experimenting with it. Some proof of concept results: https://github.com/LemmyNet/lemmy/files/12373819/auto_explain_list_post_community_0_18_4_dullbananas_with_inclusion_run0a.txt