this post was submitted on 05 Jul 2023
3062 points (99.2% liked)

Lemmy.World Announcements

29084 readers
184 users here now

This Community is intended for posts about the Lemmy.world server by the admins.

Follow us for server news 🐘

Outages 🔥

https://status.lemmy.world/

For support with issues at Lemmy.world, go to the Lemmy.world Support community.

Support e-mail

Any support requests are best sent to info@lemmy.world e-mail.

Report contact

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Ko-Fi (Donate)

Bunq (Donate)

Open Collective backers and sponsors

Patreon

Join the team

founded 2 years ago
MODERATORS
 

Another day, another update.

More troubleshooting was done today. What did we do:

  • Yesterday evening @phiresky@phiresky@lemmy.world did some SQL troubleshooting with some of the lemmy.world admins. After that, phiresky submitted some PRs to github.
  • @cetra3@lemmy.ml created a docker image containing 3PR's: Disable retry queue, Get follower Inbox Fix, Admin Index Fix
  • We started using this image, and saw a big drop in CPU usage and disk load.
  • We saw thousands of errors per minute in the nginx log for old clients trying to access the websockets (which were removed in 0.18), so we added a return 404 in nginx conf for /api/v3/ws.
  • We updated lemmy-ui from RC7 to RC10 which fixed a lot, among which the issue with replying to DMs
  • We found that the many 502-errors were caused by an issue in Lemmy/markdown-it.actix or whatever, causing nginx to temporarily mark an upstream to be dead. As a workaround we can either 1.) Only use 1 container or 2.) set ~~proxy_next_upstream timeout;~~ max_fails=5 in nginx.

Currently we're running with 1 lemmy container, so the 502-errors are completely gone so far, and because of the fixes in the Lemmy code everything seems to be running smooth. If needed we could spin up a second lemmy container using the ~~proxy_next_upstream timeout;~~ max_fails=5 workaround but for now it seems to hold with 1.

Thanks to @phiresky@lemmy.world , @cetra3@lemmy.ml , @stanford@discuss.as200950.com, @db0@lemmy.dbzer0.com , @jelloeater85@lemmy.world , @TragicNotCute@lemmy.world for their help!

And not to forget, thanks to @nutomic@lemmy.ml and @dessalines@lemmy.ml for their continuing hard work on Lemmy!

And thank you all for your patience, we'll keep working on it!

Oh, and as bonus, an image (thanks Phiresky!) of the change in bandwidth after implementing the new Lemmy docker image with the PRs.

Edit So as soon as the US folks wake up (hi!) we seem to need the second Lemmy container for performance. So that's now started, and I noticed the proxy_next_upstream timeout setting didn't work (or I didn't set it properly) so I used max_fails=5 for each upstream, that does actually work.

(page 2) 50 comments
sorted by: hot top controversial new old
[–] KSPAtlas@sopuli.xyz 20 points 1 year ago (1 children)

Shouldn't the correct HTTP status code for a removed API be 410? 404 indicates the domain wasn't found or doesn't exist, 410 indicates a resource being removed

[–] Hupf@feddit.de 11 points 1 year ago (2 children)

Or 418 for the wrong API being used :^)

load more comments (2 replies)
[–] pathief@lemmy.world 20 points 1 year ago* (last edited 1 year ago) (8 children)

Is it safe to use 2FA yet?

load more comments (8 replies)
[–] shotgun_crab@lemmy.world 18 points 1 year ago

You guys are absolute legends, thanks for the update!

[–] Marxine@lemmy.world 18 points 1 year ago (1 children)

Lemmy's devs and the .world admins have done in a month what Reddit hasn't done in it's whole existence: having a smooth and almost bug-free experience.

Jerboa feels so damn FRESH to use now!

[–] wolfpack86@lemmy.world 18 points 1 year ago* (last edited 1 year ago) (1 children)

Not to undervalue the efforts going into this, because I appreciate the new community and the transparency, but I believe we have wildly different definitions of 'almost bug free'

Which, is also something to consider about user experience consistency. Will be a challenge with growth. Fortunately, plugged in admins and devs will help.

load more comments (1 replies)
[–] dyslexicdainbroner@lemmy.world 18 points 1 year ago

How great is it to be a part of history in the making -

This is Web 3 in its fomenting -

Headlines ~5yrs:

The ending of Web 2 was unceremonious and just ugly. u/spez and moron@musk watched as their social media networks signaled the end of Web 2 and slowly dissolved. Blu bird’s value disintegrated and Reddit’s hopes for IPO did likewise. Twitter and Reddit dissolved into odorous flatulence as centralization fell apart to the world’s benefit. Decentralized/federated social media such as Mastodon and Lemmy made their convoluted progress and led Web 3’s development and growth…

This is how history is made, it’s ugly and convoluted but comes out sweeet…

[–] httperror418@lemmy.world 16 points 1 year ago (1 children)

Whilst I'm aware that too many users on one instance can be a bad thing for the wider Fediverse, I think it is a great thing at the moment in terms of how well people are banding together to fix the issues being encountered from such a surge in users.

The issues being found on lemmy.world results in better lemmy instances for everyone and improves the whole Fediverse of lemmy instances.

I'm very impressed with how well things are being debugged under pressure, well done to all those involved 👏

load more comments (1 replies)
[–] Contravariant@lemmy.world 16 points 1 year ago

Hey I can upvote now!

[–] lwuy9v5@lemmy.world 15 points 1 year ago (1 children)

That's so awesome! Look at that GRAPH!

I'd volunteer to be a technical troubleshooter - very familiar with docker/javascript/SQL, not super familiar with rust - but I'm sure yall also have an abundance of nerds to lend a hand.

load more comments (1 replies)
[–] MetricExpansion@lemmy.world 15 points 1 year ago (1 children)

I'm very curious: does single Lemmy instance have the ability to horizontally scale to multiple machines? You can only get so big of a machine. You did mention a second container, so that would suggest that the Lemmy software is able to do so, but I'm curious if I'm reading that right.

[–] DoomBot5@lemmy.world 11 points 1 year ago (5 children)

A single instance, no. You run multiple instances on multiple machines, then put a frontend (nginx in this case) to distribute the traffic among them.

load more comments (5 replies)
[–] cani@lemmy.world 15 points 1 year ago

I just love the transparancy you guys are coming forward with. It's absolutely awesome! Thank you for that and for all the work you put in. It means a lot to me that you folks are taking the time to keep us updated. Much love!

[–] DelvianSeek@lemmy.world 13 points 1 year ago

You guys are absolutely amazing. So many thanks to you @Ruud and the entire admin/troubleshooting team! Thank you.

[–] Puzzlehead@lemmy.world 13 points 1 year ago

smoooooooooth! Keep up the good work!

[–] CIA_chatbot@lemmy.world 13 points 1 year ago* (last edited 1 year ago) (4 children)

It blows my mind with the amount of traffic you guys must be getting that you are only running one container and not running in a k8s cluster with multiple pods (or similar container orchestration system)

Edit: misread that a second was coming up, but still crazy that this doesn’t take some multi node cluster with multiple pods. Fucking awesome

load more comments (4 replies)
[–] nuzzlerat@lemmy.world 12 points 1 year ago (1 children)

Is it weird that I’m always excited to read the update posts?

load more comments (1 replies)
[–] Datzevo@lemmy.world 11 points 1 year ago

You know there's something about dealing with the lagginess in the past few days makes me appreciate the fast and responsive of the update. It nice to see the community grows and makes the experience at Lemmy feels authentic.

[–] Zrob@lemmy.world 11 points 1 year ago (5 children)

Awesome work. Any way other devs can contribute?

load more comments (5 replies)
[–] InfiniteVariables@lemmy.world 11 points 1 year ago

Wow it is smooth as butter now. Great job ruud and team!

[–] Anti_Weeb_Penguin@lemmy.world 10 points 1 year ago (1 children)

Installed Jerboa again and it feels smoother than Reddit itself, great job!

load more comments (1 replies)
[–] EvilCartyen@lemmy.world 10 points 1 year ago (1 children)

Things have been super smooth lately, thanks for all the work!

load more comments (1 replies)
[–] MiddleWeigh@lemmy.world 10 points 1 year ago* (last edited 1 year ago)

I took a SM break for a few days, and it's running noticeably better today...I think. (:

Thanks a bunch for floating us degenerates.

[–] WolfhoundRO@lemmy.world 10 points 1 year ago (1 children)

Really great job, guys! I know from my experience in SRE that these types of debugs, monitoring and fixes can be much pain, so you have all my appreciation. I'm even determined to donate on Patreon if it's available

load more comments (1 replies)
[–] slashzero@hakbox.social 9 points 1 year ago* (last edited 1 year ago)

As a Performance Engineer myself, these are the kind of performance improvements I like to see. Those graphs look wonderful. Nice job to all.

[–] StringyCheese@mastodon.social 9 points 1 year ago (1 children)
load more comments (1 replies)
load more comments
view more: ‹ prev next ›