this post was submitted on 19 Jul 2024
1197 points (99.5% liked)

Technology

58108 readers
4981 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It's all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We'll see if that changes over the weekend...

you are viewing a single comment's thread
view the rest of the comments
[–] madcaesar@lemmy.world 7 points 2 months ago (1 children)

I think when you are this big you need to roll out any updates slowly. Checking along the way they all is good.

[–] Voroxpete@sh.itjust.works 21 points 2 months ago (2 children)

The failure here is much more fundamental than that. This isn't a "no way we could have found this before we went to prod" issue, this is a "five minutes in the lab would have picked it up" issue. We're not talking about some kind of "Doesn't print on Tuesdays" kind of problem that's hard to reproduce or depends on conditions that are hard to replicate in internal testing, which is normally how this sort of thing escapes containment. In this case the entire repro is "Step 1: Push update to any Windows machine. Step 2: THERE IS NO STEP 2"

There's absolutely no reason this should ever have affected even one single computer outside of Crowdstrike's test environment, with or without a staged rollout.

[–] madcaesar@lemmy.world 8 points 2 months ago (1 children)

God damn this is worse than I thought.. This raises further questions... Was there a NO testing at all??

[–] kayos@lemmy.world 1 points 2 months ago

Tested on Windows 10S

[–] elrik@lemmy.world 6 points 2 months ago (1 children)

My guess is they did testing but the build they tested was not the build released to customers. That could have been because of poor deployment and testing practices, or it could have been malicious.

Such software would be a juicy target for bad actors.

[–] Voroxpete@sh.itjust.works 1 points 2 months ago

Agreed, this is the most likely sequence of events. I doubt it was malicious, but definitely could have occurred by accident if proper procedures weren't being followed.