this post was submitted on 11 Jan 2025
57 points (96.7% liked)

Selfhosted

41005 readers
484 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I have a ZFS RAIDZ2 array made of 6x 2TB disks with power on hours between 40,000 and 70,000. This is used just for data storage of photos and videos, not OS drives. Part of me is a bit concerned at those hours considering they're a right old mix of desktop drives and old WD reds. I keep them on 24/7 so they're not too stressed in terms of power cycles bit they have in the past been through a few RAID5 rebuilds.

Considering swapping to 2x 'refurbed' 12TB enterprise drives and running ZFS RAIDZ1. So even though they'd have a decent amount of hours on them, they'd be better quality drives and fewer disks means less change of any one failing (I have good backups).

The next time I have one of my current drives die I'm not feeling like staying with my current setup is worth it, so may as well change over now before it happens?

Also the 6x disks I have at the moment are really crammed in to my case in a hideous way, so from an aesthetic POV (not that I can actually seeing the solid case in a rack in the garage),it'll be nicer.

you are viewing a single comment's thread
view the rest of the comments
[–] fmstrat@lemmy.nowsci.com 3 points 1 day ago* (last edited 1 day ago) (1 children)

As someone who runs 3 large arrays with 8TB, 16TB, and 21TB drives respectively, know that:

  • RAIDZ1 will cause tons of fear when a disk fails if you're used to Z2. Don't change.
  • When a disk goes, the larger the disk, the slower the rebuild time, and the more taxing it is on the other disks. With Z1, if another fails during the rebiluild, you're SOL.

Less disks is simpler, but more disks is safer. 6 disks is the perfect sized array IMO. If you don't need more space, I'd buy a 2TB hot spare and call it a day. But if space is a concern, Z2 with 4 disks.

Edit: Those three arrays mirror each other in different locations, and the fear was still there when the Z1 had an issue. Mostly due to the headache, but still.

[–] blackstrat@lemmy.fwgx.uk 3 points 1 day ago (1 children)

The reason I went RAIDZ2 in my current setup was because of the number of disks increasing the chance of multi failures. But with fewer disks that goes down. I'm not at all worried about data loss, as I said I have good backups so I can always restore. So if the remaining disk dies during a rebuild, that's unfortunate, but it only affects my uptime, not my data.

[–] fmstrat@lemmy.nowsci.com 0 points 20 hours ago (1 children)

Hate to be that guy, but those maths aren't mathing.

Less drives does not equal less chance of multiple failures. The statistical failure rate of one drive has no impact on another. In fact, analysis of Backblaze's data showed that larger drives were more prone to failure (platter density vs platter count).

[–] blackstrat@lemmy.fwgx.uk 3 points 15 hours ago (1 children)

Who has more chance of a single disk failing today: me with 6 disks, or Backblaze with their 300,000 drives?

Same thing works with 6 vs 2.

[–] fmstrat@lemmy.nowsci.com 1 points 6 hours ago

Backblaze of course, but we aren't talking about the probability of seeing a failure, but of one of your disks failing, and more importantly, data loss. A binomial probability distribution is a simplified way to see the scenario.

Let's pretend all disks have a failure rate of 2% in year one.

If you have 2 disks, your probability of each disk failing is 2%. The first disk in that array is 2%, and the second is 2%. If 2 disks fail in Z1, you lose data. This isn't a 1% (half) chance, because the failure rate of one disk does not impact the other, however the risk is less than 2%.

So we use a binomial probability distribution to get more accurate, which would be .02 prob in year one with 2 trials, and 2 failures making a cumulative probability of .0004 for data loss.

If you have 6 disks, your probability of each disk failing is also 2%. The first disk in that array is 2%, the second is 2%, so on and so forth. With 6 disk Z2, three must fail to lose data, reducing your risk further (not to .08%, but lower than Z1).

So with a binomial probability distribution, this would be .02 prob with 6 trials, and 3 failures making a cumulative probability of .00015 for data loss.

Thats a significantly smaller risk. The other interesting part is the difference in probability of one disk failing in a 6 disk array than a 2 disk array is not 3x, but is actually barely any difference at all, because the 2% failure rate is independent. And this doesn't even take into account large disks have a greater failure rate to start.

I'm not saying mirroring two larger disks is a bad idea, just that there are tradeoffs and the risk is much greater.