this post was submitted on 14 Dec 2023
28 points (88.9% liked)

Selfhosted

37442 readers
43 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

I recently got a few (5) hard drives to turn my home server into a NAS with trueNAS scale and my idea is to have 4 usable and 1 for redundancy, my question is... How does RAID work, like what is RAID 0, RAID 5, software RAID etc, and does any of that even matter for my use case?

all 20 comments
sorted by: hot top controversial new old
[–] tburkhol@lemmy.world 13 points 6 months ago (1 children)

Traditionally, RAID-0 "stripes" data across exactly 2 disks, writing half the data to each, trying to get twice the I/O speed out of disks that are much slower than the data bus. This also has the effect of looking like one disk twice the size of either physical disk, but if either disk fails, you lose the whole array. RAID-1 "mirrors" data across multiple identical disks, writing exactly the same data to all of them, again higher I/O performance, but providing redundancy instead of size. RAID-5 is like an extension of RAID-0 or a combination of -0 and -1, writing data across multiple disks, with an extra 'parity' disk for error correction. It requires (n) identical-sized disks but gives you storage capacity of (n-1), and allows you to rebuild the array in case any one disk fails. Any of these look to the filesystem like a single disk.

As @ahto@feddit.de says, none of those matter for TrueNAS. Technically, trueNAS creates "JBOD" - just a bunch of disks - and uses the file system to combine all those separate disks into one logical structure. From the user perspective, these all look exactly the same, but ZFS allows for much more complicated distributions of data and more diverse sizes of physical disks.

[–] taladar@sh.itjust.works 8 points 6 months ago

RAID-6 is basically the same as RAID-5 but with two extra disks instead of one, allowing for any two disks to fail and giving you n-2 capacity.

[–] lemmyvore@feddit.nl 10 points 6 months ago (1 children)

If you're using TrueNAS it already has some types of RAID it wants to do. Assuming your 5 drives are the same size what you want is called RAIDz1 (1 standing for one drive worth of redundancy).

It is a type of RAID5, which means instead of having 5x usable storage you reserve 1x for redundancy information spread out across the 5, and get only 4x usable space.

Since you're a beginner you get the usual lecture: RAID is not backup. RAID allows a certain number of your drives to fail without losing any data; it spreads the risk of hardware failure.

RAID won't help if you delete a file or accidentally explicitly format the wrong drive or even the whole array, and won't help if the PC is stolen or struck by lightning or burns in a fire.

The solution used by TrueNAS (ZFS) has something called snapshots that can help with modified or deleted files.

For anything else you have to consider which of your files are "my world has ended"-level of important and backup to a HDD in a drawer, or to Blu Ray discs, or online to the cloud.

[–] Hopfgeist@feddit.de 1 points 6 months ago

To add, unlike "traditional" RAID, ZFS is also a volume manager and can have an arbitrary number of dynamic "partitions" sharing the same storage pool (literally called a "pool" in zfs). It also uses checksumming to determine if data has been corrupted. On redundant setups it will then quietly repair the corrupted parts with the redundant information while reading.

[–] ahto@feddit.de 6 points 6 months ago

You won't be using a traditional RAID with TrueNAS Scale. You have a choice of Stripe, Mirror, RAIDZ1, RAIDZ2, RAIDZ3, dRAID1, dRAID2, dRAID3. The docs are very detailed, so you should read up on RAIDZ and the other types elsewhere, too.

[–] Huschke@lemmy.world 5 points 6 months ago* (last edited 6 months ago) (1 children)

[This is a good video that explains the basics and what raid setup you want for what kind of data.] (https://youtube.com/watch?v=5K8szc9gDYw)

[–] PipedLinkBot@feddit.rocks 4 points 6 months ago

Here is an alternative Piped link(s):

good videos that explains the basics and what raid setup you want

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

[–] Decronym@lemmy.decronym.xyz 5 points 6 months ago* (last edited 6 months ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
NAS Network-Attached Storage
PCIe Peripheral Component Interconnect Express
RAID Redundant Array of Independent Disks for mass storage
SSD Solid State Drive mass storage
ZFS Solaris/Linux filesystem focusing on data integrity

5 acronyms in this thread; the most compressed thread commented on today has 15 acronyms.

[Thread #352 for this sub, first seen 14th Dec 2023, 12:15] [FAQ] [Full list] [Contact] [Source code]

[–] _danny@lemmy.world 5 points 6 months ago (1 children)

This is a good tool for visualizing your raid needs from your capacity and total number of drives.

https://www.seagate.com/products/nas-drives/raid-calculator/

I'll preface that I'm no raid expert, just a nerd that uses it occasionally.

The main benefit of most raid configurations is the redundancy they provide. If you lose one drive, you do not lose any data. It's kinda obvious how you can have 1:1 redundancy, you just have an exact copy of the drive. But there are ways to split data into three chunks so that you can rebuild the data from any two chunks, and 5 chunks so that you can loose and two chunks. Truly understand how raid does this could easily be an entire college course.

Raid 0 is the exception. All it does is "join together" a bunch of drives into one disk. And if you lose an individual disk you likely will lose most of your data.

Another big difference is read/write speed. From my understanding, every raid configuration is slower to read and write than if you were using a single drive. Each raid configuration is varying levels of slower than the "base speed"

I typically use raid 5 or 6, since that gives some redundancy, but I can keep most of my total storage space.

The main thing in all of this is to keep an eye on drive health. If you lose more drives than your array can handle, all of your data is gone. From my understanding, there is no easy way to get the data off a broken raid array.

[–] Presi300@lemmy.world 3 points 6 months ago

I've mentioned it in another reply, but read/write speed isn't terribly important to me, as the whole thing is gonna be bottlenecked by a 1GBPs connection anyways. From what I read from the other replies and online, RAIDz1 sounds like the thing I'm gonna go with, as it seems robust enough and my NAS is powerful enough for the performance hit to not really matter...

[–] poVoq@slrpnk.net 4 points 6 months ago* (last edited 6 months ago)

That is a way too broad question to be answer here and also depends on the file-system truenas uses.

If I remember correctly it uses ZFS by default and you can easily find some articles explaining the different raid levels of OpenZFS online.

Edit: ZFS is not the same as other file-systems so not all of the general RAID info you can find online is 1:1 applicable for it (same with btrfs).

[–] xia@lemmy.sdf.org 2 points 6 months ago

0: "i don't care about my data."

1: "i REALLY care about my data"

5: "i'll trade you one drive now, for my data if one of the drives dies later"

[–] redline23@lemmy.world 1 points 6 months ago (1 children)

Other people gave a good explanation of raid and some alternatives like zfs in truenas.

You want to avoid RAID5 with drives above 4TB. Every hard drive has can have an unrecoverable read error (URE) during the read. It's a very low percentage change that your hard drive publishes. During a raid 5 rebuild after replacing a drive, the other drives are stressed for a long time during the rebuild. With high capacity drives you have a pretty large chance of encountering a URE and losing the entire array. The high stress on the drives can also cause drive failure if another drive was on its way out.

I run truenas core at home in volumes that looks like raid 10. Two mirror volumes striped together for performance.

I never played around with raidz1 (like raid 5) but you still have the chance of an URE during the resilver. I can't comment if it's possible or what happens during an error. I did see people recommending raidz2 to allow for two disc failures from losing data during a resilver.

[–] Presi300@lemmy.world 1 points 6 months ago (1 children)

All my disks are 2TB so it shouldn't be a massive issue

[–] redline23@lemmy.world 1 points 6 months ago

I personally wouldn't use raidz1 because it seems too risky to me. I'd have higher redundancy.

Some links

https://www.truenas.com/community/threads/raidz1-vs-raid-5-ures.42598/

https://www.truenas.com/community/threads/5x-4tb-raidz1-array-rebuilding-with-nre-ure-issue.13719/

https://magj.github.io/raid-failure/

The last link is talking about actual raid and not zfs. But it has a 50/50 chance with a URE rate of 10^14 to lose the array. Raidz1 maybe won't have that catastrophic of a failure, but you'd still be rolling the dice on some corruption.

[–] yournamehere@lemm.ee 0 points 6 months ago (1 children)

this is what chatgpt is made for

[–] Presi300@lemmy.world 1 points 6 months ago

already tried, did not help that much

[–] sj_zero@lotide.fbxl.net -1 points 6 months ago

The level of raid is fundamental to the operation of your raid array.

As I recall, RAID 0 is striping. It will give you faster throughput because your array can pull values out of multiple drives at once. RAID 1 is mirroring. In that, half of the drives are used for data, and the other half are used to back up the first half. RAID 5 is parody, and that's what you're looking for. Essentially, your drives will mostly be used for storing data come up with the last one will be used to track what information is on the other four, so you will have one drive for redundancy and the other four will be storing data.

Hardware raid versus software raid matters to the extent that parity calculations are relatively expensive and so if you're trying to do RAID 5 on software raid, that's going to eat up more of your CPU power and reduce your drive throughput.

I don't recall truenas in particular, and what you using the nas for is really what is important, but I do recall that some Nas software doesn't even want you to be using hardware raid because it will be using its own software algorithms that are separate from what you would typically consider to be raid.

[–] tagginator@utter.online -2 points 6 months ago

New Lemmy Post: Can someone explain me how RAID works? (https://lemmy.world/post/9554292)
Tagging: #SelfHosted

(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)

I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md