Building a home server. Please help, DC!

ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

<< < (16/36) > >>

40hz:
Weirdest thing about RAID-1. When it breaks, it sometimes takes out both drives. -40hz (August 04, 2011, 02:58 PM)
--- End quote ---
Shit happens :) - drives from the same batch can die shortly within eachother (especially if you have very disk-intensive rebuilds... mirroring isn't too bad, raid-5 is BAD). And then there's stuff like power surges etc. So yeah, stuff dies.

-f0dder (August 04, 2011, 05:07 PM)
--- End quote ---

It wouldn't have been so disturbing if it were just the drives that failed. They're mechanical devices so you have to expect that. "Omnes vulnerat, ultima necat," as those old sundials used to say.

But it wasn't the drives that caused the problem. In both cases it was a controller issue (one HP and one IBM branded) using drives originally installed by the manufacturer. Since these are big league server manufacturers, I'm confident they did the necessary mix & match games to minimize the chance of getting two "bad batch" drives on the same machine.

In both instances the controllers unexpectedly started writing total garbage to both drives thereby rendering them useless. In the case of the IBM card, a firmware update corrected the "engineering issue." With HP, a replacement was necessary because there was a "marginal hardware condition" on the card.

Having it happen two different times on servers from two different manufacturers is a little too much bad luck AFAIC. ;D

Building a home server. Please help, DC!

Stoic Joker:
Raid-5 (and other "big storage" schemes) would be silly on SSD until their storage capability goes massively up. The added writes of raid-5 is a real concern,-f0dder (August 04, 2011, 05:07 PM)
--- End quote ---

Okay, this has been bugging me. What added writes?? RAID5 is striping with parity...So 2 of the drives split the writes and each get half the file. Parity is written to drive 3 which is (in that config) its sole purpose for existing. What's extra? Traffic on the controller?

I'm not arguing the point, just trying to understand it.

40hz:
^ Um...actually the data chunks and parity info are distributed among all the drives in a RAID-5 array by the controller. There isn't a unique "parity drive" per se AFAIK. :)

Stoic Joker:
^ Um...actually the data chunks and parity info are distributed among all the drives in a RAID-5 array by the controller. There isn't a unique "parity drive" per se AFAIK. :)-40hz (August 05, 2011, 07:30 AM)
--- End quote ---

Okay, I've spent a bit too much time spoon feeding end users and my brain is turning to mush (brightly colored hand puppet stuff...).

Never the less, the parity info is stagged between the drives. So if something is written to the array it will be striped between 2 of the drives, and a third drive will catch the parity info. for any given write operation. Which goes back to my original quandary ... Where is the extra per-disk write that would possibly cause it to prematurely fail?
A gets half
B gets half
C gets parity

All are on separate physical disks. So other than controller traffic (that's a given) I don't see where anything is really getting doubled-up...At the per disk level. The parity section of each disk isn't going to take any more or less of a hit than its corresponding data segments. So it's not like it's being subjected to exhaustive localized rewrites that are going to "burn-a-hole" in it.

40hz:
It's not so much the actual read/write is it is the fact that every drive in the array spins up for every read/write - so there's more wear and tear on the drive mechanics rather than the disk platter's surface.

If you saved a file to a single drive, only it would spin up and be written to (along with the housekeeping of finding sufficient free clusters. On a three element RAID-5, three drives would be spun up to accomplish the same thing, plus need to write additional information (i.e. parity) above and beyond that contained in the actual file itself. That's three times the disk activity plus "parity tax" plus three times the heat generated over a single drive save operation.

So when you add in the MTBF for each of the three drives, you have a higher probability of a drive failing all other factors being equal. And most arrays have more than three drives since that's the least cost effective RAID-5 configuration since you always sacrifice one drive to parity even if that drive doesn't exclusively hold the parity data.

Most times, the drives chosen for arrays are built to a higher quality standard than those normally deployed in PCs - so that may even up the failure occurrence rate up between server and non-server drives despite a higher utilization rate.

I'll have to see if I can locate any hard stats for drive reliability on a per disk basis when used in an array. I'm sure studies have been done. It's just a matter of finding them.
8)

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version