But it's rather baffling/annoying that a brand new drive would go poof that quickly.
-Stoic Joker
Have a relative that holds a BEE (from
MIT no less!) tell me the problem with electronic devices is they can suffer from what he called "infant mortality." He said if you can (1) power up an electrical device, (2) run it for 72 hours under full load (3) then shut it down and let it cool off overnight (3) then bring it back up successfully - it will almost always (99.9999%) run without problems for the next six to eight years. That's because most electrical engineering defects manifest themselves very early on. And about 75% are heat-related, so they won't start to show up until the device has been running several hours.
The basic rule of thumb seems to be: If it's gonna die, it's gonna die shortly after you get it - otherwise it will die no sooner than one month out-of-warranty. So look for the longest warranty you can find. (Not that it will matter. Because hardly anybody ever registers or remembers where they put their receipt.)
According to my genius cousin, it seems that the way they move product these days, there's no longer any such thing as
real burn-in testing. Unless it's being sold at MIL-SPEC premium pricing, what little QC there is tests the major components of a device or assembly
very briefly. If the little ogre powers up, and a signal is detected on whatever I/O ports it has, it's considered "good." After that, it's off to antistatic bagging for shipping.
The manufacturers do a risk assessment, calculate the projected failure rate from actual returns, adjust the warranty as necessary, and budget for the inevitable replacements. So it's purely a lottery and numbers game. It's sort of like the old rule for buying a truck - it's either: (a) expensive, but top quality and going to last - or (b) inexpensive to buy and cheap and easy to fix.
Most electronic manufacturers opt for 'cheap to replace under warranty' because it's more cost effective for them to eat the occasional bad egg (and write it off on their taxes) than it is to properly and extensively test each individual product. And in the case of something as complex as a microprocessor (with its
meeel-yuns and meeel-yuns of transistors and countless potential electrical states) - it's not even possible to
completely test them any more. Or isn't if you don't have 20 years to wait for the tests to complete.
Looks like you got one of the
bad 'uns.
For
customers (
friends, family, and me can forget it! ), I'll run hard drives for about 72 hours using multipass zero-writes under DBAN. And then run the extended SATA test. Any niggle - no matter how small - and I won't drop it in a RAID. Ditto if the RAID controller throws an exception during setup. I keep a very small supply (about 3) of these 'really good' drives in stock since the tests take forever even though they're just a
click-walk away-wait task. They're kept on a shelf guarded by three ancient Egyptian curses, a flame filled moat, and the meanest drop-out junkyard dog I could find. A sign on the shelf (in 16 different languages) says: Don't even
think about it.
The real problem with RAID is that whenever I've mixed drives, I've run into similar hassles about one time in six. RAID controllers are fussy. And they should be. Unfortunately, minute r/w errors, spin sync & timing issues, or electrical differences that wouldn't bother anything else give RAID controllers big stones. Which they, in turn, share with us.
My suggestion? (1) Skip RAID. (2) Do a "po' boy's mirror." Sync relevant directories periodically to an internal drive. Good excuse to take regular work breaks -
and you should be taking them anyway. (3) Weekdays: sync directories overnight to an external drive. (4)
Image the data drive overnight to a second external once a week on Sunday.
Luck!
P.S.
There's a 9mm Glock Tactical in the top drawer behind the pretzel bag if you need it. Just don't point it at yourself or the dog.