topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Sunday December 15, 2024, 6:18 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Drive Extender replacement due out in 2012. It's called Storage Spaces.  (Read 23867 times)

superboyac

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,347
    • View Profile
    • Donate to Member
Q: Is data protection or media pooling the more important feature for you?
I'd say media pooling, but I'm not really sure what data protection means.  if it's a backup issue, I'm planning on having two sets of backups for everything, and since they're just files and folders, I'll be using SFFS to do all the backup management synchronization, etc.  But if data protection means something like bad clusters, then I'm interested in that also.  Looks like I have some more studying to do.

I asked one of the network guys at work, and he also said he's looking for something like I'm talking about.  From what I understand so far, it sounds like the "newest" feature in this kind of technology is being able to use a mish-mash of hard drives.  For RAID, you need to have bunches of similar hard drives.  i don't want that.  Some of these other software things I've seen will waste space if the hard drives are not the same size, I don't want that either (they are limited by the smallest drive in the bunch).

I guess I don't need the ability to be adding/removing drives a lot.  But what I don't want is this: let's say a hard drive goes down in a few years, i don't want to have to be forced to find a particular type of drive that is compatible with that bunch.  i want to be able to use whatever is available.  Also, I want to be able to add any kind of hard drive easily, without having to tweak too many things on the existing setup.  Yes...I think that is my #1 feature request: that I be able to use any drive.

superboyac

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,347
    • View Profile
    • Donate to Member
Found another helpful explanation for some of these things:
Now... all that being said. You probably don't want to get involved in RAID 5. You want to look at the new options called "snapshot RAID". The major players are FlexRAID, SnapRAID, DriveBender, Stablebit Drive Pool, and a couple of others.

These products all create a software RAID (Similar to RAID 4, but no striping) that keeps the data intact on the disc (does not stripe across all the discs)... this is a nice feature as you can pull a drive out and put it in another computer and it will act like just another disc. This is impossible to do with real RAID.

Also, since they aren't calculating parity data all the time, read and write speeds are as fast as the individual hard drive will allow (which is usually many times over fast enough to stream multiple files at a time).

The best part about Snapshot RAID is that it provides the same amount of protection as a RAID 5 setup, but you can use any size hard drives that you have. So if you still have a couple of 500GB drives, and say three 1TB drives... you can protect the data on all of those with only one of those 1TB drives used as a parity drive.

In some packages, like FlexRAID, you can pool them all together and combine the 1TB with the two 500GB for a total of 3TB of usable space (one of the 1TB will be used for parity, and it always has to be your largest drive). This gives you JBOD-like ability to make all your drives (minus your one parity drive) appear as one large disk to the OS.

It's also really easy to scrap a snapshot RAID set and start over, or use a different drive for parity, etc... because you are not modifying the data drives at all.

Perhaps this indicates that 40hz's suggestion, SnapRAID is the choice for me.  SnapRaid has gotten really great comments on the web regarding its reliability, I like that.  And it has a GUI in the Elucidate software.  The other option that is reviewed well is Drive Bender.  But Drive Bender doesn't seem to have the reputation for reliability like snapRAID.

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,964
    • View Profile
    • Donate to Member
^ I'm curious and know very little about RAID even, but the maths in your quote there dont make any sense to me.

So if you still have a couple of 500GB drives, and say three 1TB drives... you can protect the data on all of those with only one of those 1TB drives used as a parity drive.
that's 3x1TB + 2x500GB = 4TB total space.
IIUC a 1TB drive does to backup/snapshot of 3TB.
How does that work?

In some packages, like FlexRAID, you can pool them all together and combine the 1TB with the two 500GB for a total of 3TB of usable space (one of the 1TB will be used for parity, and it always has to be your largest drive).

it's not at all clear -there's possibly a typo in there ("the 1TB")- but I think they are using the same example as above of 3x1TB + 2x500GB. That would mean again that a 1TB drive does to backup/snapshot of 3TB.

Do they both use the empty space on all drives? (Apologies if this was covered already - I just skimmed the thread and the SnapRAID home & comparision pages).
Tom

superboyac

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,347
    • View Profile
    • Donate to Member
tomos, yeah I think something about the parity drive isn't as "linear algebra" as we're thinking.  It seems to be able to do more with less.  Snapraid also claims to be the only one that can potentially recover data in a pool with two drive failures instead of the normal one.  I have the same question for that also; how does the math work out there?

superboyac

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,347
    • View Profile
    • Donate to Member
OK, so far it looks like the best solution to try is:
1) SnapRAID for dealing with multiple drives
2) Elucidate, a GUI for SnapRAID
3) Liquesce, the drive pooling for windows (SnapRAID doesn't include this funcionality)

This is the same as 4wd's suggestion.  It's pretty good if it works well.  Everything is free.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,287
    • View Profile
    • Donate to Member
how does the math work out there?

Here's a simple explanation:

Let's say you have four, 2 TB drives.  You would designate one of them as the parity drive, e.g.,:

D1 (2 TB)
D2 (2 TB)
D3 (2 TB)
P1 (2 TB)

This would give you 6 TB of data space with 2 TB of parity.  The simplest way to envision this is with math, i.e., 1+2+3=6 (D1 = 1, D2 = 2, D3 = 3, thus, P1 = 6).  If you lose any one of the "D" drives, you can easily calculate what you're missing.  Let's say you lost D2 and have replaced it.  The system starts to rebuild based data off the P1 parity drive information like this:

1+?+3=6

Obviously, ? = 2.  Yes, this is a 50,000 foot view, but does that help to explain things?  Also, if you were to lose two drives at one time, you wouldn't get all your data back (unless you had two parity drives).  In other words, you can simultaneously lose as many data drives as you have parity drives.  So goes the theory, anyway.  YMMV.   :P

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,964
    • View Profile
    • Donate to Member
This would give you 6 TB of data space with 2 TB of parity.  The simplest way to envision this is with math, i.e., 1+2+3=6 (D1 = 1, D2 = 2, D3 = 3, thus, P1 = 6).  If you lose any one of the "D" drives, you can easily calculate what you're missing.  Let's say you lost D2 and have replaced it.  The system starts to rebuild based data off the P1 parity drive information like this:

starting to get it - in conjunction with above explanation, this helped:
http://en.wikipedia....iki/RAID#RAID_parity


very impressive functionality - so long as it works :D
Tom

superboyac

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,347
    • View Profile
    • Donate to Member
Yes, very helpful explanation.  Thanks skwire!

40hz

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 11,859
    • View Profile
    • Donate to Member
The math gets a little less obvious with something like RAID-5 since there isn't a designated parity drive. The parity data gets distributed among all the drives. But for RAID-5 you'll always need one additional drive to do an array.

Minimum number is three drives. Your data space is the capacity of all the drives in the array minus the capacity of one drive in the array. (Note: All the drives in a pure RAID-5 array must have identical capacity.)

Assuming 1TB drives, a three drive array would have a 2TB data space (i.e. 3TB-1TB). So 33% of the total capacity is 'lost' to parity. But a five drive array would have a 4TB data space (i.e 5TB-1TB) with only 20% total capacity lost to parity. And that percentage decreases with each drive added.

So as you can see, RAID-5 is least economical with only three drives - but it becomes increasingly economic with each drive added after that - up to the capacity of the RAID controller.

The tradeoff is that a standard RAID-5 array can only tolerate one drive failing at a time without experiencing data loss. As skwire pointed out earlier, the general rule is you can have as many drives simultaneously fail as you have parity drives. And in the case of RAID-5, you basically only have one "drive" for parity even though it isn't a specific drive.

In practice, it's a little more complicated than that. But not much. 8)
« Last Edit: July 18, 2012, 10:38 PM by 40hz »