ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

Superboyac's backup strategy revisited (revised for 2011)

(1/4) > >>

superboyac:
I have discussed backup strategies here a lot, and here I go again.  Just to remind everyone, mouser wrote a detailed backup guide several years ago:
https://www.donationcoder.com/Reviews/Archive/BackUpGuide/index.html

and I wrote my own limited guide a little after that:
https://www.donationcoder.com/forum/index.php?topic=7940.0

Now, I am going to write some more.  I'll eventually publish a cleaned up version of everything on my website once I figure everything out.  The stuff below are my ramblings based on several discussions I've had with individuals and even people here at DC.  I still have some questions and issues to sort out, but I think I have an overall handle on things now.
I want to drastically improve my current backup solution.  I have a double redundancy backup going on right now.  Meaning, I have a hard drive where I keep my personal files (no OS or installed programs, that's on another drive).  I mention that because all the files/folders are standalone, and can be moved anywhere without a problem (no drivers or OS required, they're just files).  Anyway, when I built this backup system, I used 1 TB drives.  So I have two backup 1TB drives backing up that data drives.  3 drives total, which equals double-redundancy.

Now the problem is that I'm running out of space.  When that happens, I clean out the less important stuff by burning it onto a DVD.  But this is a klunky way of doing things.  I would rather have everything on hard drives and backed up that way.  That means that I need to purchase additional hard drives.  However, I already have 5 drives on my desktop (internal+external).  Adding more at this point would be a little bit much.

That's why I'm building a server.  This issue along with my desire to put ALL my data on hard drives means I've stepped across the simple desktop boundary and into server territory.  The knee-jerk reaction at this point is to buy a NAS thing and be done with it.  But I like to do things my way, and I like to do things a little on the extreme side.  I realize that there are more simple and affordable solutions to this that are perfectly adequate.  But I'm going to do it the hard way and get a system that I like better.

First question that comes to mind: how many hard drives do I need?  Well, how much data do I have currently?  Right now, I have about 1 TB of data.  But I also have hundreds of burned DVD/CD that I eventually want to stick back onto hard drives.  Plus, I want to take my entire movie collection (home movies, DVD's, etc.) and put in on the hard drives.  The movies are huge (when uncompressed using makeMKV) so that will add significantly to my size requirement.  Altogether, taking future growth into account, I'm going to plan for backing up up to 4 TB of data.  Yes, I know it's a lot, but it makes sense, especially with all the movies.

The next question is the method of file backup.  The first "track" backup (mouser's term!) will be file synchronization.  I really prefer file syncing because it means I can use the files very easily, and I can just grab a hard drive and plug it into another computer and start using it without any extra steps.  Image backups, on the other hand, are more difficult to use because you have to extract the files, you need additional software, it's not easy to use with other computers.  That's why I love file syncing.

I have a double-redundancy philosophy with file syncing.  It would take a pretty rare circumstance to simultaneously wipe out data from three different hard drives.  Also, I very narrowly avoided losing my original data AND the backup data a few years ago when I was doing single-redundancy (long story).

At this point, the most pressing issue is that backing up 4TB of data with double-redundancy is a LOT.  Assuming I use 2TB drives for each set, I would need 2x3=6 hard drives (2TB each) to accomplish this.

The next track for a backup strategy is doing image backups.  While file syncing is great for portability and convenience as far as accessing individual files/folders, it is not that suitable for versioning and OS/programs backup.  I'm not that concerned about backing up the operating system or installed programs because you can always reinstall that stuff.  I'm way more concerned about my personal data.  What I desire from images that I don't get from file syncing is versioning.  Meaning, let's say I deleted or modified a file from last month, and only now am I realizing that I wish I had the original file back.  Versioning keeps track of all these changes and you can recover them from images.

Versioning, from what I've tried, can be done in two ways: using images, or using archived file sets (rar,zip).  I used to do it the archived file way.  I've tried the versioning support in SFFS which does versioning by appending dated suffixes to files.  It's not an elegant way of doing it.  I've also tried programs like Genie.  Programs like Genie and Backup4All can use archive formats to manage their versions.  If you need to recover an old file, it extracts it from a zip/rar file (unless you store them uncompressed).  But doing it this way is more like a hybrid between file syncing and images.  I didn't like it very much, and that's why I'm going back to images.

You might be asking why I'm so hesitant to use images.  As mentioned above, one of the reasons is that I don't really care about backing up my OS or programs, so I don't need something like images that retains all the interrelated files/drivers/OS system files.  If you haven't noticed, I REALLY like having portability with my files.  If it were up to me, everything would be portable: the OS, programs, everything.  So that's why I struggle with this part.

Anyway, so I've decided to use images for versioning because it's the best way right now.  Since versioning adds the variable of time to the pot, I have to be more thoughtful about the backup schedule.  With file-syncing, since I'm mirroring the files/folders, time is not an issue.  They are just copying files, and it doesn't really matter when it happens.  With versioning, I want the images to give me the ability to go back in time.

I've really struggled with the best way to set this schedule up the past year.  Then I saw Apple's Time Machine program.  I really liked it, so I'm going to model my imaging setup to mimic that.  So what does that mean?  Here's the explanation (from Wikipedia):
"Time Machine saves the hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month"
That's what I want to do.  I don't really know how to set up image programs to do this right now, but I'm pretty sure it can be done.  What I need to do is set up a 3-track imaging approach: one will do the hourly backups, the second will do daily backups, and the third will do the monthly backups.  My question is, do all these tracks need to be tied together somehow, or are they independent of each other?  I don't know right now.  Obviously, Apple's program is seamless.  I'd like to be able to do the same, but whether or not that's possible remains to be seen.

Once again, how many additional hard drives will I need for this imaging stuff?  Well, I'll be backing up the same 4TB of personal data, but I will also throw in the OS/programs.  I'm still going to consider it as 4TB total.  Once again, I'm going to do this with double-redundancy.  This means I will need an additional 4 drives (2TB each).

If you're keeping count, I now have 10 drives total.

Back to the server.  I need some kind of server rack to hold all these drives.  I'm not familiar with all the lingo yet (rack, enclosures, 2U, 4U, etc.) nor do I care.  I just need a box that houses the drives, the computer, etc.  I'll probably get some kind of mid-height (no more than 4') rack tower for all of this.  I need something like a Dell Poweredge unit to hold the drives.  Allowing for some future growth, I'd like it to hold 15-20 drives.  Then I need some kind of unit for the motherboard and all that.  Then some kind of rack monitor and keyboard, unless I choose to access it remotely.

This is when people normally start talking about RAID setups, and I just don't get it.  I don't think I need RAID.  There are many things I don't like about RAID.  Firstly, I don't need the speed.  Secondly, It would at least double the amount of drives I would need.  All those drives, if RAIDed, would need identical duplicates to do all that mirroring and building and such.  It just sounds like a big headache to me.  It's something I'm unfamiliar with, and uncomfortable with.  I've had issues in the past trying to deal with RAID related issues with motherboards, bios, hard drive configuration.  It's just a headache.  And I don't think it really offers much to my backup strategy.  If it were simple and affordable, I'd consider it.  But it's just a headache in every way.  I hate listening to people talk about all the different RAID flavors, it annoys the shit out of me.  I don't think most of them know what they are talking about.  I'm convinced of one thing: it's not a true backup strategy.  It's only a "kind of, sort of" backup strategy.  So unless someone can convince me otherwise, I'm just going to do this using independent hard drives and software.  No one has ever convinced me yet, and I've asked a lot of people.  Like I said, I think there are a lot of people out there who think they are RAID experts, but when I talk to them, it's clear that they don't know enough to answer my questions.

40hz:
RAID isn't a backup strategy per se. It's a strategy for data redundancy and recovery in the event of hardware failure. RAID can figure into a backup strategy - but it's not a substitute for backing up.

RAID is also oversold. IMO it's only really useful for use on a server when mirroring (RAID1) the drive holding the operating system. That provides fail-over in the event one drive unit dies.

The only problem with this strategy on a home server is the quality of the RAID controller card or subsystem on the mobo. Quality and reliability of these "consumer" or "semi-pro" controllers varies a great deal from the more expensive and better designed "enterprise" controllers.

If your RAID controller doesn't have it's own CPU and RAM, it's "consumer grade" - and you have been warned. I've seen these types of controllers malfunction and corrupt both drives in a mirror far more often than I've seen the drives themselves fail. If you do use inexpensive RAID controllers to mirror your boot drive, do yourself a favor and take periodic recovery snapshot images. Because sooner or later - you're gonna need them.

I'm seriously thinking the next home server I build is going to be configured to use laptop-type 2.5" form factor drives. They're smaller, more energy efficient, and shock resistant than regular SATA drives. And there are some new home servers already set up to use them which look quite promising. I'll likely keep minimal amounts of user data (MyDocuments, etc.) on it, and use an external array for most of my heavy (i.e. libraries and archive) storage requirements to avoid heat issues in the server case. Something from Synology, Sonnet Technologies or Granite Digital will probably be what I use for that part.

Check out the following sites for reviews and product info:

ServeTheHome  :-*
SmallNetBuilder  :-*
MyHomeServer
WeGotServed

 :Thmbsup:

JavaJones:
Agree with 40hz, RAID isn't backup, and I wouldn't recommend it for the home user (even a power user). You don't need to worry about it.

Now, to address your specific remaining issues/questions:

First, regarding your desire to do versioning with "imaging", this is certainly possible, however if the only reason you want to do imaging is for versioning, then it's probably overkill and not the best approach just to get versioning. If you want good versioning support with backup, try CrashPlan. The software is free if you don't use the Pro or online features, but it's also worth the price if you do need e.g. backup sets (a Pro feature). There are other backup programs that also do versioning, I just find CrashPlan to be a nice, comprehensive option with good versioning support. It works similar to TimeMachine in that it saves more frequent versions for more recent times. The version saving is customizable as well. If you do decide to go with imaging, make sure you select an app that does on-the-fly/"live" imaging (so you don't have to shut the machine down) - most do these days - and more importantly that it does in fact support versioning. Not all imaging apps do versioning, it's a more common feature for non-image based backup tools.

Note also that if you're doing versioning, your basic space calculations may be way off, depending on how many large files are changing.

The next big question is rack mount. If you have space for one and a desire to add more equipment over time, then it *may* make sense for you. Just keep in mind that the rack itself will cost, and most equipment designed for racks is interprise-grade, generally more expensive and higher-spec'ed than you probably need (e.g. redundant power supplies). I personally wouldn't bother with a rack. That being said if you really do need the number of drives you have spec'd, you may need to.

If you do go rack mount, you'll be looking at something like this, some more bare options or these guys. You can see how expensive it can get quite quickly.

Also, if you don't already have one, be sure to get a good UPS and configure it properly on the machine so that it will gracefully auto shut down before the backup power runs out. This will help minimize the chances of any failure due to power outages, surges, etc.

Finally, if you're going to all this length and expense to get a backup system with versioning, you should seriously consider the issue that all your data is still in one physical space. If there is a fire or other natural disaster, or even if some careless person spills water or something on the system, it could at the least mean needing to pursue data recovery, and at worst could mean your careful (and expensive) backup strategy is for naught. Consider options for off-site backup. It may seem prohibitive with that much data, but I have a similar amount and am about to start using CrashPlan's online service. They offer a backup seeding service for a reasonable price to get you started with the majority of your initial data, after that it's just maintenance so as long as you have a reasonable Internet connection (reasonable outgoing bandwidth that is), it should be fine.

- Oshyan

Stoic Joker:
Then I saw Apple's Time Machine program.  I really liked it, so I'm going to model my imaging setup to mimic that.  So what does that mean?  Here's the explanation (from Wikipedia):
"Time Machine saves the hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month"
That's what I want to do.-superboyac (May 19, 2011, 12:57 PM)
--- End quote ---

Sounds alot like the Previous Versions/Shadow Copy Service that's built into Windows. Can be configured to do snapshots daily, weekly monthly, or every 5 minutes (not recommended), as you like. Default is twice a day.

tomos:
Then I saw Apple's Time Machine program.  I really liked it, so I'm going to model my imaging setup to mimic that.  So what does that mean?  Here's the explanation (from Wikipedia):
"Time Machine saves the hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month"
That's what I want to do.-superboyac (May 19, 2011, 12:57 PM)
--- End quote ---

there's a few windows backup apps that do that - just dont ask me names :-[


(have backup4all introduced something along those lines? I've lost track of it lately)

Navigation

[0] Message Index

[#] Next page

Go to full version