I have discussed backup strategies here a lot, and here I go again. Just to remind everyone, mouser wrote a detailed backup guide several years ago:
https://www.donation...ckUpGuide/index.htmland I wrote my own limited guide a little after that:
https://www.donation...dex.php?topic=7940.0Now, I am going to write some more. I'll eventually publish a cleaned up version of everything on my website once I figure everything out. The stuff below are my ramblings based on several discussions I've had with individuals and even people here at DC. I still have some questions and issues to sort out, but I think I have an overall handle on things now.
I want to drastically improve my current backup solution. I have a double redundancy backup going on right now. Meaning, I have a hard drive where I keep my personal files (no OS or installed programs, that's on another drive). I mention that because all the files/folders are standalone, and can be moved anywhere without a problem (no drivers or OS required, they're just files). Anyway, when I built this backup system, I used 1 TB drives. So I have two backup 1TB drives backing up that data drives. 3 drives total, which equals double-redundancy.
Now the problem is that I'm running out of space. When that happens, I clean out the less important stuff by burning it onto a DVD. But this is a klunky way of doing things. I would rather have everything on hard drives and backed up that way. That means that I need to purchase additional hard drives. However, I already have 5 drives on my desktop (internal+external). Adding more at this point would be a little bit much.
That's why I'm building a server. This issue along with my desire to put ALL my data on hard drives means I've stepped across the simple desktop boundary and into server territory. The knee-jerk reaction at this point is to buy a NAS thing and be done with it. But I like to do things my way, and I like to do things a little on the extreme side. I realize that there are more simple and affordable solutions to this that are perfectly adequate. But I'm going to do it the hard way and get a system that I like better.
First question that comes to mind: how many hard drives do I need? Well, how much data do I have currently? Right now, I have about 1 TB of data. But I also have hundreds of burned DVD/CD that I eventually want to stick back onto hard drives. Plus, I want to take my entire movie collection (home movies, DVD's, etc.) and put in on the hard drives. The movies are huge (when uncompressed using makeMKV) so that will add significantly to my size requirement. Altogether, taking future growth into account, I'm going to plan for backing up up to 4 TB of data. Yes, I know it's a lot, but it makes sense, especially with all the movies.
The next question is the method of file backup. The first "track" backup (mouser's term!) will be file synchronization. I really prefer file syncing because it means I can use the files very easily, and I can just grab a hard drive and plug it into another computer and start using it without any extra steps. Image backups, on the other hand, are more difficult to use because you have to extract the files, you need additional software, it's not easy to use with other computers. That's why I love file syncing.
I have a double-redundancy philosophy with file syncing. It would take a pretty rare circumstance to simultaneously wipe out data from three different hard drives. Also, I very narrowly avoided losing my original data AND the backup data a few years ago when I was doing single-redundancy (long story).
At this point, the most pressing issue is that backing up 4TB of data with double-redundancy is a LOT. Assuming I use 2TB drives for each set, I would need 2x3=6 hard drives (2TB each) to accomplish this.
The next track for a backup strategy is doing image backups. While file syncing is great for portability and convenience as far as accessing individual files/folders, it is not that suitable for versioning and OS/programs backup. I'm not that concerned about backing up the operating system or installed programs because you can always reinstall that stuff. I'm way more concerned about my personal data. What I desire from images that I don't get from file syncing is versioning. Meaning, let's say I deleted or modified a file from last month, and only now am I realizing that I wish I had the original file back. Versioning keeps track of all these changes and you can recover them from images.
Versioning, from what I've tried, can be done in two ways: using images, or using archived file sets (rar,zip). I used to do it the archived file way. I've tried the versioning support in SFFS which does versioning by appending dated suffixes to files. It's not an elegant way of doing it. I've also tried programs like Genie. Programs like Genie and Backup4All can use archive formats to manage their versions. If you need to recover an old file, it extracts it from a zip/rar file (unless you store them uncompressed). But doing it this way is more like a hybrid between file syncing and images. I didn't like it very much, and that's why I'm going back to images.
You might be asking why I'm so hesitant to use images. As mentioned above, one of the reasons is that I don't really care about backing up my OS or programs, so I don't need something like images that retains all the interrelated files/drivers/OS system files. If you haven't noticed, I REALLY like having portability with my files. If it were up to me, everything would be portable: the OS, programs, everything. So that's why I struggle with this part.
Anyway, so I've decided to use images for versioning because it's the best way right now. Since versioning adds the variable of time to the pot, I have to be more thoughtful about the backup schedule. With file-syncing, since I'm mirroring the files/folders, time is not an issue. They are just copying files, and it doesn't really matter when it happens. With versioning, I want the images to give me the ability to go back in time.
I've really struggled with the best way to set this schedule up the past year. Then I saw Apple's Time Machine program. I really liked it, so I'm going to model my imaging setup to mimic that. So what does that mean? Here's the explanation (from Wikipedia):
"Time Machine saves the hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month"
That's what I want to do. I don't really know how to set up image programs to do this right now, but I'm pretty sure it can be done. What I need to do is set up a 3-track imaging approach: one will do the hourly backups, the second will do daily backups, and the third will do the monthly backups. My question is, do all these tracks need to be tied together somehow, or are they independent of each other? I don't know right now. Obviously, Apple's program is seamless. I'd like to be able to do the same, but whether or not that's possible remains to be seen.
Once again, how many additional hard drives will I need for this imaging stuff? Well, I'll be backing up the same 4TB of personal data, but I will also throw in the OS/programs. I'm still going to consider it as 4TB total. Once again, I'm going to do this with double-redundancy. This means I will need an additional 4 drives (2TB each).
If you're keeping count, I now have 10 drives total.
Back to the server. I need some kind of server rack to hold all these drives. I'm not familiar with all the lingo yet (rack, enclosures, 2U, 4U, etc.) nor do I care. I just need a box that houses the drives, the computer, etc. I'll probably get some kind of mid-height (no more than 4') rack tower for all of this. I need something like a Dell Poweredge unit to hold the drives. Allowing for some future growth, I'd like it to hold 15-20 drives. Then I need some kind of unit for the motherboard and all that. Then some kind of rack monitor and keyboard, unless I choose to access it remotely.
This is when people normally start talking about RAID setups, and I just don't get it. I don't think I need RAID. There are many things I don't like about RAID. Firstly, I don't need the speed. Secondly, It would at least double the amount of drives I would need. All those drives, if RAIDed, would need identical duplicates to do all that mirroring and building and such. It just sounds like a big headache to me. It's something I'm unfamiliar with, and uncomfortable with. I've had issues in the past trying to deal with RAID related issues with motherboards, bios, hard drive configuration. It's just a headache. And I don't think it really offers much to my backup strategy. If it were simple and affordable, I'd consider it. But it's just a headache in every way. I hate listening to people talk about all the different RAID flavors, it annoys the shit out of me. I don't think most of them know what they are talking about. I'm convinced of one thing: it's not a true backup strategy. It's only a "kind of, sort of" backup strategy. So unless someone can convince me otherwise, I'm just going to do this using independent hard drives and software. No one has ever convinced me yet, and I've asked a lot of people. Like I said, I think there are a lot of people out there who think they are RAID experts, but when I talk to them, it's clear that they don't know enough to answer my questions.