Chuqui 3.0: backups and drive failures: how much redundancy is enough?:
Now, having said that -- it never hurts to fire up Disk Utility every so often to check the SMART status on the drives. Since I'm using SoftRaid for the RAID, it tracks and will show you disk errors. Checking that every so often can give you a hint something's up (like a bad block). And usually, barring Coke spills or clumsy fingers, a failing laptop drive will give you a two minute warning that things are starting to go bad.
Okay, so last night and this morning, I finished the round of leopard upgrades and disk upgrades on all of the systems here, putting a 250G drive in Laurie's laptop and upgrading her to Leopard.
All's pretty much well in the world now.
I did run into one thing that ties back to this whole backup and drive failure and redundancy thing... After I'd upgraded Laurie's mini to Leopard, I realized that the drive I was going to use for backups (temporarily while I finished everything else up) wasn't big enough; on the other hand, Laurie's secondary firewire drive on the mini was, and the new drive was big enough to hold her files, so I copied all of her data over to the new drive, and went to reformat the 500 gig drive to use as her time machine drive.
it failed -- I/O errors while trying to erase and partition. the drive worked fine under normal circumstances, and the data copied off fine, but it failed the reformat. One can only surmise it had bad blocks somewhere bad. It was also a drive that was three weeks out of date on the backups and stored most of Laurie's picture library, the backups being out of date, of course, because I was in the middle of this upgrade cycle and stuff was in a bit of chaos.
Which, of course, brings up the reminder that even a small window of risk in your backup cycle can be too small. In this case, no harm, no foul, but I now realize I had a disk that was starting to fail, and if the timing had been bad, we could have lost some data. Fortunately, no.
But that's led me down the path of wondering whether it makes sense to simple schedule copying all data to a new drive once a year, just to make sure it's on a drive that can be tested and reformatted. Or copying it off, reformatting, and putting it back. Just as one more sanity check, at least for internal drives.
For external drives, I'm seriously starting to wonder whether EVERYTHING ought to go RAID 1, not just the backup drives, but the drives that plug into the laptop are now all bus powered, that's a theoretical question at best -- but anything that sits with a power block, maybe the answer is to RAID everything. I haven't decided whether it's worth the money (probably not, off the top of my head), but it's a consideration.
But it's clearly a reminder that we need to be aware of the potential failure of things with moving parts, and these days, disks are about the last thing left that have moving parts in a computer system.
I've also been streaming my picture library up to S3 via transit, and that's going well. moving 60 gigs of data takes time, though, but once it's done, it's another safety option. I also plan on throwing laurie's photos, our itunes, and our documents. 100-125 gigs total, I think. And this is a freaking HOUSE, not a business. wow.
So...
Now both Laurie and I have a laptop (macbook pro) with an internal 250G drive; I need to order one more (thanks to the 500g failure) but we'll both have bus-powered 250G portable firewire drives to use with SuperDuper. We also both have another bus-power drive (mine is 100G, laurie's 80) for carrying around files that don't need to live on the laptop all of the time -- not absolutely needed with the new drive, but knowing both of us, we'll be filling it up rapidly. This gives us ~300g of space before we have to think about it again...
Laurie has her mini (100G) with a firewire drive (250g) as well. I retired my mini, and so I live completely on the laptop, which is my preference; I prefer not to have to worry which pair of pants a given file is in, so to speak.
Each of us has a dual-drive RAID 1 drive for time machine backups, hers is 500G attached to the mini, and once it's synced up, I'll set up a network backup of the laptop to a DMG on it as well using superduper. Mine is on the laptop, of course...
One of the "bad" things about Time Machine, and I don't know how they'll work it out, but they need to, is that a DMG is seen as a single file, unlike, say, a package. That means every time I fire up Parallels and run XP and do something, the underlying disk image XP lives in is modified, and then you end up backing up the whole thing again (in my case, about 4 gigs). For me, where I use Parallels pretty casually, it's not a huge deal, but for multi-OS warriors, this will really screw over Time Machine's utility. For those figuring all of this out, be warned.
Also, if you do major restructuring of your files, fresh copies are made, it doesn't reference existing ones. So if you move, say, 30 gigs of RAW pictures to your secondary disk, that'll eat up your backup space. Time Machine will clean it up eventually -- but be warned. And when you buy backup disks, make them bigger than you think you'll need. 3X your current usage isn't a bad start...
and, frankly, it probably makes sense to at some point run two sets of Time Machine backups and store one of them offsite, rotating them. From now on, for me, copying my backup to a disk to take offsite isn't going to be my standard model. It's going to be to keep two sets of RAID 1 disks, each configured for Time Machine, and one set active, one set offsite, and rotated in however seriously I feel paranoid. That's a BIT more expensive, but if you think about it, takes a lot less time and hassle and less prone to failure in the long run... and in backups, anything that reduces hassle is good.
That's one reason why you haven't heard me talking about using RAID to manage the offsite copies, or creaing a RAID of the laptop drive and a firewire drive. that's nice in theory, but in practice, unplugging part of the RAID (and breaking it), then plugging it in later and having the RAID software sync up and update turns ot to be at least as time consuming -- usually, it seems, more so -- than using SuperDuper or the finder to clone the backup disk. That means having to leave the drive on for a longer period of time, and having to plan better, and leaving a longer window where smething bad might happen, and...
I'm guessing/hoping that one of these days, I'll be able to use SuperDuper to back up a disk to S3. Once you get that synced in over the network, it's not hard to keep in sync, adn that solves the offsite. Keep an offsite copy, keep a bus-power bootable copy, and keep your primary day to day disk, and that seems like a nice way to handle this... A full restore from S3 might be painful, but not as painful as not having the data any more...
Recent Comments