A day of random distractions tops itself

I had just posted this note to twitter: “I wish I knew ahead of time the days I need to just write off“. The world did not let me down after that. This is the story of the next hour.

It started with me thinking: “What, world, do you have in-store for my next distraction?”

Wes rolls the dice and crosses his fingers for something easy. Like French Fries.

Wes rolls a 72 and consults the random-event chart: A snake

Oh, wait… That’s not so bad! We get snakes around here all the time (we do live next to a field after all). Wait a minute… Where is the snake?

Wes rolls the dice again and hopes for “still in the field”

Wes rolls a 54: On the roof

On the roof? On the roof? What, am I in a game of madlibs or something? Wes goes to grab the ladder and a curved stick to try and pull it out from under the eaves. He returns with the ladder, climbs up to roof-level and looks for the snake. No snake. Uh-oh, where has it gone?

Wes rolls the dice, but gives up hoping for something good.

Wes rolls a 28: Under the tiles

Um… Snakey? That doesn’t seem safe. It’s 75 today and the tiles are hot for me to touch, let alone an animal as sensitive to heat as you. What? Oh, you’re still crawling further? Ok… I’ll give up for a bit then. I’ll check back on you in a few to see if you’ve come to a place I can help you. Wes does a bit of random yard work and tries to wait the distraction out.

Wes rolls the dice again. He deliberately avoids making direct eye contact with the dice. He knows its going to be snake eyes, after all.

Roll #51: Clamber in the gutter

Oh, finally. At last the snake has dropped out from under the tiles into the gutter. Now I can get it off the roof where it’s safe. Wes climbs the ladder and peers into the gutter.

Just as his eyes reach gutter-level again, the dice down below roll by themselves. Twice.

Roll #93: The snake is indeed in the gutter.
Roll #94: Wrapped tightly around a rat.

A bit of warning would have been nice, snakey. I mean… I nearly fell. You seem to have done a nice job on the rat though. Very impressive how tightly you can squeeze it. Thanks, by the way, in case I forget to mention it later when this adventure is all said and done.

Listen, I still want to get you off the hot roof. Now, I know how good snakes are at eating things bigger than themselves, but you’re not a full grown snake and that is, um, a bit bigger than I think you can handle. So, how about we get you down and you can slither away, mmm-kay? You trust me, right?

In the end, I hooked the snake with my make-shift snake hook and lowered it to the ground. It was still clutching the rat. Once on the ground, the snake proceeded to be mildly irritated with me but significantly more hungry than irritated. 30 minutes later the snake had indeed succeeded in consuming the whole rodent and now sits, as I type, in the corner of the courtyard. Digesting.


Yes, I took lots of pictures. No I won’t show you the ones you don’t want to see. But here’s the snake after finishing his meal (note the bulge):

Is it 5:00 yet?
No?
Ah, who cares at this point…

Comments (5)

Death of a Server

Murphy’s law is typically (mis)phrased as “if anything can go wrong, it will”. My new extension to this law (Wes’ law?) will now read: “If anything can go wrong, it will, at the most inconvenient time” because he didn’t take into account the 4th dimension: time.

Roughly two weeks ago I was wandering around the streets of Prague, CZ when I noticed that I could no longer log into my server back in the U.S. After checking everything leading up to the system, my wife reported “the power button is still doing nothing and no lights are coming on”. I suspected, at this point, it was the power supply. But unfortunately I still had another week of work travel to complete before I could get back to fix it. (And of course, during part of my away trip, I was planning on using it remotely for a work-related demonstration involving DNSSEC).

Hence my new extension: “… at the most inconvenient time”.

Returning to the U.S.

Upon returning to the physical system I did confirm my guess that it was the power supply that died. (Note: in front of the system is multiple surge protectors and a decent UPS, so it was definitely the supply itself breaking, not a surge coming through the power-lines.) I quickly removed the old supply and replaced it with a nice, shiny, dust-free new one. Click Switch, and still no go. Power went to the mother board but it refused to do anything.

Back to the store for a new mother board. And a CPU. And memory. My original estimate of a $75 replacement power supply was beginning to look very very off. After replacing the motherboard, taking out all the original cards and leaving only the original hard drives in place (ok, the physical case was still the same) I tried booting up again. At least the BIOS bootstrapping began, but the system still failed to boot and the screen showing “no hard drives detected” had to be a bad sign.

That left the 3 hard drives as being still in some state of “bad”. So, booting from a Fedora rescue disk, I attempted to examine how each drive was functioning. One at a time. None of them would even spin up. All 3 exhibited complete failure conditions. Two of three were identically configured drives (from different manufactures) in a RAID1 array to ensure that if either drive died, the data would still remain intact. Redundancy is great until everything fails at once. Murphy doesn’t believe redundancy will help. The third drive contained (daily) backups of the system from the other two, but it had catastrophically failed too. That meant that there was no chance of a complete recovery unless I could get at least one of the drives working.

Salvage Operations


That can’t be good

In a last-ditch effort, I ordered brand-new, exact copies of the dead drives (which themselves were only 5 months old so finding duplicates was easy). If I was lucky, only the controllers on the drives would be dead and the physical drives themselves would still function. When the new duplicate drives arrived, I swapped the good controller on a new drive onto a bad drive and hoped. Unfortunately, the first old drive with the new controller still failed (though at least it sounded like it was trying to spin up this time). I crossed my fingers and moved on to the second bad drive. Unfortunately, even that was a no-go. I even tried various other tricks, being at the true “last resort” stage. It’s amazing the things that people suggest that might fix a dead hard drive, from knocking it on a table (I didn’t try that) to pretending to throw it like a frisbee to putting it in the freezer for 30 minutes.

Eventually, I had to admit defeat and start from my oldest, external backups. Sigh. They were from 4 months ago. Double sigh.

It’s better than nothing at least, but… I lost mail. I lost some pictures. And I lost some reputation points from having run a very solid, rarely down, server for various mailing lists and other services for the last 15+ years.

Looking Forward

So what did I learn from this? The first thing: one set of backups is never enough. And most importantly, at least one set should be electrically isolated from the machine. This means that the very common technique of storing backups on an external USB drive probably isn’t wise either since it’s just as likely that the USB system would spike a few volts to the external drive too.

So what are my future plans? I’ve replaced the system and got it back up and running on the old data, restarted the backup system using the exact same nightly routine. But now I’m going to add an external USB drive to a completely different machine and (r)sync the backups to it on a daily-ish basis. That combined with a backup MX server that keeps mail copies of critical domains for 30 days and off-site backups of truly critical data should suffice right?

Shush Murphy. Yes, I can hear you whispering behind me, but I’m not on speaking terms with you right now.

Comments (2)