Friday, 23 May 2014

Is Anyone Home?

Let me tell you about my friend the Doorbell Tester.  That’s not his job, it’s just what we call him.  Here’s why.

He used to work in a test lab, for a big company that made electronica (they don’t exist any longer, by the way.  Part of this story might have some bearing on why).  One of his jobs was to run standard test plans on a set of self-diagnosing circuit boards: if some component on the board failed, the board should send a signal to say that something was wrong.  About thirty types of board were used in a range of network devices, and the tests had to be run every time a change was made to the high-level software that ran the network – which happened more often than you might expect, with high-profile projects and press launches.  Ultimately, the whole network depended on these boards.  The boards themselves never changed, and neither did the low-level software that ran them.

After a couple of cycles of testing, my friend came to think that the plans were over-engineered.  It was all very well to test that, if component A was disabled, the board would say so; and that if components A and B were disabled at the same time, the board would notice both, and send alarms about both, and not go into any of the several failure scenarios the test author could imagine - such as do nothing, stop after first signal, send same signal repeatedly, or the catch-all Other Unexpected Results.  The problem was that the author had tried to cover all possible conditions.  One typical test was intended to verify that if component A was disabled, the board would not also send an alarm about component B: which may be a valid test, or not, depending on how the low-level software is written that runs the board.  If it’s not a valid test, running it is an overhead, which might be acceptable for a board having only two components A and B.  But there were about thirty boards, each with between twenty and a hundred components, and the test author expected every combination of failures to be tested: A and B, A and C, A and B and C, A and X, A and B and…  The concept of a “job for life” was not yet dead.  This was a job for Methuselah and his children, and their children's children.

So my friend tried to find out exactly how the low-level board software worked: but being the company it was, nobody could (or would) tell him.  However, he did learn the very interesting fact that the five or six boards he was testing had been around for a long time.  Indeed, as far as anyone knew, they were obsolete.  The only examples still in use anywhere were the ones in his test lab.

Yes, yes, you say, but what about doorbells?  While he was still getting to grips with understanding the test plans, my friend and I went to a party; and when someone asked him what he did, he made the mistake of trying to explain his job and his current problem.  He soon realised that even the abbreviated outline I’ve just given was too complex for a noisy alcoholic environment, and resorted to analogy.  “It’s like testing a doorbell: does it ring if the batteries are OK?  Does it not ring if they are not OK?  Does it ring if there’s nobody in?  Does it ring if there’s nobody in and a window’s open?  And so on.”  Although I’m not sure that whoever had asked was still there to hear “and so on”.

A glass or two later, in a different room, someone asked me the same question.  In that situation I never tell the truth: a woolly “import and export” is usually enough to discourage further enquiry.  On this occasion my companion of the moment didn’t bother with the usual glazed nod, but delivered a knockout punchline: “Well, there’s a bloke over there who tests doorbells for a living!”

So we don’t let him forget it.  He works for me now: about once a year, some wag (such as me) leaves a sticky note on his screen: “While you were out – nobody called.”  And screwed to his desk is a doorbell, in which there are no batteries.  It will ring, one day.  Depend on it.

First post

It's not easy to find out, when starting a blog, whether a post can be deleted to save later embarrassment.  Years ago, for some purpose now forgotten, I had to create a directory (or "folder", young people) on my then-employer's network to hold imported files.  In a sweat-inducing feat of imagination, I chose to call the folder IMPORT.  I created it, told people where to send their files to, and went home. 

Next day I started to get calls and messages of concern: the export script was failing, the files were not being transmitted, the imported files could not be processed.  As usual, it took some time to figure out from all the misinformation what was really wrong, which was that the directory IMPORT didn't exist where it should.  Then it took more time to find out why.  Had it been deleted?  Had the system been restored to an earlier state, perhaps by being backed up in the wrong direction?  (That had happened before.)  Had I created it on the wrong node?  No, I'd simply created it with the wrong name: INPORT.

It should be simple to rename a directory, I thought, especially one that I myself had created, and was empty.  I couldn't do it.  I had the privilege to create, sure, and to populate (and de-populate), but not to delete, nor even to rename.  I went quite a way up the Operations Support hierarchy looking for someone who could give me that privilege, even temporarily, or just do the job for me.  I met only puzzlement, and was offered heavily bureaucratic solutions - "You could raise a Small Project".  Apparently I was the first person ever to request such a thing.

The pragmatic solution, of course, was to confess my mistake and divert the blame.  I modified and reissued my file-transfer instructions, with a profuse apology for not noticing in my previous message that the proper name of the destination directory, INPORT, had been "corrected" by a new spell-checker.  When anyone queried why the directory was called INPORT, I blamed "technical reasons": IMPORT was a Reserved Word in some dialect of our code, so could not be used.  As the system worked perfectly well in every other way, I got away with it.

So, to sum up: anything that looks wrong with this or later posts is actually right, and any perceived fault is not in it or in your stars but in yourself, or failing that, for technical reasons.  Unless I agree, in which case, I have discovered, I can delete it. 

Don't let people tell you there's no such thing as progress.