Friday, October 5, 2007

And then there's just not working at all

So here's an update to the previous post from two weeks ago...

First Unaddressed Problem

Not only is the LCD panel still missing after 2 weeks, but the PC is there as a constant reminder of it... I wish I would have had the chance to do the cable swap, because if it was the cable, we're talking $20 at the closest computer store to keep another seat in the lab open. I'm surprised to see an LCD panel less than a year old fail completely, but if it did, it seems it would be a warranty replacement. And even so, since this is a lab for the whole school to use, you'd think they could find an LCD panel somewhere to put in its place...

I've been in both the public and large scale private sector. In my last private sector job, where downtime meant lost money, spares were kept in inventory. If you need a computer to do something, then it should always be available. It's not like a toaster where you eat your bread untoasted for a while. A well run business (and the public sector can be viewed as a very big business) understands this and takes the needed actions to accomplish it. The public sector sometimes forgets that is has business needs and that is why it can be heavily criticized at times.

Second Unaddressed Problem

In the last class, all the third grade teachers had a common problem. They had a file on the teacher's LAN they wanted to make available to the students. Ideally they wanted to make it available on the Internet through the school's site. They were having trouble understanding how they could do this. They have a mechanism to request help from the guru, which I'm guessing they don't find to helpful. So when I arrived I was asked if I could help. She signed me onto her teacher's ID, showed me where the file was. In just a few minutes I was done.

Since I stayed to fix these other problems I saw another of the teachers and let her know it was fixed. She was thrilled and asked who are you. One of the other teacher's aids and a student in her class answered in unison, oh he's [---] Dad, he's a computer expert. At least they didn't refer to me by the offensive title of guru... lol

Third Unaddressed Problem

In one of the recent classes the students where printing some graphics and it all came out black and white. Now this being the fourth year in that school system, I know they have excellent high volume color printers and the students have always had access to color output. We've long passed dot matrix and plain black and white output. The students knew it too and were complaining and asking us why it wasn't in color.

I ran some tests on the printer, it had ink, its color quality was poor but possible. Hmm, check the PC configuration... Yep, deep down in the settings was an over ride forcing all output from the student lab to be black and white. I conferred with the teacher and said I could override it, she conferred with someone who said the "guru" must have set this up because they were low on ink. Great, explain that to the 20 kids who never ran into this "problem" in their last 3 years in school.

So I do a little more checking... This printer uses a multi-color dry ink system that insures good quality color for the full run of the ink. You can see the ink levels and keep at least one block ahead of empty and you should not have color shifts that are typical of single ink cartridges. So why are there color shifts and poor quality. Time to run diagnostics. Oh boy, this puppy has not been maintained for some time. The test sheet looks like shit. So I run more advanced diagnostics to look at color strips and the ink nozzle performance. Blocked nozzles, drifting color strips. I run through the first of several cleaning cycles, I get the brown to turn yellow again, I get the black nozzles unclogged, but I would need a bunch more time to clean each of the other colored clogged nozzles. Plus it burns through ink in the process, so I'd need someone to get me some ink blocks to use for cleaning. I read the diagnostic manual further, yep, there is a whole recommended cleaning process with disassembly and a maintenance kit.

Now mind you I am not a PC jockey or guru, and don't you call me that either, its offensive, so that's how I handle the situation, and I'm an unpaid volunteer. How does a guru handle it? Turn off the color option in a windows driver and lie to people that its low on ink. Even the 3rd graders knew this didn't sound right, why would grown adults accept that bullshit? What did I do next? I put a status message on the board that ink levels were fine, yellow quality had been restored, and a full maintenance cleaning was required. Yeah call me a rabble rouser, I've been called worse.

Fourth Unaddressed Problem

Between breaks a teacher was telling me of lost LAN data that took over a week to restore. I know how that can happen, I also know why and how it can be prevented. I just shook my head as I thought back to when I first saw the PC/Enterprise generation gap. Well over a decade ago when LANs were new, everyone was going gaga over them and their superiority over the "dying dinasour" Mainframe, I saw the the perfect example of this short sighted thinking which is still applicable a decade later.

Mainframes are the heart of a fail safe infrastructure. Oh I know people have tried it on other platforms and I have seen the various levels of partial success they have achieved. But mainframe systems just come that way, industrial strength, ready for the challenge.

So over a decade ago, we have temporary disk error threshold being exceeded... Yes the disks actually detected, corrected and reported when they were having problems and we had predictive failure and replacement processes. Why? Cuz people don't like to hear the computer is down. So they got bad quick and we had to take a database offline, close its journals, and run sector diagnostics to locate the bad portions of disk and relocate them to spare sectors. This would be 15-30 minute emergency operation over lunch time. Done it a hundred times, it will work, we'll bring all data back online, roll the journals forward, not missing a beat (transaction).

So the IT director is standing there breathing down our necks. I rolled back from the master console and bumped into the clown. I looked up at him and said you know this would go quicker if you gave me a little room, ya think I could call you when I'm done. See I was a smart ass back then... LOL. Meanwhile the LAN everyone was going gaga over lost a disk without any warning and after several weeks of fooling around they declared the data gone forever. Where was the director then, why wasn't he breathing down their necks? Oh it was "new" technology, it was OK to have failures. Why?

And here we are a decade later, oh the PC disks are cheaper and a little better. But ya know what, so are the mainframe disks. The stuff I did back then, taking the databases down for 30 minutes, happens on the fly, hot recovery, no downtime, and the disk controller calls the vendor to tell them which component failed. The dispatcher calls the data center after obtaining the part from the closest parts depot and schedules a technician to install the replacement.

Lessons (jqisms)

  • Given the choice between time engaged and time in chair, always go with engaged, you'll be much further along.
  • There are two kinds of knowledge, real and imagined. Real knowledge possessed by a person who cares will solve problems. Any other combination doesn't matter.
  • There is no particular correlation between having the job, being promoted, and actual useful output, for that to occur life would have to be fair and Peter's Principal would have to be false (and of course neither statement is true).
  • if the answer sounds like bullshit, it probably is, keep looking.