Monday, December 21, 2009

People think Y2K was a bust, thus proving it wasn't

I read an article recently about a significant virus or some other kind of security problem that people were being warned about. One of the comments on the article said something like "Yeah, well they warned us about Y2K as well, and that was a bust." I have read similar comments before and even heard similar sentiments from people I know. The truth is that Y2K was a real problem that would have caused real chaos if it hadn't been fixed in time. However it was fixed in time, and the fact that no significant problems occurred on January 1, 2000 is a testament to the amount of planning and work that went into fixing it. The fact that the general public thinks it was a bust proves that it was successful.

I know that there were Y2K problems in the database server that I worked on at the time (and continue to work on), and I know that they were fixed beforehand. Our problems were fairly minor, but I know of other problems that were not. Gail worked for a large steel company at the time (still does, kinda), and some time in the late 90's, they did some Y2K testing. They simultaneously reset all the clocks on all the computers in the plant to 11:30pm December 31, 1999 and fired 'em all up again. A few seconds after the clocks hit midnight, everything shut down. The problem was eventually traced to an exhaust fan deep in the bowels of the plant, which decided that it hadn't had any scheduled maintenance in a hundred years, so it shut down. All the systems that depended on that fan to be running also shut down, and the failure cascaded upwards until nothing was running.

If they hadn't done the testing, the plant would have shut down a few seconds after midnight on New Year's Day, and it might have taken them a couple of days to find the problem and a couple more to get a new fan installed. This is assuming that the fan was the only problem. When every hour not producing steel costs your company hundreds of thousands (if not millions) of dollars, a five-day outage would be devastating. Now think: what if that same brand of exhaust fan was used in your local power or water treatment plant? Could half your city live without power or running water for a week in January? What if a similar failure occurred in an air traffic control system? Or some safety-related subsystem in a nuclear power plant? Or the computer controlling the respirators in your local ICU?

The fan was fixed or replaced and the test was repeated. I don't know how many times they ran the test, but when the real December 31, 1999 arrived, the plant kept producing steel like it does through every other midnight. Many hours and dollars were spent in advance to make sure that the problem was solved before it happened. This was done in countless other factories, businesses, hospitals, airlines, and such (not to mention every software development company) so that when January 1, 2000 arrived, all the hardware and software would handle it.

The people who were expecting nationwide blackouts or planes to start dropping out of the sky at midnight were surprised to find that the number of actual problems was very small. Many people assumed that this meant the whole "Y2K problem" was overblown or some kind of industry hype. It wasn't. It was a real problem with an absolute deadline that could not slip. It was solved in time thanks to the combined effort of thousands of software developers (who, admittedly, created the problem in the first place) and IT professionals who put in a lot of effort so that people would never know there was a problem.

This, of course, is part of the thankless world that IT professionals live in – if they do their job properly, you don't notice them. You might even mistakenly think that they do nothing. Every morning, you arrive at work and check your email or internet connection and find that everything is working properly. How many of those mornings have come after nights where the IT staff were up until 4am fixing some network or hardware problem? I'm sure you don't know, but I'll bet that it's more than zero. Tell ya what – next time you see your sys admin walking through the halls at work, say thanks.

2 comments:

Anonymous said...

nice post. thanks.

Anonymous said...

What a great resource!