Back to Y2K Hoax

Debugging the Y2k Story

(c) 1999 by William A. Plice, III and Stephen M. Schumacher

Being programmers, we're often asked for an opinion on the Y2k bug. In brief, we're more worried about the bugs in the heads of human beings than in computers.

But that's not answering the question. People want to know whether "Y2k is real" -- whether predictions of a breakdown next January will come true and suddenly change their lifestyle. Even people who only casually follow the news realize that this could be serious. Exactly what to do about Y2k in their own lives is a big decision and they want reliable information.

There is no shortage of information, only of reliability. Y2k news stories are everywhere and a whole industry of Y2k experts has appeared, brewing scenarios for 2000 AD. They all start at the same place, with the well-known fact that a two-digit decimal number has trouble getting past 99. The question is, how does this fact play out in the real world?

First, they imagine the breakdown of individual pieces of equipment -- computer systems, automatic controllers and computing elements embedded in and essential to the functioning of all manner of devices. Then they imagine how these breakdowns could propagate and combine to cause much bigger problems. The spectrum of imagined outcomes goes all the way from complete destruction of civilization as we know it, to virtually no disruption at all.

Y2k Stories Are Buggy

We've found that Y2k stories abound in errors, often resulting from poor understanding of the technology underlying the subject. This article attempts to expose those errors by explaining in some detail the technical aspects of the problem which others have overlooked or misrepresented.

If you become sensitive to the bugs in Y2k stories, you can read them with discrimination, discount the chaff, zero in on any hard kernel of useful information that merits concern, and not be a victim of unreasonable fear.

The Compliance-Metric Bug

A common abstraction appearing in many stories is "Y2k compliance", which lumps all possible date-related failure modes and consequences for each system into a simple binary ("true/false") measure. Essentially, this concept says that if a system works the same (or as well) with 21st century dates as with 20th century dates, it's considered "Y2k compliant" -- if not, it isn't. You really can't talk about partial compliance unless you have some way to meaningfully map all of the real, multi-dimensional details into this single-dimensional measure.

The "compliance" abstraction works only in the limit of total compliance: obviously, if everything is Y2k-compliant there is no problem. But, if only 90% of individual systems are compliant and the rest are not, "compliance" tells us nothing about what the result will really be. This is because the results depend on more details than "compliance" measures. An expert can wave his hands and imagine whatever he likes, but the percentage of compliant systems is useless as a measure of the outcome -- unless it is very close to 100%.

Therefore, any story that bases its predictions on projected levels of compliance is flawed. Its predictions could turn out to be correct, of course, but only by accident.

"Compliance" is a poor way to measure even an individual system. When you look more closely at the root of the Y2k problem you find more interesting and complex things going on than what you may have been led to imagine. Explanations of the Y2k bug tend to be so simplified that the real technical issues are missed or misrepresented. This may be necessary in order to "inform" more people, but it leads to a false sense of confidence in the resulting story.

The Two-Digit-Year Bug

Many stories start with the inadequacy of a two-digit year field because it "obviously" has a problem counting centuries. If the explanation goes no further, it sounds like a watertight proof of the existence and prevalence of the Y2k bug.

What isn't so often mentioned is that computer programs can handle this condition in many ways. In fact, the two-digit year is perfectly adequate in almost all applications. It only becomes ambiguous when the century is unknowable by other means. In almost all cases the century is knowable from other data. For example, it should be clear from the context that 00 means 2000, not 1900 or 2100. Of course, whether this is implemented correctly is another question, but there is no fundamental reason why it should not be.

It may be true, in rare cases, that programmers used the two-digit year in order to conserve expensive or limited storage. Certainly it's true that eliminating one or two characters per date stored saves storage space. But the savings are minor in most applications. A more important constraint was the limited capacity of the punched cards used for input. But even if there were no such limits the two-digit year would have been used in order to avoid the unnecessary redundancy of requiring that the same first two digits of the year be input over and over.

It was always obvious to programmers that, when the time came that the first two digits would not always be the same, an adjustment would have to be made. In many cases the mechanism to make that adjustment was omitted or left incomplete, but the programmers knew many revisions would have to be made before the year 2000 (if the program survived that long). It didn't make much sense to be overly concerned about it decades in advance.

The point is that two-digit-year numbers are neither limiting nor were they bad design, given the input media. The real problem occurs when a mechanism for making the year number unambiguous is missing.

This perspective is a bit different from the common story that the Y2k bug is prevalent in legacy software because the technology and economics in the early days of computing virtually required it to be that way. It also differs from the common assumption that wherever a two-digit year is used it's impossible to resolve the century. You can see that the way the story is commonly told makes the bug seem necessarily present in all such software. But, really, this is not true.

The Date-Bugs-Everywhere Bug

Another interesting aspect is the great variety of ways that computers can represent time internally. Many internal representations are not subject to any discontinuity on 1/1/2000 at all. In fact, the whole issue is sometimes one of input/output conversion. This is commonly done in one place within a program and so only has to be fixed in that one place (if faulty at all).

If you choose to believe a Y2k story that doesn't mention this, you're led to imagine bugs everywhere in each piece of software that will take forever to fix.

The Data-Incompatibility Bug

There are some programs that deal with the year in a simple way. For example, a general-purpose sorting procedure might be used to sort records by date. This type of routine doesn't know and doesn't care what the characters represent. If year 2000 is represented as "00", those records will come out at the wrong end of the file.

There are many ways to correct things like this; the best scheme will depend of the details of the system. Updating the record formats to include a century field would be one approach, but certainly not the only one and probably not the easiest or best one. Any story that stresses Y2k obsoleting old data or the absolute necessity of revamping old files is giving you a flawed picture.

The Must-Be-Fixed Bug

The sorting example mentioned above introduces another dimension of the Y2k bug. If, for whatever reason, a particular misbehavior doesn't get fixed, what are the consequences? If the records come out sorted in the wrong order, it is, by definition, a noncompliant system. But how much does it matter? Many possibilities exist. A few that come to mind are: 1) it may not matter at all; 2) it may result in a slight loss of efficiency in the next process using the sorted data; 3) it may be acceptable after making an adjustment in a downstream system; 4) it may just take a bit of getting used to on the part of a user looking at the data on a screen or printout; 5) it might be corrected by a new process outside the original system which has been designed just for that purpose; 6) it may be essential to the internal functioning of the system that it be fixed.

This illustrates the inadequacy of the "compliance" metric. The real-world impacts of noncompliant systems are not necessarily all-or-nothing as the label "noncompliant" implies. In fact it may be economically advantageous to leave a noncompliant system alone and work around it, at least for the time being.

The Bug-Free Bug

If all computer systems had to be 100% bug free before they were useful, we would not be using computers. When concentrating on one small aspect of computers such as date processing (and this is truly a small aspect), you can lose your perspective. Sometimes you have to stand back and look at the whole picture. Does it make sense to demand that all computer systems must handle the year is a certain way? What about all the other quirky things that computer programs do?

Don't let a Y2k story lead you into the unreal land of Computer Perfection with implications that we can only use computers when they do things exactly right.

The Out-Of-Control Bug

There is an interesting aspect of computer software that makes it different from most things we encounter: it doesn't wear out or decay. Being pure information (essentially instructions about how to do something) it's not subject to deterioration through time or wear-and-tear or "bit-rot".

This is both good and bad -- the bad part is that there is no imperative for sound structure. In other words, the inside of a software program can be patchwork and still be robust in the sense that it doesn't fall apart. Modern programming tools help tremendously in building sound structures, but with today's fast, capacious and low-price hardware there is little incentive to reduce sprawling duplication and complexity. This is nowhere more evident than in the modern PC operating systems for which most new software must be targeted.

The choice is not always easy between keeping an old program that does the job tolerably well and developing or buying new software that will have greater capability but inevitably be larger and less efficient. In spite of the tendency (there are exceptions) of software to be messy internally, the vast majority work so well that we can't live without them. (When thinking about this we must remember the context. Compared with physical machines, software is really robust; compared with social organizations, software is extremely well-ordered.)

Another thing that complicates matters is the difficulty of simulating a future date for testing purposes. It sounds easy, but full simulations in real-world networked systems can be extremely difficult because of the problems associated with running all data sources forward. If the software has been designed to respect the century, it might be better not to attempt a complete test before 2000.

When you hear stories about companies being unwilling or unable to update their software (implying that things are out of control), you're probably not being told the whole story.

The Self-Sufficient-Computer Bug

Another shortcoming of the "Y2k compliancy" concept is that it ignores everything about the context of the system. It's like summarizing an income- and-expense statement in just one word: "profit" or "loss". It tells you nothing about either the magnitude or quality of incomes and expenses. More importantly, it tells you nothing about the balance sheet and whether enough equity exists to cover a loss. But this is only an analogy.

In direct terms, the context of the information system is the people running it as well as the other systems with which it interfaces. What is their reserve capacity for catching up after down time? What alternate systems exist? How strong is the maintenance staff and how quickly do they typically respond to system glitches? Is there already a plan for upgrading or replacement? These are all questions which have rather definite answers. It's simply bad science to lump all these aspects into the simplest possible metric. It may be the best that the self-appointed experts can do, but that doesn't make it valid.

Certainly we can expect to find some poorly-managed systems that are running on good luck, with little to fall back on in case of either hardware failure or an acute software bug. But this has to be the exception, for the simple reason that good luck doesn't last. In real life, things go wrong, they get fixed, and life goes on.

The Black-Box Bug

Replacing or updating systems is usually disruptive to some extent, but increased productivity eventually makes it worth the trouble. The point is that the kind of activity the Y2k bug necessitates is not necessarily different from ordinary system maintenance. There are people who work on these systems and understand them, but their detailed knowledge is not shared by managers and consultants. The further you stand from such systems, the more you are prone to treating them as black boxes that either work or don't work.

But that's not reality. In reality, systems seldom work perfectly. Every major system has bugs which the operators learn to work around. In this light, the idea that computers will simply stop working on January 1st is obviously oversimplified.

Unfortunately, a lot of people, including some managers and consultants, settle for the black-box notion, because, real or not, it's probably the only notion they have to work with. By now you should see where this leads: "compliance". They have to push for "total Y2k compliance" because that assures them that the black box will work next year. If they could step inside and see how things really work, they would soon realize that they need a better way to measure the health of their systems.

The Money-Being-Spent Bug

Under normal circumstances, when things go wrong or threaten to go wrong with computer systems, they're fixed or prevented in the most expedient way without any other pressure than the need to carry on the work at hand. This goes on every day and is seldom of any interest to anyone other than the programmers and technicians maintaining the systems. The driving force for maintaining a system is close to the technical problems and solutions involved.

But in the case of the Y2k crisis, the driving forces are far removed from the problems. Budgets, schedules and even decisions about how to proceed and measure results are driven by management under pressure from lawyers, vendors, government, shareholders and public opinion. In short, this is a social phenomenon that is not truly based on the reality of the technicalities involved in maintaining the computer systems.

Under such conditions it's certain that a great deal of time and effort is being spent unwisely. It's no wonder that it's taking a lot of money. Therefore, any story that measures the seriousness of the problem by the amount of money being spent is ignoring some important factors.

Nevertheless, simply the amount of money being spent seems to be a major component shaping the belief of many people. But the world is not so simple. There are indirect reasons for the money being spent. Exactly what it is being spent on, and how effectively, are questions that need answers before this evidence can be trusted.

Given that management decides, for whatever reasons, to increase the budget of the data processing department, few people in the department will complain. Whether the money goes exactly or directly to enhancing Y2k compliance is perhaps a matter of interpretation. If the company needs to do it in order to meet legal or external expectations, it must be done -- perhaps even at the risk of causing other bugs.

No doubt, some are profiting from the increased budgets in diverse ways. Only a close look at individual DP departments will tell the whole story of how the money is being spent. But some likely scenarios are not hard to imagine.

Maybe it's a chance to get rid of some old unmaintainable code that should have been rewritten long ago. Fixing the Y2k bug may not be a big deal, but in certain cases the chance to do the rewrites could be too good to pass up. Tax law may require charging maintenance separately from development, but who draws the line between the two?

It's easy to see that regardless of the magnitude and actual cause of computer glitches next year, a legal threat hangs over American companies that can reasonably be countered by throwing money at the Y2k bug. Perhaps what is done with the money is not as important as the fact that it was spent trying to do something. With the federal government siding with the fix-it-now mentality, the legal liability of spending nothing or too little may be just too great. (Management and government officials aren't exempt from irrational fear.) If a whole industry does it, the costs will be passed to the customers as inflation.

But who can admit to such a scenario? Documented truth will have to wait until the dust has settled and we get to read the books telling us what really happened.

The Every-Computer-Has-It Bug

It simply isn't true that every company making extensive use of computers has Y2k bugs that will shut down operation if not fixed. Some have tested their systems and found no significant problems at all. There comes a point, much earlier than you may imagine, when the residual date anomalies are less important than other daily issues that have nothing to do with Y2k.

Just because someone finds a date anomaly in a certain computer model doesn't mean that it affects every system where that model of computer is used. When you see a long list of computers that have Y2k problems, remember that in a great many applications these problems are non-problems.

The Laggers-Are-Losers Bug

Not everyone is checking their systems for the Y2k bug. When people are busy, they don't usually go looking for trouble. Small businesses, those struggling to survive, and companies in countries with recessions aren't as likely to be concerned with the Y2k bug. When you don't have resources to spare you do the things which are immediately essential. Given that the extent of the Y2k bug is controversial, the tendency is naturally to postpone action.

But there is another reason for not rushing in to fix things. It's much easier to assess and locate a bug when it appears during actual operations. Simulations just aren't the same. Once it manifests proof of its existence, you can probably devise a temporary workaround to finish the immediate task. Then, by observing its effects in the output, you can often locate the bug immediately (without the tedious and incredibly time-consuming task of combing through the code). This is the way that bugs are ordinarily handled in mature software.

Besides being far more efficient, it's far safer. If a program has a Y2k problem that is not easily fixed, the program is probably poorly structured. If so, fixes of any kind can have surprising side effects. To make more than one change in such a program without real-life testing between each change is to risk causing other problems.

By searching the code for date calculations you might be able to make the program Y2k compliant, but you will not be sure that you have not caused other serious problems in the process (chances are good that you have). Those who don't have the luxury of looking for Y2k bugs now may actually come out ahead in the end!

We don't have to tell you that this is a drastically different picture from what you get in many Y2k stories that look at only one side.

The Microchip-Shutdown Bug

Some stories emphasize the vulnerability of microchips to the Y2k bug. Since a huge number of these devices are virtually running large portions of the world, some insist that even a relatively small number of failures will be disastrous. This has led people to speculate that all manner of things will stop working on January 1.

Just because a device deals with time, doesn't mean it cares about the date. If it does use the actual date, it must have a date-input function for initialization after power has been removed. Think of your VCR or your answering machine after it's unplugged and the backup battery goes dead. If your toaster or your thermostat doesn't have a way to input the year, don't worry: Y2k isn't an issue.

If, like some VCRs and cameras, a device does input the date, then in the unlikely case that it has trouble recognizing the year 2000, it will still work. Just set the year to something the device likes. You can fool it that easily. Maybe it will get the leap year wrong, but we can live with that. Actually, the chances are excellent that anything manufactured within the last 10 years which does have a date function will work exactly right in the next century.

Most microcontrollers are ruled out as potential problems simply because they do not deal with dates or at least not with years. Internally, some may have a clock that reports time as year/month/day/hour/minute/second, and this may, in very rare applications, require that the year number be used in the process of finding the interval between two times.

But here is the thing to remember: unless the device brings out a way for the user to set the year, it does not care what the year value starts out to be. It merely resets it to some arbitrary base value when restarted. In other words, it doesn't matter whether the year number it uses is actually the current year. You can discount the credibility of any story that lists things like microwave ovens that have no year input as being prone to Y2k failure.

Devices that do have a date that is actually synchronized with the current date probably use it to stamp output records, and will function just as well whether the date comes out as 00 or 100 or 1900 or 2000. (Perhaps 100 is the most likely since the year number inside is probably a byte having a range of 0 to 255.) Any such device that shuts down because the date output format has rolled over would have been programmed either by someone too stupid to program or by someone intending sabotage.

The vast majority of microchips are relatively new and contain cheap memory, which removes the economy argument (if there ever was one) for simplistic processing of a 2-digit year. There is really no excuse for them failing on account of a year rollover.

Some applications may need to apply different parameters in different years, and so would have a function dependent on the current year (an emissions sensor, for example). But it's hard to conceive of one that would not operate reasonably after a certain date if the year fell outside its parameter table. If this ever happened, it would not be the result of a programmer taking a shortcut or trying to save memory -- it would be a unique bug of the type that cannot reasonably be assumed to be prevalent.

When interconnected in larger automatic systems, the possibility exists that an irregular date output could be detected as an error condition and initiate a shutdown procedure. But again, this would be limited to those devices that include a date in their output that is actually used for anything. If the date is required, normal maintenance would include setting the date, which makes these devices conspicuous and hard to overlook in preparation for Y2k.

If these systems have not been double-checked before 2000 arrives, they will certainly be watched and taken off-line if they malfunction. Important automatic systems have manual overrides which are necessary to get through times of malfunction or maintenance.

The Big-Automatic-System Bug

Y2k-disaster scenarios assume that relatively few failures can shut down a large part of the world's economic engine because everything is so interconnected. What is unreal about this story is that most of the interconnections are far from being completely automated.

In cases of important services or safety-related links, people with flexibility and ingenuity stand by or are even directly involved in the interconnections. And these people (assuming they aren't conditioned to panic) are all Y2k compliant. Every important link has bypasses, workarounds and emergency procedures, depending on its importance, for the simple reason that it probably has not always worked perfectly in the past and is not expected to work perfectly in the future.

The Celebrity Bug

Because the Y2k story is frightening and involves everyone, it's the ideal media story. Everyone is interested. You can hardly blame the media for giving people what they want, but it does feed the flames. No Y2k glitch of any consequence has yet occurred, so there is really nothing to report other than speculation. Reason is not part of the media's territory, and facts are hard to come by; so, like many other celebrities, Y2k is mostly known for being famous.

Fame implies importance, but since it is capable of feeding on itself and snowballing to any size, it is not a reliable measure of importance. Judging the truth of a matter by its success in gaining a following is easy and natural, but not wise.

The Fit-To-Print Bug

News stories are notoriously susceptible to bugs. Have you ever seen a newspaper write-up about something with which you were intimately familiar? Was it entirely accurate?

All sorts of errors get into stories because 1) reporters are typically not experts in the subjects they are reporting on; 2) they often report what has been communicated to them by someone else; 3) what they hear may be different from what the witness intended to say; 4) there is a natural tendency to reinterpret information to fit the reporter's preconceptions; and 5) in order to get the reader's attention there is a tendency to sensationalize a story by making subtile omissions or arranging for implications which the reporter knows are not entirely accurate.

In ordinary news reporting, these corrupting tendencies are limited because the accuracy and quality of an individual story is often subject to scrutiny by readers and competing reporters. But the Y2k story is not ordinary. It's a huge conglomeration of opinions, theories and anecdotes which is impressive not for its quality but for its sheer bulk and quantity. It seems that little care is being taken to verify the anecdotes that provide the "proofs". In such a climate, not only errors but outright fabrications are bound to be common. Something that was originally made up as a "what if" or for the purpose of humor or simply as a prank can easily be repeated as fact and enter into the mass of accepted Y2k lore.

If you hear a juicy story about a test of a power or water utility or some other facility that failed due to a Y2k bug, don't be too eager to believe it, especially if it lacks particulars regarding exactly when and where it occurred and the names of people involved.

The Wanting-It-Both-Ways Bug

It's probably true that almost every Y2k expert stresses the seriousness of the problem. But that's virtually the definition of being an expert. Anyone who has promoted himself into such a position is not about to say it isn't a big deal.

On the other hand, it seems that many are trying to cover themselves, saying that they don't really know how big the problem will turn out to be, or even approximately what would happen if nothing were done in preparation. In other words, they have no theories that they are willing to back -- because, they maintain, it's an unprecedented event and no theories apply.

Apparently it escapes most people that these experts stop far short of what might be done toward modeling their subject. Real experts (though they're not usually called that) know enough to work out conclusions with useful accuracy and reliability. Until they reach that point they're merely guessing. The first time a man went to the moon, the engineers weren't guessing whether it would work. They knew a lot about a complicated and unprecedented scenario before it was played out.

Because they have promoted the inherent unreliability of their art, many Y2k experts are in a position where they cannot lose. If the lights stay on New Year's day, they can claim success for their efforts. If lights go out, they can say, "We told you so."

When a story hedges its bet by saying that the outcome could be anywhere in a wide range, it really has nothing to say.

The Y2k-Monolith Bug

All possible consequences of individual Y2k bugs typically get lumped together as one big problem due to disrupt normal life next year. Every possible date anomaly is taken to be part of the huge problem. Consequently, every discovery during testing of a specific anomalous behavior is counted as proof that trouble is coming.

The fallacy of this is that most date anomalies are nowhere near threatening to the running of the infrastructure of civilization, nor do they add up to anything significant, since each one impacts only a small number of people. Just because a date stamp has no century field or its century is mislabeled doesn't mean that it's going to cause harm. In some cases it may lead to inconvenience, such as having the date-sorted order of a list become discontinuous. But these things belong in a different category -- the Y2k-bug label has being applied too broadly.

The Tsunami Bug

A major underpinning of the stories that predict Y2k Doomsday is the assumption that an overwhelming stress will hit the world's infrastructure at 12:00 AM, January 1, 2000. This is the stuff Hollywood movies are made of. In reality, any effects of the Y2k bug will be spread out over some time, giving us hours, days, weeks and sometimes months to handle them without imposing significant hardship, depending on the nature of the application.

In some programs, date calculations involving year 2000 and beyond have been going on for many years, and any problems in those particular routines have been fixed long ago. During calendar year 1999, accounting programs will be called on to handle fiscal-year 2000 dates, and any problems arising will be fixed this year.

Another essential ingredient in these stories is the idea that if 1000 systems fail at the same instant it's 1000 times worse than each one failing at random as they normally do. A little thought shows this isn't necessarily true. It would be true if all 1000 systems were maintained by the same person who could only be fixing one at a time. Of course, in reality almost every system has its own maintenance facility and it makes little difference whether they all happen to be fixing something at the same time or at random times.

The Biblical-Prophesy Bug

This is not the first time in history that the end of this age has been pinned on a certain date using Biblical authority. Certainly there is nothing in Biblical prophecy specifically about a technological glitch signaling the return of Christ. Even if the Y2k bug were certain to precipitate a collapse of all the world's economic systems it would not be theologically significant of itself because it's the wrong type of thing.

If it's simply the year 2000 that's significant (remember the 21st century of the Christian era doesn't start until 2001), then whether the Y2k bug's ravages turn out to be insignificant or severe will not affect the certainty of the outcome. If everything will be severe, and the Y2k bug is just another thing among many, then its severity is no more a necessity than that of anything else. Therefore, it's a fallacy to maintain that the Y2k bug is a manifestation of Biblical End Times any more than other technology-related ailments.

It looks very much like some believers have been spurred to label Y2k as divine judgement or the beginning of the Tribulation after first coming to believe that the Y2k bug will have cataclysmic consequences on scientific grounds. But in doing so, they're building their house on a pile of sand.

The Bug-Free Story

What's left after debugging the Y2k stories? Maybe not much. If that be the case, it doesn't prove that nothing will happen; it only proves that there have been no convincing arguments for major Y2k consequences beyond inconvenience and overtime work for some people.

Here's a very important point: It's not necessary to prove that an unprecedented future event will not happen. If there is no good reason to believe that it will happen, then, as a rational being, you owe it to yourself to assume that it will not happen.

Our Predictions for January 1, 2000

By now you have guessed it, but for the record, here it is:

1) No significant utility failures (power, phones, internet, water, sewer, etc. will not be shut off).

2) No inadvertent missile launches; no airplane crashes; and only the usual daily attrition of microwave ovens.

3) Your television, VCR and computer will still work.

4) What Y2k-panicked people will do, we haven't the slightest idea.

Our Recommendations for Action in 2000

1) Don't leave your PC running during Y2k rollover (midnight on December 31, 1999). The first time you use your computer after rollover, check to make sure the date is correct. If not, fix it. (You can do this by opening the "Date/Time" Control Panel in Windows or by typing DATE at the DOS prompt.)

2) If your computer applications do anything that depends on dates, back up your data files and take some extra time to make sure things are being done correctly.

3) You can read about other ways to protect your PC at here -- but these advanced safeguards won't be worth the effort for most PC owners.

4) Don't forget to write the new year on your checks (00 will do).


Back to Y2K Hoax