Tuesday, September 15, 2009

Glossary of Mathematical Mistakes

By Paul Cox

This is a list of mathematical mistakes made over and over by advertisers, the media, reporters, politicians, activists, and in general many non-math people. These come from many sources, which will appear in parenthesis. I will try to find an actual example of each for learning purposes. Note: In this document, I attack errors made by popular social organizations. I am not attacking their important causes, only their mathematical errors. I try to find errors from all political and social views. Any suggestions for better examples and new topics can be e-mailed to me by clicking here.

Back to mathmistakes.com home


Aftermath Counting (A. K. Dewdney)- What the press has a tendency to do after a major disaster. It is the counting of known casualties from the police, paramedics, hospitals, and morgues, without considering duplication. Example: After the San Francisco Earthquake in 1989, reported deaths skyrocketed to 255 before finally settling on the number 64 dead. Interestingly, the Oklahoma City bombing of 1995 kept the body count at only the number of bodies actually recovered, though the press reported as many as 300 missing (the final tally was 163).

"All Disasters Come in Threes" Conjecture - Also called the "All Celebrities Die in Threes" Conjecture. Essentially it is the mistake of grouping what is essentially a random occurrence. See Cancer Cluster Syndrome, "Shooting the Barn" Statistics.

Astrology Amnesia - When your Astrological forecast comes true one day, you forget about the last three weeks when the forecast failed. Similar to Sample Trashing.

Cancer Cluster Syndrome - Making a lot of fuss over an above average number of cancer cases in a confined region. Note that all random functions have a tendency to cluster. For every reported cancer cluster there is also a cancer deficit region that does not get reported at all. This is not to say that all "cancer clusters" are just statistical abnormalities, there may be some toxic pollutant present in the area, but false reporting of cancer clusters is very common. See "Shooting the Barn" Statistics.

Circular Reasoning - see Recursive Arguments

Compound Blindness (Dewdney)- An "impressive" growth rate that does not take into account inflation, population growth, or other forms of natural compound growth. See also Law of Zero Return

Conspiratorial Coincidence (Paulos, 1995)- Given any two events, or any two people, it is highly likely there are strange commonalities. Unfortunately, sometimes these commonalities are put together to suggest or demonstrate a "conspiracy theory". Example: The commonalities between Abraham Lincoln, and John F. Kennedy have become rather famous. Lincoln was elected in 1860, Kennedy in 1960. Their names are seven letters long. Their assassins, John Wilkes Booth and Lee Harvey Oswald, both advocated unpopular political views, went by three names, and had 15 letters total in their names. Booth shot Lincoln in a theater and fled to a warehouse. Oswald shot Kennedy from a warehouse, and fled to a theater. But if we are to make anything from this, we should consider the strange commonalities between two other assassinated presidents: William McKinley and James Garfield. Both were Republicans, born and raised in Ohio. Both have eight letters in their last name. Like Lincoln and Kennedy, they both were elected on years ending with zero (1880 and 1900). Both had Vice Presidents from New York City who wore mustaches. Both were slain during the first September of their respective terms by assassins Charles Guiteau and Leon Czolgosz, who had foreign sounding names. So where are the conspiracy theories here? The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Correlation Cause and Effect Problem - Statistical correlation is a comparison of how often event A happens when accompanied by event B, versus how often event A happens without event B. If the first happens more then the second, then event A and B are said to have positive correlation. For example, it has been shown that students with good self-esteem get better grades in school than students with poor self-esteem. So, self-esteem and grades correlate. The problem here is in deciding cause and effect. Either A helps to cause B, or B helps to cause A, or there is a third factor C that could help to cause both A and B. Unless there is definitive proof to decide, any one of the three can be the truth. Some educators believe that all they have to do is boost self esteem, and grade improvement would follow. But, it is just as likely that self-esteem is a result of good grades, not a cause. Furthermore, there could be a third factor, say a good family environment at home, that could be responsible for both. The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Credit Card Games - An offer too good to be true from a credit card or loan company. Lets say that I use my credit card to buy a $2500 computer. But, this card charges me 16% interest. Another card charges only 10% but there is a 3% surcharge for using the card. Which card should I use? Obviously, the second card because 10% plus 3% is only 13%, but if you go with the obvious choice, you would be wrong! Lets say I decide to pay off the debt in 8 months by paying $300 per month. Under the first card, over that 8 month period, I will pay $2664.42 including interest for that computer. Under the second card, they will charge me 3% up front, so my loan to start is $2575. At 10% interest over those same months, I will pay $2680.68 including interest for that same computer. That is over sixteen dollars more! If you transfer your credit card balance to a lower interest credit card there usually is this kind of surcharge, so run the numbers before you do it. Better yet, do not use a credit card if you can help it. A $2500 computer only costs $2500, if you pay it all up front. Watch out for "0% interest for 6 months" deals, also. Unless you can pay it all off within those 6 months, they will charge you back interest accumulated during those 6 months, usually at a high interest rate. You are better off charging with a bank card that charges interest from the start, than to go with one of these deals. Compare Ratiocinitis.

Definition Errors- A category of errors in which a mathematical term is misused or confused, possibly to mislead. The three most common definition errors are:

  1. Profit (or earnings) - Could mean either gross profit or net profit, this is how "5 billion dollar" companies still have financial difficulty and have to reorganize, such as a former employer who shall remain nameless.
  2. Average - There are three different kinds of averages. Given a sample set: 2,2,2,3,5,6,8: The mode average (the most common answer) is 2; the median average (the one in the middle) is 3; and the mean average (the sum of all samples divided by the number of samples) is 4. Any one of which can legitimately be the average.
  3. Odds vs. Probability - Given a list of outcomes (rolling two dice) and a list of good outcomes (rolling a 7), you can calculate either the probability (good outcomes / all outcomes) to get 1/6. Or, you can calculate the odds (bad outcomes / good outcomes) to get 5:1. It is easy to get these two confused.

Dimensional Demensia (Dewdney)- 1. The confusion of the significance of dimensions. 2. The attempt to compare two objects of different dimensions. A foot has 12 inches, but a square foot has 144 square inches.

Example: Here is a comparison of the world population in three different ways:

* If every person in the world was lined up end to end, we would stretch four times longer than the orbit of the Moon around the Earth.

* If we all lived in a city with the population density of New York City, That city would cover the state of Texas.

* If every person was given a 20'x20'x20' apartment, the total volume would only fill the Grand Canyon half way.

Note that with each dimension added, the world population seems less significant. (Examples from Paulos) An additional question: Which dimension is the correct one to use? Since our number one need is food, and farm land is measured in acreage, it is safe to assume that the two dimensional measure is the best one. While we can live in apartment buildings stacked up on top of one another, farm land cannot be stacked. Another example: Back in high school, I was dancing close with a girl, when a chaperone came over and said, "You should be six inches apart." I told the chaperone, "We are. There is at least six cubic inches of space between us." Ok, so maybe you had to be there to think it was funny.

Dramadigits (Dewdney)- 1. The reporting of a number with more significant digits than what can be reasonably expected. Example: Advertisers love Dramadigits! Ivory soap likes to brag that it is 99.44% pure. It is impossible that each bar is exactly that impure. How do they get away with it? The "99.44% pure" statement is a trademark, not a statement of fact. 2. The reporting of a number with more significant digits than can be accurately calculated. Example: For decades it was thought that the normal body temperature was 98.6°F. This number was calculated from a study in Germany which reported normal at 37°C. What was not known was that this number was an average rounded to the nearest degree. In other words it was only accurate to two significant digits, not the three we have with 98.6. Scientists today know that normal is actually 98.2 plus or minus 0.6, that is to say anything in the range of 97.6° to 98.8° should be considered normal. Here is a dramadigit joke: A visitor at a natural history museum asks a guard how old the dinosaur skeleton is. The guard responds that it is 90,000,006 years old. He explains, "They told me it was 90,000,000 years old when I started working here, and that was six years ago." (Paulos, 1995) The Readers Submitted Examples page has more on this topic.

Election Paradox - In a very close election involving more than two candidates, it is possible to invent scenarios in which any candidate could have won. The only way to avoid the election paradox is to decide how votes are counted and how the winner is decided before the election is held. The 2000 Presidential Election demonstrated this paradox. It took the Supreme Court to decide the winner of Florida. When the votes were finally all recounted a year later Bush would have won anyways, but there were scenarios in which Gore would have won Florida and the Presidency. This topic was featured as a Mistake of the Month.

Factorectomy - 1. The failure to take into account factors that vastly affect the outcome. Inflation is a commonly left out factor, especially when comparing financial situations over time. Variety always publishes a list of the "Top Grossing Movies of All-Time", and according to the list Titanic is number one, but when you take into account inflation, Gone With the Wind, made in 1939, has made almost $300 million more than Titanic. In 1939, you could still go to the movies for 10 cents. When #2 Star Wars was released in 1977 it was 3 dollars a ticket. Still low compared to $7.50 ticket prices of 1998 when Titanic, came out, (which falls to number 5 when adjusted for inflation.)

2. The use of predictive models that do not fit past data or outcomes due to missing factors. Example: Global Warming calculations have come under serious attack due to their inability to predict past occurrences. It has been determined that these calculations failed to take into account the effect of sulfur particulates and other pollutants on global temperature. While the old factorectomized models predict global warming of up to 8°F by 2040, these new predictive models show the globe heating up no more than 1°F by the year 2040. Turn on your Air Conditioners! (Note: The February 1994 Scientific American has an excellent article by Robert Charlson and Tom Wigley on this topic. Also, some have pointed out that I am oversimplifying the global warming phenomenon. Of course, I am. If you want the complete weather picture, see the May 1997 Scientific American.) Compare Factoritis.

Factoritis - 1. Taking into account factors that are not really relevant in order to inflate the numbers. AIDS activist groups regularly report that there are over a million Americans with HIV. Current statistics show that it is around 740,000 - including the approximate number that have never been tested, but are positive any ways. The only way to justify over one million people is to include those that died from the AIDS since 1980 when it was first discovered. While some may call this valid, it only looks impressive as a Raw Number. To use this number as a comparison figure, you have to include all of the people that have died from other diseases during the same time period. Stated another way: To figure out the percentage of the population with HIV we divide 740,000 (the number of HIV positives still alive) by 260,000,000 (the number of Americans still alive), resulting in 0.29% or about 1 in every 340 people. If we include the 360,000 people who have died of AIDS in the numerator, we have to include the tens of millions of people who have died from all causes during the same time period in the denominator. The resulting percent is actually lower when calculated this way.

2. Taking into account factors that were already calculated earlier. In 1990, the Department of Education reported that school expenditures have more than tripled since 1960. Some education lobbyists produced the figure that education spending has actually gone down on a per pupil basis when you figure in inflation. The problem is, the Department of Education already figured in inflation in their figures (but they left out per pupil figures, see the double exposure graph example below for details). The declining education statistic is the result of inflation being taken into account twice. Compare Factorectomy.

False Positive Conjecture - Tests with very high, but not perfect, accuracy may actually produce more false positives than true positives. Let us suppose that 3% of the population uses illegal drug X, Let us also suppose that a test has been developed that is 95% accurate in determining who has been using drug X. Say 1000 people took this test, 30 of them being users, Since the test is 95% accurate, 29 of these users gets caught (the other one gets a false negative). At the same time, of the 970 remaining, 48 also show positives, even though they are false. In other words, 78 people tested positive for use of drug X, but only 30 were true positives. Try this out on AIDS tests that are 99.7% accurate, or for more fun try polygraph (lie detector) tests that are only 80% accurate!

Filter Counting (Dewdney) - An underestimate of reality caused by the deletion of important data. Similar to Factorectomy. See also Sample Trashing

Gambler's Fallacy - see Law of Averages Thinking.

Gambler's Ruin - If a gambler stays in a casino long enough he will eventually lose all of his money. This is why casinos can pay out millions in winnings and still stay in business. This topic was featured as a Mistake of the Month.

Graph Errors - Statistical graphs can be seen everywhere. It is based on the idea that a picture is worth a thousand words. Unfortunately, pictures can lie as well. Here are four common types of misleading graphs in the business: The Readers Submitted Examples page has more on this topic.

* The Magic Graph - A chart or graph that uses an optical illusion to make the information seem more significant.

Two magic graphs and a zoom graph.

* The Zoom Graph - A graph that does not start at (0,0) thus making small changes seem big. Useful in the study of trends, but often passed off as actual changes. Such graphs are very popular in financial circles. The graph below comes from an advertisement for Creative Labs Graphic Blaster3D.

Graphics Blaster Ad

* The Double Exposure Graph- Two graphs plotted on top of one another in order to make some kind of comparison. Again occasionally useful, but easily misleading. Note below how the combined SAT side is a zoom graph. This graph was developed by the U.S. Department of Education.

SAT vs Expenditures 1

Here is a more accurate portrayal of reality. Note that since SAT scores go from 400 to 1600 points it is best to demonstrate this range. Also Education Spending has been adjusted to a per pupil counting showing spending has doubled instead of tripled. The connection between school spending and falling SAT scores cannot really be made.

SAT vs Expenditures 2

Object d' Art Graph* The Object D'art Graph - A graph that lacks a standardized y-axis in order to hide what the graph is really saying. It is can be a pretty graph suitable for framing but in fact absolutely useless. To the right is an Object D'art Graph actually used in a stock report. Note that the 1970 figure is actually a negative number.

Innumeracy (From the similarly titled book by John Allen Paulos)- 1. The inability to deal with simple mathematical concepts. 2. The math version of illiteracy. This topic was featured as a Mistake of the Month.

The "Kevin Bacon" Game (a Hollywood trivia game) - During your life you have probably gotten to know at least a thousand people. Each of those thousand people, have gotten to know a thousand people as well. Taking into account overlap, at least 100,000 people knows someone who knows you, and possibly 10 million people know someone who knows someone who knows you. By this, it follows that everyone is connected to everyone else by no more than six degrees of separation. A movie with that title has inspired a game called the Kevin Bacon game. Take any famous person, and you should be able connect them to actor Kevin Bacon in less than six moves. OK, lets try Elvis Presley, who was married to Priscilla Presley, who was in Naked Gun with Leslie Nielsen, who was in Airplane! with Lloyd Bridges, who was in Blown Away with Tommy Lee Jones, who was in JFK with Kevin Bacon. So, the next time the complete stranger sitting next to you turns out to have gone to high school with your brother in law's boss, it really is not that big of a coincidence. Compare Post Occurrence Miracle. The Readers Submitted Examples page has more on this topic.

The Law of Averages thinking - A belief by gamblers that the more often you win or lose the more likely your luck will change on the next try. If you flip a coin and it lands heads 10 times in a row, what are the odds that it will land heads on the 11th try? Answer 1:1. What about after 100 times in a row? Again it is 1:1. The odds are the same on each toss! The Readers Submitted Examples page has more on this topic.

The Law of Zero Return (Dewdney)- Return on Investment = Inflation + Taxes. This topic was featured as a Mistake of the Month.

Loaded Questionnaire - Asking questions in such a way as to make the respondees feel foolish if they do not answer the questions the way the surveyor wants. Example: In 1995, the Corporation for Public Broadcasting surveyed the viewership of TV. The CPB had a certain interest in getting responses that favored their programming format so as to avoid federal budget cutters. One of many example questions:

"A recent study by a psychology professor at a leading university concluded that the amount of violence children see on television has an effect on their likelihood of being aggressive and committing crimes. From what you have seen or heard about this subject, do you agree strongly to that conclusion, agree somewhat, disagree somewhat, or disagree strongly?"

It is difficult to listen to such a question and not agree strongly. A more balanced question would include a contrary opinion, or better yet offer no opinions at all.

Logical Fallacies - If you want to deceive the majority of the people, use some of these in your arguments. (Note that these are used consistently by politicians, lawyers, and advertisers). The Readers Submitted Examples page has more on this topic.

* Fallacy of Ambiguity - Occurs when a word or phrase is used with one meaning in one premise, and with another meaning in another premise or conclusion. Example: People should do what is right + people have the right to disregard good advice = People should disregard good advice.

* Fallacy of Attacking the Messenger - Attempting to discredit a message by discrediting the messenger. Example: He is innocent, the cop who arrested him is a racist and therefore must have planted that glove.

* Fallacy of Composition - Confuses properties of a whole with properties of the parts. Example: This is a good class = Every student in this class is good.

* Fallacy of Emotion - An appeal to popular passions such as pity, fear, brute force, snob appeal, vanity, or some other emotion. Example: You would look sexy behind the wheel of this new $50,000 sports car.

* Fallacy of Experts - Quoting an expert in one field on a matter of another unrelated field. Example: Meryl Streep testifying before congress on the homeless because she played one in a movie once. Advertisers with celebrity endorsements sell more than with expert endorsements.

* Fallacy of the Complex Question - A loaded question designed to make the person answering look bad no matter how they answer. Examples: Have you stopped beating your dog? How does mind reading work? Are the people in your town still rude to visitors? Will that be cash or charge?

* Fallacy of Non Sequitur (Literally "it does not follow") - conclusions that do not follow from their premises. Example: John is sick + John needs some rest = I need some rest.

* Fallacy of False Cause - Example: I walked under a ladder + I was hit by a car = Walking under ladders brings bad luck. Most superstitions start out this way.

* Fallacy of Popularity - Example: The majority of Americans believe UFO's are real = Space aliens must visit America. Just because something is popular, does not mean it is correct. It is best to follow that old catch phrase, "If everyone jumped off a cliff, would you?"

* Fallacy of Innovation - If something is "new" or "different" then it is often assumed that it is better than the old and ordinary, which is rarely the case. Note that the word "New" is the word most often mentioned in advertisements, but only occasionally is it followed by the words "and Improved".

"Looks Like" Geometry - The tendency to find significance in insignificant geometric patterns. The best example is the "Face on Mars", that has recently been rephotographed. To the right are three images to compare, the first is the original "face", the second is the face as photographed April 5, 1998, the third is a negative of the second to change the angle of the light. Click on the image to see an enlargement. Martin Gardner has written an excellent article on this and other examples. This topic was featured as a Mistake of the Month. Face on mars comparison

Lottery - see Sucker Bet.

Multiple Comparisons Fallacy - (statistical epidemiology) Risk factor studies have a 5% chance of being too high and a 5% chance of being too low. Lets say a pre-election poll of 1,000 people shows candidate Smith with a 8% lead over Jones. If we did instead 20 polls of 50 people each, chances are at least one of those studies would show a slight lead in Jones' favor. In 1992, a Swedish study on the effects of power line radiation showed that children living close to power lines have a nearly four fold risk of childhood leukemia. But, upon closer examination, the Swedes did nearly 800 different studies in one. Other studies in the same report actually show a decrease of childhood leukemia from power line radiation. Studies with this many comparisons are not good for concrete results, they are best used to point out directions where future research should be done. (Frontline, "Currents of Fear")

Num (Dewdney)- A reported number with too few significant digits to be useful. These usually are round numbers like 1000 or 100,000. The term "six-figured salary" is an example, meaning any number between 100,000 and 999,999. Opposite of Dramadigit.

Number Inflation - 1. A gross overestimation or underestimation of reality. 2. A reporting of a statistic that is just not true. I used to call it "The Law of Five Times Reality" because of the tendency of political advocacy groups to over inflate their numbers by five times what is reality. Take the following examples:

Situation Reported Actual
AIDS Cases (1990) 1 million (ACT UP!) 200,000 (CDC)
Homeless (1990) 3 million (Mitch Snyder) 600,000 (Urban Institute)
Right to Life March (1989) 100,000 (Right to Life) 20,000 (Federal Park Police)
Spousal Abuse (1994) 6 million cases (NOW) 1.2 million cases (FBI)
Homosexuals (%) 10% (The Advocate) 2% (1993 sex survey)

It does not necessarily have to be five times reality. Besides reality is defined by the person doing the reporting. Some of these groups try to justify their numbers by taking in to account factors that should not be included (see Factoritis), or try to include numbers that we do not have enough information to make a good estimate (see Statistical Brick Wall). The Readers Submitted Examples page has more on this topic.

Number Numbness (Hofstadter)- The inability to fathom, compare, or appreciate really big numbers or really small numbers. Such as the difference between a million and a billion and a trillion. Politicians seem to make this mistake the most, noting the need to cut $164 million from the National Endowment of the Arts, while insisting that it is important to spend 'a mere $60 billion' on a missile defense system. The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Opportunity Cost Error (expression from economics)- The failure to consider the real cost of doing something. Example: Spending a dollars worth of gas to drive across town to save fifty cents on soap. Another example is education. According to census figures, the average high school graduate earns $7000 more per year than the average high school dropout. Over a working life span of 40 years, that means a high school diploma is worth $280,000. That is the opportunity cost of dropping out of high school. The Readers Submitted Examples page has more on this topic.

Percentage Pumping Formula (Dewdney)- A non-standard of percent designed to increase the advertised percentage of discount or improvement. A normal discount formula is SAVINGS/NORMAL COST, a Percentage Pumping formula is SAVINGS/NON-SAVINGS. A 33% normal discount can become a 50% pumped discount. Advertisers use this to create as high as 200% savings which makes no sense at all. The Readers Submitted Examples page has more on this topic.

Post Occurrence Miracle - An unexpected coincidental event realized after the fact. The odds of a particular strange event, say the dream you had last night comes true, may be very small; but the odds that some strange event happening some time, like one night during your life you have a dream that comes true, is actually quite good.

Practically Zero Probability - A probability so small that it must be considered zero. Example: The probability of winning Powerball is 0.000000012. A statistician coming across such a number would not hesitate to call it 0 under most circumstances. The Readers Submitted Examples page has more on this topic.

A Rare Scare - A media report of a probable disaster (i.e.. death, earthquake, cancer risk from eating apples, etc.), where the probability is considerably lower than risks taken everyday (i.e.. getting in a car wreck on your way to school). The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Rare Scaremongering - An attempt to avoid a rare scare that results in either a cost that exceeds the cost if the disaster happened, or increases the risk of a more likely and even worse disaster. Example: In Indonesia, eliminating pesticides with very small cancer risks has increased the population of malaria carrying mosquitoes. The Readers Submitted Examples page has more on this topic.

Ratiocinitis (Dewdney)- A tendency to forget the rules regarding the addition and subtraction of fractions, ratios, and percentages. Example: A store is offering 30% off everything. A fancy dress that normally sells for $100 has a tag saying 20% off. Does this mean the total savings is 50%? No, The actual savings is 44%. 20% off $100 is $80, and 30% off $80 is $56, which is 44% off of the original $100. Question: Will it make a difference if the store takes the 30% off first? Example 2: A Man is walking down the street, and stumbles on a five dollar bill, having $10 in his wallet he says to himself, "Hey, I just increased my money by 50%." Later he discovers that he had a hole in his pocket where his $5 bill was lost but thinks, "That's OK, first I gained 50% by finding the five dollars, and now I lost 33% by losing the five dollars, I am still 17% ahead." The Readers Submitted Examples page has more on this topic.

Raw Number - The reporting of an 'impressive' number that is meaningless without something to compare it to. Example: Everyday cars in America produce over a billion tons of pollution. Impressive? yes!, but let me state this statistic another way: Every day cars in America pollute 0.000000001% of our atmosphere. Impressive?, not really. Now this is a poor way to think about pollution, also. But, that is the problem with raw numbers: you have to compare them to something else to be meaningful. A better statistic: Cars are responsible for more than half of the carbon monoxide pollution.

Recursive Arguments - see Circular Reasoning

Regression Jinx - You probably have heard of the "sophomore jinx" popular in sports and music, a rookie athlete or musician performs poorly in their second year after an outstanding first year. Believe it or not, there is a mathematical explanation. When a star gains superstardom, it is partly because of talent, partly because of luck, or outside forces the star cannot control. When luck is extremely in the stars favor, it is only a matter of time that luck changes. Thus the "sophomore jinx" is not a jinx at all, but rather an expected outcome of statistical regression (the tendency of luck to move toward the norm). Alanis Morisette's first CD released in the US, sold 15 million copies. Despite critical acclaim, it is highly unlikely her second CD will sell even half that. An article in the March/April 1999 Skeptical Inquirer has more info and examples.

Sample Occulting (Dewdney)- A disregard for an enormous sample resulting in coincidences seeming "supernatural", requiring an "occult" explanation when there really isn't a need. Example: An Advertisement for a TIME-LIFE book on unexplained phenomenon mentions a daughter who touches a hot stove and a mother 3000 miles away feeling pain in her arm at the same instance. Is the mother tele-empathic? Probably not! Consider how many times people touch hot surfaces, consider also how many times older people get mysterious pains. The likelihood of these two things happening at the same time to two people who are related are very good. Now if these events happened consistently to the same two people, and this could be replicated in a scientific experiment, then one might look to the supernatural. See also Post Occurrence Miracle.

Sample Trashing - Throwing out perfectly good data as "unreliable" because it goes against what the statistics are trying to prove. Popular with ESP believers who point to a few studies with positive results, and ignore the majority of the studies with negative results. The Readers Submitted Examples page has more on this topic.

Selective Endpoints - The reporting or graphing of a change in naturally random functions (economic indicators, weather conditions, stock prices, natural disasters, etc.), by comparing an unusual low to an unusual high. Example: During the 1992 presidential race, the Democrats pointed to how the economy was not doing as well as it was five years ago (1987 was an unusually good year economically), while Republicans pointed out that the economy was better than it was 11 years ago (1981 was an unusually bad year economically).

Self Selecting Sample - The assumption that a group willing to take a survey represents a random sample. The Hite report (1976) on female sexual attitudes was based on surveys of 3019 women, unfortunately Shere Hite distributed over 100,000 surveys. All the report measured was the sexual attitudes of the 3% who were willing to fill out the survey. Another example can be found with 900 number polls on TV shows. These are only a representation of people who feel strongly enough to pay 75 cents a call and do not represent the real population.

Self-Fulfilling Prophecy - When an 'expert' predicts that a certain stock will go up in price, the increased demand for that stock from people following the expert advice pushes the price up. When your astrological forecast says you will have a good (or bad) day, and because this gives you a positive (or negative) outlook to begin with you end up actually having a good (or bad) day. This topic was featured as a Mistake of the Month.

"Shooting the Barn" Statistics - A story is told about a Texas sharpshooter who shot his gun into the side of a barn 30 times, then painted a circle around where most of the bullets landed, calling that his target. Collecting statistical data without first knowing what you are looking for results in bad statistics. Popular among business managers who want show their investors how they are doing, they set their productivity goals after already reaching those goals. Another variation is called "Spotlight Gag" Statistics, after a famous gag by Red Skelton where the spot light always shines where he was standing before (A Simpsons episode had Krusty the Clown do the same bit.) One place where I worked set its productivity goals based on the previous months performance. Each employee was expected to reach these goals for the next month. Natural seasonal fluctuation of demand made the following months goals either too easy or too difficult to achieve. The problem is that comparing past performance to present performance is impossible without a standard measure. Varying productivity goals so drastically made the percent of goal statistics invalid. You cannot measure height with a yardstick made of silly putty. See Cancer Cluster Syndrome, Multiple Comparisons Fallacy

Simpson's Paradox - A condition in statistics in which a "small" group seems to perform better in individual comparisons than a "big" group, but when overall performance is compared, the "big" group is better. For example, ACME Manufacturing is hiring two groups for their new division. Since ACME has had trouble with sex discrimination in the past, they do their bit to avoid it this time. They have two openings in group 1, they have 5 male applicants and 3 female applicants, and they hire 1 male and 1 female. This way 33% of the women applicants get hired compared to only 20% of the male applicants. In group 2 they have 15 openings, 20 men and 3 women apply, and they hire 2 women and 13 men. This way 67% of the women applicants get hired compared to 65% of the male applicants. Who could argue with that? A week later ACME gets hit with a sex discrimination suit by one of the women that did not get hired, because overall 56% of the male applicants (14 of 25) got hired compared to 50% of the female applicants (3 of 6). See Ratiocinitis

Statistical Brick Wall -A number that cannot be verified, or accurately estimated, because the statistical data does not exist. A good example is the statistic of endangered species. Some biologists have estimated that over 10,000 species go extinct every year. Actual verified extinctions are around one species a year, including insects. These highly reported statistic tries to take into account the number of undiscovered species that go extinct, a number that is impossible to calculate. Sure there may be species that go extinct without anyone noticing, but the statistical data does not exist due to the fact that it is impossible to obtain. It could be anywhere from two to 10,000. No one can prove otherwise. Other major victims of the Statistical Brick Wall are studies that involve "ruling out" all possible factors. In 1994, a study was done to show how dangerous particulate pollution is, the result is that people who live in cities with high particulate pollution shorten their average life span by about 2 years. The study compared life spans in non-polluted regions (rural small towns) with high polluted regions (big cities), then they had to "rule out" other factors that might contribute to shorter life spans. They eliminated dozens of factors where statistics exist (i.e. violent crime), but then they were unable to rule out dozens of other factors (i.e. lifestyle differences, eating habits, exercise) because these statistics do not exist. In such cases statisticians either assume all remaining factors add up to zero, or they make an educated guess base on trends. While this does not invalidate the study, it makes such studies less than reliable. The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Statistical Rash - A judgment based on statistical data that does not take into account all of the factors that cause the data to result as it did. Example: Here are some actual statistics on accident rates based on the speed at which the cars were driving:

20 mph or less 2.0%
20 to 30 mph 29.7%
30 to 40 mph 30.4%
40 to 50 mph 16.5%
50 to 60 mph 19.2%
over 60 mph 2.2%

It would seem to the casual observer that it is safer to speed than to travel at the speed limit. In fact, the reason that only 2.2% of the accidents happen at over 60 mph is because at any given time only about 2.2% of the cars on the road are traveling at over 60 mph. These statistics do not say anything about speed and accident rates, only about how fast the average car is traveling. (Example from Marilyn Vos Savant) The Readers Submitted Examples page has more on this topic.

Sucker Bet (Dewdney) - A gambling wager in which your expected return is significantly lower than the wager. Lets say that LOTTO (the Arizona Lottery that chooses 6 out of 36 numbers) is paying out $2 million this week, a lottery ticket only costs $1, what is your expected return? To calculate expected return, use the following formula: EXPECTED RETURN = POTENTIAL WINNING * PROBABILITY OF WINNING - POTENTIAL COST * PROBABILITY OF LOSING. Since the probability of winning LOTTO is 0.00000019, the probability of losing is 1-probability of winning or 0.99999981. Therefore, EXPECTED RETURN = 2,000,000 * 0.00000019 - 1 * .99999981 = - 0.62 In other words, for every dollar you put into the lottery, you stand to lose 62 cents. The Readers Submitted Examples page has more on this topic. This topic was featured as a Mistake of the Month.

Technical Analysis (a stock market term) - The attempt to look for numerical trends in a random function. The stock market used to be filled with technical analysts deciding what to buy and sell, until it was decided that their success rate is no better than chance. Now technical stock analysis is virtually non-existent. The Readers Submitted Examples page has more on this topic.

Texas Sharpshooter, The tale of the - see "Shooting the Barn" Statistics


Sources and Further Reading: (clicking a title will take you to amazon.com)

Dewdney, A. K., 200% of Nothing: An Eye Opening Tour Through the Twists and Turns of Math Abuse and Innumeracy, John Wiley and Sons, New York: 1993.

Paulos, John Allen, Innumeracy: Mathematical Illiteracy and its Consequences, Hill and Wang, New York, 1988.

Paulos, John Allen, A Mathematician Reads the Newspaper, Basic Books, New York, 1995.

Hofstadter, Douglas, Metamagical Themas, Basic Books, New York, 1984.

Capaldi, Nicholas, The Art of Deception: An Introduction to Critical Thinking, Prometheus Books, Buffalo, New York, 1987.

Copyright ©1995-2001 by Paul Cox

No comments:

Post a Comment