Six Sigma Online

  • Increase font size
  • Default font size
  • Decrease font size
Home Six Sigma Heretic Statistical Correlation Does Not Always Prove Cause

Statistical Correlation Does Not Always Prove Cause

E-mail Print PDF

Black Belt or not, common sense is quite rare

After my last column citing some really bizarre flaws in how our brains perceive reality, I thought I might cover some flaws in logic that are applicable in the world of quality. So, basically, even if our brains are working correctly, we can still send our Black Belts off on false trails trying to solve problems, thus offering more proof (as if we need it) for Voltaire’s observation that “common sense is quite rare.”

I ran across this very cool site organizing logical fallacies into a taxonomy. (OK, so the Internet empowers my nerdosity….) Now as you know, our work in quality is not pure logic (i.e., what is consistent) but science (i.e., what works). Ptolomy was logical when he said the Earth does not move because if it did:

“… all those things that were not at rest on the Earth would seem to have a movement contrary to it, and never would a cloud be seen to move toward the east nor anything else that flew or was thrown into the air. For the Earth would always outstrip them in its eastward motion, so that all other bodies would seem to be left behind and to move toward the west.”  (Ptolomy in Almagest)

A statement of perfect logical consistency, but it is incorrect due to a number of mistaken hidden premises. However, it took about 1,600 years before somebody actually tested that statement.
If you think about it, a lot of what we do is try to create a model of a production process that is “close enough” to reality to be useful, which is all that applied science tries to do.

However, logic is a necessary (if not sufficient) requirement of science, so it is worthwhile to take a look at some common, and tricky, logical fallacies to avoid wasting time and money. Following is an insidious favorite.

Cum hoc, ergo propter hoc—or correlation is not causation

The Latin translates to “with this, therefore because of this.” You probably heard this one in your first statistics class as “correlation is not causation,” and it is treated humorously in this comic strip in figure 1:

corr_cause_1

Figure 1: Randall Munroe’s webcomic at http://xkcd.com

Just because deaths due to drowning and ice-cream sales are strongly correlated does not mean that banning ice cream at the beach will prevent drowning. So even when we find a significant correlation or association in a statistical analysis, we can’t assume a causal link.

When we see a significant correlation, this could be due to:

  • A true causal relationship between the two variables, in the direction we think. For example, ice-cream sales do cause swimming deaths.
  • A true causal relationship exists, but in the opposite direction. For example, swimming deaths cause ice-cream sales. (This might be true in some horrifically voyeuristic society that buys snacks to better enjoy the spectacle of a drowning. Come to think of it, with the popularity of reality shows, maybe this isn’t as unlikely as I thought….)
  • Another factor that the two hold in common is the real causal factor. When it is warm, more people go out swimming, and thus a higher number drown. When it is warm, more ice cream is sold. Thus, when it is warm, both drownings and ice cream sales go up, and when it is cold they both go down.
  • Alpha error. The statistical test concluded a relationship exists where none really does due to chance and chance alone or, in the absence of a statistical process, just plain coincidence. But even though we all know this, it is still a seductive error to make.

While helping a client, we were looking at the recent historical production rate of a steel mill, stratified by operator, and saw something like this in figure 2:

corr_cause_2
Figure 2: Mill speed by operator

Properly running the analysis of variation (ANOVA), we found that these operators are, in fact, significantly different from each other in the speed at which they run the mill. The supervisor I was working with was all ready to post these results on the daily management board, with the stated intent to “create a little competition,” which would presumably ratchet up the mill production. He was also preparing to kick Operator 3’s booty, saying that he ought to know better, since he had been around for awhile. In fact, the supervisor was kind of surprised because Operator 3 was “the one everyone went to with questions on the mill. He probably is getting lazy and just needs a wake-up call.”

Although that hypothesis is consistent with the data, it is not the only hypothesis that is (keeping in mind that correlation is not causation), so I suggested that we go and have a non confrontational talk with the operator.

What we found was that when the other operators had a lot of material to run that was difficult, they set it aside for Operator 3 when he came in. Of all the operators, he could run the most difficult stuff the fastest. In this case, there was a true difference, but that difference was in the opposite direction: It was not that the mill was slower because of the operator, but that the operator was slower due to what he was given to run on the mill.

Protecting against cum hoc, ergo propter hoc

You can avoid making this error. Never assume any type of relationship is causal until you have real proof that it is. The only way to do that is by designing a true experiment. Correlations of existing data during the early stages of problem solving provide one valuable input into our experimental factor-selection process, but are not sufficient alone to determine causality.

In our mill example above, we were just looking at existing data to get a picture of the process to identify factors for an experiment to increase production. If we had really intended to determine causality, we might have assigned the different operators similar things to run at various times and compared those speeds.

I also want you to notice that the choice of statistical technique employed (e.g., measures of correlation, measures of association, linear or nonlinear regression, or ANOVA) does not protect you from this error. Only you as the human brain involved in the process can do so.

Hopefully, you now feel better about the role of the human brain in the science of quality than you did after my last column.

 

Random Testimonial

This was great. The format was excellent for learning on your terms/pace and adds long-term value by providing an excellent continued education opportunity, material you can refer back to when needed for specific examples and problems. Learned so much, a skill set that will be in high demand/utilized in any type of economic condition. And, taught by someone that brought real life experiences to the lecture and SRO, truly a subject matter expert.

--R.Ta.


Our Economic Stimulus Package: $200 Off (click to learn more)

During this recent economic downturn, we have been contacted by a number of people who are looking to add more qualifications to their resume, either to increase their value at their current job or open some options as they seek employment after a layoff, as well as businesses looking for training with a high return on investment.  Six Sigma Online would like to do what little we can to help you out if this is your situation.

Six Sigma Online already offers one of the lowest price ways to get your Black Belt while still offering much greater depth of knowledge than you can get anywhere else.  In addition, for a limited time we will offer you a 10% discount to make it that much easier to get the skills you need to achieve your employment objectives.  Just go to the order page and it is automatically deducted during the time of this offer.

Businesses right now desperately need to find ways to improve profit and reduce the costs of quality.  Black Belts are the "edge of the blade" in this endeavor and so are in demand.  Our training will maximize the success of your projects with knowledge you will need as you encounter real-world problems to solve.  Here at Six Sigma Online, we hope that our training, and the discount above, will go some way to help individuals and businesses survive these tough times.

 

Heretic Articles

This article was originally published in Quality Digest. Subscribe to Quality Digest if you would like to receive these articles when they are published, or you can get an RSS feed that is updated two weeks after they are first published.

Unemployed in Colorado? Get state help for your Six Sigma training!

On October 15, 2010 The ROI Alliance (Six Sigma Online's parent company) became an approved provider of Six Sigma Black Belt and Master Black Belt training for Colorado's Workforce Investment Program (WIP), which helps people who are unemployed pay for the training they need to succeed in this competitive market place.  If you are looking for a job that is in high-demand, and you meet the requirements of the WIP, they may be able to help pay for your Six Sigma training!

 


Random Heresy

The better your reputation for quality, the more you get slammed from an adverse even

With the announcement of another Toyota recall, it seems that everyone and their dog have an opinion about Toyota, and some of them might even be drawing the right conclusions. While everyone is allowed to have opinions (not the dogs—on quality matters I don't trust entities that consider cat poo a delicacy), it’s interesting to note that Toyota’s was not the biggest recall, not even the biggest in recent memory. So why do they get all the bad press—and what does it mean for quality?

Read more...

In the News

Six Sigma's lead instructor Steven Ouellette wrote an article with Dr. Jeffrey Luftig on "The Decline of Ethical Behavior in Business."

 


 

Six Sigma Online's lead instructor Steven Ouellette was profiled in the June 2008 issue of Quality Digest magazine. If you want to learn more about Steve's peculiar view of the world, as well as what he studied for a year in Europe, read the profile online.