Letting You In on a Little Secret (MSA)
 Details
 Created: 30 November 2009
 Written by Steven Ouellette
You know how sometimes you think everyone knows a secret that they haven’t let you in on? Well, I had the opposite happen to me the other day. I assumed everyone knew the purpose for measurement system analysis (MSA), a.k.a. gauge repeatability and reproducibility; but I found out that a number of people have a completely mistaken impression of what they are for, much less how to do them correctly. So I thought I would give away, (free of charge) articles that explain the basics of MSA, as well as a cool MSA spreadsheet to help you learn how to do them, just because that’s the kind of guy I am. Selfless. And humble. Yep.
OK, so it turns out that there is a lot going on in an MSA. I'm only going to cover the basics, but if you are willing to learn repeated measures analysis of variance (ANOVA), there are tons more you can do with data such as these. But there is a lot you can tell even with just statistical process control and some basic stats built into a spreadsheet.
So what is measurement? I know it sounds weird, but think about it for a moment. Measurement is the process that turns the event in which we are interested into numbers we can analyze, known as data. Because we use data to make business decisions, we had better be able to trust our measurement process. It would be a sad day indeed to work on solving a problem with our product only to find out, six months later, that the problem was the measuring device going out of whack, especially if you scrapped or, worse, passed a product based on the bogus readings. So the end result of measurement system analysis is quantifying your ability to make the right decision about product conformance to specification. This is an important point—I’ll come back to this in the next article.
The measurement process can be affected by a number of variables: the procedures used, the standard we measure against, the equipment employed, and in many cases, the operator taking the measurement; and all of that is affected by ambient environmental characteristics. The output is a measurement that is, hopefully, somewhat related back to what we wanted to measure in the first place.
Actually, with all those potential sources of variability, I am always amazed that we can measure anything with any degree of precision.
There are three phases or studies that can be done that achieve different aspects of proving that our measurement device is allowed to make conformance to spec decisions.
Phase I—potential study
 Does the measurement device even stand a chance of being useful in the following terms?
 Repeatability. How much variability is there in the same person measuring the same thing?
 Reproducibility. How much variability is there across operators measuring the same thing?
 Your ability to make good decisions with it. Can we rely on the gauge to help us make a determination of our product’s conformance to specification?
I tell my students that the results of the potential study determine whether you call the gauge salesman back. The potential study is a very quick study, so if we can’t prove the gauge’s worth there, it is only going to get worse with time, so drop it now. If we do pass, it is only showing the potential of the gauge, not its ability in the process and certainly not its stability through time. To get a better feel for that, we perform a:
Phase II—shortterm study
 Can we demonstrate the usefulness of the gauge over more parts and a bit longer time? With this, we can assess a bit more thoroughly the:
 Repeatability
 Reproducibility
 Ability to make good decisions
The shortterm study gives us a better idea of how the measurement system might perform in the process. We get a hint of stability through time—nothing rigorous though—and more of an idea of how part variability might come into play. The results of this study might determine if you are going to consider buying the gauge. But the shortterm study does not assess the measurement system over a long enough period to assess stability through time. Just because the gauge passes a shortterm study today does not mean that it is going to continue to do so. It would be a shame to pass or fail your product based on a gauge that passed a shortterm study, but then drifts around randomly during the course of a week or so. (Been there, done that, got the Tshirt, never wanna do that again.) To protect ourselves, we need to continually monitor the gauge performance over time. To do that, we perform a:
Phase III—longterm study
 Does the measurement system show stability through time in the following?
 Repeatability
 Reproducibility
 Your ability to make good decisions
How long do you do a longterm study? Only as long as you use the gauge to make important decisions. Even a great measurement system can go cuckoo, so you balance the cost of performing the measurements for the ongoing study against the losses you might incur if the measurements are “totally whack” (to use the vernacular) and you don’t know about it.
Let’s talk about measurement error. This is the error that you would get if you measured the same exact thing over and over, maybe even across multiple operators. Here is a picture of what you might see if you did such a crazy thing:
LSL is the lower spec limit, and USL the upper spec limit (these will become important later). The blue curve would be the frequency of different readings that you would get on your measurement system. It turns out that measurement system error really does tend to be normal fairly frequently— at least across one system—multiple systems can easily be multimodal. The standard deviation of the measurement error (σe) would be approximated by the square root of the sum of the variances due to repeatability and reproducibility. This standard deviation relates to the precision—how much spread around the average your gauge produces. If you happen to know the true value of the thing you keep measuring, and the average of the measurement error is different from that value, the gauge is biased. The smaller the bias, the higher the accuracy.
Now remember, this is not the distribution of your product or process measure, this is the distribution you get with your measurement system when you are measuring exactly the same thing over and over. I admit that would be a boring job.
So let’s see it in action. You are considering purchasing a new hardness measuring device and your vendor has set it up for your potential study, telling you this gauge will do everything except your dishes. You are only going to use this device in one process measuring one nominal hardness, so you randomly select 10 samples from the process. (Note: We want to make sure that we exercise the measurement device at all levels where we are going to use it so if we had more than one nominal hardness, we would make sure to get samples from each; but here we only have one.) We give the samples an identification mark that we keep hidden from the operators (so as to avoid even subconsciously influencing the readings). Operator No. 1: Jack—measures the 10 samples in random order, then measures them again in a different random order. Operator No. 2: Jill—then measures the 10 samples in a new random order, and then again in a different random order. The specification to which we want to compare these during production is ±4.5 units, or nine units wide total. Here are the results:
We see that each person has a range across each part (clearly related to the repeatability within operator) and see that there is a difference in the average across the parts for each person (clearly related to the reproducibility across operator).
The question you need to answer is, at least for this quickndirty test, will this measurement device allow us even the hope to make good decisions about our product’s conformance to the specification? (You have already noticed that we can’t assess bias with these data, so just determine if the measurement error allows or precludes us from using the gauge for our process. If it proves to have a bias, it is easy to correct for it by adding or subtracting the amount of bias from the reading.) What statistics would you use to determine repeatability and reproducibility? How would you determine if the gauge was capable of making a good decision on whether your product was in or out of specification?
Measurement System 
Date 

Part Name 
Part Number 

Characteristic 
Tolerance 


Appraiser 1  Jack 

Appraiser 2  Jill 


Part # 
First 
Second 

First 
Second 

70 
71 

75.1 
77.4 


69.5 
70.5 

71.1 
71.8 


70 
72 

75.7 
78.8 


72 
72.5 

79.7 
81.2 


66 
67 

65.7 
65.8 


66.5 
67 

68.7 
69 


67.5 
69.5 

72 
72.7 


69.5 
74 

70.1 
70.3 


67 
70 

79.5 
80.2 


67 
68 

69.2 
69 


68.5 
70.15 

72.68 
73.62 

You may have done this before and have software or a spreadsheet set up to do this. You can also download my MSA spreadsheet to do the calculations if you like. Looking at how the spreadsheet is set up, you can answer the these questions for yourself.
Next month, I’ll show and discuss the results. I’ll warn you, though… this case study is trickier than you might think.