Measurements, Part 3
I spent my June and July articles beating up on the “sigma” index commonly associated with Six Sigma. Let me sum up by saying that, while not wrong, “sigma” isn’t an efficient metric for accomplishing what we are trying to do in today’s business environment. Is there a better way to do that? I think that there is, and in this, my lucky 14th Heretic article, it is fitting that we continue to explore ways to measure the effect of Six Sigma.
Measuring quality
First we have to define quality. As I mentioned in my last article, we used to define “high quality” as conformance to specification. It was “low quality” if it was out of specification. This understanding of quality began to change as we learned that there were losses due to variation, even within the specification.
Joseph Moses Juran and W. Edwards Deming taught us that a process loses money due to variation. Gen’ichi Taguchi created a graphic to illustrate this idea by following this line of reasoning. There’s a known loss if a part is just barely out of specification. These losses might have been due to sorting costs, costs to rework, and losses to the end-users. If your customer processes your output further, there are also costs you incur to call customers to accept product that’s nonconforming, costs incurred by your customers due to having to work their process with out-of-specification material, and so on. The further from the specification limit a product is, the higher these costs go, and pretty quickly these costs go sky-high, so the costs probably increase at an increasing rate.
Now consider output that is just barely in specification. Does it perform much differently than that output that is barely out of specification? Probably not, so there are some losses at your customer still, albeit somewhat less. In addition, there are losses due to some of the costs of poor quality such as the cost to have an inspection and sorting infrastructure (which for sure you will need if you make any parts that close to specification), costs of waste in your own process, and so on.
Finally, whatever losses you incur due to variation, they’ll be at a minimum when your output is at the real target, and probably not much more loss if you’re fairly close to target.
Taguchi reasoned that to estimate these losses, a good shape that met these criteria was a parabola centered on the target, as in Figure 1.
Figure 1: The Taguchi Loss Function |
Loss = 1 + (x - Target)^{2} |
You can see it has all those characteristics: not a huge change in loss just barely in specification to just barely out of spec., rapidly increasing loss as you go further out of spec, and a minimum loss around the target. This is the Taguchi Loss Function. You take the process output and measure the critical characteristic, and the loss function tells you how much you have lost for that unit or event. Sum all the losses up for all the parts that you make, and you have all the lost money you could have saved if you had only gotten everything on target. Thinking in terms of the loss function drives us to continually improve our conformance to target, as opposed to conformance to specification, since we lose less money if we do.
So, why talk about this in an article about the measure phase?
Well, although there are ways to calculate an approximate value for the Taguchi Loss Function to estimate actual dollar amounts, what’s really cool is that you don’t need to know that to use the concept to measure and rate your process with something that is proportional to the loss function, regardless of the actual dollar losses of the process.
You probably recall the capability indices C_{p} and C_{pk}. C_{p} measures the ratio of the width of your specification to the width of your process.
This is pretty much useful only in telling you your process’s ability to meet specification if you get the process output to average right in the center of the specification. If your natural tolerance is exactly equal to your specification width, C_{p} = 1, and the process is said to be “potentially minimally capable” of meeting specification. “Potentially” because it might be terribly off target and making 100-percent scrap but with variation equal to the specification width. More usually, companies have a goal of getting all their capability indexes equal to 1.33.
Much more useful is C_{pk}, which measures your ability to conform to specification. If the process happens to be one that follows the normal distribution, then:
If it doesn’t follow the normal distribution, you need to fit a model to the distribution and estimate the largest “tail” proportion that goes out of specification. Convert this proportion into a z-score by looking it up in a normal distribution table, or you can use this in Excel:
=NORMINV(proportion,1,0)
where proportion is your estimate of the largest proportion outside of specification. Divide this z-score by 3 and you have the equivalent C_{pk}. If C_{pk} = 1, then you’re estimating that the biggest proportion out is no more than 1,350 parts per million (ppm). (Note that the total amount outside of specification could be anything from 1,350 ppm to 2,700 ppm depending on if the other tail goes out by an equal amount.) A process with a C_{pk} = 1 is said to be “minimally capable” of meeting specification.
This is all well and good, but C_{pk} is still old-school and stuck in the “Quality as conformance to specification” paradigm. Still, we had better have a way of measuring that, because we’ll go out of business pretty quickly if we don’t at least conform to specification most of the time.
Here’s the cool part. A simple modification of the C_{p} formula allows you to penalize that index for being off target. And the penalty is…wait for it…As the squared deviation from target. Yes, just like the Taguchi Loss Function.
So C_{pm} is an index that actually measures a process’s ability to conform to target per the loss function. If the process is on target, and if the target is in the middle of the specification limits, C_{p} = C_{pk} = C_{pm}. But if this is not the case, C_{p} ≥ C_{pk} ≥ C_{pm}. A process with a C_{pm} = 1 is said to be “minimally capable” of meeting target. Take a look at how this plays out.
Following is a six “sigma” process that happens to be normally distributed and in control.
Figure 2: Centered Process, Centered Target |
Let’s say the process is off target. What effect will that have on “sigma” and C_{pm}?
Figure 3: Shifted Process, Centered Target |
As we would expect, sigma decreases and so does C_{pm}. Now take a closer look. The “sigma” index is based on the conformance to specification, but the C_{pm} is based on the conformance to target, and has dropped below “minimally capable” of conforming to target. This totally makes sense. That target is customer-defined—it’s what they want for the product to meet their needs if they’re the end-user, or what they need to make their process work well and result in high-quality output for their customers if they aren’t. The chance of them getting something near target isn’t all that good here. They will tell you, “You know, I don’t know what it is about your product, but it just doesn’t feel/work as good as your competitor’s.”
In real life, however, frequently the ideal target isn’t at the center of the specification. In fact, if the target of a specification is at the center, it’s likely there due to lazy design engineering, because it’s almost never the case that that is the ideal. What will happen to “sigma” and C_{pm} in this case?
Figure 4: Process On-Target, Target Off-Center |
This is intriguing: “sigma,” which is indexed off of the conformance rate is unchanged, but C_{pm} is back over minimal capability. Again this makes sense, as the probability of your customers getting something near what they want (the target) is pretty high. (For the sake of simplicity, assume that the loss function remains symmetrical about the target. It probably wouldn’t really do that.)
All right, how about another one. What would happen if the process was shifted way down but the target was left toward the upper side?
Figure 5: Target High, Process Shifted Low |
Again, “sigma” is the same, while C_{pm} is lower than ever. The probability that your customer will get what they need is almost zero here, and so your C_{pm} is terrible. But there is “sigma” blithely telling you that things are OK, if not quite “six sigma.” I’m here to tell you that your customers will experience a very different level of quality for this output as compared to the last output—although both have the same “sigma.”
Last one. Let’s bring the process back to center again but leave the target off-center.
Figure 6: Process Centered, Target Off-Center |
Now our “sigma” index is telling us that we have achieved our goal of becoming a “Six Sigma” company.” Problem is, the chance of our customers getting something near target isn’t very good. Good thing we have that C_{pm} to tell us that all isn’t as rosy as “sigma” indicates. C_{pm} shows that we aren’t capable of conforming to target since it’s less than 1.
Process measurement
As it turns out, “sigma” (and to be fair, C_{p} and C_{pk}) has no necessary relationship to how your customers experience quality. The same “sigma” number can mean happy customers or irate customers. As a general rule, irate customers don’t like to give us money. Not when they have a choice, anyway. So of what utility is “sigma?” Beats me.
C_{pm} is the best measurement of how your customers experience the quality of your process, because it measures how well you perform the job they have hired you to do: Conform to their target.
Some of you sticklers out there may have noted that C_{pm} as I have given the formula is only good for processes that are in control with known means and standard deviations. So, how do you handle processes in which these aren’t known? I’m glad you asked. Another cool thing about the loss function is that it’s proportional to the squared deviations regardless of distribution shape. If you calculate the process performance metric Ppm this way:
Then you can use the average and standard deviation of a sample from any distribution, in control or not, and get a measurement of how your customers have experienced your quality. But you have given something up, so there’s a price to pay.
The first caveat is that this describes only the past, and makes no prediction about future customer experiences. You still need to achieve statistical control to be able to predict those.
The second caveat is that the end-user probably buys or experiences few or one of the final product at a time. Regardless of your C_{pm} or P_{pm}, if what they get is far off-target; that is their whole experience with you, which is why we need to constantly reduce variation around that target.
C_{pm} and P_{pm}, while proportional to the losses, aren’t equal to them. But it’s easy to use them to prioritize across process outputs if you can assign a cost to the process. If you can estimate the cost of mitigation at the specification limit, you can generate actual losses for the process, and use these losses to prioritize your work across very different processes. Mitigation costs are all the losses incurred if a part is right on the specification and include the end-user or customer losses, and the losses experienced by your company, including the costs of poor quality.
Using these measures instead of “sigma” is an efficient way to measure the actual internal or external customer experience with your process output. Even in the absence of true cost data, C_{pm} or P_{pm} will give you a measurement that relates back directly to how your customers experience your quality.
So, having so cogently explained the benefits of C_{pm} over “sigma,” I expect we will be hearing no more from that inefficient index.
But I could be wrong. OK, about that last bit, I probably am wrong.