Key Words: Cpk, Ppk, differences, statistics
Purpose: This document describes the differences between Cpk and Ppk values.
Cpk values are known as “process capability” measures. They are calculated using the subgrouped data variation to estimate population variation. The basis of the Cpk calculation is sigma hat or estimated standard deviation which is calculated by taking the average of the subgroup ranges (Rbar) and dividing it by a factor from a standard chart known as a d2 table. The chart is organized by subgroup size and you select the d2 factor from the table according to the subgroup size of the control chart being used. This statistic is called the estimated standard deviation or sigma hat. The complete calculation can be found below. The Cpk statistic was developed at a time immediately before the wide use of personal computers for statistical calculations in quality. As a result estimates and assumptions are used instead of longer more laborious calculations, which would have been done by hand at that time.
When calculating Cpk some assumptions regarding the process must be made. If these assumptions or conditions are not met, the Cpk measures will not describe properly the “true” variation in the process. The two fundamental assumptions are: the process is normally distributed and the process is stable and in statistical control. If either of these conditions is not met, the Cpk measures will fail to describe the process accurately. By visually inspecting the control chart you can check for in control status. Furthermore, to be normal and stable the percent of points in the middle third of the control limits will need to be approximately 68%*. Any violations of these assumptions tend to generate Cpk values that are better (or higher) than if they were based on the “true” process variation based on the individual reading rather than subgroup estimates. Whenever there is a significant difference between Cpk and Ppk for the same data, check the control chart for out of control conditions and consider the Ppk value to be more accurate.
Ppk values are another “process performance” measure. They are calculated using the variation described by the standard deviation of the individual readings to estimate the population variation (sigma or σ). This may seem like a more reasonable set of measures, because a clearer definition of the distribution is possible. However, unlike Cpk, which uses time as a factor (because of its control chart roots), Ppk ignores time. If the process is shifting over time the Ppk will show a larger (or worse) process variation than is actually the case for a smaller subset of data from a specific time.
If Ppk and Cpk values agree within a few percent, then it is most likely that you have a stable process and can feel comfortable with their accuracy in describing your process. If they disagree, a closer inspection of the cause of discrepancy should give a greater insight into the nature of your process.
* Be aware that a non-normal distribution can produce a normal distribution of subgroup averages. The central limit theorem states that the larger the subgroup size the closer the distribution of averages will approximate a normal curve.
DataLyzer® Spectrum Application Note
Courtesy of DataLyzer International, Inc.
Control number: 132