In my experience the real shortcoming of a pareto in isolation of other analysis is the time frame of the data collection. The time period must be long enough to be relevant and this can really only be verified by seeing the time series trend. What we miss in the pareto graphic is the time series nature of the data: is the category increasing, decreasing, stable or just an isolated spike. When selecting which categories to improve first this is more critical than the statistical significance of the count differences.
A problem that is increasing rapidly is far different than a problem that has recently decreased substantially or a problem that was a one-time spike that was quickly corrected. Given that situation I use Trend – Pareto – Trend. The high-level trend (cost, yields, service events, QC failures, etc.) shows me the relative progress – or lack thereof towards the goal. Then I have a pareto chart from a meaningful time period: 3-6 months, sometimes a year, logic and knowledge of the system tell me what time period is meaningful. Then I trend each pareto item – this adds to my knowledge of the categories and I use this to determine which pareto items are truly large enough to prioritize for improvement first.
Another point about the use of the time series data rather than a p value: The p value will be misleading if the data are not homogenous. Any time series trend that increases, decreases or spikes is not homogenous. We must remember that when comparing two things a low p value means that one or more ‘assumptions’ are incorrect:
- No real difference exists
- The data are homogenous
- The selected distribution is correct for the data
- The test statistic is correct for the data
- The data are random; the trials were not confounded or biased
I would also add that in my experience simply saying that we will go after the ‘vital few’ projects is not a very effective approach. First, we must charter active projects based on the available resources – or provide the necessary resources to go after all of the projects that are necessary to meet our goal. Typically, successful organizations go after the top projects that can be resourced and continually move down the pareto until they have made enough improvement to meet the goal. The 80:20 rule is a descriptor; it is not a precise law. In that aspect it’s a lot like the Normal distribution – it is a mental model that has some usefulness in how we prioritize and sequence our work; it is not a law of physics. So when two categories are ‘close’ in count and are both stable in their occurrence it really doesn’t matter which one we go after first. We may meet our goal by only improving one or we may need to improve both – the statistical significance of the count is irrelevant in that case – it’s the accomplishment of the goal that is important. If one is hard and time consuming and the other is fairly easy and quick, the easy one should go first. (assuming the effect on the customer or our cost are equivalent)
Reference: “Pareto Analysis and Trend Charts – A Powerful Duo”, One Good Idea Column, Roger Duffy, Quality Progress, November 1995, p. 152