Root Cause Analysis
Ben Osburn
11 Posts
Hey guys,

I'm in a bit of a disagreement when it comes to RCAs. I think that every time a return happens on our product, an RCA needs to be done. Others says there's only a need to conduct one if the process is a problem on a continuous basis. How often should RCA's be done? 
15 Replies
Duke Okes
125 Posts
They should be done when the risk of not doing it is greater then the costs (time, resources) of doing it.
It will vary from industry to industry.  If it's a medical device that failed then probably a RCA/CAPA is justified.  If it's the wrong color frisbee was shipped I doubt it makes sense.
Trish Borzon
610 Posts
Hi Ben - I would agree with Duke Okes‍.  RCA can be time consuming.  If you're seeing the same thing happen again and again or if the impact is significant, then yes.  

Events and causal factor analysis: Widely used for major, single-event problems, such as a refinery explosion, this process uses evidence gathered quickly and methodically to establish a timeline for the activities leading up to the accident. Once the timeline has been established, the causal and contributing factors can be identified.

For more on RCA, visit
Whatever Duke Okes‍ says about RCA, I'd trust--he has written books about it!

I would like to add two comments to Duke's.  If you shipped the wrong color frisbee, it might be appropriate to just "fix it and go" by replacing it with the right one.  Also, sometimes you know what happened and its obvious, and/or the corrective action is simple.  In that case, a full RCA isn't necessary.   In the case of product returns, it could be that several customers are returning your product for the same reason.  If you are already investigating the problem with an on-going RCA, there is no need to open separate RCA for each occurrence, use each return as an opportunity to learn more about the nature of the problem and work on a single investigation.
Some thoughts:

Cause investigation and RCA are different.  Even though a RCA is not performed, a cause investigation may be considered.

Could track reason (thus 'cause') for return (i.e. ordered wrong item, shipping damage suspected, nonfunctional, etc.).  ...and trend.

If returned on warranty, maybe a conformation of issue should be considered.

Per FDA guidance, for returned medical devices, 'cause' should be identified.  If severity is high enough, then an immediate RCA may be needed (even for just one instance).  Although a single instance of issue may not trigger RCA, trending may indicate a systemic issue, thus requiring a RCA. Other triggers may exists  (e.g. severity X occurrence rate).

If gave more details of situation, maybe could direct you to more appropriate industry guidance/practice/expectation.
Ben Osburn
11 Posts
Thank you all! I don't know where I would be without ASQ. 
My experience is if the product ID can be traced back to a production line and manufacturing date, and if the process was behaving out-of-control, then an RCA should be initiated. Or, if the product defect could potentially result in an injury or health issue, an RCA should be initiated. Don't open a formal RCA if the process was in-control, or if the defect level/complaint code was within normal (i.e. "common cause") levels unless other "administrative" concerns are at risk.
1 Posts
Hello All,
It depends on the quality robustness of the system in particular organization.
One way of looking at it is, too many RCA's means too many issues that means our FMEA's are not robust enough.
But in any case, it's important to look into RCA to see if we have new causes or failure modes to be able to address it.
Duke's answer is the best answer.  Another possibility: how much can you learn from the RCA versus how much effort is needed to perform the RCA.  

Another question to ask : "Do I need to do a complete RCA?"    Would determining the causal factors (i.e. human performance gaps and equipment performance gaps) be enough for a event with very low consequence/effects?

Should this issue be handled by another existing program?  Should a trip, slip, and fall be handled by industrial safety?

If you investigate every issue, you spend way too much effort on investigations, recommendations, and tracking of issues whose resolution does not improve the safety, quality, etc of the organization. 

A often overlooked activity in an RCA is a review of the predictive analysis process and outcomes.  Why didn't our FMEA or HAZOP or fault tree identify this issue?  Poor FMEA process?  Poor team selection? Poor execution of a good process?
Hi, James. 
For my understanding, how do you differentiate RCA and cause investigation?
You got a few good answers that should help.  I'd love to know what difference did it make.  It's also great to hear the stories about the impact of suggestion.  Would you report back when you have a chance?

Claudio Genesini:
Hi, James. 
For my understanding, how do you differentiate RCA and cause investigation?

I like to use Claudio's question as the start point for something I observed frequently, it is deciding the amount of effort on each problem.  Most organizations have issues more than the resource them can afford.  It is no different than this thread of discussion that no one can do in depth RCA for every problems they have. 

Using risk impact is an easy way to start the prioritization.  Risk analysis is something important, but difficult to convince non-believers into doing.  It is great if your organization has the culture that your parts, assemblies or products have been analyzed and having risk (quality) impact defined.  It is much easier to determine if RCA is required when an organization already has a common understanding on the quality impact based categories of problems.  Also noted, 80-20 rule works well here.  It means your analysis only need to be good enough to handle the 80% of problems.  The rest of 20% of problems are the one you spend time to investigate or sent your best engineers to tackle.

Every problem can be seen as the biggest problem for investigators without a global view.  The best approach is a preset risk definition or process set as early as design phase.  And, yes, FMEA is a great way to do it.  However, for a smaller organization, it can be done with a set of questions to determine the risk level of each parts based on the impact on function and safety. 

I see many responses, still i would like to put this in other way, First thing any non conformance shall be evaluated for it's impact on Cost ,TIme and Safety ,there are significant impact then a structured RCA(using various tools/software etc.,)would be more appropriate .
If the impact is low, a simple Five Why analysis can be done on the spot and identify the probable root cause.
The Quantification of Impact is varies between industry  and the level of risk aversion. 
RCA focuses on root causes ( management system gaps).  Cause focuses only on equipment failures and human errors