In this case we will be looking at a Sporadic failure in the electrical field, although it appears to have happened more than once.
Should be interesting...
Last time I wrote about Reflex RCA we spoke in a little bit of detail about failure identification. In particular about chronic and sporadic failure types, and using common measures and technologies to determine which chronic failures to spend time and energy on.
This post I want to start working through some of the issues related to causal analysis.
RCA is about root causes, right?
The term Root cause analysis is somewhat misleading. It assumes that a) there is always a "root" cause, and b) that we are even interested in finding out what it is.
Take an example from our RCM 101 training where we look at driving down to the right level of causality in an analysis related to the failure of bearings on a motor.
We know that bearings can fail due to a lack of grease, but how could it have a lack of grease?
If we assume that the system was not automatic but manual, grease rounds, then it could be because it wasn't done, or because the touring didn't exist.
Again, assuming that it wasn't done it could be because the lubrication technician didn't have time, didn't have the resources, or for some other reason he omitted it.
If it was omitted then we will say that it was because he was tired. And why was he tired? because he was up at night watching the grand final.
So the elimination strategy is obviously, nobody is allowed to watch the grand final at all. Right?
Of course not, and there are a range of reasons why. First and foremost, it is none of our business so long as people present fit for work, and secondly they would probably mutiny.
So our goal is not to actually determine the actual root cause, our goal is to determine which of the dominoes we can prevent, mitigate or eliminate to stop the event from occurring.
The process to discover this is relatively simple, but extremely powerful.
- Start with a definition of what went wrong, and of what occurred in the first place.
- Then instead of asking "why", ask "how could". "How could' is a broader statement requiring a broader view, not just a narrow view of why this particular event occurred.
- Make sure each step is a small step in logic, and covers all of the reasonably likely ways that an event could have been caused.
- With each level, prove (via evidence) whether each cause is reasonably likely in this operating context.
- Drive the causal analysis down from the most frequent causes to the point where you can do something proactive about it.
- Make sure to list any human factors responsible for the event as well as the person who caused it. That is a bit uncomfortable often right?
And lastly, determine the solutions.
Solutions within ReflexRCA may be a maintenance activity or a redesign, meaning a change to either people, processes or plant.
Each of these are complex steps that involve a lot of detail, use of additional methods and techniques where possible, as well as a comprehensive approach to developing and implementing solutions.
But hopefully this is a good starting point for many of you.
Next time we write on ReflexRCA we will cover some of the research and areas relating to human error in maintenance and operations, as well as what to do about it.
No comments:
Post a Comment