And as this trend continues it is also inevitable that we will be hearing more about so called Maintenance Induced failure. (machines require maintenance, more machines means more requirement for maintenance, ergo more likelihood of maintenance error)
I wanted to spend some time running through some of the possible causes of maintenance error, and hopefully they will help you to both recognize them, and to act on them.
Maintenance Induced Error is a pretty broad and all encompassing term. When the media use it, or blame maintenance as the cause, they generally are not too sure about what they are talking about.
We speak a lot about this in our Maintenance Productivity training course, and I anted to run through a few of the causes that I regularly see around the place.
(I am working on a series of short reports related to this. In particular surrounding the Buncefield Explosion and the Waterfall Rail Incident.)
The first is obvious, rework caused by Human Error. Incompetence is often blamed for poor maintenance, and at times this is definitely the case, but a little RCA style digging sometimes uncovers even deeper issues.
For example, human error in maintenance execution can be caused by:
- A lack of knowledge about the asset and how it is maintained?
- Poor work instructions that are either too long, too wordy, unclear or even wrong.
- Memory lapses (Forgetting to do something that is usually done)
- Attention errors, missing a step due to something else demanding your attention.
Elements of this can be seen very clearly in the final reports of incidents such as the Buncefield explosion, the Waterfall rail incident, the Hatfield and Potters Bar incidents int eh UK as well as the BP Refinery explosion in Houston.
Every time there is evidence of one or more human errors contributing to the final result.
The second is a little more hidden, and often requires a little further investigation... but it is far more serious. That is of course the consequences of the wrong type of maintenance being applied.
Buncefield seems at first glance, to have at it's heart a poorly created maintenance regime. Regardless of the design characteristics of the assets in play the right maintenance simply was not prescribed.
At every site where we perform Reliability-centered Maintenance there is a change in the post analysis results. This is because the initial maintenance was simply inadequate. Tasks missing, tasks that exist that are not necessary, and tasks that are physically impossible to carry out; all leading to poor performance.
An example here is that of a range of conveyor maintenance studies we worked through recently. Every time we found that the maintenance in place was absolutely not suited to the conveyor and its operating environment. In some cases the changes involved several millions of dollars of reclaimed production time.
The bottom line here is doing the right work and doing it in the right manner. Harder than it sounds, and becoming more important every day.
If you enjoyed this post you may like to subscribe to get each blog post as they are written in your inbox here.