Saturday, 14 November 2009

At the heart of reliability

"Reliability engineering is concerned with forecasting and preventing failures..."
I just read this in an article I am reading by a respected author and practitioner on reliability engineering. The sad fact is that he is wrong, and this line of thinking has been wrong for about half a century now... old habits die hard I guess.

By forecasting this guy means "forecasting" as in probabilistic modelling. Now while I agree with the use of probabilistic tools where it is warranted, blanket statements like this are what lead people into dramatically over analyzing and over maintaining their plant items.

Lets take the case of a bearing failure. Due to long term minor overloading cracks have developed within the inner race, breaking the surface and rapidly contributing to the deterioration, and ultimately the failure, of the bearing.

The consequences of this failure are severe, so severe that a condition monitoring regime has been put in place and is being done at 33% of the P-F Interval. (To make sure that the onset of failure is detected)

How then are you going to prevent the failure in this case? There's no way known to man.. the bearing is going to fail just as the sun will rise again tomorrow. Nothing in this world will stop it... what we can do however, is preempt it somehow. Through early interventions, changes to the production run cycle or whatever other options you may have.

Are we about predicting failure here? Yes! Not forecasting but predicting as part of our failure management strategy.

And why have we bothered? Because the consequences are severe.

Lets take another example of an over speed switch in a turbine. A plant has 6 turbines. After careful consideration of the demand rates, failure rates and acceptable / tolerable levels of risk we calculate that these need to be checked every 18 months. (say)

We perform our baseline checks and find everything to be okay, and it is not until we check for the third time that we actually find that one of the over speed switches is now in a failed state. (Meaning it will not work to protect the machine if it is needed)

Preventing, avoiding and even (in this case) predicting the failure is way out of the question. Actually it has already failed.

Again we see that the reason why we are doing this at all is not to predict or avoid failure per se, it is to manage the consequences to a tolerable / acceptable level.

So where is all this going...

Even those who are very deeply embedded in probabilistic analysis realize that the likelihood of accurately forecasting failure is very remote. Because the data is never available. In fact, in my own experience with probabilistic analyses I have found that most turn into projects to try to find relevant data to use in the model.

The famous statement on the use of Weibull is that you only need 3 failure points. Fair enough... but getting even those three is often exceedingly difficult.

They need to be of the same failure mode, and if they are serious enough to warrant investigation then they carry significant safety / economic consequences. So analyzing them after the fact is almost in the realm of negligent isn't it?

The whole point of modern asset management is not to predict / forecast dates of failure - it is to manage the failure process where the consequences warrant it!

The Predictive and Detective maintenance examples above are pretty clear on this. And then there are run-to-failure cases, where we have determined that the most effective means of managing the asset is to actually let it fall over.

Do probabilistic methods have their place? Of course!! I'm a big fan of most of them and I use them regularly within my team and our business - but only where they are the best option. (You know the old story, when you have a hammer everything looks like a nail)

The real danger is thinking that it is all about preventing or avoiding failure, it is not - and thinking it is will lead only to frustration, over maintenance, and misapplied maintenance strategies.

No comments:

Post a Comment