• Novice
  • Aware
  • Competent

System reliability

It is important to understand the difference between part failure records and the probability of system or facility failures:

  • MTBF of the maintenance managed items (MMIs) or the "parts" (as derived from the CMMS).
  • Likely probability of failure of assets or parts that have no failure record (by experienced assessment and calculation).

These probabilities are then aggregated to provide a "systems probability of failure" by aggregation in "series" or "parallel".

It is important to note that these probabilities do not often equate to the actual "system failure" records. This is because of three factors:

  • The distribution of failure between identical parts (e.g. the second and third MTBF may not match the first)
  • The significantly enlarged distribution that is the result of the aggregated (system) failures (e.g. multiple probability distributions)
  • In many infrastructure systems, the number of part failures are small and an accurate MTBF is often difficult to derive.

Conclusion

Once we thoroughly understand the distribution of failure for individual parts, we can apply the system modeling techniques as discussed above.

Until we have this detailed data, we cannot aggregate up the current part failure histories or probabilities to derive an accurate system failure.

However, the aggregated parts probabilities do give us the best way to understand the current risk cost exposures of our system assets or facilities and this will enhance our management of the overall system risk.

That is, the actual (sub-system) risk cost exposure maybe less than as calculated, but the relativity between sub systems or facilities will be the same and therefore decisions with a high confidence level can be made on the data produced.

Where the costs are high (e.g. over $500 000) and the benefit cost is close to the organizations policy limit, we may need to more closely assess the sensitivity of the probabilities derived, especially where the consequences of cost of a total system failure has been used in the risk cost calculations.


previous home next
Typical approach to asset failure   Optimized Renewal Decision Making