Dr Ntokozo Mthembu, Advisor to the ODI Board, writes:
“During the past few months, we have published a couple of series on engineering maintenance from Prof. Uday Kumar’s lectures. The series was interceded by a discussion on the dangers of AI that had been raised in the social and scientific community of reasonable men. Naturally, we were drawn to this discussion since it forms the fundamental building block of 4IR technologies, towards which our world is drawn by a tsunami-type force. I believe that some in the ODI-ers community are also engaged and grappling in some form or another in conceptualising and implementing the 4IR technologies in their enterprises. Consequently, the digression from the Kumar maintenance series, albeit disturbing the flow, was necessary and warranted.
The concluding part of the Kumar series is the validation process of the ideas discussed previously with examples and calculations. However, this party of exercise is difficult to present in a blog because it is dry numbers that may not have relevance to most of what we do as captains of our factories unless we are focused on the design of systems of maintenance. It pains me to skip it, but also allows me some space for creativity by just discussing some of the key principles involved in modelling systems reliability.
Models for systems reliability
According to Schenkelberg, from the simplest to the most complex system, building and using a reliability model permits the entire team to make better decisions. Understanding and monitoring system reliability involves knowing both:
- the reliability of elements within the system,
- as well as how the elements relate to each other reliability-wise.
System reliability models are used to identify weak links, and focus resources, to meet our desired reliability objectives. Being able to build the right model to meet your team’s needs best is one of your roles as a reliability professional, continues Schenkelberg.
Reliability Prediction Models
There are a number of models used to predict the reliability of the systems. These include the Exponential model, Weibull model, Reliability block diagrams, and Fault tree analysis, to name a few. What are these models?
Models are mathematical tools that help you estimate
- the probability of failure,
- the mean time to failure, and
- the availability of your system.
They can help you optimize your maintenance strategies, plan your resources, and reduce your costs and risks. But how do you choose the best reliability model for your system? In this article, you will learn about some common types of reliability models, their advantages, and disadvantages, and how to apply them to your system.
- Exponential distribution function
The exponential distribution is the most widely used distribution in reliability and risk assessment. It is the only distribution having a constant hazard rate and is used to model the “useful life” of many engineering systems.
Exponential distribution finds its prime application in calculating the reliability of electronic gadgets such as a laptop, battery, processor, mobile phone, etc. It helps the engineers and manufacturers to know the approximate time after which the product will rupture.
The exponential distribution is closely related to the Poisson distribution, which is discrete. If the number of failures per unit of time is a Poisson distribution, then the time between failures follows an exponential distribution.
For example, the Poisson distribution is utilised to predict the failures of components in power systems, such as scheduling the preventive maintenance and improvement of power system reliability purposes.
· Reliability block diagrams
Reliability block diagrams (RBDs) are graphical representations of your system’s components and their logical connections. They show how the failure of one or more components affects the performance of the whole system. You can use RBDs to calculate the system reliability, availability, and failure rate based on the individual component reliabilities and failure rates. RBDs are easy to construct and understand, but they may not capture the complex interactions and dependencies among components, such as common cause failures, repair priorities, and redundancy.
In the above diagram, a portion of blocks, say two out of the total displayed
three blocks constituting subsystem B, need to be successful for the system to succeed. Adding corollary to this assertion, it can be concluded that a failure along
a series path, effectuates, the entire series path to fail.
· Weibull analysis
Weibull analysis is a statistical method that fits a Weibull distribution to your system’s failure data. The Weibull distribution is a flexible model that can capture different failure patterns, such as constant, increasing, or decreasing failure rates.
The Weibull distribution is flexible enough for modelling the key stages of this typical bathtub-shaped hazard function.
You can use Weibull analysis to estimate the reliability and failure rate of your system at any given time, compare the performance of different systems or components, and identify the dominant failure modes and causes. Weibull analysis is a powerful and versatile tool, but it requires sufficient and reliable data and may not account for external factors, such as environmental or operational conditions.
· Fault tree analysis
Fault tree analysis (FTA) is a deductive method that starts with a predefined system failure event and identifies all the possible causes that can lead to it. It uses logic symbols, such as AND, OR, and NOT, to construct a hierarchical tree of failure scenarios. You can use FTA to quantify the probability of the system failure event, identify the critical components and failure modes, and evaluate the effectiveness of preventive and corrective actions. FTA is useful for analysing complex and safety-critical systems, but it can be time-consuming and data-intensive.
An example of the use of FTA is shown below. Pump/motor assembly – No Flow.
In this fault tree diagram example, we see an illustration of a pump or motor assembly not having any flow. This event is the main failure, and below it, we can see the initiating events: Mechanical failure and electrical failure. To the right of the fault tree diagram, below electrical failure, we can see further events and failures, one being a motor failure and the other being a fuse failure. Below the fuse failure, we see that a circuit overload event is occurring and, below it, two different basic events: Wire shorted and/or a power surge.
Reliability Calculations / Formulae (Graphics)
In addition to reliability prediction models, a brief look at the reliability calculations, however, we use graphics as illustrative models to conceptualize the calculation process.
 We use one example of four components in series, and the other example of three components in parallel. The caveat is a combined series-parallel systems configuration used to manage the system’s redundancy.
- Series configuration
The basic calculation for a Series system reliability is given by:
In a series system, all components or subsystems must function for the system to function. The system reliability cannot be greater than the smallest component reliability. It is important for all components to have high reliability, especially when the system has many components.
- Parallel Configuration
The basic calculation for Parallel system reliability is given by:
In a parallel system, all components must fail for the system to fail. (A parallel system works as long as any one of the components works. The advantages of connecting equipment in parallel are:
- To improve the reliability of the system by making some of the equipment redundant to others.
- Extensive preventive maintenance can be pursued with no loss in plant availability since the separate parallel units can be isolated.
- In the event of failure corrective maintenance can be arranged under less pressure from production or from competing maintenance tasks.
- Combined Series-Parallel Systems
The basic calculation for a Combined system reliability is given by (for high level):
Redundancy in systems can be accomplished either at the system level (high level) at the component level (low level), or some combination:
Failure mode and effect analysis
The Failure Mode and Effect Analysis (FMEA) is one of the methods to analyse reliability, but its complex structure prohibits a full rendition in this blog. Failure Mode and Effects Analysis is a reliability procedure that documents all possible failures in a system design within specified ground rules. It determines, by failure mode analysis, the effect of each failure on system operation and identifies single failure points, that are critical to mission success or crew safety. It may also rank each failure according to the criticality category of failure effect and probability occurrence. This procedure is the result of two steps: the Failure Mode and Effect Analysis (FMEA) and the Criticality Analysis (CA).
In a nutshell, the Kumar course on Maintenance Engineering could be summarised below in terms of what we have learned or what the course attempted to teach us.
- Understand and explain the failure and failure mechanism, basics of maintenance engineering, and technology.
- Identify the factors influencing maintenance costs and recognize the consequences of poor maintenance.
- Explain the trade-offs between preventive maintenance efforts and maintenance costs.
- Understand the relationship between reliability and maintenance.
The web of our life is of a mingled yarn, good and ill together. All’s well that ends well, still, the fine’s the crown; Whate’er the course, the end is the renown. Steals ere we can affect them
William Shakespeare, 1623.
- Firstly, I thank Infinite Intelligence who connected all the dots from Sweden, Randburg, and Centurion to manifest the learning series to the community of ODI practitioners of Key 9.
- Secondly, I acknowledge Professor Uday Kumar of the Lulea University of Technology, Sweden, who presented this Online course on Maintenance Engineering through the FutureLearn Online Learning platform.
- Thirdly, I am grateful to ODI’s triumvirate: Huibie Jones, Roland Rohrs, and Johan Benadie who have meticulously edited all of these series, as well as other blogs that have been published before.
- Lastly, thank you Erica Hesse for throwing the gauntlet of writing these blogs, hoping you will accept my proposed retirement from writing them to AD2058!”
2 Fred Schenkelberg. ‘Building and Using a System Reliability Model’https://accendoreliability.com/overview-system-reliability-modeling/
 Abdurrahman ÜNSAL, Bekir MUMYAKMAZ, N. Serdar TUNABOYLU. “PREDICTING THE FAILURES OF TRANSFORMERS IN A POWER SYSTEM USING THE POISSON DISTRIBUTION: A CASE STUDY” https://www.emo.org.tr/ekler/c22590152f4f53f_ek.pdf
 Anshul Vaidya. https://www.benchmarksixsigma.com/forum/topic/39327-reliability-block-diagram/ (Posted May 3, 2022)
 Rockwell Automation. https://fiixsoftware.com/glossary/fault-tree-analysis/