Building an M&R Program for Asset Reliability

John Sewell, IDCON INC

There are numerous factors that influence asset reliability. Since time and money are limited resources, we must determine what facet of overall reliability to prioritize. Sadly, it’s not uncommon to see resources focused on the wrong area.

Well-intentioned projects and initiatives are undertaken that don’t provide the expected benefits. An underlying contributor to this phenomenon is a lack of understanding about the reality of our maintenance and reliability system. Overall reliability is determined by the one limiting factor, not the best aspects of the program. 


Consider a common scenario: A new maintenance manager is hired who has a proven record of success in their last role. The new manager announces an improvement project similar to what worked for them previously. Priorities are adjusted, resources are allocated, and work begins. Several months later the overall reliability has not increased. The performance has been raised in one area but there hasn’t been a positive impact on overall reliability. Everyone is exhausted and frustrated by the lack of progress, and the new manager’s reputation takes a hit. The manager is either reluctant to engage in future change or is being reined in by leadership’s lack of confidence. The root cause of the failure was a lack of appreciation for the unique situation at the facility. What works in one plant, mill, or mine may not work in another. Before starting a project to improve asset reliability, take time to uncover the gaps in the unique situation.


A Model to Assess M&R Programs

Liebig’s law of the minimum gives us a model to visualize our maintenance and reliability program.

The most overall reliability we can achieve is synonymous with the maximum height of water in the barrel. Each stave in the barrel represents a particular facet of our reliability program. Raising the height of an already tall stave won’t lead to an increase in overall reliability. We must identify and work to improve our program’s limiting factor, only then will we see an increase in overall equipment reliability.

As maintenance and reliability professionals, we’re often tasked with increasing overall reliability, increasing throughput, or lowering costs. Management wants to raise the level of the water in the barrel. The first reaction may be to increase the rate that we’re adding water.

We pour in resources by asking people to work harder or longer hours. We tell craftspeople to be safe while not changing any working conditions or equipment designs. We require reliability engineers to create lengthy failure analysis reports and don’t invest in closing gaps in our fundamental maintenance disciplines. We require maintenance managers to approve every purchase order and storeroom managers to reduce inventory without verifying bills of material or improving the work management process.

These efforts seem to be effective in the short term. Undoubtedly though, the wave of effort will overtop the lowest stave in our maintenance and reliability barrel and come spilling out. We’ll have a rash of injuries, downtime increases, and costs will begin to climb. The bottom line is that people can only be as effective as the system allows.

A bad system will beat a good person every time.

- Dr. W. Edwards Deming

Before launching another initiative that has a poorly defined connection to concrete deliverables, conduct an analysis of the current situation. Assess the different aspects of your system and look for gaps. After a clear understanding is developed there can be assurance that the improvement work will generate the desired outcomes.

The Widest Staves

Two staves are wider than the others and play a vital role in overall reliability. If either of these staves are the shortest in the barrel, the water will quickly leak out and a high level of reliability will never be achievable.

Work Management

The first wide stave is work management. Being able to plan and schedule maintenance work is a prerequisite for reliability. For planning, we determine the what, how, and how long of a maintenance activity, while for scheduling, the who and when are determined. Symptoms of a short work management stave include crews planning their own work on the fly, no weekly schedules being posted, a frustratingly growing backlog, and crews not offering feedback on completed work. To lengthen the work management stave, look to improve the operations and maintenance partnership, develop a backlog management process, properly plan and schedule work, and follow through with good work execution and data collection in the field.

Preventive Maintenance

The second wide stave is equally as important. To achieve a high level of overall reliability, there must be a focus on preventive maintenance. A low level of PMs can be seen by inefficiency in overlapping or missed inspection points, frequencies based on emotional priorities, dirty equipment that masks failure symptoms, and frequent unexpected breakdowns. To improve PM performance, begin with a clear and documented strategy. Conduct maintenance prevention activities such as proper cleaning and lubrication and ensure the equipment is set up to allow for precision alignment. Use the right condition monitoring tools and conduct on-the-run inspections whenever possible.

The Narrower Staves

There are six other staves that support the level of overall reliability. These staves are narrower, and symptoms of a short stave might not be as obvious. However, the level of overall reliability will be negatively impacted.

Shutdown Management

The first narrow stave is shutdown management. The cost of shutdowns varies by industry but will impact the overall reliability of almost every facility. Outages need to be carefully managed to ensure quality work is being performed and to limit cost overruns. To lengthen the shutdown management stave, document a management process. Follow a predetermined schedule and hold post-outage critique meetings.

Materials Management

Materials management supports the work management process and is necessary to achieve a high level of reliability. Parts should be stored correctly and there must be a high level of inventory accuracy. To increase the overall reliability, reserve, and kit parts as an element of the planning process.

Root Cause Problem Elimination

Root Cause Problem Elimination (RCPE) is the third supporting stave. Look for signs that problems are recurring to determine if this is reducing the overall reliability. If issues are found, develop an overall process for problem elimination and train team members. Set triggers for when problems will be investigated and collect evidence. Determine root cause with a logical thinking process and follow up to ensure the corrective actions were successful.

Engineering Interface and Technical Database

To hold a high level of reliability, engineering must include operations and maintenance early in projects. Equipment needs to be evaluated on the overall lifecycle cost. The technical database supports work management and must be carefully managed to keep the information up to date. Changes need to be recorded and all related documents updated. Identify an owner for each component of the technical database and train planners and crews on how to use the information to improve work execution in the field.

Skills for Maintenance

A lack of skills in maintenance will limit overall reliability. With fewer skills, more detailed planning must be done, and more supervisors are required. Develop a skills management plan and ensure individual training plans are completed.

Tools and Workshops

The right tools need to be available to perform high quality work. Specialty tools must be provided when needed. Workshops should be kept clean and well lit.

The Hoop Holding it all Together – Leadership and Organization

An important, but often overlooked, part of the barrel is the hoop holding all the staves together. The height and width of the staves is irrelevant if there’s not a strong band holding them in place. Having strong leadership and being organized for success underlies every aspect of maintenance and reliability improvements. A clear company strategy that recognizes the important part world-class maintenance and reliability plays is a first step. Structure the organization to reduce silos and improve the partnership between operations and maintenance. Use key performance indicators wisely to drive the desired behaviors. Develop SMART goals and provide frequent feedback to the team.

A Call to Prioritize Based on Deeper Understanding

To make lasting improvements the system itself must be improved. Evaluate the leadership and organization and begin by making changes that will support future work. Next target the lowest staves. The widest staves of work management and preventive maintenance will require the most effort to shore up. Raising the level of these two staves is vital if substantive changes are to occur. Narrower staves require less resources but will take focused effort to fundamentally improve.


Consider the hypothetical situation again with a different outcome. Before a maintenance manager is hired, the corporate and plant leadership openly and honestly evaluate their maintenance and reliability program. They find the limiting performance factor to be work management processes. This information is openly shared across the plant and the search for a qualified candidate begins. The plant develops a sense of urgency about the work management systems and a new maintenance manager is hired based on their proven record of success in this particular field. Targeted project goals are developed with the new manager able to add immediate value. The project is resourced and supported by corporate and plant leadership. Quick wins occur, the coalition of support grows, and after a few months no one can imagine work management the old way. The new maintenance manager is buoyed, and the company can celebrate success and be energized to begin the analysis again, eager to take on a new challenge.


Everything can’t be done at once. Design your program objectives around levers that will raise the lowest staves. Focus on improving the system behind the frontline workers. After the background system is improved, disseminate the changes to the crews, planners, and supervisors. Use leadership and organization to drive the change and make it last. Improving overall reliability is achievable by understanding the current situation and focusing on bringing the limiting factors up to a higher level of performance.


Subscribe to Machinery Lubrication

About the Author