Single Point Lesson: Equipment Criticality Analysis

Excerpts from Rules of Thumb for Maintenance and Reliability Engineers

Ricky Smith, World Class Maintenance; R. Keith Mobley, Life Cycle Engineering

This article has been prepared to aid in the application of an Equipment Criticality Analysis. Included in this article are instructions for applying the Equipment Criticality Analysis methodology to determine which equipment has the greatest potential impact on achieving business goals.

Introduction

The Proactive Asset Reliability Process is shown in Figure 1. It is an integral part of a larger manufacturing business process. The Proactive Asset Reliability Process focuses on the maintenance of physical asset reliability on the business goals of the company.

The potential contribution of the equipment asset base to these goals is recognized. The largest contributors are recognized as critical assets and specific performance targets are identified.

Figure 1 - The role of the maintenance function, accomplished through the six (6) elements of the maintenance process, is to maintain the capability of critical equipment to meet its intended function at targeted performance levels.

The Equipment Criticality Analysis is used to identify:

Which equipment, if it fails, has the most serious potential consequences on business performance? The resulting Equipment Criticality Number is used to prioritize resources performing maintenance work.
Identify what equipment is most likely to negatively impact business performance, because it matters when it fails, and if it fails too often. The resulting Relative Risk Number is used to identify candidate assets for reliability improvement.

A consistent definition for equipment criticality needs to be adopted for any equipment analysis. The definition used in the context of this document is: critical equipment is equipment whose failure has the highest potential impact on the business goals of the company.

The relationship between equipment failure and business performance is an important factor in deciding where and when resources should be applied to maintain or improve equipment reliability.

Maintaining reliable equipment performance requires the timely execution of maintenance work to proactively address causes of equipment failure. Large organizations normally manage a backlog of maintenance work. This maintenance work is made up of individual tasks that must be carried out over limited time periods, using limited resources.

Equipment reliability improvement also requires the application of either human or financial resources. The business case for improvement justifies why the limited resources of the company should be applied to a project over the many possible alternatives that exist also competing for usually the same resources. When justifying an improvement project, it is not sufficient to demonstrate benefit. It is necessary to demonstrate that the relative benefits of a project exceed the potential benefits of other projects.

Equipment reliability improvement projects benefit the organization by reducing the consequences of failure and/or reducing the probability that the failure will occur. Equipment reliability improvement projects must focus on equipment that both matters a lot when it fails and is failing a lot.

The discipline of risk management recognizes that failures with high consequence normally occur infrequently, while failures with low consequence occur more frequently. This is represented graphically in the Risk Spectrum. The consequence of a failure is plotted against the probability of the failure event.

Probability is a measure of the number of events/unit time. The probability of an event like the nuclear accident at Chernobyl is very low, but the consequence is very high. Consequently, we don’t see a high frequency of accidents with this severity.

Alternately, many industrial organizations routinely experience failures within their plants. These failures impact business performance but their consequence would be considered orders of magnitude less than the consequences of a Chernobyl-like incident. Most plant failures would fall to the right side of the Risk Spectrum (figure 2)

Figure 2 - The pre-requisite to do Pareto Analysis is to have failure data to analyze. This means that these failures must have occurred to be recognized. However, potential failures with very serious consequences will not even be considered because there is no failure data to associate with them. Therefore, it is necessary to manage events across the ‘Risk Spectrum’.

Benefits of the Equipment Criticality Review Process

This process takes an integrated approach to setting project priorities. Potential impact of equipment failure is assessed in each of the following categories: safety, environmental integrity, quality, throughput, customer service, and operating costs. The scales in each assessment category ensure that equipment prone to failure resulting in safety and environmental consequences is emphasized.

It also ensures that equipment impacting on the operational objectives of the organization, when failure occurs, is addressed. Resources are continually being challenged by project assignments from different sectors of the organization with no unifying evaluation process to decide which should take priority. In the case of maintenance program development, it is not possible to develop a separate maintenance strategy for each business driver.

What is required is a comprehensive program that responds to the total needs of the organization. The equipment criticality analysis provides a prioritized view of composite needs, which then becomes the focus of a suitable Equipment Reliability Improvement Strategy. Projects with the potential to deliver the maximum benefit to the company by mitigating risk are identified to be the subject of Equipment Reliability Improvement Strategies.

Preparing for Equipment Criticality Analysis

Equipment Hierarchy Review

Prior to performing an Equipment Criticality Assessment, an "Equipment Hierarchy" must be produced. The Equipment Hierarchy needs to account for all equipment within the assessment area boundaries. This means that all maintainable components can be mapped and identified to an equipment or sub-equipment level.

Registering the Equipment Criticality Analysis

All completed Equipment Criticality Analyses should be consistently documented and recorded in an appropriate database. An analysis title should reflect the highest level in the Equipment Hierarchy that the analysis will apply to. For example: XYZ Corporation, Port Operations, Sorting Plant, and Packaging Line Equipment Criticality Analysis. The date when the analysis is conducted should be recorded, along with the identification of the review team members and a description of their titles/positions.

The Equipment Criticality Analysis should be reviewed and revised on an annual basis to reflect changes in business conditions, improvements in reliability and to identify new priorities for reliability improvement. Different review team members may be involved in the analysis review. The original team should be documented as well as the team members for the last revision.

Document a list of equipment to be assessed at the appropriate analysis level.

The level of analysis that the assessment is completed at is important. It is undesirable to evaluate the criticality of components. It would also be inappropriate to evaluate the criticality at the process or facility level. The level at which the analysis is done requires that the results of the analysis apply to all sub-level equipment not identified for analysis. Although somewhat imprecise this provides a good definition for the first pass. In the evaluation process, it quickly becomes apparent if the equipment should be further sub-divided into sub-levels.

The two factors used in the risk assessment are the potential consequence of the failure when it occurs and the probability that the failure will occur. If the level of the analysis is conducted too high, the resulting estimate of risk associated with the equipment may be misleading. This is illustrated in the accompanying Airplane example (figure 3). If a risk assessment is done at the airplane level, the result will be that flying in airplanes is high risk.

What is the correct level in the equipment hierarchy to perform criticality analysis?

By simply moving the analysis down to a system level, a much different perspective is achieved. The Structural Systems of the airplane have a high consequence if they fail but are extremely reliable, having a low failure rate. The Airplane Propulsion Systems have perhaps a medium consequence when they fail because of built in redundancy.

Figure 4 - The failure rate is likely higher than the failure rate for structural systems. The relative risk is therefore greater. The comfort systems of the aircraft (such as seats, lights, entertainment plugs) have failure consequences much lower and failure rates likely much higher. Again, the overall risk is low. This result seems more reasonable at the desired level in the hierarchy complete with a definition of parent relationship and children included in the analysis line item. The facilitator prepares this list in advance of the analysis review meetings. It can also be revised during the review meetings. During the analysis, items and levels of detail omitted in the hierarchy are sometimes identified.

Define the Equipment Criticality Assessment Criteria

The Equipment Criticality Assessment is conducted by evaluating the potential impact of equipment failure on key business objectives. We will provide ’default’ assessment criteria but we also recognize that the client may wish to modify or redefine a different set of criteria for their organization.

In the ‘default’ criteria, company goals are categorized under the themes of safety, environmental Integrity, product auality, throughput, customer service, and total cost. An evaluation scale for consequence of failure potential is defined for each theme. If an equipment failure has no impact on a goal area, a score of zero (0) is assigned. If an equipment failure has an impact on a goal area the rating is assigned that most closely fits the consequence description.

Safety and Environmental issues have a maximum scale of forty (40). Operational Consequences independently score a maximum value of ten (10).

Most equipment failures impact operations in several different ways and in the extreme case a total operating consequence of forty (40) could be achieved.

The ‘default’ criteria are provided in the following table (Table 3).

Table 3 - Click to enlarge

Table 3 - Against each of the criteria it is possible to have an explanation. A set of qualitative descriptions is provided for the Environmental rankings in the next table (Table 4). Similar explanations could be provided for each of the assessment review areas.

Table 4 - Click to enlarge

Table 4 - Similar assessment criteria are provided for reviewing how likely a failure will occur on the selected equipment. This assessment is made along with the consequence evaluation for the asset being reviewed.

One way of interpreting how often the equipment fails is to assess how often any form of corrective maintenance is performed on the equipment. Differentiate corrective maintenance from preventive maintenance. The frequency or probability of failure number will be used in the calculation of relative risk to determine how likely the failure of the assessed equipment will impact the business. If an effective PM program controls failures, the equipment is unlikely to negatively impact business performance.

The Probability/Frequency of failure is evaluated on a scale ranging from 1 to 10 with 10 representing the highest failure rate. A description of the default criteria is: provided below (Table 5.). It is possible for an intermediate value to be selected, e.g., 8.5 signifying that failures are felt to occur between weekly and monthly.

Conducting the Review

Facilitating the Review Meetings

The Equipment Criticality Assessment is designed to achieve consensus among key decision-makers in an organization. Review team members are selected based on their ability to assess the consequence of equipment failure on the business, the frequency of individual equipment failure, and their responsibility for nominating or sponsoring Equipment Reliability Improvement Projects. The assessment process is designed to minimize the time that the review team must dedicate to attending the assessment review meetings.

The analysis is conducted by answering a series of structured questions about each equipment line item. These questions assess both the consequence of equipment failure and the frequency/probability of failure against the pre-defined assessment criteria. The Total Consequence evaluation is compiled from the group’s responses to the following questions using the assessment criteria for severity determination:

If the identified equipment fails, could it result in a Safety Consequence? If yes, how serious would you rate the "potential” consequence?
If the identified equipment fails, could it result in an Environmental Consequence? If yes, how serious would you rate the “potential” consequence?
If the identified equipment fails, could it result in a consequence affecting the quality of our product? If yes, how serious would you rate the “potential” consequence?
If the identified equipment fails, could it result in a consequence affecting the throughput capability of the plant? If yes, how serious would you rate the “potential” consequence?
If the identified equipment fails, could it result in a consequence affecting the service provided to the customer? If yes, how serious would you rate the “potential” consequence?

Answers to all these questions should be recorded in a spreadsheet during the review-team meetings.

Analyzing the Assessment Results

Calculate Equipment Criticality Number

The criticality of equipment is a function of its impact on the business when it fails, regardless of how often it fails. Not all failures matter equally. The Equipment Criticality Number assigned to an equipment level in the hierarchy is influenced by the severity of impact of failure and the consequence category. Equipment Criticality Numbers are assigned between 1 and 9. An Equipment Criticality of 9 is the highest and 1 is the lowest criticality.

During the review, the consequence of equipment failure is assessed against key company goal areas. The default criteria include the potential impact of failure on the safety and environmental integrity performance of the enterprise, considered fundamental to the continued operation of the business. Other key business goal areas such as product quality, throughput, customer service, and operating costs are assessed. The user may have redefined the assessment criteria as previously discussed.

The spreadsheet (Figure 5.3.8) can calculate and assign the Equipment Criticality Number using the following default logic. (This logic may need to be redefined by the organization if the consequence evaluation criteria are modified.)

Cascade Equipment Criticality Number To Applicable Levels in the Hierarchy

The Equipment Criticality Analysis is usually performed at an intermediate level in the hierarchy as described in section 2 of this document. The Equipment Criticality Number will apply to all children of the analysis level, except those children identified for analysis also. Any parent level not analyzed will adopt the Equipment Criticality value of the highest child. This is illustrated in Figure 5.3.9.

Figure 5.3.9 - Equipment Criticality Value - Click to enlarge

Determine which equipment has the greatest potential impact on business goals by calculating Relative Risk.

The Equipment Criticality Assessment uses the concept of Risk to identify which equipment has the greatest potential impact on the business goals of the enterprise. This, in turn, is the equipment most likely to fail and have significant impact when the failure occurs.

The “Relative Risk (RR)” number for the equipment is evaluated by calculating the product of the “Total Consequence Number” and the “Frequency/Probability (F/P) Number”. It is called “relative risk” because it only has meaning relative to the other equipment evaluated by the same method.

Total Consequence (TC) is the summation of the values assigned to each of the individual areas of consequence evaluation, e.g. Safety (S), Environmental (E), Quality (Q), Throughput (T), Customer Service (CS) and Operating Cost (OC).

TC = S + E+ Q + T + CS + OC

RR = TC X F/P

If the user defines different criteria then it follows that the Total Consequence would be the summation of scores applied in each area of consequence evaluation defined by the user.

Communicate the criticality assessment recommendations to all stakeholders.

The results of the Equipment Criticality Assessment should be communicated and understood by everyone affected by the nominated Equipment Reliability Improvement Projects. This

includes:

Senior and intermediate managers who sponsor or expect results from the project.
Coaches and team leaders are responsible for the assets that the project addresses.
People assigned to the assets that the project addresses.
Individuals who must commit time to the project or are directly affected by its outcome.
People who are not immediately affected.

Often the last group demonstrates the greatest opposition because they believe that the selected projects are "hogging" the financial and human resources needed to address their priorities.

The goal of this communication is to develop stakeholder understanding why each Equipment Reliability Project is selected, its potential impact on business performance and to define the resource expectations to deliver.

Using the Output of the Equipment Criticality Assessment

Prioritizing Equipment for Reliability Improvement

The relative risk ranking provides a means of identifying which equipment poses the highest potential impact on the organization. The equipment with the highest ‘Relative Risk’ ratings should be initially targeted for the application of some reliability improvement strategy.

The most basic means of prioritizing assets for reliability improvement is to perform a sort of the assessed equipment by ‘Relative Risk’. In many applications, this method of establishing priority is sufficient for project nomination. The top ten equipment items evaluated using the ‘Relative Risk’ criteria would then be subject to a project selection validation.

However, the priority ranking developed using ‘Relative Risk’ alone, does not consider how difficult it will be to improve the reliability of the critical equipment. Suppose this could only be achieved with a large commitment of human resources, over an extended time and at high cost. In assessing the business case for proceeding with the reliability improvement project, each of these factors plays a role.

An alternate prioritization method assesses the human resource effort for an equipment reliability intervention. Alternatively, the cost of the intervention, of the resulting redesign or equipment replacement can be evaluated. The following sections describe the process used to evaluate priority considering effort/cost.

The human resource effort required to proceed with the proposed equipment reliability improvement strategy is assessed. For example, the number of meetings to complete a Reliability Centered Maintenance Analysis is estimated. This effort provides an indication of the degree of difficulty that is required to overcome the performance gap.

Plotting Relative Risk/Effort Graph

The ‘Relative Risk’ value is plotted on the vertical axis of a graph and the effort on the horizontal axis of the graph. Intuitively, we want to initially work on projects with high potential impact that can be done quickly and projects with low impact, requiring large effort last.

In order to prioritize the proposed interventions, a diagonal line is drawn from the upper left corner of the risk/effort graph and terminates in the lower right. The slope of this line is calculated by summing all of the ‘Relative Risk’ values for each equipment item evaluated and dividing the ‘Total Relative Risk’ by the ‘Total Effort’ calculated by summing the ‘Effort’ values estimated for each equipment item.

The downward slope of this line from the upper right to the lower left represents a reduction in risk per unit effort. Consider a series of lines, drawn perpendicular to this diagonal completely covering the graph. Adjacent lines represent bands of relative priority.

A number one priority is assigned to the reliability intervention with the highest relative risk intercept. Lower priority is assigned to reliability interventions with successively lower relative risk intercepts.

An alternative to estimating human resource effort is to estimate the cost to proceed with the chosen equipment-reliability improvement strategy or equipment modification/replacement. This is an estimate of the cost required to overcome the performance gap.

Note: The use of this graph is a focusing tool only. The exact value and position on the graph is an indication of relative priority. Individual circumstances could require specific projects to proceed irrespective of their position on the graph.

For example, a piece of equipment whose failure has serious safety implications and a high frequency/probability of failure resulting in a high relative risk number requires a large expenditure of human and/or capital resources to improve its overall reliability. Legislation or a safety ruling may dictate that this project takes precedent over another asset scoring equivalent relative risk and requiring much less effort or cost. Nonetheless, the concept can be used successfully in most situations to develop a defensible position for assigning resources to address equipment reliability issues.

The application of the criticality assessment provides a means of identifying the equipment most likely to impact on business performance by improving reliability. Once potential Equipment Reliability Improvement Projects are nominated, developing a business case to proceed should validate them. The Criticality Assessment provides an indication of what areas of performance are likely to be impacted. In each category affected, which includes any or all of safety, environmental integrity, quality, throughput, customer service and operating cost, the current performance should be established and a performance target set considered achievable as an outcome of the improvement.

The difference between current performance and the desired end state should be quantified either in terms of costs for operational improvements or in terms of reduced incidents or level of risk for safety and/or environmental issues. This gap is important in creating the required tension for change to maintain management commitment throughout the project. Estimate the costs of the Reliability Improvement Intervention and summarize the cost benefit.

Identify what performance measures must be tracked to monitor the impact of the Equipment Reliability Improvements. As soon as capital or human resources are deployed, expectations are created to produce tangible benefits. The development of the business case solidifies what results can be expected from the Equipment Reliability Improvement Project.

However, it is still necessary to demonstrate the improvement. This is effectively done through the use of performance measurement. It is crucial that each of the stated performance benefits be monitored on a routine basis to validate improvement. If the required measurements are not currently collected, the project scope should formalize their creation. This permits the quantification of improvement benefits, sustaining project commitment and the management of long-term change.

Conclusions

The Equipment Criticality Evaluation Tool provides a systematic, consistent approach to assessing equipment criticality and nominating equipment reliability improvements. Rankings are arrived at by a consensus of decision-makers, responsible for project nomination. By design the process can be completed in a short period of time.

The focus is on business results which managers are already accountable for achieving. They are committed to projects which align with these objectives and are perceived as having the highest probability for success.

Finally, the application of systematic processes for focusing resource deployment supports a “due diligence” approach to physical asset management. Projects having the largest potential impact on the corporation weighted towards safety and environmental integrity become the most critical. Projects with the potential to deliver the maximum benefit to the company by mitigating risk are identified to be the subject of Equipment Reliability Improvement Strategies.

About the Author

Ricky Smith

Ricky has over 30 years’ experience working as Maintenance and Reliability Professional for companies such as Exxon Company USA, Alumax Mt Holly, Kendall Company and the US Army. In additi... Read More

About the Author

R. Keith Mobley

R. Keith Mobley has earned an international reputation as a leader in corporat... Read More

RCA and Troubleshooting: A Path to Sustainable Reliability

RC-Yay! Finding Success with Root Cause Analysis

Root Cause Assessment Methods

Fishbone Diagram: Determining Cause and Effect

Featured Whitepapers

The Essential Guide to Implementing a Successful Predictive Maintenance Program

Buyer's Guide

Lubricants

Oil Filtration

Lubricant Storage and Handling