MTTR is a metric used by maintenance departments to measure the average time needed to determine the cause of and fix failed equipment.
Anytime you see the phrase "mean time to," it means you're looking at the average time between two events. Mean time to repair (MTTR) is a metric used by maintenance departments to measure the average time needed to determine the cause of and fix failed equipment. It gives a snapshot of how quickly the maintenance team can respond to and repair unplanned breakdowns. It's important to remember the MTTR calculation considers the period of time between the beginning of the incident to the time the equipment or system returns to production. This includes:
The MTTR formula does not take into account lead time for spare parts and is not meant to be used for planned maintenance tasks or shutdowns.
MTTR, as it pertains to maintenance, is a good baseline for figuring out how to increase efficiency and limit unplanned downtime, therefore saving money on the bottom line. It also highlights why repairs might be taking longer than normal, which, when addressed, can get critical equipment up and running fast, minimizing missed orders and increasing customer service. In the interest of efficiency, MTTR analysis provides insight into how your team purchases equipment, schedules maintenance and handles maintenance tasks.
Even though MTTR is considered reactive maintenance, tracking MTTR gives you a look into how effective and efficient your preventive maintenance program and tasks are. For example, equipment with a lengthy repair time might have underlying root causes that contribute to the failure. MTTR can help you start investigating the root cause of failures and get you on your way to a solution. For example, if you notice MTTR increasing in a particular asset, it may be due to the fact that preventive maintenance tasks aren't standardized. A technician might get a work order telling him to lubricate a certain part, but it may not lay out which lubricant to use or how much, leading to further equipment failures.
MTTR analysis is also helpful when it comes to making decisions on whether to repair or replace an asset. If a piece of equipment takes longer to repair as it gets older, it might be more economical to replace it. MTTR history can also be used to help predict lifecycle costs of new equipment or systems.
You'll often hear the "R" in MTTR used interchangeably with "repair" and "recovery." The difference between the two terms is that when talking about mean time to recovery, you're including not only the repair time but what we've mentioned above – repair time plus the testing period and the time it takes to return to normal operation. Many people define MTTR by lumping the two together, as we did above. The only time you'll need to distinguish between the two is in the context of maintenance contracts or service level agreements (SLAs). This way, people know exactly what they need to be measuring.
As we touched on earlier, the MTTR formula is the total unplanned maintenance time divided by the total number of repairs (failures). MTTR is most commonly represented in hours. Keep in mind, MTTR assumes tasks are performed sequentially and by trained maintenance personnel.
A simple example of MTTR might look like this: if you have a pump that fails four times in one workday and you spend an hour repairing each of those instances of failure, your MTTR would be 15 minutes (60 minutes / 4 = 15 minutes).
Another example could involve an asset that experiences 10 outages in a 90-day period. The outage times (time of detection to time the asset is back to production) are 24, 51, 79, 56 and 12 minutes. The MTTR for this 90-day period is 44 minutes. That is the average time between the detection of the issue to the recovery of the asset.
There are two assumptions to keep in mind when calculating MTTR:
It's been said that some of the best maintenance teams in the world have an MTTR of less than five hours, but it's almost impossible to benchmark your facility's MTTR with another's metrics due to the number of variables. MTTR depends on multiple factors like the type of asset you're analyzing, its age, criticality, maintenance team training, etc.
When dealing with systems or equipment that can be repaired, MTTR and MTBF are two metrics often analyzed and compared when looking into failures that can result in costly downtime. So, what's the difference between the two? Mean time between failure (MTBF) is a prediction of the time between the innate failures of a piece of machinery during normal operating hours or how long a piece of equipment operates without interruption. It's calculated by taking the total time an asset is running (uptime) and dividing it by the number of breakdowns that happened over that same period of time.
MTBF analysis helps maintenance departments strategize on how to reduce the time between failures. Together, MTBF and MTTR determine uptime. To calculate a system's uptime with these two metrics, use the following formula:
Consider the following scenario: Your system is supposed to be up and running 40 hours, but it wasn't working for 28 of those hours. It's only been available for 14 hours, and a total of five failures occurred. Using our uptime formula, we'll first calculate MTBF by taking 40-28 / 5=34.4. Next, we'll calculate MTTR by taking 28 / 5 = 5.6. So, to calculate uptime, our formula would look like this:
MTTR is seen as a key performance indicator (KPI). Therefore, maintenance teams should always strive to improve it. The benefits of reducing MTTR are fairly obvious – less downtime means stable production, happy customers and reduced maintenance costs. So, what are some steps you can take to help improve your organization's MTTR? The best place to start is understanding the four stages of MTTR and taking steps to reduce each of them.
Diagnosing the cause of the failure is the most time-consuming aspect of MTTR. In fact, 80 percent of MTTR is spent figuring out what caused the asset or system to fail. Documenting, managing and having a machine ledger on hand with things like maintenance schedules, repaired/replaced components and a history from equipment monitoring systems will be vital to being able to quickly narrow down possible causes of failure. In a failure scenario, critical time is lost when phone calls are being made, meetings are being called and incorrect diagnoses are happening, leading to fixes that fail.
In the same failure scenario, having proper documentation and an asset history lets you quickly examine all causal factors that may have contributed to the failure. Management can look at the maintenance calendar to see if the machine has been consistently maintained, see when the machine last had a component repaired or replaced, and check to see where that particular machine has had problems in the past.
Detailed written procedures should be made available to all maintenance personnel and followed precisely to mitigate the risk of trial and error when it comes to making repairs. Procedures provide technicians with a structured sequence of actions that help minimize the time it takes to fix an issue.
All the documenting and preplanning in the world will not help reduce your MTTR if your technicians aren't properly trained with the right skill set needed to repair your equipment. Implementing continuous training exercises and sharing them with the team is vital. Discussing recurrence matrices and introducing one-point lessons are great ways to do this.
Even though the MTTR formula generally doesn't consider lead time for spare parts, it's important to acknowledge how the availability of spare parts affects MTTR. In his dissertation, A structured approach for the reduction of mean time to repair of blast furnace D, ArcelorMittal, South Africa, Vanderbijlpark, Alex Thulani Madonsela discusses human factors contributing to MTTR; one of them being spare parts. "Timely availability of spare parts affects the duration of maintenance tasks," he explains. "Without proper support of equipment when required, executing maintenance becomes difficult for maintenance personnel. The lack of spare parts and knowledge of where to find them negatively affects the MTTR when maintenance has to be performed." Madonsela goes on to detail an approach to help minimize MTTR by having an organized inventory of spare parts.
Perhaps the best chance for an organization to reduce its MTTR is by implementing modern monitoring technologies. Onsite or remote monitoring done via a smartphone or tablet gives you a 24/7 look into how your system is performing. This real-time data can be used to track metrics like MTTR and let plant engineers design preventive maintenance plans and plan for failures in advance.
Modern computerized maintenance management systems (CMMS) help you easily track data like labor hours spent on maintenance, number of breakdowns and operational time, which is used to monitor high-level failure statistics. CMMS can even calculate MTTR and MTBF automatically for you. You may have heard of the internet of things (IoT) – the interconnection of everyday devices to the internet. It's already taking over the world of consumerism in the form of smart homes, as you now can control your heating and air conditioning units, lights, and locks all from your smartphone. But this is also creeping into the industrial world.
The industrial internet of things (IIoT) introduces automation, real-time data analysis and smart decision-making into the world of manufacturing. Machine-to-machine technology is combined with the IIoT to offer real-time data analysis. This allows for things like tracking failure data in real-time when equipment breaks down and automatically gathering, aggregating and analyzing data before sending a recommended action to technicians. Failure data, like the asset's operating condition before the failure occurred, and historical repair data from your CMMS can be used to direct repairs. In other words, the IIoT can greatly reduce the diagnosing phase discussed earlier, the part of MTTR that takes the longest.