Like most things here in the world, long-term investment and implementation usually yields in the best results and quality. Maintenance is no exception. While the construction of the Panama Canal was nothing short of a disaster in terms of how long it took and how much it ended up costing, there is something to be said of the longevity. There’s a reason it is one of the world’s most important freight routes, and included in the “Seven Wonders of the Industrial World.”
Many things are built and maintained well and there are few quick fixes that work well in the long run. The Panamanians understood this and reached out for help in advance when it came to taking over the canal and its maintenance.
It is always surprising to me how much focus is placed on price of equipment rather than the quality, or the cost of ownership, when most of us now that few good things come cheap and that we seldom pay little and get a lot.
People call me all the time wanting to know how to cut costs. My answer is always the same: “Cutting maintenance costs seldom improves reliability.”
“Listen, I agree with you, reliability is very important and the key to competitiveness and survival, but first we have to cut the cost,” is a common answer.
It’s warranted to repeat what I see as one of the biggest threats to successful reliability and maintenance globally, as we are heading into the 2020s—efforts to cutting maintenance costs. Executives are often pushing for it, and the answer always is: There are only three ways to cut maintenance costs, and only two of them are right.
One, just cutting costs yields in short-term savings and long-term loss. Valid Maintenance jobs can never be avoided, only deferred, and as most things, the price tag only goes up the longer we wait.
Two, reducing needs/costs by engaging in reducing the need for maintenance by buying the right equipment, maintenance prevention, precision repairs, lubrication, operating equipment right and many other things. Correctly done, this will both increase reliability and lower costs.
Three, executing remaining maintenance more efficiently is mostly about work management, inspecting to find failures early, planning and scheduling work, and executing. In a healthy reliability culture, clearly the focus must be on the two latter.
We see this in nearly all organizations IDCON works with, more so my son Tor, who has run the daily operations of IDCON since 2009, since I am semi-retired, but we discuss it frequently. Sustainable improvements in the RM area are impossible within a short time frame. Short-term perspectives generate short-term gains—and long-term losses. Managing maintenance with a primary focus on costs, can only be done by validating work and executing it efficiently, not cutting or deferring.
Valid maintenance work cannot be eliminated—it can only be postponed and many times this leads to higher costs, and even disasters, later. There are many examples of this in the infrastructure of our society, including disasters claiming lives like collapsing bridges. Tragically, deferred maintenance has been reported as the cause of many of these disasters. The 1994 sinking of M/S Estonia in the Baltic Sea, one of the worst maritime disasters of the Twentieth Century, with 862 lives lost, was likely due to deferred maintenance, or at least partially due to it. There are many theories, including some conspiracy theories that she may have been sabotaged, but the official report was very critical of the crew’s shortcomings. My guess is that it most likely was a combination of things, poor design with no, or poor inspections of the components that failed.
Since the late 1960s I have always used LCC as a way to explain what terotechnology and physical asset management is. I see Life Cycle Cost as a measurement of the implementation of terotechnology and physical asset management. Sometimes people mix these terms up a bit but they all technically mean the same. So let’s have a look.
In the late 1960s, Dennis Parks, then manager at Eutectic Castolin, introduced the term “terotechnology” and the term was later added to the British Encyclopedia.
Tero is ancient Greek, meaning “to care for.”
A common definition of terotechnology is the maintenance of assets in an optimal manner. It is the combination of management, financial, engineering, and other practices, applied to physical assets such as plant, machinery, equipment, buildings and structures in pursuit of economic life cycle costs. Today we often call it physical asset maintenance management, and it also takes into account the processes of installation, commissioning, operation, maintenance, modification and replacement. Decisions are influenced by feedback on design, performance and cost information throughout the lifecycle of a project.
Asset management is a systematic process of deploying, operating, maintaining, upgrading, and disposing of assets cost-effectively. It is broadly defined and refers to any system that monitors and maintains things of value to an entity or group. It may apply to both tangible assets such as buildings and to intangible concepts such as intellectual property and goodwill. The term is commonly used in the financial world to describe people and companies that manage investments on behalf of others—investment managers that manage the assets of a pension fund.
There are some alternative views of asset management in engineering. Amongst us, it is the practice of managing physical assets to achieve the greatest return (particularly useful for productive assets such as plant and equipment), as well as the process of monitoring and maintaining facilities systems, with the objective of providing the best possible service to users (public infrastructure assets).
As the term relates to maintenance, I think it is important to clarify and call it Physical Asset Management. Today we have international standards for Physical Asset Management. Hopefully that will clear up some of the confusion.
And regardless of what you call it, just cutting maintenance costs will only yield in less productivity, which in turn will produce less profit.
The importance of reliability and maintainability design
Not every project allows us to work with reliability and maintainability design, but with the Panama Canal, we did that as part of the training. We advised them to procure equipment based on lowest cost to operate and maintain over, say 20 years, rather than lowest cost to buy, or what we call LCC-Life Cycle Cost, which also includes the cost for un-reliability. This was of course also tied in with what documentation is needed, Bill of Materials (BOM), repair instructions, preventive maintenance, troubleshooting charts and more maintenance related documentation. If reliability and maintainability design is built in to the early stages of a project, reliability and maintenance costs can be much improved.
Reliability design is done to assure that a production system and its equipment will function reliably when needed. It can include adding redundant functions, using the right material and choosing components with a low failure rate.
Designing for maintainability is done to make it safer, faster and easier to replace parts and perform repairs. It can include easy access to equipment, easy access to lifting devices, and design enabling visible inspections of components.
For the armed forces, the army, navy and air force, and things like railways and mass communication, reliability and maintainability design is an established and very familiar concept. This is of course because procurement of new equipment is massive and its technical life must be long.
A great example of reliability and maintainability design are the jeeps built during World War II, and how fast they could be assembled and disassembled. Another one is the design of a tank where the whole engine and drive train can be changed in 30 minutes, assuming all parts were on hand.
On the opposite spectrum we have the Renault models of the 1970s, where a light bulb for the headlights could not be changed without removing the whole headlight assembly. This hurt the French car manufacturer, especially since with other brands it was just a simple snap-out and snap-in job.
The ideas presented in our training with the people on the Panama project were included in the specifications when they placed the order for the mules. One key in our training is to not specify how to design, but more so specifying design on “performance.” Some basic examples of performance-based design is: “Gears for wire tension drum shall be easy to inspect,” and: “Drive motor shall be possible to replace on-site in less than 30 minutes.”
Simplify, Simplify, Simplify.
The fewer components a system has in its design, the more reliable it will be. One of the early steps in reliability design is to document and analyze the production system or equipment draft design, using reliability block diagrams, and then analyze reliability and maintainability of equipment and its components.
If the reliability of each component in the series of three components is assumed to be 90%, then the system reliability would be 0.9 x 0.9 x 0.9 = 72.9%: In a system where the B function is designed with redundancy, and if both B functions must work, total reliability will not improve. If only one of the B functions must work while the other B function is on standby in case of breakdown, the total system reliability will improve to 80.2%. Assuming the reliability of each component is 90%, the mathematical formula for this system is 0.9 x 1-(1-0.9)(1-0.9) x 0.9 = 80.2%
Understanding RBD and gathering experience from many types of industries can help you estimate the reliability of an existing production line.
I worked with an iron ore mining company where they used ball mills to crush the ore into smaller particles for continued processing in a pellets plant. They reported a reliability of 78 percent based on 100 percent capacity of throughput, producing an average 780 tons of possible 1,000 tons. I suggested they could reach over 90 percent with better maintenance. They argued that it was impossible. How do you know we can do that? What experience do you have on these ball mills? You never worked here. I often hear similar arguments. In this case, I could claim I had worked in this exact type of industry in several countries and the best of them had over 90 percent production of their production lines capacities. They still appeared suspicious and continued supplying arguments for how they were different and why they would not reach above 90 percent.
In situations like that it’s helpful to explain the basics of reliability block diagram and then boil that down to some practical examples. So I did.
A tissue and paper towel machine can reach a production reliability of 96 percent based on % Quality (Q) of total product made, multiplied by % Time efficiency (T) based on time making product as a % of 8,760 hours available production time per year. (Q% x T%)
A tissue and towel machine can be complex. Many are about 165 feet to 246 feet long, but compared to a paper machine making paper for liquid packaging that can be up to 1,200 feet long, it is less complicated. Still, a tissue and towel machine might have 50 times more components in a series than a ball mill used in mining industry to crush ore. A paper machine making paper for liquid packaging has perhaps several hundred more components in a series compared to the ball mill. These machines can have a production of 84% on same basis as for the tissue and towel machine described above.
|Tissue machine||Liquid Packaging Paper Machine||Ball Mill|
In contrast to the paper machines, to simplify, a ball mill only has an inlet for ore to be crushed, the rotating mill, one electric motor, a gear, two couplings and a pinion gear driving the rotation. Based on this logic people in this mining operation realized they were not as good as they thought they were. After showing this example they realized that they had a great improvement potential.
The more components a system has in a series design, the less reliable the system is. It is therefore important to strive for simplicity and as few components as possible in a series.
Component reliability diagram. At an average reliability of 99% for each component of a system. A system with 10 components in series will have a system reliability of 90.4%. A comparison with 95% average reliability of each component results in a system reliability of 59.4%.
One way to improve a system’s reliability is to add redundant equipment. Having redundant, or back up equipment, is common if the cost of it is less than the consequences of a system breakdown. Here’s a scenario:
Reliability Block Diagram:To calculate the reliability of the three pumps with one on standby can be done by redrawing the first RBD-illustration. In this case the three possible pump combinations are: Pump C 1 and C 2, or C1 and C 3, or C2 and C 3 must work. Then we can use the same formula as for parallel systems to calculate system reliability. Two pumps must run. One stand by: 1 –(1 –0.8) x (1-0.8) x (1-0.8) = 99.2%. This is a simplified example as you should also include reliability of an automated switch over function.
Operational as well as reliability and maintainability designs must be included very early in the LCC phases. In my opinion, this concept can only be justified for long-term investments where we can expect use of a long technical life of equipment.
One problem is how we reward project managers who have “on time and under budget” as a goal. Any problems resulting from this reward system will show up much later in the manufacturing and maintenance budgets.
Including maintainability and reliability designs early on in a project is usually a very good investment. The challenge is to bridge the gap between Capital Expenditure (CAPEX) and Operations/Maintenance budgets. It is also important that feedback operations and maintenance experiences be included in engineering standards on an ongoing basis.
Most industries we work with do not do this very well. We recently did a startup review for a client about to launch a new $770-million plant and found that some major things were missing in their project contract. Guards for rotating equipment, belt drives and several other components weren’t designed to easily allow visual inspections. Sensors for condition monitoring of components for safe access and measuring were not installed. They hadn’t set up a preventive maintenance system and didn’t even a bill of materials included in the contract with the manufacturer. All of this should have been part of the initial project contract. Instead, this client faced a much higher cost since they will have to redesign guards, set up PM, document a BOM, and several other steps they had bypassed.
As a rule of thumb: when you have spent about 50 percent of the time in the design specification phase of a project you have locked in about 80 percent of future Life Cycle Cost (LCC) for an equipment. LCC includes operating and maintaining as well as cost of energy, keeping or not keeping spare parts, and disposing of equipment.
To make a change of design late in the design/specification phase can get costly. Some times as much as 10 times or more than if you had thought of this change from the start. If you already signed a contract to proceed, it can cost about 100 times or more to make a change. And, after equipment has been operating a number of years, the cost to modify can be 1,000 times higher.
If plants could take a more holistic approach to management of physical assets and bridge the gap between the project phase and the operating and maintaining phase of equipment life much cost, lost production and safety incidents would be avoided.
This relates not only to technical changes, another example can be documentation and a detailed Bill of Material and interface with stores.
Christer Idhammar is the founder of IDCON, Inc., a management consulting firm (idcon.com). This article was excerpted from a recent book authored by Mr. Idhammar entitled Knocking Bolts. More information can be found on this book at https://www.idcon.com/reliability-and-maintenance-books/