Bearing analysis: Troubleshoot the problem, not the failure

Stuart Courtney, SKF
Tags: bearings, maintenance and reliability, condition monitoring

The objective of this article is to develop the mind-set of detecting and fixing problems and not just detecting failures. We often see examples of totally wrecked bearings and, alongside, the spectral and vibration data that detected the failure. To this end, there must be a multi-stage approach: the vibration monitoring program must be used to detect the problem at the earliest opportunity, and the maintenance department must act on that (and that may not be to change the bearing; it may just be a lubrication problem). If the bearing is changed, it is essential that it is changed at the right time. That is the key. If it is changed too early, people say the system is flawed. If it is changed too late, it may damage other components, and the evidence that can tell us what the problem was may be destroyed. The aim is to be proactive and not reactive.

The decision-support system SKF Bearing Inspector is aimed at offering increased speed, consistency and higher quality in the bearing decision making process. It should help to prevent bearing damage or failure from recurring. As with any knowledge-based computer system, SKF Bearing Inspector gathers all the relevant information and experience available about rolling bearing damage – from basic principles to practical engineering results. Causal relations between symptoms and possible reasons do not exist in reality and can easily lead to wrong conclusions. This is simply because the reasons (e.g., wrong bearing mounting) result in the damage symptoms (e.g., signs of fretting) and not the other way around. A modeling of a relationship from causes to symptoms where uncertainty is attached to “possible failure states” fits much better with the physical phenomena that occur during bearing service life. With the aid of state-of-the-art computational intelligence techniques, this approach has been followed for the development of the program.

This article will follow the ISO 15243:2004 standard as a reference.

The problem
Condition monitoring tools are often used as a way to detect defects or failure patterns in rotating machinery. We often use condition monitoring tools to be predictive in our maintenance planning to subsequently be reactive in what we actually do. Before we can study how we can use the tools to prevent the failures, we need to understand some of these buzzwords and look at what we need to do in order to use the collected data. There also must be a strategy for determining what to collect and how to turn the data into effective information. Take the case of a bearing (figures below): Did we do a good job in detecting the problem or did we just detect failure? You could say we prevented a catastrophic failure of the machine, but what was the cause and can we prevent it happening again?

Figure 1. Enveloped Spectrum of the Bearing

Figure 2. Waveform of the Bearing

Figure 3. Cyclical Time Analysis of the Bearing

This bearing had failed a number of times, but all that was done was to change the bearing, which is a very expensive and time-consuming job. By taking a time block of data, it is possible to then join the ends to show the data in a profile plot. This time block represents one revolution of the bearing. The data is then time synchronous averaged using a virtual trigger set by the time length of 1 rpm. This data now clearly shows that there are two load zones in this bearing, and that will eventually lead to stress in the inner race and cage, and failure will occur. The journal was checked and found to be oval; it was then machined and the bearing correctly fitted. The bearing has been in service since and shows no sign of a problem.

Root cause failure analysis and proactive maintenance worked. It is important to use these techniques before the functional failure occurs. The key is to troubleshoot the problem, not the failure.

Lubrication
When the lubrication of a bearing starts to fail, it generally causes an increase in vibration, noise or acoustic emission. A lubrication management regime is often based on listening to the bearing. This can work but, by far, the best way is to trend the data against engineering units. The following trend shows what happened to a bearing when it was lubricated.

Figure 4.

It can be seen that it apparently solved the problem, but the level of vibration never returned to the level from before the problem. The increased level after lubrication was due to small particles of debris still in the grease. The time waveform data was taken during the act of greasing the bearing. It can clearly be seen that the problem has been hidden by greasing.

Figure 5.

Decision-support system for bearing failure mode analysis
Gaining insight and information from rolling bearing damage and failures is of strategic importance for SKF and its customers. The knowledge collected on bearing damage is accessible for SKF engineers as a Web-enabled decision-support system called SKF Bearing Inspector. Allied with the knowledge of how bearing defect patterns appear in condition monitoring systems, root cause failure analysis can be greatly enhanced.

The decision-support system, SKF Bearing Inspector, is aimed at offering increased speed, consistency and higher quality in the bearing decision-making process. It should help to prevent bearing damage or failure from reoccurring. As with any knowledge-based computer system, SKF Bearing Inspector gathers all the relevant information and experience available about rolling bearing damage – from basic principles to practical engineering results.

Current knowledge-based systems have benefited from the experience of expert systems developed in the 1980s, although these suffered major flaws in aspects of reasoning capacity and computer power. These systems were often structured as decision trees that led from symptoms to possible causes. Causal relations between symptoms and possible reasons do not exist in reality and can easily lead to wrong conclusions. This is simply because the reasons (e.g., wrong bearing mounting) result in the damage symptoms (e.g., fretting signs), and not the other way around. A modeling of a relationship from causes to symptoms – where uncertainty is attached to “possible failure states” – fits much better with the physical phenomena that occur during bearing service life. With the aid of state-of-the-art computational intelligence techniques, this approach has been followed for the development of the program.

Knowledge system
Within a knowledge system, one generally distinguishes between modeling the knowledge with a certain knowledge representation and the reasoning principle, in order to derive problem-solving capacity. Regarding knowledge representation, several forms exist, such as:

Cases: Many bearing failure experiences can be found in case examples. Unfortunately, many practical cases are not well documented, and no uniformity regarding the documented parameters or failure mode conclusions exists. Example cases can, however, be used to model or verify other knowledge representations.

Rules: It is possible to generalize if-then rules between observed symptoms and possible causes. However, this is not appropriate because different causes can have similar effects that appear as similar symptoms.

Artificial neural networks: Mathematical relationships between symptoms and causes can be derived by using example failure cases. However, there are not sufficient numbers of discriminating cases to do this. Furthermore, system users require additional explanations rather than “black box” artificial neural network relationships that do not carry such explanations.

Probabilistic networks: It is possible to derive visual networks, in which nodes are connected by causal relationships, based on bearing failure theory and experience. Furthermore, probabilities are assigned indicating the weakness or strength of those relationships. By introducing correct causality from conditions to observations, this knowledge representation best fits the bearing failure diagnosis problem. Analysis of bearing damage and failure is principally a diagnostic task. Imagine a patient visiting his doctor with a specific complaint. The doctor first questions the patient about specific body and lifestyle parameters such as weight, smoking, etc. (conditions). Based on that information, the doctor makes hypotheses about likely diseases (failure modes). The doctor verifies or rejects these hypotheses through further questioning and inspection of the patient (symptoms). The process of a damage or failure analysis is similar to the doctor’s approach. In a correct diagnosis, there are two reasoning steps:

Hypotheses generation is where possible failure hypotheses are generated based on data. For example, the doctor starts asking questions to get an idea (hypothesis) of what could be wrong.
Verifying or rejecting hypotheses. One by one, the generated hypotheses are investigated and verified or rejected. For example, the doctor starts investigating the most probable diseases by conducting specific medical tests (blood pressure, heart rate, etc.).

With a probabilistic network, the two-step reasoning is implemented by forward and backward probability calculations.

More about probabilistic network
The probabilistic network is a visual network in which nodes are connected by causal relationships, and probability calculations are applied. The network for bearing failure analysis has four node categories: conditions, internal mechanisms, failure modes and observed symptoms. Conditions represent the conditions from and under which the bearing operates. Examples are speeds, bearing type, load, temperature, installation details, environmental factors, etc. Internal mechanisms represent the physical phenomena that happen during operation, such as lubrication, film disruption, sliding contact, etc. Failure modes represent the types of failure, such as subsurface initiated fatigue and fretting corrosion.

In Table 1, the various failure modes are listed. Observed symptoms represent the observable phenomena inside and outside the bearing, including discoloration, spalling, rust, etc. Approximately 150 nodes are connected by causal relations between conditions of the bearing application, hidden mechanisms, physical failure modes and observed symptoms. In the modeling of the network, various sources of information were used. Apart from defining the nodes, the causal relations and probabilities, explanation texts (for each node) including examples and pictures are developed. In total, approximately 250 pictures have been included in the system.

Figure 6. ISO 15243:2004

Figure 7.

Case study from Bearing Inspector
The Bearing Inspector contains several common bearing damage cases located under “Typical Cases”. These can be used as training material to demonstrate how the Bearing Inspector supports the analysis of a bearing damage investigation. One example is of an electric motor in a paper mill. In this case, an electrically insulated cylindrical roller bearing NU 322 ECM/C3VL024 is used in an electric motor of a paper winder in the reel section of a tissue paper machine. The electric motor speed is variable (400 VAC with frequency converter) and running between 1,000 and 1,500 min-1. After only a month of operation, however, heavy wear was observed on the inner and outer rings. Loading the example case in SKF Bearing Inspector sets all known application conditions (Step 1).

The first hypothesis of possible failure modes is calculated based on these application conditions. At this point in the analysis, Bearing Inspector gives a high likelihood of false brinelling, adhesive wear and current leakage. At first sight, current leakage and false brinelling seem unlikely because the machine uses insulated bearings and all machines are properly supported with rubber pads. The user then has to perform the second step of the analysis by inspecting the bearing on failure symptoms. Clicking “inspect” results in a list of damage symptoms most relevant to the selected failure mode.

The bearing is first inspected for false brinelling. Because no shallow depressions are found that can verify false brinelling, this failure mode is rejected. The analysis is continued with inspection of symptoms of adhesive wear. None of the symptoms related to adhesive wear are found, either. Finally, by inspecting electrical current leakage symptoms, the presence of small pitting is found after magnification of the raceway surface. This verified the current leakage failure mode. Subsequently, the customer indeed discovered an earthing problem in the winder construction causing the electrical current leakage.

Figure 8. Example Step 1: Application conditions are filled by loading the electric motor winder data among other bearing type, coating, speeds, etc. Detailed information and examples are provided under the information button.

Figure 9. Example Step 2: Bearing Inspector gives its initial diagnosis based on the information so far; the confidence factors are included.

Figure 10. Example Step 3: Inspection on symptoms for current leakage failure mode. After inspection and enlargement of the runway surface, small pitting is confirmed. Several examples are provided under the information button.

Figure 11. Example final diagnosis: Results are based on the provided application conditions (Step 1) and bearing system inspections (Step 2).Both the probabilities of the most relevant failure modes and related internal mechanisms are listed. The results can be printed out as a Microsoft Word document or an HTML report.

Instead of investigating all possible observations and non-filled-in conditions, the most relevant ones are suggested, dependent upon the failure hypothesis (or internal mechanisms) that need to be investigated. In other words, these are the application conditions or observations that have the most discriminating effect on the failure hypothesis. The discriminating effect is determined by a mathematical measure.

For all possible not-filled-in conditions or observations, this measure is scaled between 0 and 100. An example is given in the illustrations. Eventually, by investigating the application conditions and observations, the likelihood of the failure hypotheses and internal mechanisms is determined and ranked. These then form the conclusions of the bearing damage analysis. The system is further extended with various functions that can help the user. A simple file with user instructions is provided for getting started. Session data control is available for session data storage and retrieval. Also, in a file marked “Typical Examples”, users can be guided through the application of the program. For convenience, an extensive report can be generated in Microsoft Word or HTML format, including the relevant conditions, observations and failure mode probabilities.

Conclusions
Bearing Inspector meets the need for a fast, more consistent, high-quality decision-making process for bearing damage and failure investigations. This Web-enabled system is available for SKF engineers to support customers in bearing damage and failure investigations. It can help to determine how a bearing failed and, therefore, how to ensure that the same failure cannot happen again. These failure patterns should then be used to determine how to configure a vibration-based condition monitoring program.