Features Extraction for Classification in Switching Devices using Fiber Bragg Grating

Abstract Vibration analysis systems are used to assess the operational condition of machines and electromechanical components in various applications. This work presents a measurement and feature extraction system that analyzes dynamic strain patterns in signals measured by fiber Bragg grating (FBG). The features were used to identify different simulated operational conditions in an electromechanical relay. The selection of the best feature space in the first approach was performed by statistical criteria that determine the threshold values and frequency bands to calculate each signal's switching time and power spectral density (PSD). These parameters are used in the support vector machine (SVM) algorithm, which presents 98 % accuracy for distinguishing four distinct conditions. Another methodology for extracting features, called wavelet scattering transform (WST), was used to demonstrate that it is possible to achieve even better performance levels. The results allow extending the methodology to more complex systems.


I. INTRODUCTION
Switching devices are widely employed to protect, control, and command equipment in the electrical system.Some are particularly important, such as on load tap changers (OLTC) and circuit breakers, with several critical electromechanical components responsible for many failure modes [1].Specific measurement systems have been developed to investigate patterns of vibration signals during the switching.Most commercial instrumentation uses electrical accelerometers installed in the equipment tank [2] - [3].The signals obtained by these sensors have a low signal-noise ratio in high voltage environments due to the cross-sensitivity with other electromagnetic sources inherent to this type of installation [4].
Furthermore, the signal attenuation in the propagation medium worsens, increasing the distance between the internal mechanical components and the sensors.Among the existing methodologies for extracting features, the technique to analyze typical signal signatures in the time domain is highlighted [2].On the other hand, several methods use features from the frequency spectrum to identify operational deviations [3] - [5].The fiber Bragg grating (FBG) has some advantages compared to conventional sensors, such as: being chemically inert and immune to electromagnetic interference.In environments close to internal components of the equipment.
In general, vibration analysis techniques are based on measurements performed by electrical or acoustic sensors, with optical sensors being little considered in the application for mechanical diagnosis in switching devices.Several methods can be applied to process only segments of the vibration signal to evaluate short-duration events related to the dynamic behavior present in the switching.The signal detection method proposed in this work is based on the double threshold method (DTM) [7], and statistical analysis is used to select the best thresholds values.At the same time, some pattern recognition methods define several parameters to describe the signal.After obtaining all the features, reducing the problem's dimensionality is necessary to decrease the processing time.Methods such as principal component analysis (PCA) select the most informative features [8].In this case, the computational effort to obtain a set of representative features makes it difficult to apply in real-time monitoring systems.This work proposes a new concept in analyzing dynamic signals for diagnosing switching equipment.Traditional methods use sensing susceptible to external interferences that require complex algorithms for synchronization and separate only the informative part of the signal from other sources [9].In general, it is impossible to cover all possibilities, and some works show that signal processing and strategies for eliminating external noise are unfeasible to be applied in practice [10].
There is a growing demand for the development of new methods that are immune to crossinterference in environments with noise sources that are often unpredictable and difficult to eliminate.
In [11], the main advantages of using the FBG to measure temperature and strain in the power system equipment are mentioned.Some correlated works propose measurement applications with FBG for dynamic systems with a high-frequency acquisition rate.For example, measurements of highfrequency transient phenomena can be used to assess the mechanical integrity of power transformer winding during the energization and to measure acoustic signals generated in partial discharges [12] - [13].Another application in switching devices utilizes the FBG's temperature response to aid in the design of the arc-extinguishing chamber in low-voltage circuit breakers [14].
In the literature, FBG has a wide application in temperature measurements in electrical machines.
Recent works address vibration measurements in motors [15] and transformers [16], where the failure behavior and correlation with stationary frequencies are well known.This work aims to evaluate the promising use of FBG to measure dynamic strain signal patterns and identify possible operational deviations in an electromechanical relay.
In this work, the first approach for the feature extraction method uses prior knowledge of the physics of the system.Thus, this methodology defines the best thresholds to calculate the parameters for classification.The second approach utilizes a technique that, in short, generates a high dimensional matrix feature for each signal without the requirement to know what happens in each recognition algorithm.The measured performance of the proposed classification methods validates this application to more expensive and complex systems in future studies.

A. Experimental Setup
The measurements were carried out with a low voltage relay model MLP-A-112, manufactured by MEISHUO.The operational principle is founded on the electromechanical force generated when the actuation coil is connected to a 12 Vdc source.The movement of the armature provokes the closing of the contact.After removing the electrical current, the restoration force of the spring returns the contact to the open condition [17].Fig. 1 describes the setup used for the tests.
The optical interrogator I-MON 256 USB OEM, manufactured by Ibsen Photonics, was used for the data acquisition with a sample rate of 6 kHz.The FBG sensor was glued with a thin layer of cyanoacrylate-based adhesive to the relay's outer casing.The periodical variation in the refraction index in FBG's core works as a stop-band filter that reflects a part of incident light centered in the central Bragg wavelength (λB), which depends on the period (Λ) and effective refraction index (ηeff) of the FBG.These parameters are affected by temperature or mechanical strain [18].It is presented in (1) the fundamental relationship between the FBG physical parameters and the central Bragg wavelength.The displacement of the central Bragg wavelength (ΔλB) can also be expressed with the definition of the effective strain-optical constant (pe), which is calculated according to the effective refraction index and mechanical constants of the material employed in FBG's construction.From ( 1) is obtained the general expression presented in (2), the terms are calculated applying the derivative as a function of the deformation (Δl) imposed on FBG.
A standard value for the FBG fiber used in this work has pe = 0.2126.For a Bragg wavelength between 1530 and 1560 nm, the conversion of the wavelength shift in picometer scale (pm) to relative strain in the microstrain scale (µε) is presented in (3) [17].
In this work, the dynamic strain generated while switching a low-voltage electromechanical relay was measured by one FBG sensor.The optical interrogator adjusted the measurements in a time window equal to 5 seconds.The relay was closed and manually opened during this interval.The FBG used in the tests has reflectivity near 90 %, a central wavelength of 1532 nm, and a full-width at half maximum equal to 0.42 nm.The signal segment required to estimate the dynamic behavior in the switching event occurs on a millisecond scale, so it was not necessary to implement methods for temperature compensation.Furthermore, the influence of the electrical arc was despised because the sensor was positioned outside the active part.However, it is important to emphasize that such behavior should be evaluated in future studies with the internal application of the sensor close to the contacts.
The relay used in this work has approximately two centimeters in length and one centimeter in height and width, making an invasive approach for the insertion of controlled defects in the internal components difficult.However, it is shown in (4) that the resultant force (FRES) responsible for the impact intensity during contact movement depends on the electric current in the actuation coil (I), which produces the magnetic attraction field.The second parameter is the spring's elastic constant (k), which acts as a restoration force to recover the system's original equilibrium position [20].
where Κ1 is a constant that depends on the coil's inductance and magnetic configuration on the relay's core, and xspring is fixed as the distance between the movable part and the static part of the core.
Equation ( 4) demonstrates that the reduction in the actuation current simulates an effect analogous to the stiffening of the spring.Therefore, four distinct resistance values were connected between the actuation coil and the voltage source.For evaluation, twenty-five measurements were performed in each simulated condition by four resistors with specific values: 0 Ohms, 47 Ohms, 110 Ohms, and 137 Ohms, respectively, identified as operational conditions 1 to 4. The first condition, 0 Ohms, was selected to provide the maximum actuation current.On the other hand, the value of 137 Ohms was chosen because it is near the minimum actuation current.The values of 47 Ohms and 110 Ohms were selected as two arbitrary intermediate states for evaluation.
The signal processing and machine learning algorithms were implemented using Matlab software, version R2020b.Fig. 2 shows a complete measurement window with 30,000 points.It is essential to enhance that only the relay's closing signal was appraised.Thus, this region was adjusted in a time window with 1,024 points.It is important to enhance that in the reduced window the first negative peak is not centered and was positioned 700 samples to the right from the origin of the time axis to permit switching time calculation for all operational conditions.
It is possible to observe that the signal stabilizes to another level after the impact.This behavior can be explained by the accommodation of the FBG sensor in the relay surface.

B. Feature Extraction: Time and Frequency Analysis
The first methodology evaluates two features.The calculation of the switching time is used as the principal feature, which is obtained smoothing the measured signal from Fig. 2(b) through a moving average filter with a sliding window of 30 samples.The derivative was calculated with the finite difference method, and threshold values were defined to establish the transient signal's start and end.
The derivative calculation is expressed in nanometers per second, and better identifies the growth and stabilization of the measurement.
The threshold values were appraised using fundamental criteria of decision theory [21].The mean and standard deviation of the switching time was calculated for each subset of measurements.The best separation was defined, comparing the distances between the means and the standard deviation value, as can be directly verified by plotting the probability density functions (PDF).
Similarly, the power spectral density (PSD) was used to determine a second parameter to describe the signal energy content in a frequency band.The PSD was obtained, integrating into some frequency bands, and dividing by the bandwidth [22].Each measurement was represented as a vector with two features, the first component represents the switching time, and the second is the average PSD.

C. Feature Extraction: Wavelet Scattering Transform
Compared to Fourier transform, the continuous wavelet transform (CWT) represents the 155 nonstationary signals in the time and frequency domain more adequately.Since time-dispersed signals have a good representation with low-frequency components and the fast transients correlate better with high frequencies.The wavelet analysis modifies the scale factor that permits compressing or dilating the wavelet signal in the time domain [23].The wavelet scattering transform (WST) extracts features with the cascading convolution of wavelets and the signal, as shown in Fig. 3.These wavelets can be represented as filter banks.
The WST is equivalent to a convolutional neural network (CNN).However, instead of training the filter banks to learn the features, this method directly calculates the value of the signal convolution with a set of basic orthogonal functions called mother wavelet (ψλi), where i is the order of the filter.
Fig. 3 shows that the equivalent nonlinear operation of the network is obtained by the convolution module, which removes the imaginary part and computes the signal's envelope.The last step requires averaging the result to achieve invariance to time-shifting and warping, which is feasible performing a convolution with a low pass filter.The low pass filter is called father wavelet (ϕ) [24].
Fig. 3 shows a three-level network, with each black node representing a scattering path.It is crucial to emphasize that the quality factor, which means the number of wavelets per octave defined for each filter bank, and the invariance scale affects the number of scattering paths.The final operation in a path produces a scattering sequence vector, so the WST computes a matrix feature with the rows representing the paths and columns representing the scattering sequences.In this work was defined an invariance time of 0.03 seconds, a quality factor equal to eight and one, for the first and second filter bank respectively, obtaining 75 scattering paths and 32 scattering sequences for each signal.

D. Machine Learning Algorithm: Support Vector Machine
The features extracted from both approaches were used in the support vector machine (SVM) algorithm.The first method generates a bidimensional feature space, which is represented by a matrix with 100 rows (total of measurements) and three columns.The third column is the label used in resultant force on the spring that is modified by the electrical current [19].So, this representation with few features must present high performance if it is possible to determine suitable thresholds.
On the other hand, in the second method, the number of examples and labels was multiplicated by thirty-two, which correspond to the number of scattering sequences containing seventy-five features each.Thus, the matrix feature has 3,200 rows and 76 columns.The last one corresponds to the predefined label.The measurements in this work were not perfectly synchronized in time, and there are regions where the signal presents only noise.In this situation, classification errors may occur often.
The methodology present in [25] was used to reduce this characteristic, where the predicted label was counted and ranked in the scattering sequences related to the same signal.The final class was determined as the predicted class with higher incidence.
The SVM aims to find the best hyperplane that expresses the separation between classes.In most cases, it is unfeasible to perform classification tasks without making mistakes.Consequently, are defined penalty parameters that trades-off the margin distance separation between support vectors and the separation hyperplane with the number of admissible errors [5].In some cases, it is inevitable to utilize nonlinear kernel functions, so the feature space is projected to a higher dimension, which in the algorithm implementation is equivalently given by the inner product of two vectors [22].The functions used in this work were linear, third-order polynomial and Gaussian.Finally, the dataset was randomly segmented to validate the SVM models using cross-validation [26].Thus, 80 % of the data was used for training and selecting the best parameter for SVM and 20 % for measuring the final performance with new data.

III. RESULTS
Fig. 4(a) shows the Probability Density Function of the calculated switching time for all classes of measurements.The distribution function of the subsets was modeled as Gaussian functions.Many thresholds were appraised, the best separation was observed when the finite difference of the smoothed signal was greater in modulus than 0.035 nm/s.The threshold that indicates the end of the switching event is determined when the finite difference decay to values lower than 0.03 nm/s after the first inversion of the signal.Analyzing Fig. 4(a), it is possible to observe indecision regions where distinct distributions overlap.In this region, the classifier commits more errors.Evaluating the best frequency range in PSD analysis was not as effective as the previous feature.
An arbitrary bandwidth from 0 to 1500 Hz was selected but did not contribute in a significant manner to the classification problem.Fig. 5 compares the PSD curve from measurements eight and twenty related to the third operational condition.The variability in the Fourier frequency spectrum and the area under the curves can be explained since the signals are not synchronized in time.
Therefore, any shifting or slight deformations cause a significant change in the frequency domain.
Even having only one informative feature, in this case, the three classifiers trained with SVM models presents a high performance.Table I shows the mean accuracy obtained in the first part of the crossvalidation algorithm with the training data.The best performing model selected was the linear model to classify operational conditions 1 and 2.
In addition, the linear model was selected to identify conditions 2 and 3 because has the simplest model.On the other hand, the Gaussian model presented the best performance as a classifier to distinguish between classes 3 and 4. The second line in Table I shows the model's overall performance to the best selected parameters obtained in the cross-validation.The overall performance was 98 %, presenting one misclassification in the first classifier and another in the third.
The effort to determine thresholds values with a statistical approach requires processing time.In the tested operational conditions, only the feature correlating with switching time was enough to obtain an acceptable result.Nevertheless, many other failures modes may occur, and some do not affect the switching time.

IV. CONCLUSION
The FBG sensing method and the feature extraction algorithm presented in this work are promising applications for more complex switching systems.The FBG multiplexing property in a single transmission guide can be used in future works to increase the amount of information for the 160 classification models.The proposed system can be used to increase the ability to detect and locate internal faults in electromechanical switching devices.
In the first method, it was possible to obtain high accuracy by extracting a few features.However, determining thresholds with a statistical approach is recommended when it is possible to establish physical relationships between measurements and typical operating conditions.The second method presented is more robust and recommended to be used in unsynchronized signals.The WST algorithm automatically extracts coefficients that can feed different machine learning methods, guaranteeing invariance to small changes in the data set presenting a high performance.
Both approaches demonstrate that the sensitivity of the FBG sensor and the sampling rate used in the measurements with the optical interrogator was adequate to capture essential characteristics to recognize different operating conditions.For future studies, the influence of different loads, temperature-rise caused by electrical arcs, and performance gain with the increase in the number of optical sensors close to the internal critical points should be evaluated.In addition, composite materials should be studied to guarantee more durability to the sensors without reacting with the dielectric medium in different types of industrial applications and high voltage installations.

Fig. 1 .
Fig. 1.Experimental setup for measurements.(Source: [19]) However, this phenomenon has a low relevance in this work because the feature extraction algorithm highlights only the transient part of the signal.The measured signal was converted into relative strain in microstrain (µε) unit.This conversion is necessary to center the measurements at the origin of the ordinate axis to calculate the PSD in (µε) 2 /Hz unit.The considered sensitivity of wavelength shift with the FBG deformation is presented in (3).

Fig. 4 (
b) shows the derivative of the smoothed measured signal and the approximate switching time window defined by the mentioned thresholds values.The eighth measurement referring to operating condition three, was used as an example.

Fig. 6 (
Fig. 6(a) shows the WCT of one signal in a frequency logarithm scale, and Fig.6(b) presents the WST.Both scalograms are taken from measurement eight in the third operational condition.It is observed that the WST spreads in time the energy content from the transitory region, increasing the invariance of the signal.Therefore, the coefficients used to describe the signals are less affected by slight modifications in the dataset.That is one powerful propriety that may reduce misclassifications due to the non-synchronization of the measurements.

Fig. 7 (
Fig.7(a) shows the confusion matrix for all scattering sequences.In contrast, Fig.7(b)shows the result after ranking and grouping the correspondent sequences using a third-order polynomial SVM model for classification.In the first case, the total accuracy is equal to 84.34 %.While the second case presents 100 % of accuracy.This result is achieved by computing the numbers of classifications corresponding to the same measurement and choosing the class with major incidence.