## Services on Demand

## Journal

## Article

## Indicators

## Related links

- Cited by Google
- Similars in SciELO
- Similars in Google

## Share

## Journal of the Brazilian Society of Mechanical Sciences and Engineering

##
*Print version* ISSN 1678-5878*On-line version* ISSN 1806-3691

### J. Braz. Soc. Mech. Sci. & Eng. vol.29 no.2 Rio de Janeiro Apr./June 2007

#### https://doi.org/10.1590/S1678-58782007000200007

**TECHNICAL PAPERS**

**Damage detection in a benchmark structure using AR-ARX models and statistical pattern recognition**

**Samuel da Silva ^{I}; Milton Dias Júnior^{II}; Vicente Lopes Junior^{III}**

^{I}Member, ABCM samsilva@fem.unicamp.br

^{II}milton@fem.unicamp.br Department of Mechanical Design Faculty of Mechanical Engineering State University of Campinas – UNICAMP 13083-970 Campinas, SP. Brazil

^{III}vicente@dem.feis.unesp.br Department of Mechanical Engineering Universidade Estadual Paulista – UNESP 15385-000 Ilha Solteira, SP. Brazil

**ABSTRACT**

Structural health monitoring (SHM) is related to the ability of monitoring the state and deciding the level of damage or deterioration within aerospace, civil and mechanical systems. In this sense, this paper deals with the application of a two-step auto-regressive and auto-regressive with exogenous inputs (AR-ARX) model for linear prediction of damage diagnosis in structural systems. This damage detection algorithm is based on the monitoring of residual error as damage-sensitive indexes, obtained through vibration response measurements. In complex structures there are many positions under observation and a large amount of data to be handed, making difficult the visualization of the signals. This paper also investigates data compression by using principal component analysis. In order to establish a threshold value, a fuzzy c-means clustering is taken to quantify the damage-sensitive index in an unsupervised learning mode. Tests are made in a benchmark problem, as proposed by IASC-ASCE with different damage patterns. The diagnosis that was obtained showed high correlation with the actual integrity state of the structure.

**Keywords**: structural health monitoring, damage detection, principal component analysis, time series, fuzzy c-means clustering

**Introduction**

Nowadays, many accidents in structural systems caused by various sources of damage, such as extreme events (e.g. earthquake), gradual wear (e.g. fatigue cracking, delamination in composite structure and corrosion) and predictable discrete events (e.g. aircraft takeoffs and landings) have attracted the attention of engineers and researchers for the necessity of developing strategies of structural health monitoring (SHM). The interest in SHM is motivated by the potential of economical and life safety benefits. For instance, Farrar et al. (2005), based on the work of Coburn and Spence (2002), stated that about $60 billion are the annual costs associated with mechanical failure and earthquake damage. Additionally, in-service failure corresponds to 20-40% of all losses in the engineering sector, mainly in the petrochemical industry.

Worden and Dulieu-Barton (2004) comment that there is an increasing pressure on the market, due to economical (cheaper constructions, lower fuel consumption) and performance (higher transportation speed) reasons, to introduce new lightweight structures. As a consequence of this approach, these structures and constructions are becoming inherently weaker; their resonances are moving down into the frequency range of the excitation forces that, in turn, can cause failure of the system due to dynamic loads. In order to guarantee adequate performance throughout the life of these products, it has increased, in the recent years, the *in situ* monitoring of engineering structures through periodic dynamic measurements. And the SHM strategies use these data to estimate the current state of the system (in general, by using statistical modeling) based on a damage-sensitive feature extraction.

Doebling et al. (1998) separated formally the SHM process into four sublevels based on vibration measurements and the natural hierarchical structure: Level 1 – Detect damage; Level 2 – Detect and locate damage; Level 3 – Detect, locate and quantify; Level 4 – Detect, locate, quantify damage and obtain the remaining service life. Inman (2001) proposes that, when dealing with smart materials, 3 other sublevels should be added to the previous ones.

Worden et al. (2000) mention that the detection of whether damage is present or not is the most fundamental issue. Unfortunately, the Level 1 is still a daunting problem for practical applications, mainly in complex system, due to the significant uncertainties caused by modeling errors, unknown load data, etc, (Chang, 2000). The challenge gets bigger when it is not possible to excite the structure with active sources due to weight or power constraints and also when the operational condition is not known. In these cases, the SHM must be carried out using only the vibration responses. On the other hand, when some knowledge about the physics of the system is available, its behavior can be simulated theoretically or numerically. However, physics-based assessment approaches are usually computationally intensive. In order to overcome this difficulty, data-based techniques can be used. They rely only on previous measurements performed on the healthy system and should be able to indicate changes in the material and/or geometric properties, boundary conditions, and system connectivity. Some methodologies combine these two approaches (physics-based and data-based) to reach a better confidence level in SHM processes. Due to the drawbacks of the physics-based techniques, the present work deals with the use of data-based assessment procedure, once it can provide a potentially effective alternative for rapid monitoring system. However, in order to reach the upper levels of the SHM process – quantify the damage and obtain the remaining service life – it is probably more effective to use physics-based assessment approaches.

There are several data-based techniques that have been recently investigated. Carden and Fanning (2004) describe different common methodologies for SHM. Among those, the authors suggest that one of the most promising methodologies is the model construction based on time-series signature, called "black box" model. In this approach, large prediction error comparing to the actual measurement will occur if the system presents accumulated damage.

Sohn and Farrar (2001) pose the SHM problem in statistical pattern recognition and time series analysis paradigm. So, neither sophisticated finite element model nor modal analysis was driven to reach the two first levels in the SHM process. Another positive aspect from Sohn and Farrar (2001) proposal was the use of signal analysis only for the healthy system. Thus, the SHM was conducted in an unsupervised learning mode which is a very important feature once data from damaged structure are usually not available for most real-world engineering system, (Fugate et al., 2000). For instance, for very expensive structures like aircrafts, this is simply not possible. The use of a neural network, perceptron or any other supervised learning process, for example, is difficult for practical applications due to the necessity of training damage pattern data. In this case, training data could be obtained from accurate models or by a previous knowledge of the history of damage signals.

The approach proposed by Sohn and Farrar (2001) is composed of a two-stage prediction model, combining an auto-regressive (AR) and an auto-regressive with exogenous inputs (ARX) model. The model was constructed with selected and normalized acceleration signals obtained from the undamaged structure. The proposed approach was applied to an eight degree-of-freedom (DOF) mass-spring system. The authors consider measurements from different environmental conditions taken from the structure in the undamaged state. The one-step-ahead error prediction was defined as a damage-sensitive index. If there was any damage in the structure, the previously obtained model using the reference signals would not be able to reproduce the new time series measured from the damaged structure. A Gaussian statistical analysis based on the standard deviation ratio was used to detect damage.

Lei et al. (2003) modified this approach considering the influence of the excitation variability and the order of the ARX prediction model in the damage-sensitive index. The results were investigated in the same benchmark structure used in the present paper. A drawback of this approach seems to be the visualization of the signal processing, due the high number of measurement points.

It is also possible to use a similar frequency-domain ARX model, which was originally developed by Adams and Allemang (2000) for non-linear system identification. Park et al. (2005) combined this modified model with smart materials bonded of the structure, in order to quantify the difference between the electrical impedance measurement and the ARX frequency model output. The authors obtained a robust active damage indicator, due to the use of smart materials. Furthermore, because the high non-Gaussian nature of data error distribution tails, extreme value statistics (EVS) was employed.

Lu and Gao (2005) proposed a different linear model written in ARX form without the excitation term. In this case, the acceleration response signal was used as "input" to the ARX model. This procedure differs from the one described by Sohn and Farrar (2001) because it permits to skip the AR modeling while the latter uses the AR error as "input" of the ARX model. The paper presents a comparison between the performance of the modified ARX model and the AR-ARX model proposed by Sohn and Farrar (2001). The results of an eight DOF mass-spring system demonstrated that the model proposed by Lu and Gao (2005) had a better performance for the case of degradation in different places simultaneously. Lu and Gao concluded that this approach improved the sensitivity for structural stiffness change, when compared to the previous AR-ARX model. However, no simulations were driven considering noisy measurement. So, further research is required to extend the applicability of this model for practical use.

A different technique proposed by Bodeux and Golinval (2001) uses an autoregressive moving average vector (ARMAV) model. The difference between this approach and the ones previously described is that the first uses the natural frequency as damage-sensitive, while the latter use the prediction error due to the fact that modal parameters are extracted with uncertainties from the ARMAV model. The parameter estimation is a function of the filter order and it involves a non-linear optimization procedure. Besides, the regression term includes residual error, and there are great difficulties with unbiased estimators. Another practical disadvantage of this damage feature, based on the natural frequencies, is related to the low sensitivity for some parameter variations. In these cases the index is masked by the unavoidable experimental errors. In general, methods based on statistical pattern recognition seem to be suitable under the conditions where clear physical basis is not available.

The main goal of this paper is to present a methodology for SHM purpose to reach the 1^{st} level described before (detect damage) based on the AR-ARX model, as described in Sohn and Farrar (2001). The primary focus of the work is the application of a fuzzy classifier. The paper is organized as follows. Initially, the basic procedure for damage detection is presented by considering data compression using principal component analysis (PCA) before the damage feature extraction. The partitioning of the damage-sensitive feature in three clusters (healthy-state, damage and severe damage) is made by using the fuzzy c-means algorithm (Bezdek and Pal, 1992). This approach is based on an iterative algorithm to minimize the sum of point-to-centroid distances, summed over all clusters. The paper concludes with some numerical tests in a benchmark structure proposed by ASCE Task Group on Health Monitoring (Johnson et al., 2000). The results obtained are discussed and further directions are suggested.

**Nomenclature**

**A**_{xi}(q) = i^{th} Polynomial relative to output (roots are poles) in known structural condition.

**A**_{xR}(q) = Polynomial relative to output (roots are poles) in reference signal.

**A**_{y}(q) = Polynomial relative to output (roots are poles) in unknown structural condition.

a_{xil} = l^{th} coefficients of the i^{th} **A**_{xi}(q).

a_{yl} = l^{th} coefficients of the **A**_{y}(q).

**B**_{xR}(q) = Polynomial relative to input (roots are zeros) in the reference signal.

C_{i} = Centroid of the i^{th} cluster.

c = Number of clustesrs.

**e**_{xi}[k] = i^{th }residual error between the measurements vibration and the output prediction model in known condition.

**e**_{y}[k] = residual error between the measurements vibration and the output prediction model in unknown condition.

f_{ij} = pertinent function associated to j^{th} object of the i^{th} cluster.

m = Number of measurement locations.

N = Number of environmental/operational conditions to be observed.

n = Number of discrete-time points.

na = Order of polynomial **A**(q).

nb = Order of polynomial **B**(q).

p = Order of AR model.

r = Number of lag to obtain the correlation function.

R_{ixx}(r) =Correlation function of **x**_{i}(k) at r^{th} lag.

**u**_{j}[k] = Acceleration signal at j^{th} location and k^{th} instant.

**z**_{j}[k] = Standardized acceleration signal at j^{th} location and k^{th} instant.

**z**[k] = Vector of the response component.

**x**_{i}[k] = Vectors of data in known structural condition (healthy) projected onto 1^{st} PCA, i=1,2,
,N.

**x**_{R}[k] = Reference signal.

**y**[k] = Vector of data in unknown structural condition. (undamaged or damaged) projected onto 1^{st} PCA.

q^{-1} = Time-delay operator.

**v** = Eigenvector of the covariance matrix.

**w**_{i} = Natural excitation (one per floor).

**Greek Symbols**

**Y** = Covariance matrix m x m.

l = Eigenvalues of the covariance matrix.

**e** _{xR}[k] = Residual error of the ARX( na, nb) to **x**_{R}[k].

e_{y}[k] = Residual error of the ARX( na, nb) to **y**[k].

g = Standard deviation ratio (damage index).

s ^{2 } = Model error power.

**Mathematical Operators**

**m**( ) = Mean.

**s**( ) = Standard deviation.

**Standardization Procedure**

An ensemble of acceleration responses **u**_{j}[k] (j=1,2,
,m and k=1,2,
,n) denotes the response time series corresponding to m measurement locations and n discrete-time intervals. In the first stage, each time series signal, **u**_{j}[k], is standardized in order to remove trends as follows, (Wirsching et al., 1995):

where **z**_{j}[k] is the standardized signal at k^{th} instant, **m**(**u**_{j}) and **s**(**u**_{j}) are respectively the mean and standard deviation of **u**_{j}[k] sequence respectively.

**Data Compression using Principal Component Analysis**

Principal component analysis (PCA) is a technique devoted to the extraction of compact information from a matrix by investigating its dimensionality, which was introduced in multivariate statistics for data reduction (Cho and Kim, 2002).

In this paper, PCA is used to perform data compression when information from multiple measurement points is available. This process changes the normalized acceleration time series from multiple points into a single time series, maintaining the main information in the reduced data.

Initially, a vector **z**[k] of the response components corresponding to the m measurement locations is formed by using Eq. (1):

Then, the m x m covariance matrix,**Y**, among spatial measurement locations summed over all discrete-time samples is obtained by:

The eigenvalue problem of the covariance matrix satisfies

where l_{i} and **v**_{i} are the eigenvalues and eigenvectors, respectively. The eigenvector **v**_{i} is called a principal component.

The goal is to reduce the m-dimensional vector **z**[k] into a d-dimensional vector **x**[k], where d<<m. Finally, **z**[k] is projected onto the eigenvectors corresponding to the first d largest eigenvalues:

In the present work, all time series are projected onto the 1^{st} principal component because, in the studied case, the contribution of these feature is dominant compared with the other ones. So, **x**[k] will be a single vector and it is called as *pattern vector*.

**Damage-Sensitive Index Extraction**

It is very important to distinguish between undamaged and damaged condition from the measured vibration signals. The process of identifying damage-sensitive properties from data is a crucial point. In this paper, the residual error between an AR-ARX linear prediction discrete-time model and measured time series is used as damage index.

The first phase of the technique considers signals from the undamaged structure (healthy state) in N environmental/operational conditions. Each ensemble of data is standardized by Eq. (1) and compressed using Eq. (5). The final signal is the pattern vector **x**_{i}[k], where i=1,2,
,N.

The next phase is devoted to the construction of an AR model, with order p, for each **x**_{i}[k]. The AR(p) model is written as:

where **e**_{xi}[k] is the i^{th} error between the measured signal and the output from the prediction model. **A**_{xi}(q) is the i^{th }polynomial in the delay operator q^{-1} , written as:

where a_{xi1}, a_{xi2},
, a_{xip} are coefficients of the i^{th} **A**_{xi}(q) polynomial (i=1,2,
,N). For example, 2q^{-3}**x**[k] means 2**x**[k-3].

The coefficients of the AR model can be found by several methods, such as Burg algorithm, least means square approach, etc. In this work, the set of coefficients in Eq. (7) are estimated by minimizing the power of each prediction error |**e**_{xi}[k]| ^{2}. This procedure leads to the Yule-Walker equations given by Wang (2003)

where R_{ixx}(r) is the correlation function of **x**_{i}[k] at the r^{th} lag and s^{2} is the model error power estimated for the undamaged structure. In order to estimate the autocorrelation function, the Levinson-Durbin recursion method is used. Equations (8) can be expressed in the matrix form as:

It has been shown by Wang (2003) that the solution of Yule-Walker equations yields the optimal AR model for linear prediction.

The order *p* of the model, in general, is not known *a priori*. There are several criteria to determine this order. For example, Shin et al. (2003) proposed a criterion based on the singular-value decomposition. The two most widely used methods are Akaikes information theoretic criterion (AIC) and Akaikes final prediction error (FPE), (Aguirre, 2004). In this paper the first one (AIC) is used.

The monitoring of the structure is realized by obtaining new vectors of data to the unknown condition (undamaged or damaged) using the PCA technique. This new sequence **y**[k] has the same length of the signal **x**_{i}[k]. In other words, n discrete-time points are considered. In this phase, the previous step given by Eq. (6) is repeated with the same order p:

where **A**_{y}(q) is the polynomial

and a_{y1}, a_{y2},
, a_{yp} are the coefficients of the **A**_{y}(q) polynomial obtained by using the Yule-Walker method previously described.

The new AR model is compared with each model of the signal **x**_{i}[k] in the reference database to select a signal **x**_{R}[k] "closest" to the unknown condition block **y**[k]. This is obtained through the minimization of the following Euclidean norm:

The signal **x**_{R}[k] whose coefficients satisfy the minimum distance is called reference signal. This procedure is defined by data normalization to select a pattern vector from the reference data base. If the **y**[k] vector is obtained from the same operational condition and there has been no structural change in the system, the AR model should be capable of predicting the dynamic behavior of the system, which is given by the AR coefficients, and it should be similar or close to the reference signal (Sohn and Farrar, 2001).

The following stage is to obtain an ARX model from the reference signal **x**_{R}[k] as:

where **e**_{xR}[k] is the residual error of the polynomials ARX(na,nb) model, **e**_{xR}[k] is the residual error of the AR(p) model given by Eq. (6), and:

where na and nb are the orders of the polynomials **A**_{xR}(q) and **B**_{xR}(q) of the reference signal respectively

Equation (13) represents the relationship between the output **x**_{R}[k] and the "input" **e**_{xR}[k]. The AR residual error, **e**_{xR}[k], is a function of all unknown external inputs and is considered to be an approximation of the estimated system input. The order na and nb of the polynomials given by Eqs. (14) and (15) can be set arbitrarily.

The model associated with Eq. (13) is now applied to investigate the vector of data for the unknown conditions:

If the ARX model obtained from Eq. (13) is not a good prediction for the new signals **y**[k] and **e**_{y}[k], then the residual error **e**_{y}[k] from Eq. (16) and its probability distribution will change. A procedure to cluster the residual error and to help the decision of whether the variation of the residual error corresponds to damage or not is developed in the following section.

**Unsupervised Statistical Pattern Recognition **

The process for choosing a threshold value to identify the health condition of the structure from undamaged to damaged state is known as *statistical modeling*. In the previous sections it was obtained an index for which the effects of disturbances, such as inputs and environmental variation, were normalized. This data normalization procedure is necessary when measurements of different environmental and operational conditions are not available to construct the undamaged AR model.

It is assumed here that **e**_{xR}[k] and **e**_{y}[k] are asymptotically normally distributed. A common approach is to monitor the standard deviation of **e**_{y}[k] and compare its value with the standard deviation of the healthy state **e**_{xR}[k]. Lu and Gao (2005) investigated this feature for diagnosis and the results showed that it is a suitable index. They employed the following ratio of the standard deviation of the residual errors from **y**[k] and **x**_{r}[k]:

If the index presents a non-Gaussian distribution, this approach must be modified, which can be done by using extreme value statistics (EVS). EVS fits only the data distribution tail. But, if it is reasonable to assume that the set of data fall close to the normal distribution curve, this procedure is not used.

An increase in this index value would indicate that the location of measurement is close to the damage. However, in order to obtain a rapid SHM process, the present paper focuses only on the diagnostic based compressed measurements. It is not concerned with the location the damage (2^{nd} level).

Another approach used to detect damage is the Statistical Process Control (SPC). This method is based on a control chart for automated continuous monitoring. This technique is applied to structural monitoring using different damage-sensitive feature, as can be seen in Fugate et al. (2000), Sohn et al. (2000) and Silva et al. (2005).

The present work proposes an approach that is not frequently applied to SHM purposes: namely, the fuzzy c-means (FCM) algorithm, which was first presented by Bezdek (1981). The goal is to identify a finite number of clusters to describe one data set. In fuzzy clustering, the membership of a data-point in a cluster is a fuzzy decision. A data-point is considered to be a member of every cluster with a given possibility membership value that ranges from 0 to 1. The objective function of the fuzzy c-means algorithm is based on selecting representative objects from the data set in such a way that the total fuzzy dissimilarity within each cluster is minimized in an unsupervised manner.

The basic fuzzy c-means clustering algorithm is given by:

where f_{ij} is the pertinent function associated to the j^{th} object of the i^{th} cluster, C_{i} is the centroid of the i^{th} cluster, c is the number of clusters and m > 1, in general, is unknown. It is usually used m = 2. x_{i }is the representative feature, and in this paper, it is considered as being the standard deviation of the ARX model residual error and the signals **e**_{xR}[k] and **e**_{y}[k].

The optimum cost function, J, can be obtained by following the steps:

1^{st} – Chose the initial centers C_{1}, C_{2},
, C_{c},

2^{nd} – Compute f_{ij} for each j Î {1, 2,
,L), where L is the number of features of each cluster (in the present paper three features were used). If || x_{j} - C_{i }||^{2 }> 0 for i = 1,
,c then:

If || x_{j} - C_{i }||^{2 }= 0 for i Î 1 Ì {1, 2,
,c), then define f_{ij }, i Î 1, as any non negative real number that satisfy f_{ij} = 1 and define f_{ij} = 0 for i Î {1, 2,
,c) - I.

3^{rd} – Update the centers:

4^{th} – If convergence is achieved, stop the process; otherwise, return to the second step.

The solution comes from the optimality equations via Lagrange multipliers. Details about this procedure can be found in Bezdek and Pal (1992).

**Benchmark Test Structure**

Dyke et al. (2001) comment that an important part of the current effort for the progress of the SHM technology is the development of well-defined benchmark structures that allows performance comparison among various approaches for realistic conditions. The associated effort led to a benchmark structure, which was proposed by ASCE Task Group on Health Monitoring. A schematic drawing of such a structure is shown in Fig. 1.

This frame is a 2 x 2 bay, four-story rectangular steel structure built at approximately one-third scale. The model is 3.6m tall and 2.5m wide (Johnson et al., 2000). The geometrical and physical properties are shown in Table 1.

Two different finite element models were developed in order to generate data for simulation purposes. The first one has 12 DOF – two horizontal translations and one rotation around the vertical axis per floor, except the ground level, which is completely constrained – and the second is a 120 DOF model that requires only the floor nodes to have the same horizontal translation and in-plane rotation. The columns and floor beams are modeled as Euler-Bernoulli beams in both finite element models. The braces are bars with no bending stiffness. A data generation program (free code) allows the user to consider any case of damage pattern for testing purposes. In the present paper, the 120 DOF model was used. Five damage patterns are defined for this structure. These patterns are given in Table 2.

The complete program and more details about the benchmark structure are available in http://wusceel.cive.wustl.edu/asce.shm/default.htm.

**Results**

To illustrate the SHM process, tests in the benchmark structure (120 DOF model) were performed. Five different scenarios for the undamaged state, varying the operational condition (percentage of noise added in the input), which is shown in Table 3 were considered. Table 4 describes the 15 cases considered as unknown conditions (undamaged or damaged situations). It is worth noting that data from cases 6 to 8 (see Table 4), despite the fact the structure presents no damage, were not used to construct the AR-ARX model. They were considered as unknown conditions and used to test false-positive.

In each scenario sixteen acceleration "measurement" directions were considered– two in the x-direction and two in the y-direction per floor (in the middle at Fig. 1, the accelerations in x-direction are omitted for the sake of clarity). Gaussian pulse processes, with various RMS percentages, are added to simulate the sensor noise vector.

One excitation per floor was applied. The excitations were modeled as a filtered Gaussian white noise – white noise stochastic processes with Gaussian distribution, filtered with a 6^{th} order low-pass Butterworth filter and 100 Hz cutoff frequency. In Fig. 1 the excitations are designated by the letter **w**.

The data generation was obtained with a sampling rate of 512Hz and time period of 2 sec, resulting in 1024 data points.

A typical acceleration signal obtained from case 1 is shown in Fig. 2 (one from 16 measurement points). In Fig. 3, it is shown a signal from case 12 (damaged). All signals were standardized by using Eq. (1).

The set of all data was compressed by using PCA. The PCA of the covariance matrix of 16 measurements points for case 1 is shown in Fig. 4. This figure shows that the first principal component alone holds about 29% of the total information. Thus, raw time series from all points are first projected onto the 1^{st} PCA. The other cases are very similar. Fig. 5 and Fig. 6 provide the compressed signals for cases 1 and 12, for illustration, respectively.

The next phase comprehends the extraction of the feature from the data for classification purposes. The procedure to select the order by using AIC criterion indicates that 13 is candidate to the AR order model (order of polynomial described by Eq. (6)). Figure 7 shows the criterion plot. Only the healthy data (cases 1 to 5) were used to estimate the order. It was used the first half of data to determine the order.

The coefficients for the 13^{th} - order polynomial **A**_{xi}(q) was constructed from the cases of Table 3 (healthy state) by solving the Yule-Walker equations. It was considered only the first half of data (512 points) to obtain the AR(13) model. The second half was used to validate the model. For illustration purposes, the polynomial obtained for case 5 is presented below:

Each model was compared with the one-step-ahead predicted model output and the measured output. The results for case 5 are presented in Fig. 8a while Fig. 8b shows a zoom detail. Fig. 9 presents the tests of residuals associated with this model. The residual analysis shows that the correlation between the **x**_{5}[k] model output and the residual error **e**_{x5}[k] remains within the confidence interval (99%), except at zero lag. Therefore, the prediction error is close to white noise process. The other cases are very similar with fit about 75%. So, this set of models under healthy condition may be considered as validated.

The residual error **e**_{xi}[k], i = 1,2,
,5, was obtained from Eq. (6) and it is shown in Fig. 10. These signals are used as "input" of an ARX model with the order set arbitrarily na = 5, nb = 5 and time delay was set to 1. The length of the residual errors of AR models corresponds to 512 points.

Each signal in Table 4 (unknown condition) was fitted to an AR model of order 13 and with the same number of points. The reference signal was obtained by using Eq. (12). The ARX model for the reference signal was constructed by using the second half of the data, because the "input" (residual error) was also obtained by using the set of points from 513 to 1024 (512 samples).

After constructing the ARX model for each reference signal for the 15 unknown cases studied (Table 4), the respective model was used to predict these signals. If there was damage in the structure, the ARX model previously obtained using the reference signal would not be able to reproduce the new time series measured from the damage condition. A typical response can be seen in Fig. 11 for case 12 (damage pattern 2).

It was discussed before that if the data point fall near the line corresponding to the normal probability plot, it is reasonable to assume that the ARX residual error is asymptotically normally distributed in healthy-state. Besides, if the kurtosis value is close to 3.0 and the skewness is near to zero, the distribution is close to Gaussian. Figure 12 shows this statement.

The first four statistics moments of the raw time series **e**_{y}[k] are summarized in Table 5.

If there is damage in the structure, the probability distribution should change. Figure 13 illustrates this by comparing the probability density function of the residual error from damaged case 12 with the reference signal.

Table 6 presents the standard deviation of residual errors for various damage sources and the ratio given by Eq. (17).

The increase in the bold values was probably caused by damage or operational variability. It is very difficult to classify these data in healthy or damage states only by observing the ratio values, as it was made in Sohn and Farrar (2001). In order to classify them according to a more rigorous statistical criterion, it is proposed in this paper the classical fuzzy c-means algorithm. The basic clustering procedure was described earlier in the present work.

Figure 14 shows the set of data to classify the residual error with m=2 and c = 3 in Eq. (18). It is known only the data in healthy state. The number of cluster used is c = 3 and it is associated with the undamaged, damaged and severe damaged conditions in the data classification. The result of the fuzzy clustering is presented in Fig. 15. The evolution of the cost function is shown in Fig. 16. The cost function reached the minimum value after 6 iterations. Figure 17 shows the percentage of membership of each data point in each cluster.

By analyzing Fig. 15 and Fig. 17, one can observe that cases 12, 13 and 14 are classified in a severe damage cluster. That is true, because these data corresponds the damage pattern 2, where all braces in the first and third floor are removed (see Table 2). Figure 17 shows that, in these cases, the percentage of confidence corresponds to approximately 80% for cases 12 and 14 and almost 60% for case 13.

All cases in healthy condition were well recognized and it is not observed any false positive (false alarming of fault). Cases 9, 10 and 11, which correspond to the damage pattern 1, were not well classified. Case 9 is classified by the monitoring system in cluster 2 (damage), which is true, but the confidence of this decision is only about 40 % (Fig. 17). The analysis of case 11 is similar. The biggest problem happens with case 10, where the fuzzy clustering classifies this set of data as a healthy-state, which is clearly a false negative. To justify this incorrect decision it is very important to remember that it was performed data compression using 16 measurements points. The damage pattern 1, associated to case 10, is relative to removal of all braces from the first floor. Obviously, this change affects more the four measurements performed in the 1^{st} floor. The way it was conducted the PCA, this information gets hidden (or diluted) in the pattern vector, resulting in a misunderstanding in the classification of case 10 and in the low confidence observed for cases 9 and 11. It is important to note that the use of PCA must be made carefully when the confidence is classified close to threshold. In these cases, it is recommended to use some measurements before PCA to be sure of the correct classification, or else, to performance the procedure by using complete data. Alternatively, the second component of the PCA procedure could also be considered. In this case, it would be possible to extract the correct classification. However, the goal in the present paper is to detect damage by using few features.

All other cases are well classified. Cases 15 to 20 demonstrated the correct classification and it was observed that these cases are closer to undamaged cluster instead of severe damage cluster.

**Final Remarks**

The method of structural health monitoring proposed and exemplified in this paper showed to be able to determine the damage state in an unsupervised learning mode. The method has many desirable features for utilization in real-world structures, as for instance: it is based only on output measurements; uses single pattern vector by application of PCA; its procedure is conducted in an unsupervised learning mode; and the information about the level of confidence of the threshold value are based on fuzzy decision. Hence, the method is very attractive for the implementation of a real monitoring system, mainly in large and complex structures, where the knowledge about the physics is limited. The combination of this approach with wireless sensing system is also very attractive, because it allows conducting an automatic monitoring without human supervision by using digital filters implemented in a DSP board. However, in order to improve the prediction step it is important to extend the procedure to permit it to handle with non-stationary signals and non-linear systems. It is also important to investigate what would happen if the environmental excitation were a colored noise. In this sense, further research is being conducted dealing with all these issues, including the study of non-linear systems and the use of excitations signals other than the white noise. The goal is to extend the capabilities and to evaluate the effectiveness of the present approach when dealing with experimental data from a real structure.

**Acknowledgements**

The first author is thankful to UNICAMP for his scholarship from BIG program. The authors would also like to thank the Associate Editor and reviewers for the valuable comments.

**References**

Adams, D. E. and Allemang, R. J., 2000, Discrete frequency models: a new approach to temporal analysis. *Journal of Vibration and *Acoustics, v. 123, p. 98–103. [ Links ]

Aguirre, L. A., 2004, *Introdução à identificação de sistemas – Técnicas lineares e não-lineares aplicadas a sistemas reais*. 2.º edição, Editora UFMG.

Bezdek, 1981, *Pattern recognition with fuzzy objective function algorithm*. Plenum Press, 1981.

Bezdek, J. and Pal, S., 1992, *Fuzzy models for pattern recognition*, IEEE Press.

Bodeux, J. B. and Golinval, J. C., 2001, Application of ARMAV models to the identification and damage detection of mechanical and civil engineering structures. *Smart Materials ansd Structures*, v. 10, n. 3, p. 479-489.

Carden, P. E. and Fanning, P., 2004, A vibration based condition monitoring: a review. *Structural Health Monitoring – SHM.* v. 3, n. 4, p. 355-377.

Chang, F. K., 2000, Structural health monitoring. *Proceedings of 2 ^{nd }International Workshop on Structural Health Monitoring*, Stanford, Ca, USA.

Cho, M. S. and Kim, K J., 2002, Indirect imput identification in multi-source envioronments by principal component analysis. *Mechanical Systems and Signal Processing*. v. 16, n. 5, p. 873-883.

Coburn, A. and Spence, R., 2002, *Earthquake Protection*, John Wiley and Sons, Inc., New York.

Dyke., S. J.; Bernal, D.; Beck, J. L. and Ventura, C., 2001, An experimental benchmark problem in structural health monitoring. *Proceedings of 3 ^{rd }International Workshop on Structural Health Monitoring*, Stanford, Ca, USA.

Doebling, S. W.; Farrar, C. R. and Prime, M. B., 1998, A summary review of vibration-based damage identification methods. *The Shock and Vibration Digest*, v. 30, n. 2, p. 91-105.

Farrar, C. R.; Lieven, N. A. J. and Bement, M. T., 2005, An introduction to damage prognosis, Chapter 1 of Inman, D. J.; Farrar, C. R.; Lopes Jr, V.; Steffen Jr., V. (Editors). *Damage Prognosis: For Aerospace, Civil and Mechanical Systems.* 1^{st} Edition, John Wiley & Sons.

Fugate, M. L.; Sohn, H. and Farrar, C. R., 2000, Unsupervised learning methods for vibration-based damage detection. *Proceedings of 18 ^{th} International Modal Analysis Conference – IMAC*, San Antonio, Texas, USA.

Inman, D. J., 2001, Smart structures: examples and new problems. *Proceedings of 16 ^{th} Brazilian Congress of Mechanical Engineering – COBEM 2001*, Uberlândia, MG.

Johnson, E. A.; Lam, H. F.; Katafygiotis, L. S. and Beck, J. L., 2000, A benchmark problem for structural health monitoring and damage detection. *Proceedings of 14 ^{th} Engineering Mechanics Conference*, Austin, Texas, USA.

Lei, Y.; Kiremidjian, A. S.; Nair, K. K.; Lynch, J. P.; Law, K. H.; Kenny, T. W.; Carryer, E. and Kottapalli, A., 2003, Statistical damage detection using time series analysis on a structural health monitoring benchmark problem. *Proceedings of 9 ^{th} International Congress on Applications of Statistical and Probability in Civil Engineering*, San Francisco, Ca, USA.

Lu, Y. and Gao, F., 2005, A novel time-domain auto-regressive model for structural damage diagnosis.* Journal of Sound and Vibration*, v. 283, n. 3-5, p. 1031-1049.

Park, G.; Rutherford, A. C.; Sohn, H. and Farrar, C. R., 2005, An outlier analysis framework for impedance-based structural health monitoring. *Journal of Sound and Vibration*, v. 286, n. 1-2, p. 229-250.

Shin, K.; Feraday, S. A.; Harris, C. J.; Brennan, M. J. and Oh, J. E., 2003, Optimal autoregressive modelling of a measured noisy deterministic signal using singular-value decomposistion. *Mechanical Systems and Signal Processing*. v. 17, n. 2, p. 423-432.

Silva, S.; Lopes Jr., V. e Dias Jr., M., 2005, Detecção de falhas estruturais utilizando controle estatístico de processos. *Proceeding of 4º Congresso Temático de Dinâmica, Controle e Aplicações*, Bauru, SP. 4.º DINCON, p. 47-54.

Sohn, H.; Czarnecki, J. J.; Farrar, C. R. and Fellow, P. E., 2000, Structural health monitoring using statistical process controls. *Journal of Structural Engineering*, ASCE, v. 126, n. 11, p. 1356-1363.

Sohn, H. and Farrar, C. R., 2001, Damage diagnosis using time series analysis of vibration signals. *Smart Materials ansd Structures*, v. 10, n 3, p. 446-451.

Wang, W., 2003, An evaluation of some emerging techniques for gear fault detection. *Structural Health Monitoring – SHM.* v. 2, n. 3, p. 225-242.

Wirsching, P. H.; Paez, T. L. and Heith, O., 1995, *Random Vibrations: Theory and Practice*. John Wiley & Son, Inc.

Worden, K.; Manson, G. and Fieller, N. R. J., 2000, Damage detection using outlier analysis. *Journal of Sound and Vibration*, v. 229, n. 3, p. 647-667.

Worden, K. and Dulieu-Barton, J. M., 2004, An overview of intelligent fault detection in system and structures. *Structural Health Monitoring – SHM.* v. 3, n. 1, p. 85-98

Paper accepted April, 2006.

Technical Editor: Nestor A. Zouain Pereira.