Method for automatic detection of wheezing in lung sounds

Riella, R.J.; Nohama, P.; Maia, J.M.

doi:10.1590/S0100-879X2009000700013

Abstract

The present report describes the development of a technique for automatic wheezing recognition in digitally recorded lung sounds. This method is based on the extraction and processing of spectral information from the respiratory cycle and the use of these data for user feedback and automatic recognition. The respiratory cycle is first pre-processed, in order to normalize its spectral information, and its spectrogram is then computed. After this procedure, the spectrogram image is processed by a two-dimensional convolution filter and a half-threshold in order to increase the contrast and isolate its highest amplitude components, respectively. Thus, in order to generate more compressed data to automatic recognition, the spectral projection from the processed spectrogram is computed and stored as an array. The higher magnitude values of the array and its respective spectral values are then located and used as inputs to a multi-layer perceptron artificial neural network, which results an automatic indication about the presence of wheezes. For validation of the methodology, lung sounds recorded from three different repositories were used. The results show that the proposed technique achieves 84.82% accuracy in the detection of wheezing for an isolated respiratory cycle and 92.86% accuracy for the detection of wheezes when detection is carried out using groups of respiratory cycles obtained from the same person. Also, the system presents the original recorded sound and the post-processed spectrogram image for the user to draw his own conclusions from the data.

Wheezes; Lung sounds; Spectrogram; Digital image processing; Artificial neural networks

Braz J Med Biol Res, July 2009, Volume 42(7) 674-684

Method for automatic detection of wheezing in lung sounds

Correspondence and Footnotes R.J. Riella^1,2, P. Nohama¹ and J.M. Maia¹

¹Departamento de Eletrônica e Centro de Pós-Graduação em Engenharia Elétrica e Informática Industrial, Universidade Tecnológica Federal do Paraná, Curitiba, PR, Brasil

²Instituto de Tecnologia para o Desenvolvimento, Curitiba, PR, Brasil

References

Correspondence and Footnotes ^{Correspondence and Footnotes} Correspondence and Footnotes

Abstract

The present report describes the development of a technique for automatic wheezing recognition in digitally recorded lung sounds. This method is based on the extraction and processing of spectral information from the respiratory cycle and the use of these data for user feedback and automatic recognition. The respiratory cycle is first pre-processed, in order to normalize its spectral information, and its spectrogram is then computed. After this procedure, the spectrogram image is processed by a two-dimensional convolution filter and a half-threshold in order to increase the contrast and isolate its highest amplitude components, respectively. Thus, in order to generate more compressed data to automatic recognition, the spectral projection from the processed spectrogram is computed and stored as an array. The higher magnitude values of the array and its respective spectral values are then located and used as inputs to a multi-layer perceptron artificial neural network, which results an automatic indication about the presence of wheezes. For validation of the methodology, lung sounds recorded from three different repositories were used. The results show that the proposed technique achieves 84.82% accuracy in the detection of wheezing for an isolated respiratory cycle and 92.86% accuracy for the detection of wheezes when detection is carried out using groups of respiratory cycles obtained from the same person. Also, the system presents the original recorded sound and the post-processed spectrogram image for the user to draw his own conclusions from the data.

Key words: Wheezes; Lung sounds; Spectrogram; Digital image processing; Artificial neural networks

Introduction

According to data from the Brazilian Health Department, death caused by respiratory diseases grew 3% from 1980 to 1999, being the fifth mortality cause in Brazil in 1999 (1). In many countries nearly 5% of the population suffers from asthma and other related chest disorders (2).

Respiratory sounds constitute a relevant source of information for the investigation of the state of the lungs and of the other organs, which compose the respiratory system. As respiratory sounds may be acquired by the easy and non-invasive auscultation procedure, more relevant information from lung sounds may be extracted and contribute to reducing the time for diagnosis, consequently increasing treatment efficiency. Thus, an automated algorithm developed to recognize abnormalities in lung sounds may be of great relevance to clinical diagnosis.

Abnormal lung sounds may be classified according to two main categories: crackles and wheezes (2). Wheezes are musical adventitious lung sounds, also called continuous (3). These adventitious sounds are characterized by a dominant frequency, usually over 100 Hz (3), and a duration of more than 100 ms (4). Their presence is related to partial airway obstruction (3,5). Also, wheezing with unforced breathing is correlated with the severity of airway obstruction. Therefore, its auscultation has been used for the detection and evaluation of diseases such as children's nocturnal asthma (6) and for the evaluation of bronchoconstriction in asthma (5). The presence of wheezing in infants has also been used as a parameter to evaluate the predisposition to asthma (7). Other investigations have used wheezing, among other symptoms, to evaluate the physician's and patient's perception of acute asthma exacerbation, compared to objective measurements such as forced expiratory volume (8).

In contrast, crackles are short, explosive and discontinuous sounds, shorter than 100 ms, usually occurring during inspiration (4). They are characterized by a rapid initial pressure deflection followed by a short oscillation (9). These adventitious sounds are classified as fine crackles and coarse crackles based on their duration. Thus, fine crackles are defined as those lasting less than 10 ms and coarse crackles are defined as those lasting more than 10 ms (4).

Crackles are a qualitative diagnostic tool (3) and can be produced either by explosive openings among regions of the lungs, deflated to residual volume (10), due to sudden equalization of gas pressure during inspiration or by change in elastic stress resulting from sudden opening of closed airways (11).

However, the instrument used for auscultation, the stethoscope, sometimes does not present an efficient response in the acquisition of lung sounds. Basically, this instrument is simply a sound conduit between the body surface and the ears (3). Its frequency responses are rarely tested, rated or compared, with the instruments being usually chosen for their appearance, reputation and inadequately supported claim of performance, instead of their technical characteristics (3). Usually, the frequency response of the stethoscope favors the lower frequencies, amplifying those lower than 112 Hz and attenuating higher frequencies (12). Therefore, the response of the stethoscope is insufficient when auscultating pulmonary sounds, which may have frequency components far above 112 Hz. Since heart sounds are composed mainly of lower frequencies, they cause much interference when the stethoscope is used to auscultate lung sounds, being a cause of misunderstanding in the auscultation of respiratory sounds.

In clinical practice, some difficulties occur during the diagnostic process, such as a difference in sensitivity between the ears, physicians' practice in the task of recognizing lung sounds, and presence of external and internal noises, which may cause errors in the identification of the sound as pathological or normal, impairing the precision of the diagnosis.

However, lung sounds have naturally non-stationary signals. This property can be observed both in healthy normal and abnormal subjects. But this non-stationarity is more severe in cases of abnormal lung sounds. Therefore, significant diagnostic information can be obtained from the frequency distribution of lung sounds, with the selection of the signal processing technique used to extract this information being very important to maximize the efficiency of extraction (13). This task has motivated many studies on the classification of lung sounds using frequency analysis (13-17).

The spectrograms, which are also called sonograms (4) or respirosonograms, when applied to respiratory sounds (3), have been widely used for auscultation teaching, lung sound researching and evaluation of techniques for the processing of respiratory sounds (3,14-19). However, their applications are usually restricted to the visualization of the spectral information of lung sounds. Since wheezes are musical or continuous abnormalities (3), their presence demonstrates a typical picture in the spectrogram. In this image, a wheeze shows continuous horizontal lines, representing the time interval of the main frequency, and the presence of other horizontal lines representing the frequency spectra that compose the wheeze, being usually harmonic frequencies of the main frequency.

The aim of the present study was to develop a method for recognizing lung sounds by applying techniques of image processing based on a normalized spectrogram obtained from lung sounds. The proposed method increases the visualization of wheezing picture characteristics and uses them as the input to an artificial neural network-based pattern recognition system, which classifies sounds as being with or without wheezes.

Material and Methods

The proposed technique consists of a dynamic structure developed to extract the parameters, which define the wheezing characteristics, eliminating the redundant or inexpressive data, and applying the pattern recognition algorithm itself. In order to implement and test the proposed methodology, computer software was developed. The software was implemented using the C⁺⁺ language. It accepts only lung sounds recorded in wave files and generates spectrograms in bitmap format. The flowchart of the software is shown in Figure 1. It may be divided into four blocks: 1) pre-processing of the lung sounds digitally recorded, 2) the generation of a spectrogram, 3) digital image processing of the generated spectrogram, and 4) pattern recognition. All of these blocks are detailed below.

Pre-processing

Initially, the audio file containing the recording of a specific lung sound is opened and read. Since the software was developed with the purpose of testing the described technique, the respiratory sounds were processed off-line after being recorded. After the file reading procedure, the samples of the digital recording from the respiratory sound signal are stored in an array that will serve as data source for analysis.

The main procedure for information extraction from digitally recorded lung sounds in this study is the spectrogram generation and processing. The spectrogram is the matrix obtained from the application of a short time Fourier transform (STFT) (16,20) to a one-dimensional signal. This processing technique generates images from sounds, expressing graphically the sound frequency components and the time location where these frequency components occur. However, to ensure that the same spectrogram characteristics will be obtained from different types of recordings, a normalization process is required.

Digitally recorded lung sounds are discrete signals, being processed in a discrete system. Therefore, some properties of these signals must be normalized in order to ensure that the results will not change with signals recorded in different settings. These properties are directly related to the signal's spectral information. As described in the Nyquist theorem (21), the spectral information contained in a sampled signal is limited to the frequency bandwidth from 0 to the sample frequency's half. Thus, the fast Fourier transform (FFT) (20), which is the algorithm used in the STFT spectral computation, will reflect this frequency bandwidth. Its spectral resolution is defined by the highest spectral component divided by the FFT width, and a maximum frequency point equal to the maximum frequency contained in this bandwidth. Therefore, digital recordings with different sample rates generate FFTs with different spectral resolution and, consequently, spectrograms with different figures. Thus, to ensure that the proposed algorithm generates equal results for signals recorded with different sample rates, it is necessary to implement a sample rate normalization routine, which may be considered to be the main algorithm of the pre-processing procedure.

The original sample rate is initially detected to implement the sample rate normalization. If this sample rate is higher than 9 kHz a down-sample algorithm is applied to the signal in order to change this property to this normalized value. This down-sample algorithm initially computes a low-pass filter with a cut-off frequency of 4 kHz, according to the original sampling frequency, and applies it to the signal. This filter works as a second anti-aliasing filter, with the objective of limiting the original signal's frequency bandwidth and allows a reduction of sample frequency. After the filtering process, the signal is down-sampled to 9 kHz. As all recordings used to validate this algorithm were recorded with a sampling frequency higher than 9 kHz, only a down-sample algorithm was implemented.

Since the purpose of the proposed method is the recognition of wheezing in lung sounds extracted from any source, another careful procedure must be followed to eliminate other types of contaminating signals, which may be seen in the spectrogram and may lead to misinterpretations. These signals, containing low-frequency components, are located spectrally in a lower region when compared to the wheeze spectra, and may be generated heart sounds and other types of continuous adventitious sounds such as rhonchus. In order to eliminate these signals, a high-pass filter was built. For this specific application, a Butterworth filter was designed. The digital filters used in the pre-processing procedure were a 100-tap finite impulse response low-pass filter with 4 kHz of cut-off frequency and an infinite impulse response high-pass third-order Butterworth filter with 20 Hz of cut-off frequency. Since at this point the sample frequency is already normalized, the filter coefficients are set to the sample frequency of 9 kHz.

Thus, after running the pre-processing procedure, the spectrogram may be generated, and it is certain that any type of lung sounds recorded with sample frequencies equal to or higher than 9 kHz will generate the same frequency range in the spectrogram.

Spectrogram generation

Spectrograms are the figures generated by the application of the STFT theorem (20). The theorem states that the spectrum variation of a non-stationary signal, which may not be seen by a single Fourier analysis, may be generated by segmenting this signal into slices, which are considered stationary, and by computing the Fourier transformation of these slices. In this way, the STFT may be defined by Equation 1, where x(t) is the signal in the time domain, w(t) defines the window function, and τ is the time localization of the STFT.

(Equation 1)

In a discrete system, the STFT may be implemented by segmenting the signal samples into groups and computing the FFT of such groups. In the spectrogram computation, we used 1024-point FFTs with a Hamming window and 60% overlap. The matrix resulting from the computation generates a three-dimensional figure, representing time, frequency and magnitude of the analyzed signal. This figure is usually represented by a three-dimensional graph that composes an image, where time is usually represented on the X-axis, and the Y-axis shows the frequency and the magnitude, which are represented by a color or grey scale. The resulting spectrogram image presents different characteristics according to the kind of adventitious sound contained in the respiratory cycle.

Normal lung sounds are basically formed by the sound of the airflow in the airways, usually contaminated by heart sounds. Thus, spectrograms of these sounds usually present a continuous magnitude decay from lower to higher frequencies and higher spectral components in the region lower than 100 Hz, with these components being related to the heart sounds. These characteristics are shown in Figure 2, which is a spectrogram of a normal vesicular sound.

The musical wheezing characteristics are determined by the fundamental frequency and its harmonics, over 100 Hz, and with a duration longer than 100 ms as described by Leher (18). Since these characteristics are continuous, the resulting spectrogram presents horizontal lines that define the strong presence of the wheeze main frequency and its harmonics during a certain period of time. This property may be seen in Figure 3 for a bronchial respiratory cycle with wheezing.

In contrast, the discontinuous abnormalities, or crackles, are characterized by short explosive sounds. Since these sounds are of relatively high amplitude, condensed in a short-time interval, they usually generate vertical lines in the spectrogram figure referring to the spectrum contained in this type of signal. This characteristic is shown in Figure 4, which represents the spectrogram of a respiratory cycle containing fine crackles during the inspiration phase.

Image treatment of the spectrogram

As may be seen in Figure 3, wheezing present in lung sounds shows a characteristic profile in the spectrogram, demonstrating great differences when compared to the pattern of discontinuous abnormalities. Thus, to permit automatic wheezing recognition, the proposed technique applies some digital image processing techniques to isolate the wheezing characteristic figure in the spectrogram. In order to isolate the main line, which defines the wheezing fundamental frequency in the spectrogram, two-dimensional convolution masks were initially applied. These masks were implemented in order to increase the spectrogram's contrast and enhance its edges. Therefore, the application of this processing technique enhances the presence of isolated lines in the graphic image, increasing the visibility of the characteristic images of wheezes and keeping their magnitude intact. Among all masks tried, the one yielding the best result was the Laplacian 9 x 9 mask (22). After the application of this two-dimensional filter, the edges were enhanced and the noise was reduced. The resulting spectrogram may be seen in Figure 5. Since the characteristic images of the wheezes were kept intact after the application of the convolution mask, it was possible to eliminate the spectrogram's low-magnitude components in order to isolate only the high-amplitude information. To accomplish this task, a half-threshold algorithm was developed. This procedure is called half-threshold because only the values under a threshold are set to zero, with the values over this threshold being kept intact.

Different sounds may be recorded with different patterns. Therefore, the resulting signal may present a variable recording level. In order to obtain the same limitation for any record level, the limiting algorithm sweeps along the spectrogram and localizes its highest values after the application of the Laplacian mask, assigning a zero magnitude to all points that have values lower than the threshold, which is computed as a percentage of the highest point.

The threshold was determined empirically and the best result occurred at 80% of the peak value. Thus, every point with magnitude lower than 80% of the highest value of the spectrogram is considered to be zero when the threshold is applied. An example of the effects of the threshold on the spectrogram may be seen in Figures 6 and 7. Figure 6 presents the processed spectrogram of a normal vesicular sound, whose non-processed spectrogram is shown in Figure 2.

The processed spectrogram of normal breaths usually shows only alternated pulses in the very low-frequency range and sometimes some high-amplitude components in higher spectral region, usually under 200 Hz. These characteristics may be easily seen in Figure 6. In contrast, the spectrogram of a respiratory cycle containing wheezes, presented in Figure 7, shows only the solid horizontal line, above 200 Hz, and some sparse low-frequency components. This solid horizontal line is the characteristic figure of wheezes, which was always prominent in the processed spectrograms obtained during the tests.

Because of these characteristics, the processed spectrograms are presented as a user feedback, beyond the automatic wheeze classification. Thus, the user can make his own evaluation and draw his own conclusion regarding the processed respiratory cycle, confirming or denying this automatic classification.

At this point, the presence of wheezing in an analyzed respiratory cycle may be easily seen, although a reduction of the amount of data is required in order to apply the automatic pattern recognition module. Therefore, to perform this data reduction, the mean spectral projection is computed from the processed spectrogram, and its result is stored in an array. The graph obtained from this projection is illustrated in Figure 8, which corresponds to the processed spectrogram in Figure 7. When this array presents high and isolated magnitudes, characterized as edges over 100 Hz (3), it may be assumed that there is a high isolated frequency component and that the signal presents a high probability of the presence of wheezes. Thus, the ten largest edges of the array are located and their frequency and amplitude values are stored in order to be used as data source for an artificial neural network-based pattern recognition module.

Pattern recognition

Since the lung sound patterns vary widely among different recording techniques and subjects, a pattern recognition module based on an artificial neural network was created to classify the analyzed input data as containing wheezing or not.

The multilayer perceptron neural network contains 20 inputs. Each input corresponds to the frequency and amplitude of the 10 largest edges of the mean frequency from the processed spectrogram, 41 neurons in the hidden layer and 2 neurons in the output layer. The number of neurons in the hidden layer was defined as 2n + 1, where n represents the number of inputs (23) and the number of neurons in the output layer corresponds to the number of classification patterns.

The selected activation function for all neurons in the artificial neural network was the hyperbolic tangent. This function was chosen because of its sigmoid shape, having a magnitude variation between -1 and 1. Another advantage of this activation function is the fact that it may be derivable, which is a requirement when using the back-propagation algorithm. However, the use of this activation function requires the observation of some properties. As the valid region of this sigmoid is restricted between approximately -7 and 7, it is necessary to keep the weights and the bias magnitudes lower enough, so that the induced local field does not present a value outside this interval. To avoid an under- and over-saturation of the activation function, a data normalization technique was applied to the frequency and magnitude values used as data source for the artificial neural network.

Since the frequency values may vary from 0 to 4000 Hz, limited by the new anti-aliasing filter in the pre-processing phase, all the frequency points were divided by 1000, reducing the interval from zero to 4. The values of the average amplitude of the pixel may range from zero to 255. Thus, during the normalization process, the magnitude of these points was divided by 100, reducing this variation interval from 0 to 2.55.

Forty recorded respiratory cycles were used for the training procedure, with 20 respiratory cycles containing wheezes and 20 respiratory cycles containing normal lung sounds and respiratory cycles with other types of continuous and discontinuous anomalies.

The proposed method was validated using 28 different recordings from different individuals ranging from newborn babies to 76-year-old subjects. The recordings were available in the internet repositories (19,24,25). Generally, the repositories do not indicate the recording standards. From the disposable resources, only PixSoft (19) presents this information. The recordings made by them (19) used contact accelerometers and a sample frequency of 10 kHz.

Since the number of respiratory cycles in each recording may vary from one to eleven, the total respiratory cycles analyzed were 112, 40 of them with and 72 without wheezes.

Results

When all respiratory sounds were evaluated separately, without establishing a relation between the lung sound and the volunteer, the algorithm presented the results shown in Table 1. Positive values were computed when the technique resulted in a positive value for respiratory cycles containing wheezes and negative values were computed when the technique returned a negative value for respiratory cycles without wheezes. In contrast, false-positive values were computed when a positive value occurred for respiratory cycles without wheezes and false-negative values were computed when a negative value occurred for respiratory cycles containing wheezes. The total accuracy was computed by adding the positive and negative values.

The performance of the algorithm developed for the analysis of isolated respiratory cycles was made in terms of sensitivity (se), specificity (sp) and performance (per) as defined by Equations 2 to 4. These analyses resulted in a sensitivity value of 0.861, a specificity value of 0.825 and a total performance of 84.28%.

(Equation 2)

(Equation 3)

(Equation 4)

For each volunteer, the mean matching index for all respiratory cycles was computed. When the number of accurate determinations and errors was equal, the result was computed as undefined. The resulted performance of the algorithm for this analysis was a total accuracy of 26 (92.86%), with an error of 1 (3.57%) and one result considered as undefined (3.57%).

Figure 1.
Flowchart of the software described here to implement the proposed methodology for automatic wheezing recognition. FIR = finite impulse response digital filter; IIR = infinite impulse response digital filter.

Figure 2.
Spectrogram obtained from a normal vesicular respiratory cycle. The almost continuous magnitude decay with increasing frequency can be seen as continuous whitening in middle and higher frequencies, shown by the vertical arrowheads. The spectral region less than 100 Hz contains the majority of the frequency components (horizontal arrowhead).

Figure 3.
Spectrogram resulting from a bronchial respiratory cycle with wheezing. The horizontal lines in the middle-end (arrowheads) show the appearance of strong frequency components characterizing the presence of wheezes.

Figure 4.
Spectrogram of a respiratory cycle containing fine crackles. The vertical lines at the top of the figure indicate the presence of many impulsive signals, in the region between the first two arrowheads. Subsequent arrowheads point to isolated crackles.

Figure 5.
Spectrogram presented in after applying the Laplacian 9 x 9 mask. The appliance of this mask increases the contrast of horizontal lines, which is a characteristic to the presence of wheezes. The arrowheads indicate the wheezing signals, which have to be maintained intact.

Figure 6.
Spectrogram of a normal vesicular sound, presented in , after applying the Laplacian 9 x 9 mask and threshold. After this processing, only a few spectral components remain in the figure, all of them located in the low frequency region (arrowheads).

Figure 7.
Spectrogram presented in after applying the Laplacian 9 x 9 mask and threshold. The arrowhead points to the horizontal line, which represents the wheeze's main frequency that was maintained intact after processing.

Figure 8.
Spectral projection of the processed spectrogram. The arrowhead points to the isolated high amplitude value resulted from the presence of an isolated frequency, which is the wheeze's main frequency.

Discussion

The main principle, which motivated the development of the proposed methodology, is the characteristic figure generated in the spectrogram from the respiratory cycles containing wheezes. Since this type of adventitious sound is composed basically of a fundamental frequency and its harmonics, these frequencies appear as a horizontal line in the spectrogram. Based on this characteristic, the proposed methodology enhances the wheeze in a visual way, applying image processing techniques to the spectrogram and using them as a data source for an automatic pattern recognition system.

In a qualitative analysis, all processed spectrograms of respiratory cycles containing wheezes showed a characteristic figure formed by a horizontal line, which may easily determine the presence of wheezes in a visual analysis.

Concerning the automatic recognition module, the best artificial neural network proved to be very robust, presenting dispersed errors in the analyzed set. This fact indicates that it is not possible to assert, in the tested sound domain, which type of specific sound may generate a higher number of recognition errors. This fact is confirmed by the fact that only one example of normal bronchial sound presented error in the general diagnosis.

The computation time needed for the generation and processing of the spectrogram is an important feature of the viability of the proposed technique. In the proposed application, the spectrogram generation, filtering and limiting may require more than 5 s for a 2-s respiratory cycle. The computing time is acceptable for off-line analysis, but the algorithms must be optimized to allow real-time recognition. However, as the purpose of the software was only to test the effectiveness of the proposed methodology, procedures for time optimization were not implemented. Therefore, it would be possible to reduce this processing time by refining the computational procedures of the proposed algorithms, which might allow real time analysis. Those procedures are out of the scope of the present study, and will be implemented later.

The developed algorithm was conceived to return not only an automatic diagnosis, but also the processed spectrogram containing the wheezing figure, in order to allow the user to draw his own conclusions about the results obtained. According to the figures and the values obtained for the automatic recognition system, the analysis allows the conclusion that the proposed technique is robust and trustworthy for use as support for the detection of wheezing in lung sounds, mainly when the analysis is performed through several respiratory cycles recorded from the same patient.

The results presented here were the best score for 10 neural networks that had been trained. The distinction among these neural networks was only the training group. To achieve these results, the "without wheezing" group had to present several elements from different normal and abnormal lung sounds.

The values obtained from the artificial neural network could not be compared to those presented by some investigators who had proposed to develop systems for lung sound recognition due to different points of view adopted in the evaluation of the results obtained. Oud and Doijes (17), based on the analysis of respiratory sounds, classified their patients as healthy or asthmatic. The results reported by Kandaswamy et al. (13) were closer to those reported here. In their investigation, the respiratory cycles were divided into six groups: normal, wheeze, crackle, squawk, stridor, and rhonchus. The results presented in their paper, obtained from an artificial neural network chosen from a set of six artificial neural networks trained, were 94.02% accuracy for group validation and 91.67% mean efficiency for recognition. Therefore, despite the fact that the present study resulted in a high recognition index, the results presented in Ref. 13 have a larger number of classification categories.

The results computed here did not show differences regarding patient age, body transducer position or recording method. However, errors may occur when the recorded sounds contain specific frequency noise. Also, errors may occur when the signal is filtered before the recording and the filtering process favors frequencies higher than 200 Hz.

The proposed technique was developed with the purpose of creating not only a recognition system, but also an effective algorithm that could support physicians in the diagnosis of lung diseases. The algorithm returns not only an indication of the diagnosis but also processed data to the user. Thus, the user may reach conclusions by analyzing these data by himself. For this application, the treated spectrogram is displayed on the computer screen before the automatic recognition.

The results obtained during the tests indicate that this technique may be useful in clinical diagnosis, mainly when the analysis can be performed continually using many respiratory cycles from the same patient. However, the algorithm still needs to automatically detect the respiratory cycle limits, finding its beginning and end.

Finally, the novel technique presented here is a first step in the creation of an automatic lung sound analyzer that may be quite useful to increase the accuracy and speed of clinical diagnosis.

Acknowledgments

We are indebted to Professor Álvaro Luiz Stelle (in memoriam) for his knowledge that he shared with us and support for the development of this research program. We thank the RALE repository for allowing us to use the recordings contained in its software.

Address for correspondence: R.J. Riella, DPEE/DVEL, LACTEC, Centro Politécnico, UFPR, BR-116, km 98, s/n, 81531-980 Curitiba, PR, Brasil. E-mail: riella@lactec.org.br, percy@utfpr.edu.br and joaquim@utfpr.edu.br

Research supported by CAPES. Received April 27, 2008. Accepted May 4, 2009.

¹
Anuário Estatístico da Saúde no Brasil. Ministério da Saúde http://portal.saude.gov.br/saude/aplicacoes/anuario2001/index.cfm; 2001
2. Murphy RLH. A simplified introduction to lung sounds Wellesley Hills: Stethophonics; 1977.
3. Pasterkamp H, Kraman SS, Wodicka GR. Respiratory sounds. Advances beyond the stethoscope. Am J Respir Crit Care Med 1997; 156: 974-987.
4. Sovijärvi ARA, Dalmasso F, Vanderschoot J, Malmberg LP, Righini G, Stoneman SAT. Definition of terms for application of respiratory sounds. Eur Respir Rev 2000; 10: 597-610.
5. Kiyokawa H, Yonemaru M, Horie S, Kasuga I, Ichinose Y, Toyama K. Detection of nocturnal wheezing in bronchial asthma using intermittent sleep tracheal sounds recording. Respirology 1999; 4: 37-45.
6. Bentur L, Beck R, Shinawi M, Naveh T, Gavriely N. Wheeze monitoring in children for assessment of nocturnal asthma and response to therapy. Eur Respir J 2003; 21: 621-626.
7. Martinez FD, Wright AL, Taussig LM, Holberg CJ, Halonen M, Morgan WJ. Asthma and wheezing in the first six years of life. The Group Health Medical Associates. N Engl J Med 1995; 332: 133-138.
8. Atta JA, Nunes MP, Fonseca-Guedes CH, Avena LA, Borgiani MT, Fiorenza RF, et al. Patient and physician evaluation of the severity of acute asthma exacerbations. Braz J Med Biol Res 2004; 37: 1321-1330.
9. Alencar AM, Buldyrev SV, Majumdar A, Stanley HE, Suki B. Avalanche dynamics of crackle sound in the lung. Phys Rev Lett 2001; 87: 088101.
10. Forgacs P. Crackles and wheezes. Lancet 1967; 2: 203-205.
11. Sovijärvi ARA, Malmberg LP, Charbonneau G, Vanderschoot J, Dalmasso F, Sacco C, et al. Characteristics of breath sounds and adventitious respiratory sounds. Eur Respir Rev 2000; 10: 591-596.
12. Abella M, Formolo J, Penney DG. Comparison of the acoustic properties of six popular stethoscopes. J Acoust Soc Am 1992; 91: 2224-2228.
13. Kandaswamy A, Kumar CS, Ramanathan RP, Jayaraman S, Malmurugan N. Neural classification of lung sounds using wavelet coefficients. Comput Biol Med 2004; 34: 523-537.
14. Taplidou SA, Hadjileontiadis LJ. Wheeze detection based on time-frequency analysis of breath sounds. Comput Biol Med 2007; 37: 1073-1083.
15. Taplidou SA, Hadjileontiadis LJ. Nonlinear analysis of wheezes using wavelet bicoherence. Comput Biol Med 2007; 37: 563-570.
16. Charbonneau G, Ademovic E, Cheetham BMG, Malmberg LP, Vanderschoot J, Sovijärvi ARA. Basic techniques for respiratory sound analysis. Eur Respir Rev 2000; 10: 625-635.
17. Oud M, Doijes EH. Automated breath sound analysis. Proceedings of 18th Annual International Conference of IEEE Engineering in Medicine and Biology Society Amsterdam: 1996. p 990-992.
18. Leher S. Understanding lung sounds 3rd edn. New York: Saunders; 2002.
19. PixSoft. The RALE repository. http://www.rale.ca
20. Cohen L. Time-frequency analysis 1st edn. Englewood Cliffs: Prentice-Hall; 1995.
21. Haykin S, Van Veen B. Signals and systems 1st edn. New York: John Wiley & Sons; 1998.
22. Myler HR, Weeks AR. Computer Imaging recipes in C Upper Saddle River: Prentice-Hall; 1993.
23. Haykin S. Neural networks: a comprehensive foundation 2nd edn. Englewood Cliffs: Prentice-Hall; 1999.
24. Ausculta pulmonar. http://orbita.starmedia.com/medbahia/pulmonar.htm
25. S. Louie. IMD 420-C review of lung sounds. http://medocs. ucdavis.edu/IMD/420C/sounds/lngsound.htm

Table 1. Results of isolated respiratory cycle analysis.

Correspondence and Footnotes

Publication Dates

Publication in this collection
26 June 2009
Date of issue
July 2009

History

Received
27 Apr 2008
Accepted
04 May 2009

This work is licensed under a Creative Commons Attribution 4.0 International License.

[1] ¹
Anuário Estatístico da Saúde no Brasil. Ministério da Saúde http://portal.saude.gov.br/saude/aplicacoes/anuario2001/index.cfm; 2001

[2] 2. Murphy RLH. A simplified introduction to lung sounds Wellesley Hills: Stethophonics; 1977.

[3] 3. Pasterkamp H, Kraman SS, Wodicka GR. Respiratory sounds. Advances beyond the stethoscope. Am J Respir Crit Care Med 1997; 156: 974-987.

[4] 4. Sovijärvi ARA, Dalmasso F, Vanderschoot J, Malmberg LP, Righini G, Stoneman SAT. Definition of terms for application of respiratory sounds. Eur Respir Rev 2000; 10: 597-610.

[5] 5. Kiyokawa H, Yonemaru M, Horie S, Kasuga I, Ichinose Y, Toyama K. Detection of nocturnal wheezing in bronchial asthma using intermittent sleep tracheal sounds recording. Respirology 1999; 4: 37-45.

[6] 6. Bentur L, Beck R, Shinawi M, Naveh T, Gavriely N. Wheeze monitoring in children for assessment of nocturnal asthma and response to therapy. Eur Respir J 2003; 21: 621-626.

[7] 7. Martinez FD, Wright AL, Taussig LM, Holberg CJ, Halonen M, Morgan WJ. Asthma and wheezing in the first six years of life. The Group Health Medical Associates. N Engl J Med 1995; 332: 133-138.

[8] 8. Atta JA, Nunes MP, Fonseca-Guedes CH, Avena LA, Borgiani MT, Fiorenza RF, et al. Patient and physician evaluation of the severity of acute asthma exacerbations. Braz J Med Biol Res 2004; 37: 1321-1330.

[9] 9. Alencar AM, Buldyrev SV, Majumdar A, Stanley HE, Suki B. Avalanche dynamics of crackle sound in the lung. Phys Rev Lett 2001; 87: 088101.

[10] 10. Forgacs P. Crackles and wheezes. Lancet 1967; 2: 203-205.

[11] 11. Sovijärvi ARA, Malmberg LP, Charbonneau G, Vanderschoot J, Dalmasso F, Sacco C, et al. Characteristics of breath sounds and adventitious respiratory sounds. Eur Respir Rev 2000; 10: 591-596.

[12] 12. Abella M, Formolo J, Penney DG. Comparison of the acoustic properties of six popular stethoscopes. J Acoust Soc Am 1992; 91: 2224-2228.

[13] 13. Kandaswamy A, Kumar CS, Ramanathan RP, Jayaraman S, Malmurugan N. Neural classification of lung sounds using wavelet coefficients. Comput Biol Med 2004; 34: 523-537.

[14] 14. Taplidou SA, Hadjileontiadis LJ. Wheeze detection based on time-frequency analysis of breath sounds. Comput Biol Med 2007; 37: 1073-1083.

[15] 15. Taplidou SA, Hadjileontiadis LJ. Nonlinear analysis of wheezes using wavelet bicoherence. Comput Biol Med 2007; 37: 563-570.

[16] 16. Charbonneau G, Ademovic E, Cheetham BMG, Malmberg LP, Vanderschoot J, Sovijärvi ARA. Basic techniques for respiratory sound analysis. Eur Respir Rev 2000; 10: 625-635.

[17] 17. Oud M, Doijes EH. Automated breath sound analysis. Proceedings of 18th Annual International Conference of IEEE Engineering in Medicine and Biology Society Amsterdam: 1996. p 990-992.

[18] 18. Leher S. Understanding lung sounds 3rd edn. New York: Saunders; 2002.

[19] 19. PixSoft. The RALE repository. http://www.rale.ca

[20] 20. Cohen L. Time-frequency analysis 1st edn. Englewood Cliffs: Prentice-Hall; 1995.

[21] 21. Haykin S, Van Veen B. Signals and systems 1st edn. New York: John Wiley & Sons; 1998.

[22] 22. Myler HR, Weeks AR. Computer Imaging recipes in C Upper Saddle River: Prentice-Hall; 1993.

[23] 23. Haykin S. Neural networks: a comprehensive foundation 2nd edn. Englewood Cliffs: Prentice-Hall; 1999.

[24] 24. Ausculta pulmonar. http://orbita.starmedia.com/medbahia/pulmonar.htm

[25] 25. S. Louie. IMD 420-C review of lung sounds. http://medocs. ucdavis.edu/IMD/420C/sounds/lngsound.htm