A comparative analysis of preprocessing techniques of cardiac event series for the study of heart rhythm variability using simulated signals

In the present study, using noise-free simulated signals, we performed a comparative examination of several preprocessing techniques that are used to transform the cardiac event series in a regularly sampled time series, appropriate for spectral analysis of heart rhythm variability (HRV). First, a group of noise-free simulated point event series, which represents a time series of heartbeats, was generated by an integral pulse frequency modulation model. In order to evaluate the performance of the preprocessing methods, the differences between the spectra of the preprocessed simulated signals and the true spectrum (spectrum of the model input modulating signals) were surveyed by visual analysis and by contrasting merit indices. It is desired that estimated spectra match the true spectrum as close as possible, showing a minimum of harmonic components and other artifacts. The merit indices proposed to quantify these mismatches were the leakage rate, defined as a measure of leakage components (located outside some narrow windows centered at frequencies of model input modulating signals) with respect to the whole spectral components, and the numbers of leakage components with amplitudes greater than 1%, 5% and 10% of the total spectral components. Our data, obtained from a noise-free simulation, indicate that the utilization of heart rate values instead of heart period values in the derivation of signals representative of heart rhythm results in more accurate spectra. Furthermore, our data support the efficiency of the widely used preprocessing technique based on the convolution of inverse interval function values with a rectangular window, and suggest the preprocessing technique based on a cubic polynomial interpolation of inverse interval function values and succeeding spectral analysis as another efficient and fast method for the analysis of HRV signals. Correspondence


Introduction
Since the initial experiments of Akselrod et al. (1,2), Pomeranz et al. (3) and Pagani et al. (4), who suggested that the spectral analysis of heart rhythm variability (HRV) could be used as a novel and powerful noninvasive tool for indirect assessment of the sympathovagal modulating activity of cardiovascular function, the technique has been extensively used, not only in basic research, but also in clinical medicine (5)(6)(7)(8).In the frequency domain, three underlying frequency bands have been defined through the HRV spectrum observation (1-10) and named: a) "high-frequency band" or "respiratory sinus arrhythmia related band", which occurs at about the mean breathing frequency of the patient or experimental animal; b) "low-frequency band" or "Mayer wave-like sinus arrhythmia related band", which occurs at about 0.1 Hz spontaneous vasomotor activity, and c) "verylow-frequency band", which is the band of less than 0.05-Hz frequencies.Power alterations in the high-frequency band have been attributed to changes in cardiac parasympathetic activity, whereas in the low-frequency band they have been related to changes in both sympathetic and parasympathetic activities (1)(2)(3)(4)(5)(6)(7)(8).In contrast, the physiological elucidation of the very-low-frequency band power is much less established (8), albeit some researchers have related it to thermoregulatory activity, to the renin-angiotensin system and to other systems that involve humoral factors (1,2,9,10).
The analysis of signals in the frequency domain can be performed either by classical techniques, such as Blackman-Tukey's correlogram and Welch's periodogram (11) -both based on the fast Fourier transform (FFT) algorithm (12) -, or by autoregressive modern techniques (11).However, the processing of a series of non-equidistant point events, constituted by successive heartbeats (cardiac event series), and their analysis in the frequency domain are not straightfor-ward.Spectra can be estimated only from regularly sampled signals (11), although some techniques propose the direct estimation of the spectra of unevenly sampled data (13,14), but these algorithms are computationally cumbersome.Therefore, the cardiac point processes must be first submitted to mathematical approaches, denoted in this report as preprocessing procedures, to produce a series of equidistantly sampled data suitable for spectral analysis.This implies that different spectra can be defined by different forms of preprocessing.Moreover, the signal of HRV can be either heart period or heart rate, defined as the reciprocal values of each other.These signals have the same informative content, but the results obtained from each have shown considerable discrepancies, inasmuch as the relationship between them is non-linear (15).
As pointed out above, the numerous methods of preprocessing the HRV signal and the reciprocal mode to define them will result in dissimilar spectra.It is essential that the estimated spectra exhibit a minimum of harmonic components and other artifacts, therefore being as accurate as possible.In previous studies, De-Boer et al. (16,17) and Berger et al. (18) presented an evaluation of the performance of four common spectral techniques for analysis of HRV.Berger et al. (18) recommended their algorithm, based on the convolution of inverse interval function values with a rectangular window, as the most efficient for the derivation of heart rate signal from the electrocardiogram.These studies employed a group of noise-free simulated cardiac event series generated by an integral pulse frequency modulation (IPFM) model, and the performance of the preprocessing methods was assessed by means of a visual contrast between the estimated spectra (spectra of the preprocessed simulated HRV signals) and the true spectrum (spectrum of the model input modulating signal).In the present report we show a more comprehensive survey (assessing 15 types of spectra) and a well-defined comparison (using merit indices to quantify the mismatches between the estimated spectra and the true spectrum) of the several methods that can be employed in the signal preprocessing for the analysis of HRV.

Signal simulation
For the evaluation of the performance of the several methods that can be employed in the signal preprocessing for the analysis of HRV, we used as input signals for the preprocessing algorithms three noise-free computer-simulated temporal series of heartbeat intervals generated by an IPFM model.
Information embedded in a series of point events will be described by an IPFM where a[l] and ƒ[l] are, respectively, the amplitude and the frequency of the l-sinusoid (17,18).In the current study, we simulated three time series of 512 data (henceforth referred to as time series #1, #2 and #3), for which the model's threshold value T was 1.05 s, m 0 = 1, and the other parameters were established in the following fashion: a) m 1 (t) = 0.3sin(2π0.16t)for time series #1; b) m 1 (t) = 0.3sin(2π0.12t)+ 0.3sin(2π0.16t)for time series #2, and c) m 1 (t) = 0.3sin(2π0.07t)+ 0.3sin(2π0.16t)+ 0.3sin(2π0.28t)for time series #3.The parameters used in the formation of the time series #1 and #2 were chosen to carry out the same simulations previously reported by De-Boer et al. (16,17) and Berger et al. (18).In addition, we simulated a third time series, in which the modulating signal was comprised of three oscillatory components with frequencies equal to 0.07, 0.16 and 0.28 Hz.The values of T = 1.05 s and m 0 = 1 imply, by definition, an intrinsic frequency of 57 beats per minute.The frequencies chosen for the oscillatory components correspond approximately to the frequencies usually found in the HRV spectra of humans.De-Boer et al. (17) showed analytically that a modulation depth of a = 0.3 results in heartbeat intervals whose length is up to approximately 30% different from the mean value.

Signal preprocessing
The term preprocessing is employed in the current report to designate the mathematical approaches used in the derivation of an equidistantly sampled time series from irregularly sampled cardiac event series.The several spectra that were studied in this report (listed in Table 1) are defined below.
The spectrum of heart periods (or spectrum of intervals) is the spectrum of the interval tachogram, which is defined as the series of heartbeat intervals as a function of interval number (9,(16)(17)(18)22,23).In this manner, the intervals can be visualized as regularly spaced, but, as a consequence, the units of the frequency axis of these spectra are "cycles per beats" rather than "Hertz".In a different way, if the intervals should be considered evenly spaced with distances equal to the average interval length, the frequency unit will be "cycles per second" or "Hertz equivalent".This method has been extensively used (4-8) since it was first suggested by Sayers (9).
Additionally, a large class of spectra arises from a spectral estimation of the series obtained by a regular sampling of sectionally continuous functions derived from the cardiac event series.We labeled this class as spectrum of sampled sectionally continuous interval function.For the determination of these functions, numerous possibilities exist, and in the current study we used the following ones: a) The function values are maintained constant from the time of occurrence of the nth cardiac event, t[n], to the time of occurrence of the succeeding one, t[n + 1], and this constant value can be the instantaneous heart period (ƒ(t ).These sectionally continuous interval functions are known as instantaneous heart period signal and delayed heart period signal, respectively (21).b) The function values are determined through the convolution of the instantaneous heart period signal with a rectangular window of width equal to twice the sampling period of the posterior sampling procedure, divided by the window width.It is important to note that the convolution of a signal with a rectangular window in the time domain corresponds, in the frequency domain, to the multiplication of the Fourier transform of this signal with a lowpass filter with a transfer function equal to where T w is the window width (12).Consequently, when this type of preprocessing technique is used, the power spectrum estimate can be improved if multiplied by the correction factor [T w /H(ƒ)] 2 .Furthermore, the spectral estimate can be considered accurate only for 0<ƒ<ƒ s /4, where ƒ s is the sampling frequency.This method was suggested by Berger et al. (18), who used the instantaneous heart rate signal instead of the instantaneous heart period signal as described here.c) The function values are determined by means of some kind of interpolation of the interval function values.The interval function is defined as the series of heartbeat intervals as a function of time, in which impulses are formed only at the event occurrence times with an amplitude equal to the length of the preceding interval (23,24).In the current study, we employed the efficient Neville's algorithm for evaluation of polynomial interpolations (13).
Finally, another spectrum can be assessed in a straightforward manner from the nonequidistant cardiac event series in which the series of cardiac events was replaced with a train of impulse functions (Dirac's delta functions), where N is the number of cardiac events and t[n] (n ∈ N; 0 ≤n≤N -1) is the occurrence time of the nth event.This power spectrum, known as spectrum of counts (14,(16)(17)(18)25), is defined as the low-frequency part of the squared Fourier transform of the sequence of impulse functions p(t), and can be analytically calculated as As pointed out by Berger et al. (18), equation 5 embodies terms that compensate for the truncation effects that are brought about by the computation of the Fourier transform of a finite set of impulse functions.
With the exception of the spectrum of counts, a further set of spectra can be obtained in the same manner as those defined above if reciprocals of intervals (heart rates) are considered instead of intervals (heart periods).

Signal processing
In the processing of equidistantly sampled time series resulting from the preprocessing procedures, the spectra were computed by means of a windowed FFT algorithm (11,12,26), except for the spectrum of counts which was calculated using equation 5.For the sake of comparison with the previous works of De-Boer et al. (16,17) and Berger et al. (18), the amplitude spectrum (the square root of the power spectrum) was adopted.This method has been extensively used to accentuate the presence of harmonics and other artifacts, which is helpful in the evaluation of the performance of the spectra (16)(17)(18).

Performance evaluation
In an attempt to evaluate the various estimated spectrum performances mentioned in Table 1, their shapes were compared to that of the spectrum of the IPFM model modulating signal (true spectrum), responsible for the generation of the simulated heartbeat interval series.These comparisons can be made both by visual analysis (16)(17)(18), in which we look for the presence of harmonic peaks and other artifacts in the estimated spectra, or by contrasting merit indices, designed for evaluating the performance.As previously noted, it is desired that estimated spectra match the true spectrum as closely as possible and therefore the merit indices used in this study were empirically designed in order to quantify these mismatches.In this study, we proposed the following indices: a) the foremost, named leakage rate (S L ), is defined as a measure (in percentage) of total leakage spectral components with respect to the whole components.The leakage compo-maining spectrum.b) The numbers of leakage spectral components with amplitudes greater than 1% (n 1% ), 5% (n 5% ) and 10% (n 10% ) of the total spectral components.

Results
Figure 1A-C shows the interval tachograms of the first 250 (of 512) simulated values of time series #1, #2 and #3, respectively.These graphs illustrate the increased irregularity (complexity) in the output signal of an IPFM model when more than one modulatory element are lumped together to compound the input model signal.It is a highly laborious task to analyze the nature of the modulating signals only by simple observation of Figure 1C.
For the spectral analysis of the preprocessed signals, some commonly used time windows (Bartlett, Hanning, Hamming and Blackman) (26) were first tested and we utilized the Blackman window since this invariably produced the least leakage rate in amplitude spectra.
For the present analysis, the 15 types of amplitude spectra grouped in Table 1 were divided into three clusters, as follow: a) cluster 1 aggregated the set of spectra derived from the intervals (heart periods), which are spectra #1 to #7, b) cluster 2 aggregated the set of spectra derived from the reciprocals of intervals (heart rates), which are spectra #8 to #14, and c) cluster 3 consisted of the spectrum of counts (spectrum #15).
Figure 2A-C presents the computed leakage rate of the spectra estimated from the time series #1, #2 and #3, respectively, while Figure 2D-F exhibits the numbers of leakage spectral components (n 1% , n 5% and n 10% ).
As shown in Figure 2B, for the time series #2 and cluster 1, spectra #5 and #4 presented the lowest leakage rates (36.98% and 37.28%, respectively), whereas spectra #2, #3 and #1 presented the highest ones (56.13%, 49.09% and 48.83%, respectively).For cluster 2, spectra #12 and #13 presented the lowest leakage rates (18.00% and 18.64%,Both the analysis of the numbers of leakage spectral components n 1% , n 5% and n 10% (see Figure 2D-F) and the visual examination of the spectrum shapes corroborated the results described above.As evidenced by Figure 2A-F and the analysis reported above, the spectrum of sampled function of cubicinterpolated heart rate values (spectrum #12), the spectrum of sampled function of 5-degree-interpolated heart rate values (spectrum #13) and the spectrum of sampled instantaneous heart rate signals convoluted with a rectangular window (spectrum #14) appear to be the most appropriate for analysis of the HRV signal in the frequency domain.

Discussion
In the present study we compared several methods for the preprocessing of cardiac event series for the analysis of HRV.Our major objective was to extend the studies conducted by De-Boer et al. (16,17) and Berger et al. (18) in two aspects: a) these previous studies only investigated the spectrum of intervals (16)(17)(18), the spectrum of inverse intervals (17,18), the spectrum of counts (16)(17)(18) and the spectrum of sampled instantaneous heart rate signal convoluted with a rectangular window (18).In the current study we analyzed a more comprehensive group consisting of 15 types of spectra, as described in Table 1.b) In the above mentioned studies the spectra were compared only by means of visual contrasting, while in the current study, besides visual analysis, we performed a well-defined comparison based on merit indices, designed to quantify the mismatches between the estimated spectra and the true spectrum.
The spectrum of intervals -owing to its simplicity -and the spectrum of sampled instantaneous heart rate signal convoluted with a rectangular window -owing to its good performance reported in the study of Berger et al. (18) -are still the most frequently employed techniques for the study of HRV signals (1)(2)(3)(4)(5)(6)(7)(8)(9).The results obtained in the present study contraindicate the spectrum of intervals, reinforce the efficiency of the algorithm suggested by Berger et al. (18) and indicate another simple approach which is also efficient for the study of HRV, as discussed below.
We noticed that if the classical methods of spectral estimation are chosen for the analysis of HRV, then the multiplication of the time series with a Blackman window before the signal processing procedures will result in a spectrum with the least leakage rate.In addition, the matched comparison between the spectra of clusters 1 and 2 and their own merit indices indicate a marked decrease in the leakage rate when the heart rates were used instead of the heart period values in the derivation of signals representative of heart rhythm.This result agrees with data reported by Mohn (22), which suggested that the spectrum of inverse intervals is superior to the spectrum of intervals.However, the reasons for these discrepancies are not clear.The influence of the choice of heart rate or heart period values in the study of HRV has not been sufficiently considered.Only a few recent reports (15,27,28) have speculated this questionable point.
We observed that all spectra present artifactual peaks at "sum" and "difference" frequencies of the IPFM model's modulating frequencies, sometimes misleading the identification of the modulatory peaks.It was observed that the spectrum of sampled function of cubic-interpolated heart rate values is the most suitable for the study of HRV signals, inasmuch as it more closely matches the true spectrum at the input of the IPFM model when the model's modulating signal is composed of one or two sinusoidal noisefree signals.Consequently, we conclude that the preprocessing technique based on a cubic polynomial interpolation of inverse interval function values is an adequate approach for the derivation of an equidistantly sampled time series, representative of the heart rhythm, from the irregular cardiac event series.Furthermore, we observed that the spectrum of sampled function of 5-degreeinterpolated heart rate values and the spectrum of sampled instantaneous heart rate signal convoluted with a rectangular window also closely match the true spectrum, particularly for the case of three sinusoidal noise-free modulating signals.However, from a computational point of view, the assessment of regularly sampled data by means of cubic polynomial interpolation using Neville's algorithm is the most efficient and effortless procedure.Conversely, the spectrum of intervals, which is among the techniques most frequently used, is the least effective estimate in the sense that it diverges considerably from the true spectrum.
The results reported here indicate that an accurate and successful spectral estimation is a sine qua non of a carefully selected preprocessing technique.The use of heart rate values instead of the heart period values in the derivation of signals representative of heart rhythm results in more correct spectra.Furthermore, our data derived from mathematical modeling and noise-free simulations reinforce the efficiency of the widespread adopted preprocessing technique based on the convolution of inverse interval function values with a rectangular window, and recommend the technique based on a cubic polynomial interpolation of inverse interval function values and succeeding spectral analysis as another efficient, fast and more effortless computer implementation set of procedures for the analysis of HRV signals.

Figure 1 -
Figure 1 -A, B, C, Interval tachograms of the first 250 (of 512) values of three noise-free simulated time series produced by means of an IPFM model in which the threshold value was 1.05 s and the input signals consisted of a summation of modulating sinusoids with frequencies equal to 0.16 Hz (A), 0.12 and 0.16 Hz (B), and 0.07, 0.16 and 0.28 Hz (C).D, E, F, Amplitude spectra estimated from the three simulated time series preprocessed by means of a polynomial cubic interpolation (spectrum of sampled function of cubic-interpolated heart period values).

Figure 2 -
Figure 2 -Values of merit indices calculated to assess the performance of the spectra estimated from three noise-free simulated time series.These series were produced by means of an IPFM model in which the threshold value was 1.05 s and the input signals consisted of a summation of modulating sinusoids with frequencies equal to 0.16 Hz (A, D), 0.12 and 0.16 Hz (B, E), and 0.07, 0.16 and 0.28 Hz (C, F).The 15 types of spectra shown here are discriminated in Table1.A, B, C, Leakage rate.D, E, F, Numbers of leakage spectral components with amplitudes greater than 1% (n 1% ), 5% (n 5% ) and 10% (n 10% ) of the total spectral components.

Table 1 -
Spectrum types of heart rhythm variability derived from the various preprocessing techniques used to transform the cardiac event series into a regularly sampled time series.