Automatic pain quantification using autonomic parameters

Walter, Steffen; Gruss, Sascha; Limbrecht-Ecklundt, Kerstin; Traue, Harald C.; Werner, Philipp; Al-Hamadi, Ayoub; Diniz, Nicolai; Silva, Gustavo Moreira da; Andrade, Adriano O.

doi:10.3922/j.psns.2014.041

Abstract

The objective measurement of subjective, multi-dimensionally experienced pain is a problem for which there has not been an adequate solution. Although verbal methods (e.g., pain scales and questionnaires) are commonly used to measure clinical pain, they tend to lack objectivity, reliability, or validity when applied to mentally impaired individuals. Biopotential and behavioral parameters may represent a solution. Such coding systems already exist, but they are either very costly or time-consuming or have not been sufficiently evaluated. In this context, we collected a database of biopotentials to advance an automated pain recognition system, determine its theoretical testing quality, and optimize its performance. For this purpose, participants were subjected to painful heat stimuli under controlled conditions. One hundred thirty-five features were extracted from the mathematical groupings of amplitude, frequency, stationarity, entropy, linearity, and variability. The following features were chosen as the most selective: (1) electromyography corrugator peak to peak, (2) corrugator shannon entropy, and (3) heart rate variability slope RR. Individual-specific calibration allows the adjustment of feature patterns, resulting in significantly more accurate pain detection rates. The objective measurement of pain in patients will provide valuable information for the clinical team, which may aid the objective assessment of treatment (e.g., effectiveness of drugs for pain reduction, information on surgical indication, and quality of care provided to patients).

pain quantification heat; biopotentials; feature extraction and selection; calibration; support vector machines

PERCEPTION-ACTION

Automatic pain quantification using autonomic parameters

Steffen Walter^I; Sascha Gruss^I; Kerstin Limbrecht-Ecklundt^I; Harald C. Traue^I; Philipp Werner^II; Ayoub Al-Hamadi^II; Nicolai Diniz^III; Gustavo Moreira da Silva^III; Adriano O. Andrade^III

^IUniversity of Ulm, Ulm, Germany

^IIOtto-von-Guericke-University Magdeburg, Magdeburg, Germany

^IIIUniversidade Federal de Uberlândia, Uberlândia, MG, Brazil

^{Correspondence} Correspondence regarding this article should be directed to: Steffen Walter Email: steffen.walter@uni-ulm.de

ABSTRACT

The objective measurement of subjective, multi-dimensionally experienced pain is a problem for which there has not been an adequate solution. Although verbal methods (e.g., pain scales and questionnaires) are commonly used to measure clinical pain, they tend to lack objectivity, reliability, or validity when applied to mentally impaired individuals. Biopotential and behavioral parameters may represent a solution. Such coding systems already exist, but they are either very costly or time-consuming or have not been sufficiently evaluated. In this context, we collected a database of biopotentials to advance an automated pain recognition system, determine its theoretical testing quality, and optimize its performance. For this purpose, participants were subjected to painful heat stimuli under controlled conditions. One hundred thirty-five features were extracted from the mathematical groupings of amplitude, frequency, stationarity, entropy, linearity, and variability. The following features were chosen as the most selective: (1) electromyography corrugator peak to peak, (2) corrugator shannon entropy, and (3) heart rate variability slope RR. Individual-specific calibration allows the adjustment of feature patterns, resulting in significantly more accurate pain detection rates. The objective measurement of pain in patients will provide valuable information for the clinical team, which may aid the objective assessment of treatment (e.g., effectiveness of drugs for pain reduction, information on surgical indication, and quality of care provided to patients).

Keywords: pain quantification heat, biopotentials, feature extraction and selection, calibration, support vector machines.

Introduction

Pain is a very personal sensation that is difficult to interpret without any communication from the patient. Consequently, a method for the objective measurement of pain would be beneficial particularly in cases in which the patient is unable to describe the pain that he or she is experiencing, such as in neonates (Brahnam, Chuang, Shih, & Slack, 2006), somnolent patients, and patients who suffer from dementia (Basler et al., 2001; Zwakhalen, Hamers, Abu-Saad, & Berger, 2006; Herr, Bjoro & Decker, 2006). Under certain circumstances, little correlation exists between subjectively experienced pain and tissue lesions or other pathological changes. The pain may even be completely unrelated. Therefore, somatic pathology does not allow any conclusions to be drawn about subjectively experienced pain (Turk & Okifuji, 1999; Nilges & Traue, 2007). Children, older individuals, and patients who suffer from dementia have different pain thresholds and varying tolerance to pain relative to healthy adults (Lautenbacher, 2004; Soetanto, Chung, & Wong, 2004).

One central problem is the fact that no simple method can be used to measure pain directly. The examining physician must rely on the patient's qualitative description of the intensity, location, and nature of the pain. Quantifying pain is possible with the aid of the Visual Analog Scale or Numeric Rating Scale. However, these methods only work when the patient is sufficiently alert and cooperative, which is not always the case in the medical field (e.g., post-surgery phases). Overall, these methods are either considered inadequate or still in development (Lautenbacher, 2004). If conditions do not allow for a sufficiently valid measurement of the pain, under-perfusion of the operating field, or the development of chronic pain. For example, 30-70% of patients report moderate to severe pain after surgery (Wiebalck, Vandermeulen, Aken, & Vandermeersch, 1995). The measurement of biopotential via the autonomic nervous system may be a solution that would permit an objective, reliable, and variable diagnosis of pain.

In the area of pure research, many studies have been performed to determine correlations between the autonomic nervous system (primary electrocardiogram and galvanic skin conductance) and pain stimulation (Colloca, Benedetti, & Pollo, 2006; Cortelli & Pierangeli, 2003; Jeanne, Logier, De Jonckheere, & Tavernier, 2009; Korhonen & Yli-Hankala, 2009; Ledowski, Ang, Schmarbeck, & Rhodes, 2009; Loggia, Juneau, & Bushnell, 2011; Schlereth & Birklein, 2008). However, these studies only examined the correlation between a single biopotential parameter (Treister, Kliger, Zuckerman, Goor Aryeh, & Eisenberg, 2012) and were not oriented toward applied research. To receive an objective, reliable, and valid diagnosis of pain, we need a combination of multi-parameter features. To the best of our knowledge, the study by Treister et al. (2012) was the first that took a multi-parameter biopotential approach. Tonic heat was applied to elicit pain for a duration of 1 min, with intensities of no pain, low pain, medium pain, and high pain. The pain intensities were calibrated individually. The following biopotential measurements were used: heart rate, heart rate variability-high frequency, skin conductance, number of skin conduction fluctuations, photoplethysmography, and a linear combination parameter. All of the features differed significantly in "no pain" and the other categories, only the linear combination of features significantly differentiated between all pain categories (p < .001 to .02). Additionally, a clinical study by the same research group (Ben-Israel, Kliger, Zuckerman, Katz,& Edry, 2013) reported similar results to those obtained with a linear regression and non-linear Random Forest regression based on the same six features used by Treister et al. (2012). Like Treister et al. (2012), the authors of the present study represent the scientific viewpoint that extracting only six features is insufficient for objective, reliable, and valid pain recognition. A clear statement about which features are the most innovative can only be made based on the simultaneous testing of a large collection of features. Furthermore, innovative applied pain recognition requires the use of modern machine learning classification methods (e.g., Neural Networks and Support Vector Machines [SVMs]).

Hence, the goal of the present study was to develop an extensive multimodal dataset (i.e., The BioVid Heat Pain Database; Walter et al., 2013a) in which varying levels of pain would be induced. This paper focuses only on the biopotentials of the multimodal dataset and not on the behavioral data. We plan to release the database for research purposes. We also used the pain heating model (e.g., Treister et al., 2012) because this model is computer-based and the best controlled pain model that can be found in the existing literature (Lautenbacher, 2004). The aim of the present study (see Figure 1A) was to select the feature patterns that contribute to the highest recognition rate for pain quantification.

The paper features several unique attributes: (1) highly controlled pain stimulation, (2) multimodal detection (i.e., simultaneous data collection on electromyogram [EMG] including zygomaticus, corrugator, and trapezius, skin conductance level [SCL], and electrocardiogram [ECG]), (3) extraction of features from the statistical groups of amplitude, frequency, stationarity, entropy, linearity, and variability (in this regard, a maximum number [Σ = 135] of features should be extracted), and (4) the selection of general statistical and individual automatic feature patterns that contribute to the highest recognition rate for pain quantification.

The overall hypothesis of the present study was that the distinction between pain quantification (baseline [B] vs. pain thresholds T₁vs. T₂vs. T₃vs. T₄) would be significant with regard to signal features (i.e., amplitude, frequency, stationarity, entropy, linearity, and variability) of the EMG (zygomaticus, corrugator, and trapezius), ECG, and SCL. An explorative hypothesis was that there are reliable (> 80%) individual automatic (SVM) pain quantification rates (baseline [B] vs. pain thresholds T₁vs. T₂vs. T₃vs. T₄) with regard to signal features (i.e., amplitude, frequency, stationarity, entropy, linearity, and variability) of the EMG (zygomaticus, corrugator, and trapezius), ECG, and SCL.

Methodology

Participants

A total of 90 subjects participated in the experiment, recruited from the following age groups: (1) 18-35 years (n = 30 years; 15 men, 15 women), (2) 36-50 years (n = 30; 15 men, 15 women), and (3) 51-65 years (n = 30; 15 men, 15 women). A total of 86 subjects were included in the final analysis because four subjects were excluded because of limited data quality with regard to the EMG. Recruitment was performed through notices posted at the university for the 18- to 35-year-old age group and through the press for the 36- to 65-year-old age group. Only healthy subjects were recruited. Pre-existing neurological conditions, chronic pain, cardiovascular disease, regular use of pain medication, and use of pain medication immediately before the experiment were applied as exclusion criteria. The subjects received an expense allowance. The study was conducted in accordance with the ethical guidelines set out in the WMA Declaration of Helsinki (ethical committee approval was granted: 196/10-UBB/bal).

Measured parameters

Biopotentials: A Nexus-32 amplifier (http://www.mindmedia.nl; accessed May 23, 2014) was used to record biopotential data (see Figure 1C) during the experiment. Biopotential and event data were recorded using Biotrace software. The following parameters were included in the classification (Walter et al., 2013a):

EMG: Electrical muscle activity is also an indicator of general psychophysiological stimulation in which increased muscle tone is associated with increasing activity of the sympathetic nervous system. A decrease in somatomotor activity reflects predominantly parasympathetic stimulation. We used two-channel EMGs for the zygomaticus, corrugator, and trapezius muscles. EMG responses via facial muscle regions such as the corrugator medialward to form a frown, and the zygomaticus major, which elevates the corners of the mouth superiorly and posteriorly, are expected to be active during pain stimulation. The activity of the trapezius is an indication of a high stress level, which is also to be expected when pain is being experienced. SCL: To measure the skin conductance level, two electrodes connected to the sensor were positioned on the index and ring fingers. Because the sweat glands are innervated exclusively sympathetically (i.e., without the influence of the parasympathetic nervous system), electrodermal activity is considered a good indicator of the "inner" tension of a person. This phenomenon can be reproduced particularly impressively by the observation of a rapid increase in skin conductance within 1-3 s due to a simple stress stimulus (e.g., deep breathing, emotional excitement, or mental activity). ECG: We measured the average action potential of the heart on the skin using two electrodes, one on the upper right and one on the lower left of the body. Common features of the ECG signal are heart rate, interbeat interval, and heart rate variability (HRV). Heart rate variability refers to the oscillation of the interval between consecutive heartbeats. It has been used as an indication of mental effort and stress in adults (Kim & Andre, 2008). EEG: We measured 21 EEG channels including two EOG (horizontal, vertical) channels. The EEG analysis is not presented in this paper.

Pain stimulation method

For pain elicitation we used a thermode (PATHWAY, http://www.medoc-web.com; accessed April 23, 2014) applied to the right arm (see Figure 1D). Throughout the entire experiment, the participants sat in a chair with their arms resting on the desk in front of them. With this kind of technology, eliciting quantified pain under highly controlled conditions is possible, without causing skin burns (Lautenbacher, 2004). A temperature of 50.5ÂºC must not be exceeded.

Calibration of thresholds: At the beginning of the experiment, we tested pain (T₁) and tolerance thresholds (T₄) for every participant. During this process, the subjects sat in a chair, and a thermode was attached to their right forearm (see Figure 1D). In the left hand, they held a computer mouse. To measure T₁ and T₄, a temperature rise (10ÂºC/s) was implemented, starting at a value of 32ÂºC. When the threshold of T₁ and T₄ was reached, the subject clicked the right mouse button. Four measurements each for T₁ and T₄ were made for each subject. From these values, a specific average was calculated for T₁ and T₄ for each individual. Two other intermediate individual pain thresholds (T₂ and T₃) were determined mathematically:

Instruction for pain threshold: "Please press the stop button immediately when you experience a burning, stinging, piercing, or pulling sensation in addition to feeling heat."

Instruction for tolerance threshold: "Please press the stop button immediately when you can no longer tolerate the heat, taking into account the burning, stinging, piercing, or pulling sensation."

Pain stimulation: After the calibration procedure, we programmed the thermode software with the T₁vs. T₂vs. T₃vs. T₄ separately for each individual for the stimulation experiment. For approximately 25 min, we randomly stimulated the participants with the four individual specific thresholds of pain. The baseline (no pain) (B) was 32ÂºC. Every pain level (T₁vs. T₂vs. T₃vs. T₄) was applied 20 times, resulting in a total of 80 stimulations. Figure 1B illustrates a temperature plot of each stimulus and the subsequent pause. The maximum temperature for each pain threshold was maintained for 4 s. The pauses between stimuli were randomized between 8 and 12 s, and the serial heat stimulation was also randomized. The time until the thresholds (T₁vs. T₂vs. T₃vs. T₄)) were attained was proportionally equal. The subjects had the option to terminate the experiment immediately using an emergency stop button. After the experiment, the subject was asked to apply a cold pack to the site of the heat stimulation for at least 5 min.

Preprocessing

We performed the following biopotential preprocessing:

(1) We visualized the biopotentials to check the intensity of the noise and activity with regard to pain stimulation.

(2) We applied a Butterworth filter to the EMG (20-250 Hz) and ECG (.1-250 Hz) signals.

(3) We also applied an additional filter using the Empirical Mode Decomposition technique developed by Andrade, Kyberd, & Nasuto (2008).

(4) We quantified the pain level caused by the heat applied using four pain thresholds during the "pain window" (5.5 s) and with regard to the baseline during the "non-pain window" (see Figure 2).

(5) We detected bursts of activity via the EMG using the Hilbert Spectrum (Andrade, Nasuto, & Kyberd, 2007).

Feature extraction

We extracted features from the mathematical groups of (1) amplitude, (2) frequency, (3) stationarity, (4) entropy, (5) linearity, and (6) variability. To this end, the maximum information (Σ = 135) of the features was extracted systematically (Nakano, Ota, Ukai, Nakamura, & Fujita, 2002; Andrade, 2005; Cao & Slobounov, 2011; Chen, Zhuang, & Wang, 2009; Hua- Mei, Varshney, & Arora, 2003). Table 2 (see Appendix) provides a detailed overview of all of the feature information. Figure 3 contains a graph as an example for each feature group, one extreme example each for high and low manifestations.

Analysis

Pain stimulation thresholds

We used the Mann-Whitney U test to compare genders (female vs. male) and age groups (18-35 years vs. 36-50 years vs. 51-65 years) with regard to pain and tolerance thresholds. These analyses were performed to examine the extent to which the gender and age group comparisons were consistent with the literature (Lautenbacher, 2004; Zimmer, 2004; Basler, 2004).

Biopotential response via statistical results

All biopotentials were normalized separately for each individual signal feature. Generalized Linear Models (GLMs) were used to test the quantitative pain intensity with respect to all of the features (Nelder & Wedderburn, 1972). This model is based on the Wald χ² test (Wald, 1943) and the related post hoc test. For this purpose, a Wald χ² test for B, T₁, T₂, T₃, and T₄ (see Table 1) and five subsequent post hoc tests for B vs. T₁, T₁vs. T₂, T₂vs. T₃, T₃vs. T₄, and B vs. T₄ (for details, see Table 3 in the Appendix) were performed across all thresholds, including the baseline. The maximum significant separation was determined for each feature (as in Treister et al., 2012). Because a total of 135 features were tested, Bonferroni correction was necessary so that only values of p < .0001 were considered significant.

Machine learning and classification

Machine learning systems are systems that learn from known data and try to recognize characteristic patterns in such data. After a learning phase, they return a model that can be used to map (i.e., classify) unknown input data into a category (Mitchell, 1997). For these classification tasks, there are several machine learners (classifiers), all of which work using different decision algorithms such as Neural Networks, Decision Trees, K-Nearest Neighbor, and SVMs.

For the classification of different pain thresholds, we chose SVMs because these have proven to be highly effective in other studies (Kapoor, Burleson, & Picard, 2007) and are capable of maintaining sufficient flexibility with regard to their main parameter optimization (Hsu, Chang, & Lin, 2003).

The goal of an SVM is to develop a predictive model from given training sets X_i and their associated class labels Y_i that can subsequently be applied to an unlabeled dataset to assign this set to a particular class. Thus, the SVM (Boser, Guyon, & Vapnik, 1992) searches for an optimal solution to the following problem:

Minimize with respect to:

so that the following constraint applies:

By means of a kernel function, K(x_i, x_j) = ϕ(x_i)^Tϕ(x) (in our case the radial basis function [RBF] kernel), the training sets are transformed into a higher dimensional space in which the SVM finds an optimal hyperplane with a maximum margin that separates the classes. The hyperplane then serves as a decision function for unlabeled data with unknown class allocations. For further details on the SVM, the reader may refer to Schoelkopf, Smola, Williamson, and Bartlett (2000).

Feature selection

Automatic pattern selection methods are used to further clarify recognition rates. Feature selection is a "method for selecting a subset of features providing optimal classification accuracy of the classification model" (Kolodyazhniy, Kreibig, Gross, Roth, and Wilhelm, 2011, p. 909). This is accomplished by means of a variety of feature selection (pattern configuration) methods including sequential backward search, sequential forward search, sequential floating forward search, Fisher projection, and hybrid methods, in combination with the classification (e.g., SVMs, Neuronal Networks, and K-Nearest Neighbor). The present article was limited to forward selection and backward selection.

Forward selection

"The forward selection algorithm starts with an empty set of features and adds in each round each unused feature of the given feature pool. For each added feature, the classification accuracy is estimated via cross-validation. Only the feature giving the highest increase of accuracy is added to the set. Then a new round is started with the modified set. The algorithm stops as soon as there is no increase anymore" (Akthar & Hahne, 2012).

Backward selection

"The backward selection algorithm starts with the full set of features and removes in each round each remaining feature of the given feature pool. For each removed feature, the classification accuracy is estimated via cross-validation. Only the feature giving the least decrease of accuracy is finally removed from the set. Then a new round is started with the modified set. The algorithm stops as soon as there is no increase anymore" (Akthar & Hahne, 2012).

Cross-validation

For every classification, cross-validation is necessary. Cross-validation is a "common approach for estimating the classification accuracy with unknown data. In this approach, the entire dataset is divided into N nonoverlapping parts. Training and validation are performed repeatedly N times. At iteration k of the cross-validation, all parts of the data, with the exception of the kth, part are used for training, and the kth part of the data is used for validation" (Kolodyazhniy et al., 2011, p. 909). The N results from the N iterations are finally averaged to produce a single estimation. In the present paper, the performance of the individual classification procedure (Figure 4) is measured with 10-fold cross-validation, meaning that the data were partitioned into 10 parts of equal size. The performance of the general classification procedure (Figure 5) was measured with the "leaveone- subject-out" method (i.e., "in each iteration, all measurements corresponding to a particular participant are removed from the training set and used for validation"; Kolodyazhniy et al., 2011, p. 911).

Results

The following section contains the results of the (1) temperatures of the thresholds, including the relationships with gender and age, (2) biopotential response via the statistical results, and (3) biopotential response via the machine learning results.

Pain stimulation thresholds

The average temperature for the four thresholds was T (M = 46.29ÂºC, SD = 2.57ÂºC), T (M = 47.44ÂºC, SD = 2.14ÂºC), T (M = 48.59ÂºC, SD = 1.82ÂºC), and T (M = 49.74ÂºC, SD = 1.73ÂºC).

Group comparison for gender: No significant difference was found between female (M = 45.91ÂºC, SD = 2.59ÂºC) and male (M = 46.70ÂºC, SD = 2.51ÂºC) subjects for T₁, but a significant difference was observed between female (M = 49.28ÂºC, SD = 2.59ÂºC) and male (M = 50.22ÂºC, SD = .63ÂºC) subjects for T₄ (p < .001).

Group comparison for age: A significant difference was found between age groups (18-35 years: M = 45.73C, SD = 2.08C; 36-50 years: M = 46.26C, SD = 2.79C; 51-65 years: M = 46.88C, SD = 2.68C) for T₄. (p < .05), but the post hoc tests indicated a significant difference only between the 18-35 and 51-65 groups (p < .05). No significant difference was found between age groups (18-35 years: M = 49.68ÂºC, SD = 1.14ÂºC; 36-50 years: M = 49.81ÂºC, SD = 1.94ÂºC; 51-65 years: M = 49.69ÂºC, SD = 1.99ÂºC) for T₄.

Biopotential response via statistical results

Table 1 summarizes the significant results (p < .0001, with Bonferroni correction) of the Wald χ² test for each of the 135 features. One hundred five features could be considered significant with regard to pain differentiation. Some features separated ascending (B (is equivalent) minimum, T₄ maximum), whereas others separated descending (B maximum, T₄ minimum). With regard to the frequency of significant separation (see Table 3 in the Appendix, last column [total]), the features of (1) Corrugator_Amplitude_ p2p, (2) Corrugator_Entropy_shannon, and (3) HRV_ slopeRR were the most selective (five significant pain differentiations) in distinguishing between pain thresholds.

With regard to the four significant pain differentiations, all 10 amplitude features of the zygomaticus (Zygomaticus_ Variance: var, std, range, intrage), and all 10 amplitude features of the corrugator except for p2p (Corrugator_ Variance: var, std, range, intrage; Corrugator_Frequency_ zc; Corrugator_Stationarity_sd; SCL_Amplitude: mavfdn, mavsdn; SCL_Stationarity: me, sd; SCL_Entropy: aprox, fuzzy, sample; SCL_Variability_range) were selected. In total, we found 41 top features.

No calculations were possible with regard to the features of SCL_Frequency (zc, fmode, bw, cf).

Biopotential response via machine learning results

Figure 4 presents the mean of all individual (10- fold cross-validation) automatic classification results via automatic feature selection for B vs. T₁, T₁vs. T₂, T₂vs. T₃, T₃vs. T₄, and B vs. T₄. The detection rates were 88.79-94.73% for forward selection and 59.44-81.75% for backward selection.

We found highly significant results (p < .0001) in the comparison (Wilcoxon signed-rank test) between forward selection recognition and backward selection recognition rates via every threshold contrast: B vs. T₁, T₁vs. T₂, T₂vs. T₃, T₃vs. T₄, and B vs. T₄.

Because of the extremely high level of individual specificity of the patterns, providing a frequency diagram of the forward and backward selection was not possible because this would clearly exceed the scope of this article.

Figure 5 presents the general automatic classification (leave-one-subject-out method) results of 52.41-74.59% for the general statistical features (1) Corrugator_ Amplitude_p2p, (2) Corrugator_Entropy_Shannon, and (3) HRV_slopeRR. For the top 41 features, we found classification results of 53.49-77.05%. A comparison between the top three vs. top 41 analyses was not possible because there were no individual means.

Discussion and conclusion

We have presented a newly collected multimodal dataset (BioVid Heat Pain Database; Walter, 2013a) to facilitate advances in the reliable recognition of pain intensity. The higher-level pragmatic orientation of this research ultimately allows the objective, reliable, and valid recognition of pain in infants, people who suffer from dementia, and people with limited verbal communication skills. The authors of the present article consider the approach of Treister et al. (2012) and Ben- Israel et al. (2013) as straightforward but insufficient in terms of objective, reliable, and valid clinic pain recognition. Therefore, we extracted features of the highly complex statistical groups amplitude, frequency, stationarity, entropy, linearity, and variability and selected the feature patterns (general statistic and individual automatic) that contributed to the highest recognition rate for pain quantification.

Discussion of results

Pain stimulation thresholds: The thresholds T₁, T₂, T₃, and T₄, including the effects of age and gender, are consistent with the results reported in the literature (Lautenbacher, 2004; Zimmer, 2004; Basler, 2004).

Biopotential response via statistical results: A very low p level (p < .0001) was used for the Bonferroni correction and selection criteria by taking into account the most selective features. With regard to our statistical procedure, the majority of features are generally suitable for ensuring the quantification of pain. Similar to Treister et al. (2012), we used the frequency of significant separation between pain thresholds as a selection criterion. The features Corrugator_ Amplitude_p2p, Corrugator_ Entropy_shannon, and HRV_slopeRR can be regarded as the most selective, in which they significantly distinguished among five pain differentiations. Furthermore, the features that resulted from the mathematical groupings of amplitude and variability in conjunction with zygomaticus and corrugator tended to be suitable. In the SCL, selectivity with respect to pain quantification and mathematical grouping was more complex and less clear. The features in the areas of linearity, stationarity, variability, and frequency can only be regarded as satisfactory. These features can be assumed to have greater significance with regard to the duration or nature of pain (e.g., stabbing, pulling, throbbing, sharp, tearing, etc.).

Biopotential response via machine learning results: Using automatic feature selection (forward selection and backward selection) with SVMs, we tested the extent to which individual-specific automatic feature selection is beneficial. The pattern configurations are evidently extremely individual-specific, accompanied by very high recognition rates, especially in forward selection (> 88%). Precisely distinguishing between all four thresholds is possible. The high individualspecific pattern configuration is consistent with intense individual-specific stress regulation according to fundamental research (e.g., Stemmler & Wacker, 2010).

Although we are aware of the advantages and disadvantages of our chosen feature selection algorithms, we presently have no adequate explanation why forward selection outperforms backward selection. A frequency diagram of typical individual-specific patterns was not productive because the patterns significantly differed from each other.

The recognition rates regarding the general classification with only three features (Features Corrugator_ Amplitude_p2p, Corrugator_Entropy_shannon, and HRV_ slopeRR) were obviously less compared with individual rates. Pain tolerance and baseline could be recognized by 74.59% (top 3) and 77.05% (top 41), meaning we are high about chance level. In a two class problem (what we have used) it means always about 50%). With regard to our calculated automatic classification rates, there are currently no comparable studies in the area of automatic pain recognition. However, our results are in line with comparisons of high vs. low arousal in the Affective Computing research area (Kim & Andre, 2008; Walter et al., 2011; Walter, Kim, Hrabal, Crawcour, Kessler,& Traue, 2013b). The comparison of the top 3 vs. top 41 features shows more about 3 top features make the results not really relevant better.

Comparison: statistics and machine analysis summary

We are unaware of any studies in which conservative statistical methods were compared with modern automatic classification algorithms with regard to empirical pain induction, indicating a lack of "state-of-the-art" methods. We sought to make this comparison and will pursue it further. The EMG features appeared to make a significant contribution to the quantification of pain.

In terms of the statistical results, a general feature pattern was detected, but the individual-specific classification rates showed that detection rates can be significantly improved through individual-specific calibration.

For pain recognition in clinical practice, future pain recognition algorithms may have initial default features, such as (1) Corrugator_Amplitude_p2p, (2) Corrugator_ Entropie_Shannon, and (3) HRV_slopeRR. Individualspecific calibration allows for an adjustment of feature patterns, resulting in significantly more accurate pain detection rates.

Outlook

Numerous additional analyses will be performed using the described dataset (Walter et al., 2013a). Specifically, a data fusion of biopotentials with video signals (i.e., facial expressions and gestures) that have been recorded three-dimensionally (Walter et al., 2013a) is planned. Early, intermediary, and late fusions are being tested for the data fusion (Schwenker, Dietrich, Thiel, & Palm, 2006; Schwenker, Dietrich, Kestler, Rieder, & Palm, 2003. The features presented in the present article must be investigated using other models in terms of the duration and type of pain induction.

There are plans for a clinical project in which detection will occur postoperatively in humans. Multimodal signals with biomedical, visual, and paralinguistic (e.g., sighing) parameters will be measured. Highly complex pain logs will be created to allow for pain quantification. Pain detection will be further clarified by means of data fusion.

Generally, we would like to point out that the development of technology for the detection of pain always requires a multimodal approach with a maximum dimensionality of features. Within this context, it is crucial that the extracted feature configurations are logically comprehensible and clearly structured. Methodological benchmarks are urgently needed.

Acknowledgements

This research was part of the DFG/TR233/12 "Advancement and Systematic Validation of an Automated Pain Recognition System on the Basis of Facial Expression and Psychobiological Parameters" project, funded by the German Research Foundation, FAPEMIG, CNDq, CAPES, and the Brazilian government.

Received 11 August 2013;

Received in revised form 06 May 2014;

Accepted 10 July 2014.

Available online 25 November 2014.

Steffen Walter, Sascha Gruss, Kerstin Limbrecht-Ecklundt, and Harald C. Traue, Department of Psychosomatic Medicine and Psychotherapy, University of Ulm, Ulm, Germany. Philipp Werner, and Ayoub Al-Hamadi, Institute for Information Technology and Communications, Otto-von-Guericke-University Magdeburg, Magdeburg, Germany. Nicolai Diniz, Gustavo Moreira da Silva, and Adriano O. Andrade, Biomedical Engineering Laboratory (BioLab), Universidade Federal de Uberlândia, Uberlândia, Brazil.

Appendix

Thumbnail

Table 1

Thumbnail

Clique aqui para ampliar

Thumbnail

Table 3

Thumbnail

Andrade, A. O, Kyberd, P., & Nasuto, S. J. (2008). The application of the Hilbert spectrum to the analysis of electromyographic signals. Information Sciences, 21762193.
Andrade, A. O., Nasuto, S. J., & Kyberd, P. (2007). Extraction of motor unit action potentials from electromyographic signals through generative topographic mapping. Journal of the Franklin Institute, 344(3-4), 154-179.
Andrade, A. O., Decomposition and Analysis of Electromyographic Signals, 3 Doutorado, University of Reading, UR-England 2005.
Akthar, F., & Hahne, C. (2012). RapidMiner 5 Operator Reference Retrieved from http://rapidminer.com/documentation/; accessed April 23, 2014.
Basler, H. D., Bloem, R., Casser, H. R., Gerbershagen, H. U., Grieβinger, N., Hankemeier, U., Weiβ, L. (2001). Ein strukturiertes Schmerzinterview für geriatrische Patienten. Der Schmerz, 15(3), 164-171.
Ben-Israel, N., Kliger, M., Zuckerman, G., Katz, Y., & Edry, R. (2013). Monitoring the nociception level: a multi-parameter approach. Journal of Clinical Monitoring and Computing, 27(6), 659-668.
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers In D. Haussler, editor, 5th Annual ACM Workshop on COLT, 144-152, Pittsburgh, PA, ACM Press.
Brahnam, S., Chuang, C. F., Shih, F., & Slack, M. (2006). SVM classification of neonatal facial images of pain. In: I. Bloch, A. Petrosino, & A. B. Tettamanzi (Eds.), Fuzzy Logic and Applications: 6th International Workshop, WILF 2005, Crema, Italy, September 15-17, 2005: revised selected papers (series title: Lecture notes in computer science, vol. 3849; 121-128. Berlin: Springer.
Cao, C., & Slobounov, S. (2011). Application of a novel measure of EEG non-stationarity as 'Shannon entropy of the peak frequency shifting' for detecting residual abnormalities in concussed individuals. Clinical Neurophysiology, 122, 1314-1321.
Chen, W. T., Zhuang, J., Yu, W. X., & Wang, Z. Z. (2009). Measuring complexity using FuzzyEn, ApEn, and SampEn. Medical Engineering and Physics, 31(1), 61-68.
Colloca, L., Benedetti, F., & Pollo, A. (2006). Repeatability of autonomic responses to pain anticipation and pain stimulation. European Journal of Pain, 10(7), 659-665.
Cortelli, P., & Pierangeli, G. (2003). Chronic pain-autonomic interactions. Neurological Sciences, 24(Suppl. 2), S68-S70.
Herr, K., Bjoro, K., & Decker, S. (2006). Tools for assessment of pain in nonverbal older adults with dementia: a state-of-the-science review. Journal of Pain Symptom Management, 31(2), 170-192.
Hofmann, T., Scholkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. Annals of Statistics, 36(3), 1171-1220.
Hsu, C. W., Cang, C. C., & Lin, C. J. (2003). A practical guide to support vector classification. Online in Internet: http://www.csie. ntu.edu.tw/~cjlin/papers/guide/guide.pdf
Hua-Mei, C., Varshney, P. K., & Arora, M. K. (2003). Performance of mutual information similarity measure for registration of multitemporal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 41(11), 2445-2454.
Jeanne, M., Logier, R., De Jonckheere, J., & Tavernier, B. (2009). Validation of a graphic measurement of heart rate variability to assess analgesia/nociception balance during general anesthesia. Conference Proceedings - IEEE Engineering in Medicine and Biology Society, 2009, 1840-1843.
Kapoor, A., Burleson, W., Picard, R. W. (2007). Automatic prediction of frustration. International Journal of Human-Computer Studies, 65(8), 724-736.
Kim, J., & Andre, E. (2008). Emotion recognition based on physiological changes in music listening. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2067-2083.
Kolodyazhniy, V., Kreibig, S. D., Gross, J. J., Roth, W. T., & Wilhelm, F. H. (2011). An affective computing approach to physiological emotion specificity: toward subject-independent and stimulus-independent classification of film-induced emotions. Psychophysiology, 48(7), 908-922.
Korhonen, I., & Yli-Hankala, A. (2009). Photoplethysmography and nociception. Acta Anaesthesiologica Scandinavica, 53(8), 975-985.
Ledowski, T., Ang, B., Schmarbeck, T., & Rhodes, J. (2009). Monitoring of sympathetic tone to assess postoperative pain: skin conductance vs surgical stress index. Anaesthesia, 64(7), 727-731.
Loggia, M. L., Juneau, M., & Bushnell, M. C. (2011). Autonomic responses to heat pain: Heart rate, skin conductance, and their relation to verbal ratings and stimulus intensity. Pain, 152(3), 592-598.
Lautenbacher, S. (2004). Schmerzmessung. In: H.D. Basler, C. Franz, B. Kröner-Herwig, & H.P. Rehfisch (Eds.), Psychologische schmerztherapie (271-288). Berlin: Springer.
Mitchell, T. (1997). Machine learning London: McGraw-Hill.
Nakano, K., Ota, Y., Ukai, H., Nakamura, K., & Fujita, H. (2002, 24-28 June). Frequency Detection Method Based on Recursive DFT Algorithm. Paper presented at the 14th PSCC, Sevilla.
Nelder, J. A., & Wedderburn, (1972). Generalized Linear Models. Journal of the Royal Statistical Society Series A (General), 135(3), 370-384.
Nilges, P., & H.C., T. (2007). Psychologische Aspekte des Schmerzes. Verhaltenstherapie & Verhaltensmedizin, 28(3), 302-322.
Schlereth, T., & Birklein, F. (2008). The sympathetic nervous system and pain. Neuromolecular Med, 10(3), 141-147.
Schoelkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New Support Vector Algorithms. Neural Computation, 12(5), 1207-1245.
Schwenker, F., Dietrich, C., Thiel, C., & Palm, G. (2006). Learning of decision fusion mappings for pattern recognition. Journal on Artificial Intelligence and Machine Learning (AIML), 6, 17-21.
Schwenker, F., Dietrich, C., Kestler, H. A., Riede, K., & Palm, G. (2003). Radial basis function neural networks and temporal fusion for the classification of bioacoustic time series. Neurocomputing, 51(0), 265-275.
Soetanto, A. L., Chung, J. W., & Wong, T. K. (2004). Gender differences in pain perception: a signal detection theory approach. Acta Anaesthesiologica Taiwanica, 42(1), 15-22.
Stemmler, G., & Wacker, J. (2010). Personality, emotion, and individual differences in physiological responses. Biological Psychology, 84(3), 541-551.
Treister, R., Kliger, M., Zuckerman, G., Goor Aryeh, I., & Eisenberg, E. (2012). Differentiating between heat pain intensities: the combined effect of multiple autonomic parameters. Pain, 153(9), 1807-1814.
Turk, D. C., & Okifuji, A. (1999). Assessment of patients' reporting of pain: an integrated perspective. Lancet, 353(9166), 1784-1788.
Wald, A. (1943). On a statistical generalization of metric spaces. Proceedings of the National Academy of Sciences of the United States of America, 29(6), 196-197.
Walter, S., Gruss, S., Ehleiter, H., Tan, J., Traue, H. C., Werner, P., Al-Hamadi, A., Crawcour, S., Andrade, A. O. & Moreira da Silva, G. (2013a). The BioVid Heat Pain Database: data for the advancement and systematic validation of an automated pain recognition system Paper presented at the IEEE International Conference on Cybernetics, Lausanne, Switzerland, June 13-15, 2013.
Walter, S., Kim, J., Hrabal, D., Crawcour, S. C., Kessler, H., & Traue, H. C. (2013b). Transsituational individual-specific biopsychological classification of emotions. IEEE Transactions on Systems Man Cybernetics-Systems, 43(4), 988-995.
Walter, S., Scherer, S., Schels, M., Glodek, M., Hrabal, D., Schmidt, M., Böck, R., Limbrecht, K., Traue, H. & Schwenker, F. (2011). Multimodal emotion classification in naturalistic user behavior. Human-Computer Interaction: Towards Mobile and Intelligent Interaction Environments, Part III, 6763, 603-611.
Wiebalck, A., Vandermeulen, E., Aken, H. V., & Vandermeersch, E. (1995). Ein konzept zur verbesserung der postoperativen schmerzbehandlung. Der Anaesthesist, 44(12), 831-842.
Zimmer, C. (2004). Schmerz und geschlecht. In: H. D. Basler, C. Franz, B. Kröner-Herwig, & H. P. Rehfisch (Eds.), Psychologische schmerztherapie (203-215). Berlin: Springer.
Zwakhalen, S. M., Hamers, J. P., Abu-Saad, H. H., & Berger, M. P. (2006). Pain in elderly people with severe dementia: a systematic review of behavioural pain assessment tools. BMC Geriatrics, 6, 3.

Correspondence regarding this article should be directed to:

Steffen Walter

Email:

steffen.walter@uni-ulm.de

Publication Dates

Publication in this collection
24 Feb 2015
Date of issue
Dec 2014

History

Received
11 Aug 2013
Accepted
10 July 2014
Reviewed
06 May 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

[1] Andrade, A. O, Kyberd, P., & Nasuto, S. J. (2008). The application of the Hilbert spectrum to the analysis of electromyographic signals. Information Sciences, 21762193.

[2] Andrade, A. O., Nasuto, S. J., & Kyberd, P. (2007). Extraction of motor unit action potentials from electromyographic signals through generative topographic mapping. Journal of the Franklin Institute, 344(3-4), 154-179.

[3] Andrade, A. O., Decomposition and Analysis of Electromyographic Signals, 3 Doutorado, University of Reading, UR-England 2005.

[4] Akthar, F., & Hahne, C. (2012). RapidMiner 5 Operator Reference Retrieved from http://rapidminer.com/documentation/; accessed April 23, 2014.

[5] Basler, H. D., Bloem, R., Casser, H. R., Gerbershagen, H. U., Grieβinger, N., Hankemeier, U., Weiβ, L. (2001). Ein strukturiertes Schmerzinterview für geriatrische Patienten. Der Schmerz, 15(3), 164-171.

[6] Ben-Israel, N., Kliger, M., Zuckerman, G., Katz, Y., & Edry, R. (2013). Monitoring the nociception level: a multi-parameter approach. Journal of Clinical Monitoring and Computing, 27(6), 659-668.

[7] Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers In D. Haussler, editor, 5th Annual ACM Workshop on COLT, 144-152, Pittsburgh, PA, ACM Press.

[8] Brahnam, S., Chuang, C. F., Shih, F., & Slack, M. (2006). SVM classification of neonatal facial images of pain. In: I. Bloch, A. Petrosino, & A. B. Tettamanzi (Eds.), Fuzzy Logic and Applications: 6th International Workshop, WILF 2005, Crema, Italy, September 15-17, 2005: revised selected papers (series title: Lecture notes in computer science, vol. 3849; 121-128. Berlin: Springer.

[9] Cao, C., & Slobounov, S. (2011). Application of a novel measure of EEG non-stationarity as 'Shannon entropy of the peak frequency shifting' for detecting residual abnormalities in concussed individuals. Clinical Neurophysiology, 122, 1314-1321.

[10] Chen, W. T., Zhuang, J., Yu, W. X., & Wang, Z. Z. (2009). Measuring complexity using FuzzyEn, ApEn, and SampEn. Medical Engineering and Physics, 31(1), 61-68.

[11] Colloca, L., Benedetti, F., & Pollo, A. (2006). Repeatability of autonomic responses to pain anticipation and pain stimulation. European Journal of Pain, 10(7), 659-665.

[12] Cortelli, P., & Pierangeli, G. (2003). Chronic pain-autonomic interactions. Neurological Sciences, 24(Suppl. 2), S68-S70.

[13] Herr, K., Bjoro, K., & Decker, S. (2006). Tools for assessment of pain in nonverbal older adults with dementia: a state-of-the-science review. Journal of Pain Symptom Management, 31(2), 170-192.

[14] Hofmann, T., Scholkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. Annals of Statistics, 36(3), 1171-1220.

[15] Hsu, C. W., Cang, C. C., & Lin, C. J. (2003). A practical guide to support vector classification. Online in Internet: http://www.csie. ntu.edu.tw/~cjlin/papers/guide/guide.pdf

[16] Hua-Mei, C., Varshney, P. K., & Arora, M. K. (2003). Performance of mutual information similarity measure for registration of multitemporal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 41(11), 2445-2454.

[17] Jeanne, M., Logier, R., De Jonckheere, J., & Tavernier, B. (2009). Validation of a graphic measurement of heart rate variability to assess analgesia/nociception balance during general anesthesia. Conference Proceedings - IEEE Engineering in Medicine and Biology Society, 2009, 1840-1843.

[18] Kapoor, A., Burleson, W., Picard, R. W. (2007). Automatic prediction of frustration. International Journal of Human-Computer Studies, 65(8), 724-736.

[19] Kim, J., & Andre, E. (2008). Emotion recognition based on physiological changes in music listening. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2067-2083.

[20] Kolodyazhniy, V., Kreibig, S. D., Gross, J. J., Roth, W. T., & Wilhelm, F. H. (2011). An affective computing approach to physiological emotion specificity: toward subject-independent and stimulus-independent classification of film-induced emotions. Psychophysiology, 48(7), 908-922.

[21] Korhonen, I., & Yli-Hankala, A. (2009). Photoplethysmography and nociception. Acta Anaesthesiologica Scandinavica, 53(8), 975-985.

[22] Ledowski, T., Ang, B., Schmarbeck, T., & Rhodes, J. (2009). Monitoring of sympathetic tone to assess postoperative pain: skin conductance vs surgical stress index. Anaesthesia, 64(7), 727-731.

[23] Loggia, M. L., Juneau, M., & Bushnell, M. C. (2011). Autonomic responses to heat pain: Heart rate, skin conductance, and their relation to verbal ratings and stimulus intensity. Pain, 152(3), 592-598.

[24] Lautenbacher, S. (2004). Schmerzmessung. In: H.D. Basler, C. Franz, B. Kröner-Herwig, & H.P. Rehfisch (Eds.), Psychologische schmerztherapie (271-288). Berlin: Springer.

[25] Mitchell, T. (1997). Machine learning London: McGraw-Hill.

[26] Nakano, K., Ota, Y., Ukai, H., Nakamura, K., & Fujita, H. (2002, 24-28 June). Frequency Detection Method Based on Recursive DFT Algorithm. Paper presented at the 14th PSCC, Sevilla.

[27] Nelder, J. A., & Wedderburn, (1972). Generalized Linear Models. Journal of the Royal Statistical Society Series A (General), 135(3), 370-384.

[28] Nilges, P., & H.C., T. (2007). Psychologische Aspekte des Schmerzes. Verhaltenstherapie & Verhaltensmedizin, 28(3), 302-322.

[29] Schlereth, T., & Birklein, F. (2008). The sympathetic nervous system and pain. Neuromolecular Med, 10(3), 141-147.

[30] Schoelkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New Support Vector Algorithms. Neural Computation, 12(5), 1207-1245.

[31] Schwenker, F., Dietrich, C., Thiel, C., & Palm, G. (2006). Learning of decision fusion mappings for pattern recognition. Journal on Artificial Intelligence and Machine Learning (AIML), 6, 17-21.

[32] Schwenker, F., Dietrich, C., Kestler, H. A., Riede, K., & Palm, G. (2003). Radial basis function neural networks and temporal fusion for the classification of bioacoustic time series. Neurocomputing, 51(0), 265-275.

[33] Soetanto, A. L., Chung, J. W., & Wong, T. K. (2004). Gender differences in pain perception: a signal detection theory approach. Acta Anaesthesiologica Taiwanica, 42(1), 15-22.

[34] Stemmler, G., & Wacker, J. (2010). Personality, emotion, and individual differences in physiological responses. Biological Psychology, 84(3), 541-551.

[35] Treister, R., Kliger, M., Zuckerman, G., Goor Aryeh, I., & Eisenberg, E. (2012). Differentiating between heat pain intensities: the combined effect of multiple autonomic parameters. Pain, 153(9), 1807-1814.

[36] Turk, D. C., & Okifuji, A. (1999). Assessment of patients' reporting of pain: an integrated perspective. Lancet, 353(9166), 1784-1788.

[37] Wald, A. (1943). On a statistical generalization of metric spaces. Proceedings of the National Academy of Sciences of the United States of America, 29(6), 196-197.

[38] Walter, S., Gruss, S., Ehleiter, H., Tan, J., Traue, H. C., Werner, P., Al-Hamadi, A., Crawcour, S., Andrade, A. O. & Moreira da Silva, G. (2013a). The BioVid Heat Pain Database: data for the advancement and systematic validation of an automated pain recognition system Paper presented at the IEEE International Conference on Cybernetics, Lausanne, Switzerland, June 13-15, 2013.

[39] Walter, S., Kim, J., Hrabal, D., Crawcour, S. C., Kessler, H., & Traue, H. C. (2013b). Transsituational individual-specific biopsychological classification of emotions. IEEE Transactions on Systems Man Cybernetics-Systems, 43(4), 988-995.

[40] Walter, S., Scherer, S., Schels, M., Glodek, M., Hrabal, D., Schmidt, M., Böck, R., Limbrecht, K., Traue, H. & Schwenker, F. (2011). Multimodal emotion classification in naturalistic user behavior. Human-Computer Interaction: Towards Mobile and Intelligent Interaction Environments, Part III, 6763, 603-611.

[41] Wiebalck, A., Vandermeulen, E., Aken, H. V., & Vandermeersch, E. (1995). Ein konzept zur verbesserung der postoperativen schmerzbehandlung. Der Anaesthesist, 44(12), 831-842.

[42] Zimmer, C. (2004). Schmerz und geschlecht. In: H. D. Basler, C. Franz, B. Kröner-Herwig, & H. P. Rehfisch (Eds.), Psychologische schmerztherapie (203-215). Berlin: Springer.

[43] Zwakhalen, S. M., Hamers, J. P., Abu-Saad, H. H., & Berger, M. P. (2006). Pain in elderly people with severe dementia: a systematic review of behavioural pain assessment tools. BMC Geriatrics, 6, 3.

Brasil

Brasil

Automatic pain quantification using autonomic parameters

Abstract

Publication Dates

History