SciELO - Scientific Electronic Library Online

vol.25 issue2Dartos flap coverage of the neourethra following repair for primary hypospadias, reoperative hypospadias and urethrocutaneous fistulas: it is a safe approachCosts of bariatric surgery in a teaching hospital and the financing provided by the Public Unified Health System author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



  • English (pdf)
  • Article in xml format
  • How to cite this article
  • SciELO Analytics
  • Curriculum ScienTI
  • Automatic translation


Related links


Acta Cirurgica Brasileira

On-line version ISSN 1678-2674

Acta Cir. Bras. vol.25 no.2 São Paulo Mar./Apr. 2010 



Accuracy of six minute walk test, stair test and spirometry using maximal oxygen uptake as gold standard1


Acurácia do teste de caminhada de seis minutos, teste de escada e espirometria usando o consumo máximo de oxigênio como padrão ouro



Daniele Cristina CataneoI; Shoiti KobayasiII; Lídia Raquel de CarvalhoIII; Rafael Camargo PaccanaroIV; Antonio José Maria CataneoV

IPhD. Assistant Professor, Thoracic Surgery Division, Department of Surgery, Botucatú School of Medicine, UNESP, São Paulo, Brazil
IIFull Professor, Gastroenterologic Surgery Division, Department of Surgery, Botucatú School of Medicine, UNESP, São Paulo, Brazil
IIIPhD, Assistant Professor, Botucatú Biosciences Institute, UNESP, São Paulo, Brazil
IVMD, Resident, Department of Surgery. Botucatú School of Medicine, UNESP, São Paulo, Brazil
VFull Professor, Thoracic Surgery Division, Department of Surgery, Botucatú School of Medicine, UNESP, São Paulo, Brazil





PURPOSE: To assess the accuracy of the variables stair climbing time (SCt), stair climbing power (SCP), six-minute walk test distance (6MWT), and forced expiratory volume in 1 second (FEV1) using maximal oxygen uptake on exercise (VO2max) as the gold standard.
METHODS: Tests were performed in 51 patients. FEV1 was measured by spirometry and 6MWT was performed in a flat 120-m corridor. Stair climbing test was performed on a 6-flight stairway to obtain SCt and SCP. VO2max was measured by ergospirometry, using the Balke protocol. Pearson's linear correlation and p values were calculated between VO2max and the other variables tested. For accuracy calculations, variable cutoff points were obtained through receiver operating characteristic (ROC) curves, dividing individuals into normal or unhealthy. Kappa statistic was used to calculate concordance.
RESULTS: Accuracy was: SCt – 86%, 6MWT – 80%, SCP – 71%, FEV1(L) – 67%, FEV1(%) – 63%. SCt and 6MWT showed 93.5% sensitivity when combined in parallel, and 96.4% specificity in series.
CONCLUSION: SCt presented the best accuracy. SCt and 6MWT combined showed nearly 100% sensitivity or specificity. Thus, these simple exercise tests should be more routinely used, especially when an ergospirometer is not available to measure VO2max.

Key words: Prognosis. Oxygen Consumption. Spirometry.


OBJETIVO: Determinar a acurácia das variáveis: tempo de escada (tTE), potência de escada (PTE), teste de caminhada (TC6) e volume expiratório forçado (VEF1) utilizando o consumo máximo de oxigênio (VO2máx) como padrão-ouro.
MÉTODOS: Os testes foram realizados em 51 pacientes. O VEF1 foi obtido através da espirometria. O TC6 foi realizado em corredor plano de 120m. O TE foi realizado em escada de 6 lances obtendo-se tTE e PTE. O VO2máx foi obtido por ergoespirometria, utilizando o protocolo de Balke. Foram calculados a correlação linear de Pearson (r) e os valores de p, entre VO2máx e variáveis. Para o cálculo da acurácia, foram obtidos os pontos de corte, através da curva característica operacional (ROC). A estatística Kappa (k) foi utilizada para cálculo da concordância.
RESULTADOS: Obteve-se as acurácias: tTE – 86%, TC6 – 80%, PTE – 71%, VEF1(L) – 67%, VEF1% – 63%. Para o tTE e TC6 combinados em paralelo, obteve-se sensibilidade de 93,5% e em série, especificidade de 96,4%.
CONCLUSÃO: O tTE foi a variável que apresentou a melhor acurácia. Quando combinados o tTE e TC6 podem ter especificidade e sensibilidade próxima de 100%. Estes testes deveriam ser mais usados rotineiramente, especialmente quando a ergoespirometria para a medida de VO2máx não é disponível.

Descritores: Prognóstico. Consumo de Oxigênio. Espirometria.




Major surgery under general anesthesia poses considerable stress to the cardiopulmonary system increasing morbidity and mortality in individuals with low cardiopulmonary reserves. Patients with lung disease have a higher incidence of postoperative complications with complication frequency increasing in proportion to the severity of the lung impairment. Similarly, the presence of several cardiac risk factors increases the risk of postoperative complications in patients with cardiac disease. The incidence of postoperative cardiopulmonary complications is highest in patients undergoing upper abdominal and thoracic surgery leading to longer hospital stays and higher costs. Risk stratification may help to select patients for appropriate counseling and prophylactic treatment1,2.

The ideal test to predict the risk of postoperative complications should determine the aerobic capacity and functional reserve that would enable the patient to cope with the physical demands of surgery.

Cardiopulmonary exercise testing (CPET) includes a wide spectrum of clinical applications. It can be used to evaluate patient physical fitness and has been considered as the gold standard for predicting surgical risk. In this regard, the best measure used is maximal oxygen uptake during exercise (VO2max) determined by ergospirometry, which reflects the maximum oxygen volume consumed during exercise and is, therefore, considered a maximum test. VO2max seems to be the best indicator of exercise capacity3, but despite its usefulness, ergospirometry is not available in most hospitals. Nonetheless, while cardiac performance and respiratory function each can be evaluated individually, cardiopulmonary exercise testing allows the examination of both systems in a single study. During exercise, oxygen consumption, carbon dioxide production, and cardiac output increase while the work level reached reflects how well the heart, lungs, and circulatory system interact with oxygen transport to the tissues.

CPET is the closest to ideal of all surgical risk prediction tools. It allows improved assessment of postoperative risk and has been increasingly used by surgeons for preoperative evaluation4. At first, its efficacy was not clear, but more recent works have resolved all questions and demonstrated that CPET, including VO2max, is the most important step for the physiologic assessment of a thoracotomy candidate5. Thus, CPET can be considered as the gold standard test for surgical risk evaluation.

A systematic review of the literature6 showed that exercise capacity expressed as VO2max is lower in patients that develop clinically relevant complications after lung resections. Other CPET forms, which do not get the patient to maximum fatigue, are considered submaximal. These tests, namely the twelve-minute walk test7, the six-minute walk test (6MWT)8, and the stair climbing test (SCT)9, can be used to confirm whether an individual is physically fit for a specific surgical procedure. They are simple, low-cost tests that do not require specialized equipment.

VO2max and distance during the 6MWT have proven to better predict prognosis than resting lung and/or cardiac function. In major surgery exercise testing is recommended for the preoperative evaluation of patients10.

Given that the current economic environment calls for cheaper and more accessible testing forms, and that ergospirometers are not available at all hospitals, the purpose of this study was to standardize SCT in our service11 and to assess the accuracy of stair climbing time (SCt), stair climbing power (SCP), 6MWT distance, and forced expiratory volume in one second (FEV1) measured by spirometry in order to determine which best predicts surgical risk using VO2max as the gold standard.



After approval by the Research Ethics Committee of São Paulo State University, this study was initiated by contacting patients over 18 years of age who had been referred for spirometry and agreed to participate and sign the informed consent form. Eligible patients were those referred for spirometry for any clinical or preoperative reason. Exclusion criteria were the same as for ergospirometry: any acute conditions, systolic arterial pressure > 200mmHg and diastolic arterial pressure > 110mmHg, decompensated heart failure, infarction within the past 40 days, decompensated COPD, electrocardiogram showing complete left bundle branch block, and walking difficulty (orthopedic, neurological, vascular changes), inability to ascend the complete staircase or to perform ergospirometry. All patients enrolled underwent history taking, physical examination, and electrocardiography at rest before physical strength testing. Spirometry, followed by 6MWT and SCT were all performed on the same day. Minimum recovery time between tests was 30 minutes. Ergospirometry was scheduled for a later date according to laboratory availability.

Spirometry was performed using a Med-Graphics Pulmonary Function System 1070, according to the American Thoracic Society guidelines12. Forced vital capacity was measured at least three times choosing the curve with the highest FEV1. Readings were expressed in liters and percent predicted.

The 6MWT consisted of measuring the distance covered by the patient after six minutes of encouraged walking according to the guidelines of the American Thoracic Society13. It was performed in the shade, at a fast pace with encouragement from the examiner, along a flat 120-m corridor marked every 0.75m to determine the distance covered by the patient in six minutes.

SCT was performed in the shade, on a staircase of 30º in incline which consisted of six flights, each flight having twelve steps (72 steps in total) measuring 16.9cm, with a total ascent height of 12.16m11. Patients were asked to climb all the steps in the shortest possible time with verbal encouragement from the same examiner. Between flights, patients had to take two or three paces on a flat surface trying to maintain the same speed while the examiner inquired if everything was fine. Testing was stopped only for fatigue, limiting dyspnea, thoracic pain, or exhaustion. The time taken to climb the stairs (SCt) was expressed in seconds. The amount of work (W) done to climb the stairs was calculated in joules using the formula "W= m x g x h", where m is patient mass in kilograms, g is gravity acceleration (9.8m/s2), and h is the height of the staircase in meters (12.16m). Stair-climbing power (SCP) was calculated in watts as W / SCt.

VO2max was measured using a Quinton ergospirometer (Q4500, Quinton Instruments, Seattle, WA, USA) coupled to a treadmill in a standard climate-controlled environment. Heart and respiratory rates, arterial blood pressure, oxygen saturation, and 12-lead electrocardiograms were monitored throughout the test. All ergospirometry variables were measured, but VO2max, expressed in ml/kg/min was the only one used. Testing was performed using the Balke protocol, which is an incremental protocol indicated for individuals with comorbidities. The examination was interrupted in the event of systolic arterial pressure drop > 10mmHg as compared to rest, angina, symptoms related to the central nervous system (ataxia, dizziness, lightheadedness), signs of low perfusion (cyanosis, pallor), technical difficulties in monitoring electrocardiograms or arterial blood pressure, sustained ventricular tachycardia, ST-segment elevation >2mm or depression >3mm, patient request, fatigue, dyspnea, hissing, cramps, limping, left bundle branch block or conduction delay, increasing chest pain, or hypersensitive response. When testing was interrupted for any of the above reasons, the patient was excluded from the study.

Data were statistically analyzed by using Pearson's coefficient to estimate the correlations of VO2max with other variables along with p-values. The sensitivity, specificity and accuracy of the variables that significantly correlated with VO2max were determined. The variable cut-off points used to distinguish normal from abnormal test results were calculated using receiver operating characteristic curves (ROC)14 and rounded whole numbers. The Kappa (k) statistic was used to assess concordance. Serial and parallel combinations of the two most accurate tests were used to determine sensitivity and specificity. Serial and parallel combinations were used to increase test specificity and sensitivity, respectively. The software utilized was SAS 9.1.



Tests were performed in 51 patients (30 males and 21 females) aged 18-77 years (Mean ± SD = 52±16). Testing interruption was not considered necessary in any case. Table 1 shows mean and standard deviation (SD), maximum and minimum values, as well as the cut-off points obtained in all tests performed. Linear correlations between VO2max and test variables are presented in Table 2. Since no significant correlation was observed between VO2max and W, data on this variable were discarded and its accuracy was not determined. Table 3 exhibits tests sensitivity, specificity, accuracy and Kappa concordance using VO2max as the gold standard.







As 6MWT distance and stair-climbing time had the best accuracy and concordance, they were combined in parallel and in series. Parallel combination yielded 93.5% sensitivity, 59.6% specificity, and 84% accuracy (Table 4) whereas series combination showed 50.7% sensitivity, 96.4% specificity, and 82% accuracy (Table 5).






Preoperative CPET can detect changes in oxygen transport that would not be discovered unless metabolic demand increased during or after surgery. Ergospirometry, which can be used for this purpose, is not available in most services and requires costly equipment. Therefore, despite being highly efficient and considered as the gold standard for surgical risk prediction by most authors4, CPET is still far from being feasible, especially in poor countries.

The ideal test for preoperative investigation must be simple, cheap, and widely available. SCT shows these characteristics, and has been used in developed countries to evaluate cardiopulmonary training15-17. Considering that exercise capacity is limited by cardiac or pulmonary disease, it is not surprising that patients with cardiopulmonary disorders have difficulty climbing stairs, with the degree of limitation degree being proportional to the severity of cardiopulmonary impairment, while patients who accomplish multiple flight rapid stair climb without symptoms have considerable cardiopulmonary reserve. However, SCT standardization requires the use of an accurate, universal and adequately determined variable, as previously done for 6MWT. Attempts to stratify postoperative complications just by the number of floors or steps completed have sometimes been frustrating18. The step is not a universal unit of measurement, so the ideal would be to measure the height reached in meters rather than in flights or floors. It would be also difficult to apply SCT if height were considered as variable. In this case, very high stairways, not available in all services including ours, would be needed. Nevertheless, considering a minimum height of 12m as a constant, it would be possible to use SCt as a variable. Lower heights may not be as useful. According to some studies15, 50% of the patients unable to ascend 12m are likely to have complications.

In this same line, a previous study19 has shown a significant correlation between stair climbing speed and VO2max measured by cycle ergometry, a correlation much greater than that of VO2max with the height climbed. In that study, patients were asked to climb as high and as fast as they could, to a maximum elevation of 20m. The height reached and the average speed of ascent were compared to VO2max. In patients with a speed > 15m/min (80s over 20m) VO2max was > 20ml/kg/min, and in patients with a speed > 12m/min (100s over 20m) VO2max was > 15ml/kg/min. In our study, patients with a speed > 18m/min (40s over 12m) had VO2max > 25ml/kg/min. When height is constant, time is the only variable and speed calculation is not necessary. Nevertheless, if comparing different stair heights is desired, the ideal is to use average speed. Our results are similar to those reported by these authors19 (Figure 1).



SCt must be adequately determined and there must be encouragement during stair climbing to prevent patients from walking at their own pace. Time measured without encouragement, besides not reflecting the actual physical capacity of an individual, also affects other SCT parameters already used by other authors, such as power and estimated VO215,18. Ultimately, time is needed to calculate these variables.

Regarding SCP, other authors18 have estimated the amount of work done to ascend a stairway (W) by the formula "W = step height (m) x number of steps/t (min) x mass (kg) x 0.1635". This formula does not actually calculate work but power expressed in watts. It is, indeed, a more complicated form of the classical formula for power (P = m x g x h / t)11, and is still used by some authors15 who refer to this variable through the name "work". Work calculations do not require information on SCt. As a matter of fact, in our study VO2max was found not to correlate with work but with power, which does depend on time. In order to estimate VO2 during stair climbing, the above mentioned authors15,18 used power under the name of "work": VO2 (ml/min) = 5.8 x m (kg) + 151 + (10.1 x "W"). Thus, time remains to be an important variable in estimating VO2. It must be measured as strictly as weight and height, and will never be adequate without encouragement. However, the literature shows that encouragement and adequate time measurement have not been a matter of concern.

Under our experimental conditions, SCt showed the best linear correlation with VO2max, and the highest accuracy indicating that the patients who take < 40s to climb 12m have a high probability of having VO2max > 25ml/kg/min, and those who take > 40s have a high probability of having VO2max < 25ml/kg/min. SCP greater or smaller than 200w, indicated a high probability of VO2max being greater or smaller than 25ml/kg/min, respectively. However, using SCP rather than SCt can lead to more errors. Those patients who do not manage to complete the 12m with SCt being considered infinite, should be carefully evaluated in order to detect and attempt to correct any changes in the oxygen transport system as previously demonstrated15. Brunelli et al.20 demonstrated that cardiopulmonary complications were 2-fold higher and mortality was 13-fold higher in patients climbing less than 12m than in patients who climbed more than 22m. Our findings differ from those reported by Brunelli et al.15,20 because they used the variable height in their study while stair climbing speed was used here. Nonetheless, patients should not be allowed to climb the stairs at their own pace. Instead, they should be encouraged to ascend as fast as possible, as suggested by Koegelenberg et al.19.

In this study, 6MWT showed the third best linear correlation with VO2max, the second best accuracy, and the highest specificity, with a strong ability for detecting fit individuals. Thus, it may be safely said that individuals who walk more than 500m in 6 minutes have a high probability of having VO2max > 25mL/kg/min, but not that individuals who walk less than 500m have VO2max < 25mL/kg/min, as test sensitivity was not so high.

FEV1 results in liters were better than those expressed in percent predicted. VO2max linear correlation was good with FEV1 (L), but worse with FEV1%. Both parameters were less accurate than 6MWT and SCT variables. FEV1 (L) was better than FEV1% in detecting individuals without complications (VO2max > 25mL/kg/min) due to its high specificity, but it poorly detected individuals with complications because of its low sensitivity.

Given that SCt and 6MWT were considered the best tests, and there was a 80% concordance between them, their sensitivity could be improved by a parallel association that increased sensitivity by 93.5%. In this case, one positive test was enough to identify an individual with complications. This can be very useful when deciding for a minor resection such as a segmentectomy rather than a lobectomy, or insisting on better preoperative preparation for elective surgery, or even predicting the need for a greater support during postoperative intensive care.

In case it is necessary to increase specificity, 6MWT and SCt should be done in series, which provides a 96.5% specificity with both tests being positive to consider an individual as unhealthy. Thus, the ability to identify individuals without complications is enhanced, facilitating decision making in cases of patients with few or no comorbidities, young and clinically healthy patients, and pulmonary resection when postoperative predicted values are below the acceptable. This way, surgery would only be contraindicated if results were abnormal in both tests.

This study aimed at assessing the accuracy of the parameters obtained during stair climbing with encouragement, 6MWT and FEV1; not at finding the cutoff points for the identification of patients at high or low surgical risk. The next step, which is in progress, is to ascertain the cut-off points for patients at high, average, and low surgical risk by evaluating the correlations of postoperative complications with the parameters provided by these tests.



By comparing the surgical risk prediction tests used in this group of patients, it may be concluded that the cheapest tests, namely the stair climbing test and the six-minute walk test, showed better accuracy than spirometry, and should, therefore, be more frequently used in preoperative assessments.



1. Galvan CCR, Cataneo AJM. Effect of respiratory muscle training on pulmonary function in preoperative preparation of tobacco smokers. Acta Cir Bras. 2007;22(2):98-104.         [ Links ]

2. Du Moulin M, Taube K, Wegscherder K, Behrike M, Van den Bussche H. Home-based exercise training as maintenance after outpatient pulmonary rehabilitation. Respiration. 2009;77:139-45.         [ Links ]

3. Weisman IM, Zeballos RJ. Clinical exercise testing. Clin Chest Med. 2001;22(4):679-701.         [ Links ]

4. Schuurmans MM, Diacon AH, Bolliger CT. Functional evaluation before lung resection. Clin Chest Med. 2002;23(1):159-72.         [ Links ]

5. Win T, Jackson A, Sharples L, Groves AM, Wells FC, Ritchie AJ, Laroche CM. Cardiopulmonary exercise tests and lung cancer surgical outcome. Chest. 2005;127(4):1159-65.         [ Links ]

6. Benzo R, Kelley GA, Recchi L, Hofman A, Sciurba F. Complications of lung resection and exercise capacity: a meta-analysis. Respir Med. 2007;101(8):1790-7.         [ Links ]

7. Bagg LR. The 12-min walking distance: its use in the pre-operative assessment of patients with bronchial carcinoma before lung resection. Respiration. 1984;46:342-5.         [ Links ]

8. Butland RJA, Pang J, Gross ER, Woodcock AA, Geddes DM. Two, six and 12-minute walk tests in respiratory disease. BMJ. 1982;284:1607-8.         [ Links ]

9. Souders CR. Clinical evaluation of the patient for thoracic surgery. Surg Clin North Am. 1961;41:545-56.         [ Links ]

10. Ferraza AM, Martolini D, Valli G, Palange P. Cardiopulmonary exercise testing in the functional and prognostic evaluation of patients with pulmonary diseases. Respiration. 2009;77:3-17.         [ Links ]

11. Cataneo DC, Cataneo AJM. Accuracy of stair climbing test using maximal oxigen uptake as the gold standard. J Bras Pneumol. 2007;33(2):128-33.         [ Links ]

12. Crapo RO, Hankinson JL, Irvin C, MacIntyre NR, Voter KZ, Wise RA, Graham B, O'Donnell C, Paoletti P, Roca J, Veigi G. Standardization of spirometry: 1994 Update. Am J Respir Crit Care Med. 1995;152:1107-36.         [ Links ]

13. American Thoracic Society. Guidelines for the six minute walk test. Am J Respir Crit Care Med. 2002;166:111-7.         [ Links ]

14. Martinez EZ, Louzada-Neto F, Pereira BB. A curva ROC para testes diagnósticos. Cad Saúde Coletiva. 2003;11(1):7-31.         [ Links ]

15. Brunelli A, Al Refai M, Monteverde M, Borri A, Salati M, Fianchini A. Stair climbing tests predicts cardiopulmonary complications after lung resection. Chest. 2002;121(4):1106-10.         [ Links ]

16. Girish M, Trayner E Jr, Dammann O, Pinto-Plata V, Celli B. Symptom-limited stair climbing as a predictor of postoperative cardiopulmonary complications after high-risk surgery. Chest. 2001;120(4):1147-51.         [ Links ]

17. Kinasewitz GT, Welch MH. A simple method to assess postoperative risk. Chest. 2001;120(4):1057-8.         [ Links ]

18. Olsen GN, Bolton JW, Weiman DS, Hornung CA. Stair climbing as an exercise test to predict the postoperative complications of lung resection: two years experience. Chest. 1991;99(3):587-90.         [ Links ]

19. Koegelenberg CFN, Diacon AH, Irani S, Bolliger CT. Stair climbing in the functional assessment of lung resection candidates. Respiration. 2008;75:374-9.         [ Links ]

20. Brunelli A, Refai M, Xiumé F, Salati M, Sciarra V, Socci L, Sabbatini A. Performance at symptom-limited stair-climbing test is associated with increased cardiopulmonary complications, mortality, and costs after major lung resection. Ann Thorac Surg. 2008;86:240-8.         [ Links ]



Daniele Cristina Cataneo
Thoracic Surgery Discipline of the Surgery and Orthopedics Department
Botucatú School of Medicine - UNESP
18618-970 Botucatú - SP Brazil
Phone: (55 14)3815-6230
Fax: (55 14)3815-7615

Received: September 08, 2009
Review: November 09, 2009
Accepted: December 07, 2009
Conflict of interest: none
Financial source: none



1 Research performed at Postgraduate Program on General Basis of Surgery, São Paulo State University (UNESP), Botucatú School of Medicine, São Paulo, Brazil.

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License