Kh. Labs et al., Reliability of treadmill testing in peripheral arterial disease: a comparison of a constant load with a graded load treadmill protocol, VASC MED, 4(4), 1999, pp. 239-246
This study aims to evaluate the reliability of repeated graded workload tre
admill testing (G-test; 2 mph; 0% grade, increasing 2% every 2 min) and to
compare the reliability of a constant workload treadmill protocol (C-test;
2 mph; 12% grade) versus the graded workload treadmill protocol in patients
with intermittent claudication, studied longitudinally.
A clinical trial investigating an orally stable prostacycline derivative th
at included 330 patients with intermittent claudication was performed. The
trial employed three active treatment groups and one placebo group. Because
there were no significant inter-group differences at baseline or after tre
atment, data from all groups were pooled for the evaluation of treadmill te
st reliability. Treadmill data were obtained from a 2-week run-in phase whe
re three G-tests were performed, as well as from the beginning and the end
of a 3-month double-blind phase where a G-test and a C-test were performed
in random order. Treadmill test reliability was described through test proc
ess-related and between-subject variances and also using variance-derived p
arameters such as the reliability coefficient (RC) and the relative precisi
on (RP). A higher Value for the RC and a lower Value for the RP indicate th
at the test variability is predominantly due to between-subject variance an
d not to test process-related variance. Estimates of variance were describe
d for both the maximal or absolute claudication distance (ACD) and the init
ial claudication distance (ICD) with each treadmill test. Reliability estim
ates are reported for the total study sample and for patients with baseline
claudication distances less than or equal to 300 feet and >300 feet (appro
ximately less than or equal to 100 m; >100 m), as measured with the C-test.
The cut-off value was empirically chosen to separate severely diseased fro
m mild to moderately diseased claudicants. Theoretical considerations sugge
st that reliability measures may differ in these subgroups. With repeated t
esting during the run-in phase for the measure of ACD, the G-test had an RC
of 0.952 and an RP of 21.9%. With the comparison of both test protocols in
the entire study population for the measurement of ACD, the G-test had an
RC of 0.902 and an RP of 31.3%, while the C-test had an RC of 0.876 and an
RP of 35.2%. The results for ICD on the G-test were an RC of 0.809 and an R
P of 43.7%, while the C-test had an RC of 0.737 and an RP of 51.3%. The rel
iability of the ACD measurement for RC and RP was numerically superior to t
hose for the ICD for both protocols. In patients with a baseline ACD less t
han or equal to 300 feet, the RC for ACD on the G-test was 0.827 and the RP
was 41.4%. In contrast, on the C-test the RC decreased to 0.250 and the RP
increased to 86.6%. These changes in RC and RP were due to a marked decrea
se in the between-subject variance, demonstrating the inability of the C-te
st to separate appropriately the different claudication distances in popula
tions with highly limited baseline claudication distances.
During a run-in phase, the G-test has excellent test characteristics. Durin
g the longitudinal phase of a trial, the reliability of G-tests and C-tests
are comparable in the entire study population. However, in patients with l
ow claudication distances, the G-test should be given preference over the C
-test.