OBJECTIVE: To develop an approach to the prediction of survival in patients
with colorectal cancer using nearest neighbor analysis and case-based reas
oning.
STUDY DESIGN: A total of 216 patients with full clinicopathologic records a
nd five-year follow-up were the subjects of this study. They were divided i
nto It cove database of 162 cases and a test group of 54 cases, with follow
up on all patients. When the patient was still alive at the end of the fol
low-up period, censored survival time was used. For each of the test cases,
the four closest neighbors from the database were retrieved and their medi
an survival time recorded and used as the predicted estimate of survival. C
ase matching was based on a Euclidean multivariate distance measure for the
three best predictor variables: patient age, Dukes stage and tubule config
uration . Cases with the smallest distance from the test case were consider
ed to be the most similar. The predicted survival times for the test cases
were compared with the actual, observed survival in the test cases to deter
mine the success of this approach.
RESULTS: The results showed reasonable concordance between observed and pre
dicted survival figures, although there was a large degree of spread. Class
ification of cases into less than or equal to 60 and > 60 months' survival
showed a correct classification rate of 63%. For the prediction of survival
time, the distribution Of differences between observed and predicted survi
val times for the uncensored test cases had a median value of-5 months but
also showed a wide dispersion of values. Correlation of observed and predic
ted survival times, while not reaching statistical significance at P < .05,
did show a strong positive association.
CONCLUSION: Case-based approaches to the prediction of survival times in ca
ncer patients are important. The results of the current study illustrate th
e difficulties in applying this approach to survival data and highlight the
complexity of patient information and the inability to accurately predict
patient outcome on 17 small subset of clinicopathologic features. While ext
ensive work needs to be carried out to improve prediction power, this study
illustrates the potential for case-based analyses. The ability to retrieve
feature-matched cases from hospital patient databases has clear, independe
nt advantages in patient management, but the ability to provide reliable, t
argeted prognostic estimates on individual cases should be a common goal in
medical research.