This investigation evaluated potential revisions to the Armed Services
Vocational Aptitude Battery (ASVAB). The data analyzed were collected
from trainees in 17 U.S. Air Force, Army, and Navy jobs as part of th
e Joint-Services Enhanced Computer-Administered Test (ECAT) battery va
lidation study. Predictors included the trainees' preenlistment scores
for the 10 tests in the current ASVAB, plus the 9 experimental ECAT b
attery tests. The criteria were measures of training performance. All
possible combinations of tests that (a) included the Word Knowledge an
d Arithmetic Reasoning tests of the ASVAB and (b) could be administere
d in a 134- to 164-min interval were evaluated with respect to 5 index
es of test battery performance: criterion-related validity, classifica
tion efficiency, and 3 types of subgroup differences (White vs. Black,
White vs. Hispanic, and male vs. female). The 5 indexes were calculat
ed for each of the 16,437 possible combinations of tests. The standard
deviations of the indexes across the combinations of tests showed tha
t (a) values on the validity index varied little, (b) values on the cl
assification efficiency and White versus Black and White versus Hispan
ic subgroup differences indexes varied moderately, and (c) values on t
he male versus female difference index varied substantially. The valid
ity index of the combinations showed a moderate correlation with the c
lassification efficiency index and a nearly zero correlation with subg
roup differences. However, the classification efficiency index showed
a small-to-moderate positive correlation with the subgroup difference
indexes. The subgroup difference indexes showed moderate-to-high posit
ive correlations with one another. Examinations of the top 20 combinat
ions of tests identified by each index demonstrated that tests that op
timize one type of index usually do not optimize each of the other ind
exes. In particular, trade-offs were observed between (a) the maximiza
tion of validity (and classification efficiency) versus the minimizati
on of all 3 types of subgroup differences and (b) the minimization of
differences between Whites and Blacks (or between Whites and Hispanics
) versus the minimization of differences between men and women. These
results suggest that no combination of the tests considered in this in
vestigation simultaneously optimizes all 5 test battery performance in
dexes.