M. Forina et al., FEATURE-SELECTION AND VALIDATION OF SIMCA MODELS - A CASE-STUDY WITH A TYPICAL ITALIAN CHEESE (CORRECTED VERSION OF LA014), Analusis, 21(3), 1993, pp. 133-147
A strategy for building class models by means of SIMCA (soft independe
nt modelling of class analogy) is suggested. to be applied in the case
of a small number of objects and a large number of variables. The str
ategy uses both the customary procedure, based on the selection of var
iables with both high modelling and discrimination powers, and a novel
procedure. Here Monte Carlo simulations are used to obtain the signif
icance level of the experimental Fisher weights. so that only the rele
vant variables are selected. avoiding the use of noisy information. Th
e validation of SIMCA models is performed by means of a leave-one-out
procedure: many validation parameters are suggested to evaluate the ac
curacy of the models obtained. Data on a typical Italian cheese have b
een used to show the feature selection and the validation procedures.
The significance of the validation parameters has been tested by compa
ring the results of the 'cheese categories' with those obtained from a
rtificial and real data sets (the variety Versicolor of the iris flowe
r, and categories of typical wines and olive oils). The models compute
d for the typical cheese are shown to be reliable.