Classifying genetic resources by categorical and continuous variables

Citation
J. Franco et al., Classifying genetic resources by categorical and continuous variables, CROP SCI, 38(6), 1998, pp. 1688-1696
Citations number
25
Categorie Soggetti
Agriculture/Agronomy
Journal title
CROP SCIENCE
ISSN journal
0011183X → ACNP
Volume
38
Issue
6
Year of publication
1998
Pages
1688 - 1696
Database
ISI
SICI code
0011-183X(199811/12)38:6<1688:CGRBCA>2.0.ZU;2-M
Abstract
Hierarchical and nonhierarchical clustering methods are used for classifyin g genetic resources. In hierarchical clustering methods, all variables (cat egorical and continuous) can be used to form the subpopulations (groups or clusters), but in standard nonhierarchical methods only the continuous vari ables are incorporated in the analysis. The Location model (LM) allows clas sifying individuals into homogeneous subpopulations by continuous and categ orical variables. In practice, the multinomial variable of the LM that aris es from the combination of all the categorical variables usually shows empt y cells in some subpopulations with the consequence of not allowing estimat ion of cell means and within-cell variances and covariances. The main objec tives of this study were (i) to develop the Modified Location model (MLM) t hat allows empty cells in some subpopulations under the assumption that the means and the variance-covariance matrices depend on a given subpopulation instead of on a specific cell, (ii) to show how to use the MLM in the cont ext of two-stage clustering in which the Ward method is used to form the in itial groups and the MLM is applied to those groups (Ward-MLM), and (iii) t o show how to apply the Ward-MLM to three different data sets to study some of its features and to compare results with other methods. The two-stage c lustering strategy of finding initial groups by the Ward method and then im proving the composition of the groups by the MLM produces compact and well- separated groups with respect to all the variables (categorical and continu ous) compared with classifications obtained with only categorical variables , with only continuous variables, and with the standard Location model.