Bd. Harch et al., THE ANALYSIS OF LARGE-SCALE DATA TAKEN FROM THE WORLD GROUNDNUT (ARACHIS-HYPOGAEA L) GERMPLASM COLLECTION .1. 2-WAY QUANTITATIVE DATA, Euphytica, 95(1), 1997, pp. 27-38
Data associated with germplasm collections are typically large and mul
tivariate with a considerable number of descriptors measured on each o
f many accessions. Pattern analysis methods of clustering and ordinati
on have been identified as techniques for statistically evaluating the
available diversity in germplasm data. While used in many studies, th
e approaches have not dealt explicitly with the computational conseque
nces of large data sets (i.e. greater than 5000 accessions). To consid
er the application of these techniques to germplasm evaluation data, 1
1328 accessions of groundnut (Arachis hypogaea L) from the Internation
al Research Institute for the Semi-Arid Tropics, Andhra Pradesh, India
were examined. Data for nine quantitative descriptors measured in the
rainy and post-rainy growing seasons were used. The ordination techni
que of principal component analysis was used to reduce the dimensional
ity of the germplasm data. The identification of phenotypically simila
r groups of accessions within large scale data via the computationally
intensive hierarchical clustering techniques was not feasible and non
-hierarchical techniques had to be used. Finite mixture models that ma
ximise the likelihood of an accession belonging to a cluster were used
to cluster the accessions in this collection. The patterns of respons
e for the different growing seasons were found to be highly correlated
. However, in relating the results to passport and other characterisat
ion and evaluation descriptors, the observed patterns did not appear t
o be related to taxonomy or any other well known characteristics of gr
oundnut.