The self-organizing map (SOM) can classify documents by learning about thei
r interrelationships from its input data. The dimensionality of the SOM inp
ut data space based on a document collection is generally high. As the comp
utational complexity of the SOM increases in proportion to the dimension of
its input space, high dimensionality not only lowers the efficiency of the
initial learning process but also lowers the efficiencies of the subsequen
t retrieval and the relearning process whenever the input data is updated.
A new method called feature competitive algorithm (FCA) is proposed to over
come this problem. The FCA can capture the most significant features that c
haracterize the underlying interrelationships of the entities in the input
space to form a dimensionally reduced input space without excessively losin
g of essential information about the interrelationships. The proposed metho
d was applied to a document collection, consisting of 97 UNIX command manua
l pages, to test its feasibility and effectiveness. The test results are en
couraging. Further discussions on several crucial issues about the FCA are
also presented.