An important issue in knowledge discovery in databases (KDD) is to all
ow the discovered knowledge to be as close as possible to natural lang
uages to satisfy user needs with tractability on one hand, and to offe
r KDD systems robustness on the other. At this junction. this paper de
scribes a new concept of linguistic atoms with three digital character
istics: expected value Ex entropy En, and deviation D. The mathematica
l description has effectively integrated the fuzziness and randomness
of linguistic terms in a unified way. Based on this model, a method of
knowledge representation in KDD is developed which bridges the gap be
tween quantitative and qualitative knowledge. Mapping between quantiti
es and qualities becomes much easier and interchangeable. In older to
discover generalized knowledge from a database, we may use virtual lin
guistic terms and cloud transforms for the auto-generation of concept
hierarchies to attributes. Predictive data mining with the cloud model
is given for implementation. This further illustrates the advantages
of this linguistic model in KDD. (C) 1998 Elsevier Science B.V. All ri
ghts reserved.