Base information content in organic formulas

Citation
Dj. Graham et Dv. Schacht, Base information content in organic formulas, J CHEM INF, 40(4), 2000, pp. 942-946
Citations number
25
Categorie Soggetti
Chemistry
Journal title
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES
ISSN journal
00952338 → ACNP
Volume
40
Issue
4
Year of publication
2000
Pages
942 - 946
Database
ISI
SICI code
0095-2338(200007/08)40:4<942:BICIOF>2.0.ZU;2-I
Abstract
Three questions are addressed concerning organic formulas at their most pri mitive level: (1) What is the information per atomic symbol? (2) What is th e level of system redundancy? (3) Wow are high-information formulas disting uished from low-information ones? The results are simple yet interesting Ca rbon chemistry embodies a code which is low in base information and high in redundancy, irrespective of database size. Moreover, code units associated with halocarbons, proteins, and polynucleotides are especially high in inf ormation. Low-information units are more often associated with simple alkan es, aromatics, and common functional groups. Overall, the work for this pap er quantifies the base information content in organic formulas; this contri butes to research on symbolic language, chemical information, and molecular diversity.