One of the distinguishing criteria of the SWISS-PROT protein sequence data
bank is minimal redundancy. The introduction of TrEMBL as a supplementary d
atabase ensured the comprehensiveness of SWISS-PROT and TrEMBL but introduc
ed some degree of redundancy, We developed a strategy to identify the redun
dancy present within and between SWISS-PROT and TrEMBL and its subsequent r
emoval.