A Unified Approach to Authorship Attribution and Verification

Citation
Xavier Puig, et al., A Unified Approach to Authorship Attribution and Verification, American statistician , 70(3), 2016, pp. 232-242
Journal title
ISSN journal
00031305
Volume
70
Issue
3
Year of publication
2016
Pages
232 - 242
Database
ACNP
SICI code
Abstract
In authorship attribution, one assigns texts from an unknown author to either one of two or more candidate authors by comparing the disputed texts with texts known to have been written by the candidate authors. In authorship verification, one decides whether a text or a set of texts could have been written by a given author. These two problems are usually treated separately. By assuming an open-set classification framework for the attribution problem, contemplating the possibility that none of the candidate authors is the unknown author, the verification problem becomes a special case of attribution problem. Here both problems are posed as a formal Bayesian multinomial model selection problem and are given a closed-form solution, tailored for categorical data, naturally incorporating text length and dependence in the analysis, and coping well with settings with a small number of training texts. The approach to authorship verification is illustrated by exploring whether a court ruling sentence could have been written by the judge that signs it, and the approach to authorship attribution is illustrated by revisiting the authorship attribution of the Federalist papers and through a small simulation study