The goal of this project is to develop a secure system for generating document signatures that will support authorship identification and document provenance. The document signatures will consist of two main components: document profile and author profile. The first component will allow the unique identification of the document and support provenance determination, which includes information for finding documents that are closely related to the document at hand. The second component, the author profile, will include the characteristics of the document that will allow to determine the authorship of that document, and/or to generate a descriptive profile of the author.

Document signatures can also be used as a biometric authentication tool. In addition to identifying the author of the document, document signatures can be designed to provide a profile of the author and determine discriminative characteristics such as age, gender, native language, and the like. The use of document signatures as biometrics has numerous potential applications, including: contributing to build a prosecution case against an online abuser, settling disputes over original creators of a document, identifying violators of online community standards, such as sock puppets in Wikipedia, and identifying the author of a written threat. Other applications include historical purposes, such as disentangling the different authors contributing to a literary work.

