We will build an edition that permits us to view the text in two states:

  1. A diplomatic transcription. This is a direct transcription that records the document as-is. To the best of our ability, we will preserve spelling, punctuation, word spacing, use of capital/lower-case letters, abbreviations, line breaks, and any other formal aspect that we can represent in XML.
  2. A regularized reading text. This version regularizes the text to be easily readable for a twenty-first-century audience. Regularizations include the correction of misspellings, the standardization of word spacing, the amelioration of capitalization inconsistencies, and the elongation of abbreviations. We are not altering the grammar or syntax unless it dramatically inhibits our understanding of the text. In such a case when the editor does regularize punctuation or grammatical features, he/she must detail the reasons for this decision in the corresponding critical essay.

For each document in the collection, these two versions will be encoded together in a single <text> element using the <choice> element to layer the transcription and edition together. For each text, we will also encode semantic features such as dates, people and people groups, places, and titles of books in order to allow us to later experiment with options for analysis and display.