[note: if you cut and paste elements from this page, you may need to convert the curly quotation marks to straight ones in oXygen.]

Distribution of pages:

Ali 1-49

TBA 50-99

John (completed) 100-149

TBA 149-200


General approach: 

We transcribe spelling, word separation, and punctuation, as-is.

We will transcribe capital letters according to standard usage (beginning of sentence, proper names, words that always capitalized only)

Common elements for transcription:

<pb n=”22″/> Page break. The value of the n attribute should be the page number, as indicated on the manuscript page.

<lb/> Line break.

<lb break=”no”/> (for when a word is broken across the end of the line)

Anything superscript:

<hi rend=”superscript”>s</hi>

Some special characters have to be transcribed in XML with character references. These include

° (the degree sign, common in nautical references). Must be represented as follows: &#176;

& (the ampersand, representing “and”). Must be represented as follows: &amp;

Author’s marginal notes.

<note type=”authorial” place=”marginLeft”></note>

<note type=”authorial” place=”marginRight”></note>

<note type=”authorial” place=”marginTop”></note>

<note type=”authorial” place=”marginBottom”></note>

Any material crossed out:

<del type=”strikeout”></del>

Any material that’s been written over:

<del type=”overwritten”></del>

Any material that’s been added, with caret indicated point of insertion:

<add type=”caret”>

Any material that’s been added, with point of insertion not indicated (you’ll have to make an educated guess about where it goes):

<add type=”no_caret”>


<date when=”YYYY-MM-DD”></date>

as in

date when=”1670-06-04″>June 4, 1670</date>

If you only have part of the date, such as year, this is the format:

<date when=”1670″>1670</date>

And sometimes references to dates aren’t explicit, but we can still tag them, as in:

earlier <date when=”1670″>that same year</date>, we came upon…

Headings at top of pages. The pages typically have a page # and a date written across the top. Let’s encode this as follows:

<note type=”authorial” place=”marginTop”></note>




Modernizing spelling:


Resolving abbreviation without superscript letters:


Resolving abbreviations with superscript letters:

<choice><abbr><hi rend=”superscript”></hi></abbr><expan></expan></choice>