Structural elements

<pb n="1" facs="../images/ew_a1_342_001.jpg"/>

Page break. The value of the attribute n should be the page number. If there is no explicit pagination in the document, you can impose it here, numbering sequentially from 1. The value of the attribute facs should be filename of the corresponding image. (See Naming Files)

If you’re working with a document that has page numbers, please record them in your transcription as follows, but number your page breaks and images beginning with 1 (or 001), regardless of those numbers. In the following example, we see what this would look like in a document that has two pages labeled “53” in the original:

 <pb n="53" facs="../images/ew_d5_1687_053.jpg"/>
 <pb n="54" facs="../images/ew_a1_342_054.jpg"/>
 <pb n="55" facs="../images/ew_a1_342_055.jpg"/>

Note that we use <choice> to remove the original page numbers from the edition view, and we continue with the sequential numbering of the n attribute and the file names, ignoring the error in pagination in the original document.


Paragraph. <p></p> encloses a paragraph. I strongly recommend putting these elements on their own lines, as follows:

   This is the text of a paragraph.

If a paragraph is indented in the original, you can indicate that as follows:

<p rend="indent">

(The rend attribute is used here and elsewhere to describe how things appear in the original, not how they should be rendered in any eventual output of the XML document. See


Line break. <lb/> marks the place where a line break occurs in the transcription. When a word is divided across the line, we use attribute break to indicate this:

<lb break="no"/>. 

For instance:

   This document has several<lb/> 
   lines of text. Most lines end<lb/> 
   neatly at the end of a word,<lb/> but this line quite stubborn<lb break="no"/> 
   ly does not.<lb/> 

If stubbornly were hyphenated in the original, you would not need to transcribe here the hyphen.


Section or division. <div></div> encloses a section or division of some sort within a document. You likely will only need this if your document contains internal headings of some sort (see next element below). A <div> may be nested inside another <div> when appropriate. For instance, this might be useful in a document containing sections (with headings) and below those, subsections (with subheadings).


Heading. <head></head> encloses a heading. Every heading must go inside its own <div>, except perhaps if you have only one heading at the top of a document (<body> seems to function like a <div> in this case).


Column break. <cb/> marks the beginning of a column on a page that has more than one column. See the example on this page:

<figure><desc>Engraving on p. 1</desc><graphic url="pilloniere_001_001.jpg"/></figure>

Figures. <figure> can be used to mark the place that a graphic of some sort (image, drawing, etc.) appears on a page. The url attribute points to the image itself (you’ll need to crop your selection out of the larger archival scan. For instructions on how to name such a file, see Naming Files. You can use <desc> to provide a text description.

More info: 


Tables. <table> can be used to represent information that is presented in tabular format. This is the basic structure:


(This would be for a table with two rows and two columns. For more examples, see

<list rend="simple">

Lists. <list rend=”simple”> combined with <head> and <item>, can be used to display a list. Rend=”simple” should (theoretically) cause the list to be displayed without any numbers or bullets. Here is an example:

<list rend="simple">
 <head rend="center">Officers</head>
 <item><name type="person"> Mrs. J.W. Ward </name>,<name role="administrator"> President </name></item>
 <item><name type="person"> Mrs. Phillis Witsell </name>,<name role="administrator"> First Vice President </name></item>
 <item><name type="person"> Mrs. A.J. Williams </name>,<name role="administrator"> Second Vice President </name></item>
 <item><name type="person"> Mrs. T.G. Freeland </name>,<name role="administrator"> Third Vice President </name></item>
 <item><name type="person"> Mrs. E.L. James </name>,<name role="administrator"> Financial Secretary </name></item>
 <item><name type="person"> Mrs. G.N. Griffin </name>,<name role="administrator"> Corresponding Secretary </name></item>
 <item><name type="person"> Miss Eartha M.M. White </name>,<name role="administrator"> City Organizer </name></item>

For more info, see

Representing formatting of original document

When we encode documents with TEI-XML, we are concerned more with content than appearance. Indeed, one of the benefits of using XML is that it separates content from how that content will ultimately be presented. That styling is generally done with XSLT, the stylesheet/transformation language for XML (like CSS is the stylesheet language for HTML). However, we are indeed interested in recording the appearance, or formatting, of the original document itself. This is why, after all, we are using <pb/>, <lb/> and other such structural elements. Here are a few others that may be useful:

<hi rend="center"></hi>

Center alignment. <hi rend=”center”></hi> encloses centered text (see

<hi rend="superscript"></hi> 

Superscript. <hi rend=”superscript”></hi> encloses raised text.

<hi rend="italic">


<hi rend="underline"></hi>



 <head rend="case(allcaps)">Indigent Hospital Patients</head>

Capitalizations. If headings or labels of any sort appear in the document in all caps, please transcribe them using title case, and use the rend attribute to indicate the all caps formatting in original (This text appears in the original as “INDIGENT HOSPITAL PATIENTS.”).


 <label rend="case(allcaps)">Laws and Rules</label>

(This appears in original as “LAWS AND RULES”.)

If other material appears in all caps in the original, please use the following:

 <emph rend="case(allcaps)">


Elements for regularization


Abbreviations. <choice><abbr></abbr><expan></expan></choice> can be used to simultaneously record an abbreviation and provide its resolution, as in:


If the letters st in this example actually appeared in raised script in the original, we would document that as follows:

<choice><abbr>1<hi rend="superscript">st</hi></abbr><expan>first</expan></choice>

As a general rule, we will resolve all abbreviations.

Punctuation. We can, however, use this sequence of elements to regularize punctuation, as in the following examples:

removing comma:
removing period:
adding comma:
changing comma to period:
adding period:
replacing semicolon with comma:

Comments & Editorial Annotations

<!-- text here -->

Unseen comments. <!– text here –> can be used to insert comments into your XML file. This is metadata that will not display in the output of your file. You can use comments of this sort to record doubts or questions you might want to follow up on later.

<note type="editorial"></note>

Editorial (seen) comments. <note type=”editorial”></note> can be used to add a note of your own. This would go right after the word or phrase in question. Editorial comments include information about a person or book, an explanation of a place, or clarification of an idea.


<mentioned></mentioned> is used in your note to encode the word to which you are referring (the interface will render this in italics).

Here is an example:

<note type="editorial"><mentioned>Moosa</mentioned> was previously an alternative form in English of the Spanish <mentioned>Mosé</mentioned>... [source].</note>

You would need to cite an academically rigorous source for your information, and please use Chicago Notes-Bibliography format. If you want to mark something you want to annotate but need to do research before you write the note, you could add a placeholder note of this sort:

<note type="editorial"><mentioned>Moosa</mentioned></note>

Special characters

Some characters have to be transcribed in XML with character references. These include:

° (the degree sign, as in 98° Fahrenheit) must be represented in your text as follows:


& (the ampersand, meaning “and”) must be represented as follows:



Semantic elements

<date when="YYYY-MM-DD"></date> 

Dates. <date when=”YYY-MM-DD”></date> encloses a date, however it is articulated. You might see a standard situation like the following:

She was born on <date when="1970-04-01">April 1, 1970</date>.

If you had only year, this would be the format:

<date when="1970">1970</date>

Only month and year would be:

<date when="1980-02">February 1980</date>

Only month and day won’t validate, so if that’s all you have (April 1) and you don’t know the year, you will be unable to tag it, I believe.

Sometimes references to dates aren’t explicit, but we can still tag them, as in:

Earlier <date when="1970">that same year</date>, her parents had moved to Florida.
<name type="person"></name>

Person. <name type=”person”></name> encloses the proper name of a person, as in:

<name type="person">Nikolai Vitti</name>

This can also be used to mark common nouns or phrases that refer to a specific, identifiable person, as in:

The <name type="person">current superintendent</name> of the local public school system...

where current superintendent refers, for instance, to Nikolai Vitti.

<name type="person_group"></name>

People group. <name type=”person_group”></name> encloses a proper noun indicating the name of a group of people that has a particular name, such as those of a given nationality or some other category. We would typically write this type of word with an initial capital but not always. Here are some examples:

the <name type="person_group">Seminole</name> and <name type="person_group">Creek</name>


the <name type="person_group">British</name> and <name type="person_group">French</name>
<name type="place"></name> 

Place. <name type=”place”></name> encloses the proper names of places of any type. This includes buildings, streets, cities, states (and other political divisions), as well as geographical features like rivers, lakes, etc. for example:

<name type="place">Jacksonville</name>

This can also be used to mark common nouns or phrases that refer to specific, identifiable places, as in:

the level of toxicity in the <name type="place">river</name> has increased...

where river refers, for instance, to the St. John’s.

The subtype attribute can be used to provide a more specific category for a place, as in

<name type="place" subtype="river">St. John's River</name>


<name type="place" subtype="city">Jacksonville</name>

Let’s handle specific street addresses as follows:

<name type="place" subtype="address">123 Main Street</name> 

For places that can be located on a map, you can include latitude and longitude as follows:

<name type="place">The Clara White Mission<location><geo>30.332632 -81.664020</geo></location></name>

To get this information, search for the place in Google Maps, right click on the location on the map and select “What’s here?”. A box will pop up showing lat. and long. If you aren’t able to copy the numbers from there, click on them, and they will appear in the search box at the left, from where you will be able to copy them. Tagging a specific geographical location is prudent when the text refers to a specific place that can be identified exactly (for example: someone’s home, Stonehenge, or the Cathedral of Notre Dame in Paris).

<title level="m"></title> 

Book title. <title level=”m”></title> encloses the title of a monographic (“m”) work (a book, primarily).

<title level="a"></title> 

Article/Essay title. <title level=”a”></title> encloses the title of an “analytic” (“a”) work (a journal chapter, an article, etc.)


Openings and Closings of Letters

These elements are both structural and semantic, so I’m putting them in their own section.

Here is an example of how to mark up the opening of a letter:

 Dear <name type="person">Miss White</name> :<lb/>
 <hi rend="center">
 <choice><sic>your</sic><corr>Your</corr></choice> sister in Christ.<lb/>
 <name type="person">Sarah Best</name>.<lb/>

These elements do not go inside <p> or <head> elements.