Multi-language texts

2025-02-18

Where isolated foreign words appear in a text otherwise in a single language, or when a text has significant parts in two or more languages (bilinguals, glosses, etc.).

TEI definition: ab ; EpiDoc-specific customization: ab
TEI definition: div ; EpiDoc-specific customization: div
TEI definition: foreign ; EpiDoc-specific customization: foreign
TEI definition: gloss ; EpiDoc-specific customization: gloss
TEI definition: seg ; EpiDoc-specific customization: seg
TEI definition: term ; EpiDoc-specific customization: term

1. Foreign words or phrases

The foreign element identifies a word or phrase as belonging to a language other than that of the surrounding text, either in the inscription text or in other parts of the edition. It is usually useful to apply the attribute xml:lang with a language code (see Languages and Scripts) to identify the language of the embedded word.

For the lost line 6 Aurigemma, loc. cit., suggests <foreign xml:lang="la">matris castrorum</foreign>

Transformation using the EpiDoc Reference stylesheets:

Default (Panciera) style: For the lost line 6 Aurigemma, loc. cit., suggests matris castrorum
London style: For the lost line 6 Aurigemma, loc. cit., suggests matris castrorum

(IRT: 21)

2. Multilingual text sections

Where significantly sized passages of the inscription text are in different languages, it is advisable to choose a single language as the default one and to declare it with an xml:lang attribute on the <div type="edition">, and then, for any other sections in a different language, to declare it by adding an xml:lang attribute to the block-level containers of that text (e.g. ab, lg, seg, or <div type="textpart"> when the language shift coincides with a semantic and physical break).

<div type="edition" xml:space="preserve"
xml:lang="la"> <ab> <lb/><gap reason="lost"/>dium murum supra<gap reason="lost"/> <lb/><gap reason="lost"/>tribunicia potestate<gap reason="lost"/> </ab> <ab xml:lang="grc"> <lb/><gap reason="lost"/>υς μόνος τὸν ναὸν <gap reason="lost"/> <lb/><gap reason="lost"/>δεων αὐτοκράτορ<gap reason="lost"/> </ab>
</div>

(IRT: 481)

<div type="edition" xml:space="preserve"
xml:lang="la"> <div type="textpart" n="a"> <ab> <lb/>Caecilius Diodorus <gap reason="lost"/> <lb/>Caesaris delubrum a<gap reason="lost"/> </ab> </div> <div type="textpart" n="b" xml:lang="grc"> <ab> <lb/>Καικίλιος Διόδωρος ἅμα <gap reason="lost"/> <lb/>ἐκ τῶν ἰδίων εὔξατο θ<gap reason="lost"/> </ab> </div>
</div>

(IRT: 481)

<div type="edition" xml:space="preserve"
xml:lang="grc"> <div type="textpart" n="a"> <ab> <milestone unit="face" n="a"/><lb n="1"/>Greek verse <lb n="2"/>Greek verse <lb n="3"/>Greek verse <lb n="4"/>Greek verse <milestone unit="face" n="b"/><lb n="5"/>Greek verse <lb n="6"/>Greek verse </ab> </div> <div type="textpart" n="b" xml:lang="la"> <ab> <lb n="1"/>Latin prose <lb n="2"/>Latin prose </ab> </div>
</div>

In the following example from DHARMA, inscription text in Sanskrit, with a chunk in Telugu:

<div type="edition" xml:lang="san-Latn">
<ab>
  <lb n="1"/>svasti śrīmatāṁ sakala-bhuvana-saṁstūyamāna-mānavya-sagotrānāṁ
<lb n="2"/>...
</ab>

<ab xml:lang="tel">
  <lb n="29"/>puṭṭi-nirugu saveraṁ Iruvadinālgu vuṭla-ni<lb n="30" break="no"/>ṇḍṟāyam padu-gaṇḍu padeḻ dumu tamulaṁmula-tūmeṇḍu
</ab>

<ab>
  <lb n="30"/>Asyopari na
<lb n="31"/>kenacid bā<space type="binding-hole"/>dhā karttavyā yaḥ karoti sa paṁca-mahā-pātaka-saṁyu<lb n="32" break="no"/>kto bhavati…
</ab>

</div>

(Source)

In the following example from DHARMA, the text is inscribed on two separate doorjambs, continuing from the bottom of one to the top of the other. The inscription begins with a series of verses in Sanskrit, followed by Khmer prose (starting at some point on the second doorjamb):

<div type="edition"
xml:lang="x-oldkhmer-Latn">
<lg n="1" met="vasantatilakā"
  xml:lang="san-Latn">
  <l n="a">
   <milestone unit="item" n="S"/>
   <label xml:lang="en">Southern Doorjamb</label>
   <lb n="S1"/>jejīyatāṁ vraja <seg met="-+--+-+=">
    <gap reason="lost" quantity="8"
     unit="character"/>
   </seg>
  </l>
  <l n="b">...</l>
</lg>

<lg n="17" met="vasantatilakā"
  xml:lang="san-Latn">
  <l n="a">
   <milestone unit="item" n="N"/>
   <label xml:lang="en">Northern Doorjamb</label>
   <lb n="N1"/>Agre-saraḥ prathita-puṇyavatāṁ sva-puṇyaiś</l>
  <l n="b">...</l>
</lg>

<ab>
  <lb n="N17"/>ta duk· śloka neḥ mratāñ· śrī Indrapaṇ<orig>d</orig>ita</ab>

</div>

(Source)

3. Glosses and dictionaries

A text that is made up of words or phrases in one language, and glosses or translations in one or more others, may be tagged with a series of term and gloss elements, inline, and each bearing the xml:lang specifying the language of the short phrase. Alternatively, if the specific and rich semantics of term and gloss are to be avoided, seg may be used to mark arbitrary spans of text as being in one language or another.

<ab>
<term xml:id="seq1" xml:lang="san-Brah">mahībhujām·</term>
<gloss target="#seq1" xml:lang="pyx">tg'am·ḥ d'iṁ tiṁ pmir·ḥ CV naḥ</gloss>
<term xml:id="seq2" xml:lang="san-Brah">°unnata</term>
<gloss target="#seq2" xml:lang="pyx">kd'ir·ṁ tra v'a kv'iṁ</gloss>
<term xml:id="seq3" xml:lang="san-Brah">porusa</term>
<gloss target="#seq3" xml:lang="pyx">°o saṁḥ pir·ṁ tg'a</gloss>
</ab>

In this example, from the Corpus of Pyu Inscriptions project, xml:id and corresp attributes are used to link the Pyu glosses with the Sanskrit terms of which each is a translation or equivalent.

Responsibility for this section

Charlotte Tupman, author
Gabriel Bodard, author
Arlo Griffiths, contributor
Marc Miyaki, contributor
Irene Vagionakis, contributor

EpiDoc version: 9.7

Date: 2025-02-18