Roman Phonetic Alphabet for English

 

 

Lyubomir Ivanov (Sofia), Valerie Yule (Mount Waverley, Australia

 

 

(Paper published in Contrastive Linguistics, XXXII, 2007, 2, pp. 50-64)

 

 

 

Abstract

 

The present work describes the underlying principles of the 2002 Basic Roman Spelling of English [4] aimed at providing an alternative English orthography for international usage.  This has been developed by Lyubomir Ivanov, who introduces here a new construction for that system, and proposes a closely related Roman phonetic alphabet to be used for the pronunciation respelling of English without special characters or diacritics.  A comparison is made with Interspel [11], a system developed by Valerie Yule that attempts to maximize the advantages and remove the disadvantages of traditional spelling of English, to benefit learners, users and international communication.

 

 

1.     Re-Romanization

 

Re-Romanization is to replace an orthographic system that uses the Roman (Latin) alphabet for writing the words of a certain language by another writing system that is different yet based on the same alphabet.  Traditional orthographies are often modified by reform proposals when the spelling of languages is perceived as reflecting past stages of development rather than actual present day spoken language.  Traditional English orthography (Traditional Spelling of English, TS) is an obvious such case of outdated spelling, which according to some research impedes the acquisition of literacy and efficient reading and spelling [10].

 

Systems like the 2002 Basic Roman Spelling of English (BR) [4] seek to re-Romanize the spelling for the purposes of academic research and education in general, both for native and non-native English speakers.  In one such application, BR is adapted in the present work to serve as a phonetic alphabet for the pronunciation respelling of English.  This new phonetic alphabet has no diacritics, and unlike the International Phonetic Alphabet [2] has no additional special characters.

 

The BR orthographic system could be arrived at from various starting points.  One of these is the original construction [4], which makes use of an intermediary Cyrillic phonetic transcription of English words (possibly leaving the misleading impression that BR was somehow associated with or influenced by Bulgarian phonetics or orthography).  Here we shall present another construction of BR starting from scratch, which provides a better introduction and explanation of the system.

 

 

2.    Spelling Principles

 

Our first step is to formulate and substantiate the small number of basic principles that are inherent to the BR system.

 

2.1.  Strict Romanization Principle: Use the basic Latin alphabet, with no additional characters or diacritics.

 

Traditional English orthography essentially adheres to this principle too, with minor exceptions involving loanwords from other languages.  In the case of languages with Latin-based orthography employing diacritics, the mass practice of modern electronic communication (e-mail, instant messaging, short message service etc.) overwhelmingly uses not the true alphabets but their corrupted versions stripped from all diacritical marks and special characters.  Therefore, it is only natural to keep this felicitous advantage of English spelling, that it is e-communication friendly.

 

2.2.  Consistency Principle: Use single-valued spelling, with no phoneme rendered by two or more graphemes.

 

In the orthodox English orthography one and the same phoneme is often spelled differently in different words.  For instance, the consistency principle is extensively violated by the multiple-valued rendering of the English phonemes

 

/æ/, /e/, /eɪ/, /ɪ/, /i:/, /ɒ/, /aɪ/, //, /oʊ/, /ə/, /u:/, /ɜ:/

 

respectively in:

 

have, salmon;

red, jeopardy, says, guess;

paper, rain, way, eight, break;

big, damage, pretty, women, busy, myth, build, marriage;

feel, beach, shield, perceive, key, people;

tall, walk;

nine, try, high, tie, height, buy, bye, eye, aisle, sign;

out, now;

no, know, boat, soul, toe;

ago, anthem, awesome, iridium, mountain;

mood, soup, jewel, true, lose, fruit, through;

firm, fern, turn, worst, earth, err.

 

This list can easily be extended, as vowels, diphthongs, and consonants have multiple-valued grapheme presentation in numerous other instances.

 

In particular, the above examples demonstrate that the short vowels /ɪ/ and /ɒ/ are presented in TS by more than one single letter each, with as much as six single letters representing the former as in ‘big’, ‘damage’, ‘pretty’, ‘women’, ‘busy’, and ‘myth’.

 

2.3.  Proportionality Principle: Spell short vowels by single letters; spell long vowels and diphthongs by digraphs.

 

Orthodox English orthography defies the proportionality principle e.g. by using

 

‘a’, ‘eigh’ and ‘aigh’ for /eɪ/ as in ‘paper’, ‘eight’ and ‘straight’;

‘e’ for /i:/ as in ‘delete’;

‘i’ for /aɪ/ as in ‘fine’;

‘o’ for /oʊ/ as in ‘no’;

‘u’ for /jʊ/ or /ju:/ as in ‘duty’ and ‘tune’;

‘y’ for /aɪ/ as in ‘by’;

‘ea’, ‘ai’, ‘ie’, ‘eo’, ‘ay’, ‘ue’ for /e/ as in ‘leather’, ‘said’, ‘friend’,

‘jeopardy’, ‘says’, ‘guess’;

‘oo’, ‘ou’ for /ʊ/ as in ‘book’, ‘should’;

‘ou’ and ‘oe’ for /ʌ/ as in ‘touch’, ‘does’;

‘ai’ and ‘ou’ for /ə/ as in ‘mountain’, ‘famous’; etc.

 

It is worth mentioning the uniform presentation by the same single letters of pairs of ‘short/long values’ of vowels, such as the pairs /æ/-/eɪ/; /e/-/i:/; /ɪ/-/aɪ/; /ɒ/-/oʊ/; and /ʌ/-/jʊ/ (or -/ju:/) in ‘national/nation’, ‘serenity/serene’, ‘finish/final’, ‘posture/pose’, and ‘study/student’.  These vowel alternations in stressed syllables have their origins in the Great Vowel Shift of Early Modern English, “the pivotal process of Modern English phonology” according to Chomsky and Halle [5].

 

Obviously, there are three possible ways of dealing with the Great Vowel Shift at orthography level.  The first one would be to disregard it and keep the pre-Shift (now dual) usage of the letters ‘a’, ‘e’, ‘i’, ‘o’, and ‘u’, thereby preserving the spelling uniformity in relevant word families (such as the word pairs above) at the expense of phonemicity.  That is precisely what the TS does.  A technical advantage of this approach is that the presentation of ‘long value’ vowels by single letters contributes to the brevity of TS, i.e. the lower grapheme to phoneme ratio in TS texts.

 

A second tactic – employed by Interspel [11] – would be to maintain the dual usage for the five single vowel letters, but indicate long vowels when necessary, especially for learners, with a diacritic, as in Interspel ‘national/nàtion’, ‘repetition/repèt’, ‘finish/fìnal’, ‘impotent/pòtent’, ‘study/stùdent’.  This approach, proposed for investigation by Valerie Yule, is discussed in greater detail below.

 

The third approach, prescribed by the proportionality principle and adopted by BR, would be to explicate the vowel shift by a corresponding spelling shift, to the effect of preserving phonemicity at the expense of the abovementioned uniformity of word families’ spelling.

 

2.4.  Context Freeness Principle: Spell diphthongs in accordance with the spelling of their components; spell long vowels either as diphthongs or by doubling the letters spelling the respective short vowels.

 

In other words, ‘xy’ renders /αβ/ if and only if ‘x’ renders /α/, and ‘y’ renders /β/.  That is, whenever some letters ‘x’, ‘y’ represent respectively the short vowels /α/ and /β/, then the digraph ‘xy’ represents the diphthong /αβ/; also, ‘xx’ represents the long vowel /α:/.  And conversely, whenever the digraph ‘xy’ represents the diphthong /αβ/, then ‘x’ and ‘y’ should represent the short vowels /α/ and /β/ respectively; and whenever ‘xx’ represents the long vowel /α:/, then ‘x’ should represent the short vowel /α/.  (We have used here ‘α’, ‘β’ to denote both short vowels and related diphthong components.  Sometimes the IPA notation uses slightly different shapes for that purpose, e.g. /ɪ/, /ə/, /ɒ/ but /i:/, /ɜ:/, /oʊ/, not /ɪ:/, /ə:/, /ɒʊ/, indicating that the respective pairs differ not only in length but also in quality.)  The principle stipulates that if we represent say /ɒ/ by ‘o’ (as in ‘not’) and /ʊ/ by ‘u’ (as in ‘put’), then we always do that including in the diphthong /oʊ/ which is always represented by ‘ou’ as in BR ‘nou’, ‘bout’, ‘soul’, ‘tou’ (TS ‘no’, ‘boat’, ‘soul’, ‘toe’).

 

Unlike certain other languages or dialects such as Estonian, Finnish, Dutch, German, Frisian or Lombard, the traditional English orthography does not normally use double letters for the long vowels.  The digraphs ‘ee’ and ‘oo’ are exceptions, and they represent /i:/, /u:/ and /ʊ/ (as in ‘feel’, ‘mood’ and ‘book’), not /e:/ and /ɔ:/ as it would have been the case if the context freeness principle explained above were applied.

 

In the case of digraphs used for diphthongs, TS violates the context freeness principle e.g. by the use of ‘ai’, ‘ea’ for /eɪ/ in ‘main’, ‘break’.  Indeed, the context freeness principle, and ‘ai’ representing /eɪ/ would have implied that ‘a’ represents /e/ which is not the case.  Besides, ‘ea’ representing /eɪ/ would have implied by the context freeness principle that ‘a’ represents /ɪ/, which is not the case.  Similarly, TS violates that principle by the use of ‘ie’, ‘ei’ for /aɪ/ in ‘tie’, ‘either’, because ‘i’ and ‘e’ do not represent /æ/ or /ʌ/; or by the use of ‘oa’, ‘oe’ for /oʊ/ in ‘boat’, ‘toe’ since ‘a’ and ‘e’ do not represent /ʊ/.

 

2.5.  Universality Principle: Spell short vowels and consonants in a way that is common for the traditional orthography of most Romanized languages including English.

 

English spells consonants generally in a way common for the traditional orthography of most Romanized languages, with few exceptions like /ʃ/, // and /dʒ/.  Whereas most Romanized languages are likely to pronounce ‘a’, ‘e’, ‘i’, ‘o’, and ‘u’ as in ‘pasta’, ‘ballet’, ‘police’, ‘depot’ and ‘tabu’, the corresponding English short vowels are as in ‘cat’, ‘pet’, ‘big’, ‘fog’ and ‘put’.  The universality principle however refers to spelling not pronunciation.  We shall discuss the spelling of English vowels in greater detail below, drawing a comparison between the orthographic systems of BR [4] and Interspel [11].

 

We apply the above five principles to build from scratch the re-Romanization system of Basic Roman Spelling of English (BR) [4].

 

 

3.    Building BR Orthography from Scratch

 

The Basic Roman Spelling is aiming at a reasonably precise approximation of Spoken English, for which purpose we use 48 phonemes comprising the set of 45 English phonemes from [2], plus two rhotic variant phonemes, plus the non-English consonant /ts/.  This system is not to serve some particular standard of English pronunciation, but rather provide the means that could be used for the spelling of different varieties of English.

 

3.1.  Short vowels

 

In accordance with the strict Romanization, consistency, proportionality and universality principles, we represent the short vowels

 

/æ/, /e/, /ɪ/, /ɒ/, /ʊ/

by

a, e, i, o, u

 

as in BR ‘dam’, ‘net’, ‘big’, ‘hot’, ‘put’ (TS ‘dam’, ‘net’, ‘big’, ‘hot’, ‘put’) respectively.

 

The proportionality principle dictates that this same set of letters be used for the representation of the short vowels /ə/ and /ʌ/ as well.  In order to facilitate disambiguation we choose ‘a’ for /ə/, for /æ/ is more consistently reduced to /ə/ in unstressed syllables.  The vowels /e/, /ɪ/ and /ɒ/ are not reduced to /ə/ e.g. in /en’dʒɔɪ/, /’ɑ:tɪst/, and /ɒ’tɒrɪtɪ/, hence the representation of /ə/ by ‘e’, ‘i’ or ‘o’ would have increased ambiguity.  As for the possible representation of /ə/ by ‘u’, in view of the context freeness principle that choice would have created the ambiguity of both the diphthong /ʊə/ and the long vowel /u:/ being represented by ‘uu’, as in ‘tuu’ representing both /ə/ and /tu:/ (TS ‘tour’ and ‘too’).  No such ambiguity arises with the chosen representation of /ə/ by ‘a’ as there is no /aə/ diphthong in English.

 

Given the intermediate position of /ʌ/ between /æ/ and /ə/, and the already fixed representation of both /æ/ and /ə/ by ‘a’, this dictates that /ʌ/ be represented by ‘a’ as well.  Therefore, we represent the three short vowels

 

/æ/, /ʌ/ and /ə/

by

a

 

as in BR ‘hav’, ‘dast’, ‘ahed’ (TS ‘have’, ‘dust’, ‘ahead’).  Let us stress that while the BR spelling does not distinguish between these three short vowels, ‘a’ is still pronounced differently in these three words; namely, BR ‘hav’, ‘dast’, ‘ahed’ are pronounced /hæv/, /dʌst/ and /əhed/ respectively.  (The more precise RPA alphabet in Section 6 below has different presentations for /æ/, /ʌ/ and /ə/.)  BR spelling reflects the spoken language as the latter is, without hinting at, suggesting or advocating any pronunciation distortions whatsoever.  In the case of homographs like e.g. BR spelling ‘hat’ for both /hæt/ and /hʌt/ (TS ‘hat’ and hut’), the relevant word and pronunciation would be differentiated from the context of the sentence.

 

3.2.  Consonants

 

In accordance with the strict Romanization, consistency and universality principles we represent the consonant sounds

 

/b/, //, /d/, /ð/, /dʒ/, /f/, /g/, /h/, /x/, /k/, /l/, /m/, /n/,

/ŋ/, /p/, /r/, /s/, /ʃ/, /t/, /θ/, /ts/, /v/, /w/, /j/, /z/, /ʒ/

by

b, ch, d, d, dzh, f, g, h, h, k, l, m, n,

ng, p, r, s, sh, t, t, ts, v, u, y, z, zh

 

respectively.

 

The English dental fricative consonants /θ/ and /ð/ (as in ‘think’ and ‘this’) are somewhat problematic for many non-native English speakers who tend to pronounce /θ/ as /t/ or /s/, and /ð/ as /d/ or /z/.  (A similar merger of /θ/ with /f/, and /ð/ with /v/ occurs in native English varieties such as Cockney, Newfoundland English, African American English, and Liberian English, making inroads into Estuary English too.)  At the same time, while native English speakers tend to distinguish /θ/ and /ð/ in their spoken language, they write both with the same digraph, ‘th’ – and increasingly not caring about the spoken distinction.  While we have opted to represent them in BR by ‘t’ and ‘d’ respectively, in Section 4 below the system is extended to differentiate between these consonants.

 

3.3.  Long vowels and diphthongs

 

The representation of long vowels and diphthongs is obtained by a straightforward application of the proportionality and context freeness principles.  In the case of long /ɪ/ we take ‘y’ instead of ‘i’ as a second letter, following the pattern of the diphthongs /aɪ/, /eɪ/ and /ɔɪ/.  Namely, we represent

 

/ɑ:/, /i:/, /ɔ:/, /u:/, /ɜ:/

by

aa, iy, oo, uu, aa

 

as in BR ‘faam’, ‘fiyl’, ‘soo’, ‘muud’, ‘baaning’ (TS ‘farm’, ‘feel’, ‘saw’, ‘mood’, ‘burning’), and

 

//, /aɪ/, /eə/, /eɪ/, /ɪə/, /oʊ/, /ɔɪ/, /ʊə/

by

au, ay, ea, ey, ia, ou, oy, ua

 

as in BR ‘nau’, ‘tray’, ‘hea’, ‘wey’, ‘dia’, ‘lou’, ‘voys’, ‘pua’ (TS ‘now’, ‘try’, ‘hair’, ‘way’, ‘dear’, ‘low’, ‘voice’, ‘poor’) respectively.

 

The initial impression of BR spelling may well be one of ‘inner city talk’, unusual, and in any case un-TS – which of course it is.  BR spellings like ‘fiyl’, ‘soo’, ‘baaning’, ‘tray’, ‘hea’, ‘wey’, ‘dia’, and ‘lou’ (TS ‘feel’, ‘saw’, ‘burning’, ‘try’, ‘hair’, ‘way’, ‘dear’, and ‘low’) may well appear either too dissimilar to present TS forms or confusing in their similarity to contradictory TS conventions.  This is the place to remind and stress that BR derives from Spoken English, and from traditional spelling patterns in the wider family of Romanized languages.  Furthermore, BR is self contained; it neither derives from TS, nor is it designed with a view to a step-by-step transition from TS to some reformed English spelling.  Similarity to TS is sought at the basic level only, when choosing the representation of consonants and vowels as in the case of ‘sh’, ‘ch’, ‘y’ (and ‘j’, ‘w’ in Section 4 below).  Once that representation is fixed, then because of the inconsistent nature of TS the two orthographies could be expected to be confusingly contradictory in many cases.  We are not concerned about that, for BR is intended for independent usage rather than in combination with TS; texts in BR are certainly not supposed to be read as if written in TS.  For instance, once we fix ‘o’ for /ɒ/ as in BR ‘boks’ (TS ‘box’), then we use ‘oo’ for /ɔ:/ as in BR ‘soo’ (TS ‘saw’), not bothering that TS uses ‘oo’ for /ʊ/ or /u:/ instead, as in ‘book’ and ‘mood’.

 

The obtained BR system uses 22 Roman letters (the letters ‘j’, ‘q’, ‘w’ and ‘x’ are not used), with no special characters or diacritical marks.  The chosen representation of short vowels, together with the derivative representation of long vowels and diphthongs it entails, contributes most to shaping the characteristic features that distinguish BR from TS, Interspel, and other orthographic systems such as those discussed in Section 7 below.

 

 

4.    Extensions and variants

 

4.1.  Rhotic variety

 

Rhotic dialects are accommodated by appending ‘r’ to the relevant non-rhotic graphemes, so that /ɚ/, /ɝ/ are rendered by ‘ar’ and ‘aar’ respectively [4], as in BR ‘tiycha’, ‘paasiyv’ becoming ‘tiychar’, ‘paarsiyv’ (TS ‘teacher’, ‘perceive’).

 

4.2.  Dental fricatives

 

Traditional English spelling represents both dental fricative consonants /θ/ and /ð/ by ‘th’ as in ‘think’ and ‘this’, and dialects may vary in which words are pronounced with what.  BR differentiates between the two, but spells /θ/ same like /t/, and /ð/ same like /d/.  Seeking to expand BR as near to one-to-one phoneme-grapheme correspondence as possible, one may consider a version of the system using ‘th’ for /θ/, and ‘dh’ for /ð/ [4], as in ‘think’, ‘dhis’ instead of BR ‘tink’, ‘dis’ (TS ‘think’, ‘this’).

 

4.3.  ‘w’ vs. ‘u’

 

The BR system could also be extended by spelling the consonant /w/ (as in ‘we’, ‘queen’) by ‘w’ instead of ‘u’ [4].  This extended system provides for a possible variant using ‘uw’ instead of ‘uu’ for /u:/ as in ‘muwd’ instead of BR ‘muud’ (TS ‘mood’), which however we do not take as standard.

 

4.4.  ‘j’ vs. ‘dzh’

 

The BR orthographic system could be further extended by using the grapheme ‘j’ instead of ‘dzh’ for /dʒ/, as in ‘joy’ instead of BR ‘dzhoy’ (TS ‘joy’) – a compact and traditional English spelling pattern [4].

 

We call Extended Basic Roman Spelling of English (EBR) the 24-letter system (the letters ‘q’ and ‘x’ are not used) obtained from BR by incorporating all the extensions given in 4.1-4.4.

 

Therefore, we consider two systems here: the simpler BR, and the more elaborate EBR.  The former is the system originally introduced in [4], while the options for an extended system were discussed in that work too.  The better choice between these two systems would derive from one’s preference for the precision of the system, or for its simplicity in using the available Roman letters and letter combinations instead.

 

 

5.    Phonemicity

 

While the present approach is essentially phonemic, the introduced orthography falls short of establishing a one-to-one correspondence between phonemes and graphemes; hence it could be described as semi-phonemic at word level.  It indicates the approximate rather than precise pronunciation of individual words.  In addition to the homophones now receiving identical spelling, more homographs are created by the presentation of /æ/, /ʌ/ and /ə/ by one and the same letter.

 

At textual level however, readers could retrieve the relevant word from among several homographs by taking into account the context of the sentence.  Therefore, writing in BR is context-free, while reading is context-dependent.  This property may possibly allow for the automated conversion of texts from BR into traditional spelling.

 

 

6.    Roman Phonetic Alphabet for English

 

The BR orthography is close enough to one-to-one phoneme-grapheme correspondence, which makes it possible to engender one by means of a minor adaptation.  We start from the full extended system EBR, then add stress marks as appropriate for a transcription system, and use them both to indicate stress and to disambiguate homographs as follows.

 

As shown in the table below, we take the unstressed ‘a’ to represent /ə/, and use two primary stress marks ‘’’ and ‘”’, and two secondary stress marks ‘,’ and ‘,,’, with ‘a’ in syllables stressed by ‘’’ or ‘,’ representing /æ/, and ‘a’ in syllables stressed by ‘”’ or ‘,,’ representing /ʌ/.  This convention is extended to the two relevant long vowels /ɑ:/ and /ɜ:/ too, taking into account that the latter may occur in unstressed as well as stressed position.  Namely, we take ‘aa’ in syllables stressed by ‘’’ or ‘,’ to represent /ɑ:/, and ‘aa’ in syllables that are either unstressed or stressed by ‘”’ or ‘,,’ to represent /ɜ:/.  In short, the set of stress marks ‘’’, ‘,’ is used in the case of /æ/ and /ɑ:/, while the stress marks ‘”’, ‘,,’ are used in the case of /ʌ/ and /ɜ:/.  Stress marks are placed before the syllable concerned (i.e. not necessarily next to the relevant vowel as in the table below).

 

For instance, the BR homographs ‘hat’ (IPA /hæt/, TS ‘hat’) and ‘hat’ (IPA /hʌt/, TS ‘hut’) are differentiated now to become RPA /