|Sinhala script (Sinhalese)|
Si?hala Ak?ara M?l?va
|Languages||Sinhala, Pali, Sanskrit|
Sinhala script (Sinhala? ; Si?hala Ak?ara M?l?va), also known as Sinhalese script, is a writing system used by the Sinhalese people and most Sri Lankans in Sri Lanka and elsewhere to write the Sinhala language, as well as the liturgical languages, Pali and Sanskrit. The Sinhalese Ak?ara M?l?va, one of the Brahmic scripts, is a descendant of the ancient Ancient Indian Brahmi script and closely related to the South Indian Kadamba alphabet.
Sinhala script is an abugida written from left to right. Sinhala letters are ordered into two sets. The core set of letters forms the ?uddha si?hala alphabet (Pure Sinhala, ), which is a subset of the mi?ra si?hala alphabet (Mixed Sinhala, ).
Sinhala script is a Brahmi derivate, and was imported from Northern India, around the 3rd century BCE, but was influenced at various stages by South Indian scripts, manifestly influenced by the early Grantha script.
By the 9th century CE, literature written in Sinhala script had emerged and the script began to be used in other contexts. For instance, the Buddhist literature of the Theravada-Buddhists of Sri Lanka, written in Pali, used Sinhala script.
In 1736 the Dutch were the first to print with Sinhala type on the island. The resulting type followed the features of that of the native Sinhala script practiced on palm leaves. The Dutch created type was monolinear and geometric in fashion with no separation between words in early documents. During the second half of the 19th century, during the Colonial period, a new style of Sinhala letterforms emerged in opposition to the monolinear and geometric form being high contrast in appearance and having varied thicknesses. This high contrast type gradually replaced the monolinear type as the preferred style which continues to be used in the present day. The high contrast style is still preferred for text typesetting in printed newspapers, books and magazines in Sri Lanka.
Sinhala script is an abugida written from left to right. It uses consonants as the basic unit for word construction as each consonant has an inherent vowel (/a/), which can be changed with a different vowel stroke. To represent different sounds it is necessary to add vowel strokes, or diacritics called ? Pili, that can be used before, after, above or below the base-consonant. Most of the Sinhala letters are curlicues; straight lines are almost completely absent from the alphabet, and it does not have joining characters. This is because Sinhala used to be written on dried palm leaves, which would split along the veins on writing straight lines. This was undesirable, and therefore, the round shapes were preferred. Upper and lower cases do not exist in Sinhala.
Sinhala letters are ordered into two sets. The core set of letters forms the ?uddha si?hala alphabet (Pure Sinhala, ), which is a subset of the mi?ra si?hala alphabet (Mixed Sinhala, ). This "pure" alphabet contains all the graphemes necessary to write E?u (classical Sinhala) as described in the classical grammar Sidatsan?gar? (1300 AD). This is the reason why this set is also called E?u h?diya ("E?u alphabet" ). The definition of the two sets is thus a historic one. Out of pure coincidence, the phoneme inventory of present-day colloquial Sinhala is such that yet again the ?uddha alphabet suffices as a good representation of the sounds. All native phonemes of the Sinhala spoken today can be represented in ?uddha, while in order to render special Sanskrit and Pali sounds, one can fall back on mi?ra si?hala. This is most notably necessary for the graphemes for the Middle Indic phonemes that the Sinhala language lost during its history, such as aspirates.
Most phonemes of Sinhala can be represented by a ?uddha letter or by a mi?ra letter, but normally only one of them is considered correct. This one-to-many mapping of phonemes onto graphemes is a frequent source of misspellings.
While a phoneme can be represented by more than one grapheme, each grapheme can be pronounced in only one way, with the exceptions of the inherent vowel sound, which can be either [a] (stressed) or [?] (unstressed), and "?" where the consonant is either [v] or [w] depending on the word. This means that the actual pronunciation of a word is almost always clear from its orthographic form. Stress is almost always predictable; only words with [v] or [w] (which are both allophones of "?"), and a very few other words need to be learnt individually.
Some pronunciation exceptions in Sinhala:
In Sinhala the diacritics are called ? pili (vowel strokes). diga means "long" because the vowel is sounded for longer and deka means "two" because the stroke is doubled when written.
|Using the consonant 'k' + 'vowel' as an example:|
|pilla||Name||Transliteration||Formation||Compound form||ISO 15919||IPA|
|?||Inherent /a/ (without any pili)||+ ?||?||ka||[k?]|
|?||Diga ædaya||+ ?||k?||[kæ:]|
|?||Diga ispilla||+ ?||k?||[ki:]|
|?||?||P?pilla||+ ?||ku||[ku], [k?]|
|?||?||Diga p?pilla||+ ?||k?||[ku:]|
|?||? ? ?||Gæ?a sahita ælapilla||+ + ?||kru||[kru]|
|?||? ?||Gæ?a sahita ælapili deka||+ + ?||kr?||[kru:]|
|?||Gayanukitta||Used in conjunction with kombuva for consonants.|
|?||Diga gayanukitta||Not in contemporary use|
|?||?||Kombuva saha halkir?ma||+ ?||k?||[ke:]|
|?||Kombu deka||+ ?||kai||[k?j]|
|?||? ?||Kombuva saha ælapilla||+ ?||ko||[ko]|
|?||? ?||Kombuva saha halælapilla||+ ?||k?||[ko:]|
|?||?||Kombuva saha gayanukitta||+ ?||kau||[k]|
The anusvara (often called binduva 'zero' ) is represented by one small circle ? (Unicode 0D82), and the visarga (technically part of the mi?ra alphabet) by two ? (Unicode 0D83). The inherent vowel can be removed by a special virama diacritic, the hal kir?ma ( ?), which has two shapes depending on which consonant it attaches to. Both are represented in the image on the right side. The first one is the most common one, while the second one is used for letters ending at the top left corner.
The ?uddha graphemes are the mainstay of Sinhala script and are used on an everyday-basis. Every sequence of sounds of Sinhala of today can be represented by these graphemes. Additionally, the ?uddha set comprises graphemes for retroflex ⟨?⟩ and ⟨?⟩, which are no longer phonemic in modern Sinhala. These two letters were needed for the representation of E?u, but are now obsolete from a purely phonemic view. However, words which historically contain these two phonemes are still often written with the graphemes representing the retroflex sounds.
|Display this table as an image|
Vowels come in two shapes: independent and diacritic. The independent shape is used when a vowel does not follow a consonant, e.g. at the beginning of a word. The diacritic shape is used when a vowel follows a consonant. Depending on the vowel, the diacritic can attach at several places. The diacritic for ⟨i⟩ attaches above the consonant, the diacritic for ⟨u⟩ attaches below, the diacritic for ⟨?⟩ follows, while the diacritic for ⟨e⟩ precedes. ⟨o⟩ finally is marked by the combination of preceding ⟨e⟩ and following ⟨?⟩.
While <a,e,i,o> are regular, the diacritic for ⟨u⟩ takes a different shape according to the consonant it attaches to. The most common one is represented on the image on the right for the consonant ? (p). The k-shape is used for some consonants ending at the lower right corner (? (k),? (g), ?(t), but not ?(n) or ?(h)). Combinations of ?(r) or ?(?) with ⟨u⟩ have idiosyncratic shapes.
|Display this table as an image|
The ?uddha alphabet comprises 8 plosives, 2 fricatives, 2 affricates, 2 nasals, 2 liquids and 2 glides. Additionally, there are the two graphemes for the retroflex sounds /?/ and /?/, which are not phonemic in modern Sinhala, but which still form part of the set. These are shaded in the table.
The voiceless affricate (? [ta]) is not included in the ?uddha set by purists since it does not occur in the main text of the Sidatsan?gar?. The Sidatsan?gar? does use it in examples though, so this sound did exist in E?u. In any case, it is needed for the representation of modern Sinhala.
The basic shapes of these consonants carry an inherent /a/ unless this is replaced by another vowel or removed by the hal kir?ma.
|Display this table as an image|
The prenasalized consonants resemble their plain counterparts. ⟨m?b⟩ is made up by the left half of ⟨m⟩ and the right half of ⟨b⟩, while the other three are just like the grapheme for the plosive with a little stroke attached to their left. Vowel diacritics attach in the same way as they would to the corresponding plain plosive.
The mi?ra alphabet is a superset of ?uddha. It adds letters for aspirates, retroflexes and sibilants, which are not phonemic in today's Sinhala, but which are necessary to represent non-native words, like loanwords from Sanskrit, Pali or English. The use of the extra letters is mainly a question of prestige. From a purely phonemic point of view, there is no benefit in using them, and they can be replaced by a (sequence of) ?uddha letters as follows: For the mi?ra aspirates, the replacement is the plain ?uddha counterpart, for the mi?ra retroflex liquids the corresponding ?uddha coronal liquid, for the sibilants, ⟨s⟩. ? (ñ) and ? (gn) cannot be represented by ?uddha graphemes but are found only in fewer than 10 words each. ? fa can be represented by ? pa with a Latin ⟨f⟩ inscribed in the cup.
|syllabic r||?||0D8D||?||[ur]||?||0DD8||?||[ru, ur]||?||0D8E||?||[ru:]||?||0DF2||?||[ru:, u:r]||syllabic r|
|syllabic l||?||0D8F||?||[li]||?||0DDF||?||[li]||?||0D90||?||[li:]||?||0DF3||?||[li:]||syllabic l|
|Display this table as an image|
There are six additional vocalic diacritics in the mi?ra alphabet. The two diphthongs are quite common, while the "syllabic" ? is much rarer, and the "syllabic" ? is all but obsolete. The latter are almost exclusively found in loanwords from Sanskrit.
The mi?ra ⟨?⟩ can also be written with ?uddha ⟨r⟩+⟨u⟩ or ⟨u⟩+⟨r⟩, which corresponds to the actual pronunciation. The mi?ra syllabic ⟨?⟩ is obsolete, but can be rendered by ?uddha ⟨l⟩+⟨i⟩. Mi?ra ⟨au⟩ is rendered as ?uddha ⟨awu⟩, mi?ra ⟨ai⟩ as ?uddha ⟨ayi⟩.
Note that the transliteration of both and ? is ⟨?⟩. This is not very problematic as the second one is extremely scarce.
|Extra mi?ra plosives|
|Other additional mi?ra graphemes|
|aspirate affricates||?||0DA1||cha||[ta]||?||0DA3||jha||[da]||aspirate affricates|
|other||?||0D9E||?a||[?a]||?||0DC6||fa||[fa, ?a, pa]||other|
|other||?||0DA6||n?ja||[nda]||f?||n/a||fa||[fa, ?a, pa]||other|
|Display this table as an image|
Certain combinations of graphemes trigger special ligatures. Special signs exist for an ? (r) following a consonant (inverted arch underneath), a ? (r) preceding a consonant (loop above) and a ? (y) following a consonant (half a ? on the right).  Furthermore, very frequent combinations are often written in one stroke, like ddh, kv or k?. If this is the case, the first consonant is not marked with a hal kir?ma.  The image on the left shows the glyph for ?r?, which is composed of the letter ? with a ligature indicating the r below and the vowel ? marked above. Most other conjunct consonants are made with an explicit virama, called al-lakuna or hal kir?ma, and the zero-width joiner as shown in the following table, some of which may not display correctly due to limitations of your system. Some of the more common are displayed in the following table. Note that although modern Sinhala sounds are not aspirated, aspiration is marked in the sound where it was historically present to highlight the differences in modern spelling. Also note that all of the combinations are encoded with the al-lakuna (Unicode U+0DCA) first, followed by the zero-width joiner (Unicode U+200D) except for touching letters which have the zero-width joiner (Unicode U+200D) first followed by the al-lakuna (Unicode U+0DCA). Touching letters were used in ancient scriptures but are not used in modern Sinhala. Vowels may be attached to any of the ligatures formed, attaching to the rightmost part of the glyph except for vowels that use the kombuva, where the kombuva is written before the ligature or cluster and the remainder of the vowel, if any, is attached to the rightmost part. In the table below, appending "o" (kombuva saha ælepilla - kombuva with ælepilla) to the cluster "ky" /kja/ only adds a single code point, but adds two vowel strokes, one each to the left and right of the consonant cluster.
|/kja/||U+0D9A U+0DCA U+0DBA||?||U+0D9A U+0DCA U+200D U+0DBA||yansaya|
|/kjo/||?||U+0D9A U+0DCA U+0DBA U+0DCC||U+0D9A U+0DCA U+200D U+0DBA U+0DCC||yansaya|
|/?ja/||U+0D9C U+0DCA U+0DBA||?||U+0D9C U+0DCA U+200D U+0DBA||yansaya|
|/kra/||U+0D9A U+0DCA U+0DBB||?||U+0D9A U+0DCA U+200D U+0DBB||rak?ransaya|
|/?ra/||U+0D9C U+0DCA U+0DBB||?||U+0D9C U+0DCA U+200D U+0DBB||rak?ransaya|
|/rka/||U+0DBB U+0DCA U+0D9A||?||U+0DBB U+0DCA U+200D U+0D9A||r?paya|
|/r?a/||U+0DBB U+0DCA U+0D9C||?||U+0DBB U+0DCA U+200D U+0D9C||r?paya|
|/kjra/||U+0D9A U+0DCA U+0DBA U+0DCA U+0DBB||?||U+0D9A U+0DCA U+200D U+0DBA U+0DCA U+200D U+0DBB||yansaya + rak?ransaya|
|/?jra/||U+0D9C U+0DCA U+0DBA U+0DCA U+0DBB||?||U+0D9C U+0DCA U+200D U+0DBA U+0DCA U+200D U+0DBB||yansaya + rak?ransaya|
|/rkja/||U+0DBB U+0DCA U+0D9A U+0DCA U+0DBA||?||U+0DBB U+0DCA U+200D U+0D9A U+0DCA U+200D U+0DBA||r?paya + yansaya|
|/r?ja/||U+0DBB U+0DCA U+0D9C U+0DCA U+0DBA||?||U+0DBB U+0DCA U+200D U+0D9C U+0DCA U+200D U+0DBA||r?paya + yansaya|
|/kva/||U+0D9A U+0DCA U+0DC0||?||U+0D9A U+0DCA U+200D U+0DC0||conjunct|
|/k?a/||U+0D9A U+0DCA U+0DC2||?||U+0D9A U+0DCA U+200D U+0DC2||conjunct|
|/?d?a/||U+0D9C U+0DCA U+0DB0||?||U+0D9C U+0DCA U+200D U+0DB0||conjunct|
|/a/||U+0DA7 U+0DCA U+0DA8||?||U+0DA7 U+0DCA U+200D U+0DA8||conjunct|
|/t?ta/||U+0DAD U+0DCA U+0DAE||?||U+0DAD U+0DCA U+200D U+0DAE||conjunct|
|/t?va/||U+0DAD U+0DCA U+0DC0||?||U+0DAD U+0DCA U+200D U+0DC0||conjunct|
|/d?da/||U+0DAF U+0DCA U+0DB0||?||U+0DAF U+0DCA U+200D U+0DB0||conjunct|
|/d?va/||U+0DAF U+0DCA U+0DC0||?||U+0DAF U+0DCA U+200D U+0DC0||conjunct|
|/nd?a/||U+0DB1 U+0DCA U+0DAF||?||U+0DB1 U+0DCA U+200D U+0DAF||conjunct|
|/nda/||U+0DB1 U+0DCA U+0DB0||?||U+0DB1 U+0DCA U+200D U+0DB0||conjunct|
|/mma/||U+0DB8 U+0DCA U+0DB8||?||U+0DB8 U+200D U+0DCA U+0DB8||touching|
The Sinhala ?uddha graphemes are named in a uniform way adding -yanna to the sound produced by the letter, including vocalic diacritics. The name for the letter ? is thus ayanna, for the letter ? ?yanna, for the letter ? kayanna, for the letter k?yanna, for the letter keyanna and so forth. For letters with hal kir?ma, an epenthetic a is added for easier pronunciation: the name for the letter is akyanna. Another naming convention is to use al- before a letter with suppressed vowel, thus alkayanna.
Since the extra mi?ra letters are phonetically not distinguishable from the ?uddha letters, proceeding in the same way would lead to confusion. Names of mi?ra letters are normally made up of the names of two ?uddha letters pronounced as one word. The first one indicates the sound, the second one the shape. For example, the aspirated ? (kh) is called bayanu kayanna. kayanna indicates the sound, while bayanu indicates the shape (kh) is similar in shape to ? (b) (bayunu = like bayanna). Another method is to qualify the mi?ra aspirates by mah?pr?na (?: mah?pr?na kayanna) and the mi?ra retroflexes by m?rdhaja (?: m?rdhaja layanna).
This section needs expansion. You can help by adding to it. (February 2019)
Each Sinhala letter has a specific stroke order and method of writing.
Sinhala Illakkam were used for writing numbers prior to the fall of Kandyan Kingdom in 1815. These digits did not have a zero instead the numbers had signs for 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000. These digits and numbers can be seen primarily in Royal documents and artefacts.
Prior to the fall of Kandyan Kingdom all calculations were carried out using Lith digits. After the fall of the Kandyan Kingdom, Sinhala Lith Illakkam were primarily used for writing horoscopes. However, there is evidence that they were used for other purposes such as writing page numbers etc. The tradition of writing degrees and minutes of zodiac signs in horoscopes continued into the 20th century using different versions of Lith Digits. Unlike the Sinhala Illakkam, Sinhala Lith Illakkam included a 0.
Layman's transliterations in Sri Lanka normally follow neither of these. Vowels are transliterated according to English spelling equivalences, which can yield a variety of spellings for a number of phonemes. /i:/ for instance can be ⟨ee⟩, ⟨e⟩, ⟨ea⟩, ⟨i⟩, etc. A transliteration pattern peculiar to Sinhala, and facilitated by the absence of phonemic aspirates, is the use of ⟨th⟩ for the voiceless dental plosive, and the use of ⟨t⟩ for the voiceless retroflex plosive. This is presumably because the retroflex plosive /?/ is perceived the same as the English alveolar plosive /t/, and the Sinhala dental plosive /t?/ is equated with the English voiceless dental fricative /?/. Dental and retroflex voiced plosives are always rendered as ⟨d⟩, though, presumably because ⟨dh⟩ is not found as a representation of in English orthography.
Many of the oldest manuscripts in the Pali language are written in the Sinhala script. Mi?ra consonants are used to represent Pali phonemes that have no Sinhala counterpart. The following table lays out the Sinhala representations of Pali consonants with their standard academic Romanizations:
|velar||? (ka)||? (kha)||? (ga)||? (gha)||? (?a)|
|palatal||? (ca)||? (cha)||? (ja)||? (jha)||? (ña)|
|retroflex||? (?a)||? (?ha)||? (?a)||? (?ha)||? (?a)|
|dental||? (ta)||? (tha)||? (da)||? (dha)||? (na)|
|labial||? (pa)||? (pha)||? (ba)||? (bha)||? (ma)|
|unordered||? (ya)||? (ra)||? (la)||? (va)||? (sa)||? (ha)||? (?a)|
The vowels are a subset of those for writing Sinhala:
(on ? ka)
The niggah?ta is represented with the sign ?. Consonant sequences may be combined in ligatures in a manner identical to that described above for Sinhala.
As an example, below is the first verse from the Dhammapada in Pali in Sinhala script, followed by Romanization:
, ? ;
? , ;
, ? ? .
Manopubba?gam? dhamm?, manoseh? manomay?;
manas? ce paduhena bh?sati v? karoti v?;
tato na? dukkhamanveti cakka?va vahato pada?.-- Yamaka-vaggo 1
Sinhala is one of the Brahmic scripts, and thus shares many similarities with other members of the family, such as the Kannada, Malayalam, Telugu, Tamil script and Devan?gar?. As a general example, /a/ is the inherent vowel in all these scripts. Other similarities include the diacritic for ⟨ai⟩, which resembles a doubled ⟨e⟩ in all scripts and the diacritic for ⟨au⟩ which is composed of preceding ⟨e⟩ and following ⟨?⟩.
Likewise, the combination of the diacritics for ⟨e⟩ and ⟨?⟩ yields ⟨o⟩ in all these scripts.
Sinhala alphabet differs from other Indo-Aryan alphabets in that it contains a pair of vowel sounds (U+0DD0 and U+0DD1 in the proposed Unicode Standard) that are unique to it. These are the two vowel sounds that are similar to the two vowel sounds that occur at the beginning of the English words at (?) and ant (?).
Another feature that distinguishes Sinhala from its sister Indo-Aryan languages is the presence of a set of five nasal sounds known as half-nasal or prenasalized stops.
Generally speaking, Sinhala support is less developed than support for Devan?gar?, for instance. A recurring problem is the rendering of diacritics which precede the consonant and diacritic signs which come in different shapes, like the one for ⟨u⟩.
Sinhala does not come built in with Windows XP, unlike Tamil and Hindi. However, all versions of Windows Vista and Windows 10 come with Sinhala support by default, and do not require external fonts to be installed to read Sinhala script. Nirmala UI is the default Sinhala font in windows 10. The newest version of Windows 10 has added support for Sinhala Archaic Numbers that were not supported by default in the previous version.
For Linux, the IBus, and SCIM input methods allow the use Sinhala script in applications with support for a number of key maps and techniques such as traditional, phonetic and assisted techniques. In addition, newer versions of Android mobile operating system also support both rendering and input of Sinhala script.
The main Unicode block for Sinhala is U+0D80-U+0DFF. Another block, Sinhala Archaic Numbers, was added to Unicode in version 7.0.0 in June 2014. Its range is U+111E0-U+111FF.
Official Unicode Consortium code chart (PDF)
|Sinhala Archaic Numbers|
Official Unicode Consortium code chart (PDF)