![]() Figure 12.8 shows the possible words for the input sequence of “222.” As a result, the word “ ” is selected with “ ”, “ ”, “ ”, and “ ” listed as the alternative candidates. For example, “kan,” “karn,” “khan,” and “kharn” can be interpreted as the same pronunciation. Due to the nonstandard Roman transliteration of Thai words, all possible spellings are permitted in order to yield the same or similar pronunciation of the original word. Second, the appropriate word based on its pronunciation has to be selected, for which the word n-gram model of the Thai text is taken into account. First, the appropriate sequence of characters in the Roman transliterated form has to be selected, for which the character n-gram model of the Roman transliterated form is taken into account. In order to implement “one click for one character,” the method has to deal with two types of ambiguity. This gets around the problem of the large number of Thai characters as well as the variations in keypad layout. Sornlertlamvanich (2001a, b) proposed an input method for Thai text using Roman transliteration. The form will be automatically disambiguated by the text entry software. ![]() ![]() Therefore, in Arabic, the user enters consonantal characters ambiguously, without entering information on which form the character takes. Many allographs are used in Arabic because each of the 28 consonantal characters has four different forms (initial, medial, final, and isolated), so the total number of character forms including diacritics exceeds 100. The assignment policy differs for Arabic and Hebrew, however, due to differences in how many allographs are used and the character form variations of each script. There is no systematic phonetic similarity between Arabic/Hebrew and Roman characters on the same keys, nor does there seem to be any coordination between Hebrew and Arabic keyboard layouts. The keyboard layouts reflect the legacy of typewriters. Thus, all characters to be entered are assigned on the standard full keyboard. Entering “Qur'an” in Arabic and automatic prediction of the hamza diacritic. BIS has formulated IS 15341 Transliteration of the Indian Scripts to the Roman Script in 2003.įIGURE 13.9. A British standard on the subject, BS 2978, was formulated in 1958. It has so far brought out 16 standards for transliteration of different types of characters, including ISO 15919: 2001 Transliteration of Devanagari and Related Indian Scripts into Latin Characters. ISO was the first organization to bring out a standard in this field in 1955, viz., ISO/R-9 International System for the Transliteration of Cyrillic Characters, which was revised in 1968. Unless a standard pattern of transliteration is followed, there is always a chance of misplacement and hiding of entries. The problem of transliteration is quite serious in India where documents are produced in 15 languages and several dialects. Transliteration is required in documentation when the documents being processed and listed are in different languages. Transliteration means representation of words and phrases of one language by the alphabets of another keeping their pronunciation intact. ![]() Amitabha Chatterjee, in Elements of Information Organization and Dissemination, 2017 X.6.2 Transliteration
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |