![]() Then, individual morphemes can be further analysed to identify the grammatical structure of the word. In this paper, we propose a deep learning approach for learning the rules for identifying the morphemes automatically and segmenting them from the original word. Therefore, determining the morpheme boundaries becomes a tough task, especially in languages like Malayalam. Due to sandhi, many morphological changes occur at the conjoining position of morphemes. Sandhi splitting is important in the morphological analysis of agglutinative languages like Malayalam, because of the richness in morphology, inflections and sandhi. This process is termed as sandhi splitting. In order to perform this task, morphemes have to be separated from the original word. It is the study of the rules of word construction by analysing the syntactic properties and morphological information. Morphological analysis is one of the fundamental tasks in computational processing of natural languages. We appraise the complexity of our pseudo-coded algorithm and finally, we propose an extension of this work in the creation of similar tools, for other Indian languages that use the Devanagari script, such as Hindi and Marathi. We have also explained the Sanskrit alphabet and its classifications, which are incorporated into our proffered process. Our rationale for the use of HK scheme, stem from its prime traits of Sanskrit Unicode encoding. The ASCII-based, HK and its variant, Indian Language Transliteration (ITRANS) schemes do not use diacritics and hence accounted to be the simplest. The International Alphabet of Sanskrit Transliteration (IAST) schemes used diacritics to disambiguate phonetic similarities and seem to have induced much strenuous venture for the non-professionals. Since the nineteenth century, various transliteration schemes based on Roman script have evolved. A survey on the evolution of scripts in India suggests the Brahmi script as the foundation for the origin of variants like Devanagari. It also describes the various standard schemes available for transcribing Devanagari into Roman. Accordingly, the paper presents the context of the utility for such an algorithm. This study is focused on development of a rule-based, grapheme model character alignment back-transliteration algorithm of Sanskrit script, transcribed ASCII(American Standard Code for Information Interchange)-encoded English to Devanagari, pursuant to the Harvard-Kyoto (HK) convention. The highly technical phonetic system of Sanskrit seems to have made the preparation of transliteration scheme quite arduous. Transliteration is the process to transcribe a script of one language into another, while, backward or back transliteration is converting back the transliterated text into its original script. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2022
Categories |