DOIONLINE

DOIONLINE NO - IJAECS-IRAJ-DOIONLINE-11739

Publish In
International Journal of Advances in Electronics and Computer Science-IJAECS
Journal Home
Volume Issue
Issue
Volume-5,Issue-4  ( Apr, 2018 )
Paper Title
Identification of Marathi and Sanskrit Compound and Non-Compound Word using Genetic Algorithm
Author Name
Sonal P. Patil, K. N. Jariwala
Affilition
Ph.D. Research Scholar, Assistant Professor Computer Engineering Department, S.V.N.I.T, Surat, India
Pages
23-27
Abstract
Text based language recognition is the task of recognizing a language from a given text of document automatically. It is complicated to distinguish languages within language families than other families. In this paper, the performance of statistical measures has been investigated to determine the text-based language identification system with prominence on five languages used in India based on Devanagari script –Marathi, Hindi, Sanskrit, Bhojpuriand Nepali. ngrams is used as feature for classification in the proposed system. Language Identification is a main pre-processing step in several tasks of Natural Language Processing (NLP). There is wide scope in a multilingual society like India for automatic language identification since it would be a fundamental step in bridging the digital segregate between the Indian masses and the world. Index Terms- Devanagari Script, Multilingual Computing Wiener filter, Curvelet transform, Genetic algorithm
  View Paper