A survey on Morphological Feature Extraction for Gujarati
DOI:
https://doi.org/10.47392/IRJAEM.2026.0115Keywords:
Computational Morphology, Gujarati Language, GujMORPH, Grammatical Feature Prediction, Bi-Directional LSTM, Gujarati-BERT, Hybrid Morphological AnalyzerAbstract
The field of linguistics known as morphology focuses on the smallest meaningful units, known as morphemes, and examines the internal structure and formation of words. It studies how prefixes, suffixes, and roots work together to form new words, shift grammatical categories (such as noun to adjective), or modify number and tense. The following datasets are used in this paper for morphological analysis in the Gujarati language: Gujmorph, TDIL-ILCI-II corpus, Rudhiprayog ane kahevatsangrah , Gujarati Lexicon, and EMILLE corpus. In this review on a bidirectional LSTM-based morphological analyzer for Gujarati, the authors show that across important POS categories, the Bi-LSTM with Individual Label Representation method outperforms the Bi-LSTM monolithic and Bi-LSTM individual feature representation approaches in terms of accuracy. The accuracy increased from 68.27% (unsupervised) and 70.64% (individual feature representation) to 99.95% for nouns, from 12.95% and 16.18% to 78.76% for verbs, and from 25.72% and 85.85% to 99.84% for adjectives. Dataset Expansion: Future research should focus on making the current training datasets larger and include other POS categories outside of nouns, verbs, and adjectives.India.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Research Journal on Advanced Engineering and Management (IRJAEM)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.