A survey on Morphological Feature Extraction for Gujarati

Authors

  • Jeenal Patel Research Scholar, Parul University, 391760, India Author
  • Dr. Pooja Bhatt Associate Professor, Parul University, 391760, India Author

DOI:

https://doi.org/10.47392/IRJAEM.2026.0115

Keywords:

Computational Morphology, Gujarati Language, GujMORPH, Grammatical Feature Prediction, Bi-Directional LSTM, Gujarati-BERT, Hybrid Morphological Analyzer

Abstract

The field of linguistics known as morphology focuses on the smallest meaningful units, known  as morphemes, and examines the internal structure and formation of words. It studies how prefixes, suffixes, and roots work together to form new words, shift grammatical categories (such as noun to adjective), or modify number and tense. The following datasets are used in this paper for morphological analysis in the Gujarati language: Gujmorph, TDIL-ILCI-II corpus, Rudhiprayog ane kahevatsangrah , Gujarati Lexicon, and EMILLE corpus. In this review on a bidirectional LSTM-based morphological analyzer for Gujarati, the authors show that across important POS categories, the Bi-LSTM with Individual Label Representation method outperforms the Bi-LSTM monolithic and Bi-LSTM individual feature representation approaches in terms of accuracy. The accuracy increased from 68.27% (unsupervised) and 70.64% (individual feature representation) to 99.95% for nouns, from 12.95% and 16.18% to 78.76% for verbs, and from 25.72% and 85.85% to 99.84% for adjectives. Dataset Expansion: Future research should focus on making the current training datasets larger and include other POS categories outside of nouns, verbs, and adjectives.India.

Downloads

Download data is not yet available.

Downloads

Published

2026-04-06