Comparative Analysis of Machine Learning and Deep Learning Approaches for Predicting Closed Questions on Stack Overflow

Authors

  • Puranasree M S UG, Artificial Intelligence and Data Science, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore - 641043, Tamil Nadu, India. Author
  • Rithanyavarshikaa M UG, Artificial Intelligence and Data Science, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore - 641043, Tamil Nadu, India. Author
  • Sowndarya UG, Artificial Intelligence and Data Science, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore - 641043, Tamil Nadu, India. Author
  • Swetha P UG, Artificial Intelligence and Data Science, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore - 641043, Tamil Nadu, India. Author
  • DR.D. Nithya Associate Professor, CSE, Avinashilingam Institute for Home Science and Higher Education for Women, Coimbatore - 641043, Tamil Nadu, India. Author

DOI:

https://doi.org/10.47392/IRJAEM.2025.0106

Keywords:

Convolutional Neural Network, Deep Learning, Machine Learning, Stack Overflow, XGBoost Classifier

Abstract

Stack Overflow, as a primary platform for programming-related knowledge sharing, faces ongoing challenges in maintaining content quality and managing duplicate questions. This research investigates two distinct computational approaches - Machine Learning and Deep Learning to predict question closure to enhance the efficiency of content question quality. The methodology encompasses two parallel approaches: an XGBoost classifier leveraging TF-IDF vectorization and a Convolutional Neural Network (CNN) architecture for semantic pattern recognition. The analysis utilizes a comprehensive dataset of labelled Stack Overflow questions, with both approaches incorporating text cleaning, tags removal and feature extraction in their respective pre-processing pipelines. Performance evaluation employs standard metrics including accuracy, precision, recall F1-score and confusion matrix. The comparative analysis provides insights into the relative strengths and limitations of traditional machine learning versus deep learning approaches, demonstrating each method's unique capabilities in identifying questions likely to be closed.

Downloads

Download data is not yet available.

Downloads

Published

2025-03-22