Medical Insurance Price Prediction Using ML
DOI:
https://doi.org/10.47392/IRJAEM.2025.0333Keywords:
Health Insurance, Machine Learning, Medical Costs Prediction, Regression Models, Random Forest, Insurance Pricing, COVID-19, Predictive Modeling, Policy Optimization, Healthcare Expenditure, Supervised Learning, Insurance Dataset, Cost Estimation, Algorithm Comparison, Insurance Policy DesignAbstract
The rising significance of health insurance in the aftermath of the COVID-19 pandemic has spurred numerous initiatives aimed at better understanding and managing medical insurance costs. This study presents a machine learning-based approach for predicting health insurance expenses using a dataset sourced from Kaggle. The primary objective is to develop a predictive system that assists individuals in making cost-effective insurance decisions and supports policymakers in identifying and regulating high-cost providers. Various regression algorithms were explored to capture the complex relationships between individual and regional health factors and insurance costs. Few models—Linear Regression, Ridge Regression, Support Vector Regression, Random Forest, XGBoost, Decision Tree and k-Nearest Neighbors—were evaluated for performance. Among these, Random Forest served as a baseline model for predictions. The results highlight the potential of machine learning to improve insurance pricing transparency, reduce unnecessary expenditure, and enhance the efficiency of insurance policy formulation. Early cost prediction empowers users with informed choices and contributes to a more equitable healthcare system.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Research Journal on Advanced Engineering and Management (IRJAEM)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.