Cervical Cancer Detection Using a Hybrid CNN-Vision Transformer Model: A Comparative Study with Efficient NETB, DenseNET, Xception, And ResNET50

Mrs. A Gokilavani; C. Gunalakshmi; S. Jaisri; L. Keerthana; M. Thirisha

doi:10.47392/IRJAEM.2025.0220

Authors

Mrs. A Gokilavani Assistant Professor, Computer Science and Engineering, Jai Shriram Engineering College, Tamil Nadu, India. Author
C. Gunalakshmi Student, Computer Science and Engineering, Jai Shriram Engineering College, Tamil Nadu, India. Author
S. Jaisri Student, Computer Science and Engineering, Jai Shriram Engineering College, Tamil Nadu, India. Author
L. Keerthana Student, Computer Science and Engineering, Jai Shriram Engineering College, Tamil Nadu, India. Author
M. Thirisha Student, Computer Science and Engineering, Jai Shriram Engineering College, Tamil Nadu, India. Author

DOI:

https://doi.org/10.47392/IRJAEM.2025.0220

Keywords:

Vision Transformers, Hybrid Model, Feature Extraction, Deep Learning, Convolutional Neural Networks

Abstract

Cervical cancer remains one of the leading causes of cancer-related deaths among women worldwide, particularly in low-resource settings. Early detection is crucial for improving survival rates, and advancements in deep learning have shown promise in automating this process. This paper proposes a novel hybrid model combining Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) for cervical cancer detection. We integrate EfficientNetB, DenseNet, Xception, and ResNet50 as backbone CNN architectures to extract hierarchical features, followed by a Vision Transformer to capture long-range dependencies and global context. The proposed model is evaluated on a publicly available cervical cancer dataset, achieving state-of-the-art accuracy, sensitivity, and specificity performance. Our results demonstrate the effectiveness of combining CNNs and ViTs for medical image analysis, providing a robust framework for cervical cancer detection.

Downloads

Download data is not yet available.

Cervical Cancer Detection Using a Hybrid CNN-Vision Transformer Model: A Comparative Study with Efficient NETB, DenseNET, Xception, And ResNET50

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

Keywords

Language

Information

Make a Submission