A Comparative Study of Deep Learning-Based CBIR Frameworks: From CNN Baselines to Explainable and Hybrid Models

Nagaraju. P.B; Gaddikoppula Anil Kumar

doi:10.62647/IJITCE2025V13I2PP1446-1452

Authors

Nagaraju. P.B Research Scholar, Department of CSE Bharatiya Engineering Science and Technology Innovation University (BESTIU), AP & Asst. Professor, IT Department, S.R.K.R.Engineering College (A), Bhimavaram, AP, India Author
Gaddikoppula Anil Kumar Principal and Professor of CSE, Scient Institute of Technology, Ibrahimpatnam, P.R. District, Telangana, India Author

DOI:

https://doi.org/10.62647/IJITCE2025V13I2PP1446-1452

Keywords:

Content-Based Image Retrieval, Deep Learning, Explainable AI, CNN-Transformer Fusion, Image Similarity

Abstract

Deep learning has proven to be a
breakthrough technology transforming
Content-Based Image Retrieval (CBIR). In
this paper, we present a comparative study
of three novel frameworks of CBIR that
were developed in a series of studies,
namely a baseline of Modified CNN model,
the Explain CBIR-Net utilizing explainable
AI with Grad-CAM, and HybridCBIRNet,
which incorporates CNN and
Transformer-based architecture with
weighted feature fusion. We quantitatively
estimate all models' accuracy, precision,
recall, and explain ability over the Mini
ImageNet dataset. The findings of our
comparative study underscore the balance
between accuracy, interpretability, and
feature richness. The findings validate the
superior efficacy of HybridCBIRNet
while highlighting the value of contextual
and explainable modelling towards critical
retrieval applications.