Image Translation and Reconstruction Using a Single Dual Mode Lightweight Encoder

Manik Rao patil; G. Rakesh; Koyyada Pooja; Parshaveni Nagaraju

doi:10.62647/

Authors

Manik Rao patil Assistant Professor, Department Of IT, Guru Nanak Institutions Technical Campus (Autonomous), India. Author
G. Rakesh B.Tech Students, Department Of IT, Guru Nanak Institutions Technical Campus (Autonomous), India Author
Koyyada Pooja B.Tech Students, Department Of IT, Guru Nanak Institutions Technical Campus (Autonomous), India Author
Parshaveni Nagaraju B.Tech Students, Department Of IT, Guru Nanak Institutions Technical Campus (Autonomous), India Author

DOI:

https://doi.org/10.62647/

Abstract

In this work, we propose a novel Dual-Mode Web-Based Image Processor designed to address the challenges of image translation across different modalities. Traditional computer vision models often rely on a single sensor modality, such as RGB or thermal images, but fail to fully exploit the complementary strengths of both. Our architecture leverages a single lightweight encoder that efficiently encodes both grayscale and thermal images into compact latent vectors. This encoding enables cross-modal image translation, including grayscale image colorization and thermal image reconstruction, facilitating flexibility in handling multiple downstream tasks. Our approach reduces the computational burden by utilizing a compact encoder and optimizing for both data compression and robust image translation across varied lighting conditions. The model employs four distinct generators and two discriminators in an adversarial framework, incorporating reconstruction error terms to ensure consistency and contrast preservation. Experimental results demonstrate competitive quality in translation and reconstruction across various lighting scenarios, with comprehensive evaluations across multiple metrics. Additionally, ablation studies validate the effectiveness of the proposed loss terms, confirming their role in improving model performance.