Image Translation and Reconstruction Using a Single Dual Mode Lightweight Encoder
DOI:
https://doi.org/10.62647/Abstract
In this work, we propose a novel Dual-Mode Web-Based Image Processor designed to address the challenges of image translation across different modalities. Traditional computer vision models often rely on a single sensor modality, such as RGB or thermal images, but fail to fully exploit the complementary strengths of both. Our architecture leverages a single lightweight encoder that efficiently encodes both grayscale and thermal images into compact latent vectors. This encoding enables cross-modal image translation, including grayscale image colorization and thermal image reconstruction, facilitating flexibility in handling multiple downstream tasks. Our approach reduces the computational burden by utilizing a compact encoder and optimizing for both data compression and robust image translation across varied lighting conditions. The model employs four distinct generators and two discriminators in an adversarial framework, incorporating reconstruction error terms to ensure consistency and contrast preservation. Experimental results demonstrate competitive quality in translation and reconstruction across various lighting scenarios, with comprehensive evaluations across multiple metrics. Additionally, ablation studies validate the effectiveness of the proposed loss terms, confirming their role in improving model performance.
Downloads
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.