(Created page with " == Abstract == <p>Multimodal medical imaging plays a pivotal role in clinical diagnostics by integrating complementary anatomical and functional information from modalities...")
 
m (Scipediacontent moved page Review 488264515018 to Shafiq et al 2026b)
 
(One intermediate revision by the same user not shown)
(No difference)

Latest revision as of 11:02, 23 March 2026

Abstract

Multimodal medical imaging plays a pivotal role in clinical diagnostics by integrating complementary anatomical and functional information from modalities such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and Single-Photon Emission Computed Tomography (SPECT). Despite notable progress, existing fusion approaches continue to face persistent challenges. Convolutional Neural Network (CNN)-based methods often suffer from information loss due to convolutional down-sampling, while Transformer architectures, though effective at capturing global dependencies, incur high computational costs and rely on large-scale pretraining. Generative Adversarial Network (GAN)-based fusion models can generate visually realistic outputs but are prone to training instability and limited reproducibility. In addition, prior studies frequently adopt inconsistent evaluation metrics, with insufficient emphasis on clinical interpretability and robustness, hindering real-world deployment across heterogeneous datasets and institutions. To address these limitations, this study proposes a U-shaped Nested Network – Restoration Transformer (U2Net–Restormer) framework with a Dilated Dense Encoder–Decoder architecture for robust multimodal medical image fusion. The framework integrates hierarchical multiscale representation learning with residual global contextual refinement. To enhance discriminative capability, an optimized Haar-based feature selection strategy is introduced to preserve high-gradient structural and functional details while reducing feature redundancy. Furthermore, an attention-driven fusion mechanism adaptively weights modality-specific contributions, enabling effective integration of heterogeneous information. The proposed method is evaluated on the Augmented Alzheimer’s Neuroimaging Library (AANLIB) multimodal brain imaging dataset, covering CT-MRI, PET-MRI, and SPECT-MRI fusion tasks. Experimental results demonstrate consistent performance gains over state-of-the-art CNN-, Transformer-, and GANbased methods, achieving Structural Similarity Index Measure (SSIM) up to 0.963, Peak Signal-to-Noise Ratio (PSNR) of 42.1 dB, Feature Mutual Information (FMI) of 0.86, and Edge Preservation Index (EPI) of 0.91, with improvements of at least 4%–6% across modalities. Subjective evaluations by radiologists and neurologists report Likert scores up to 4.8/5 for structural visibility, functional fidelity, and diagnostic value. Robustness analysis under Gaussian noise (σ= 15%) further confirms the method’s resilience. Overall, the proposed framework delivers high-fidelity, clinically interpretable multimodal fusion suitable for diverse imaging scenarios.


Document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document
Back to Top
GET PDF

Document information

Published on 22/03/26
Accepted on 15/01/26
Submitted on 10/11/25

Volume Online First, 2026
DOI: 10.23967/j.rimni.2026.10.75903
Licence: CC BY-NC-SA license

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?