m (Scipediacontent moved page Draft content 255578399 to Review 880364326573) |
m (Scipediacontent moved page Review 880364326573 to Shafiq et al 2026c) |
(No difference)
| |
Emotion recognition is crucial for advancing human–computer interaction (HCI) by enabling systems to interpret complex affective states. While Electroencephalogram (EEG) signals provide direct insights into neural activity, facial expressions offer external emotional cues. However, unimodal systems often struggle with robustness and generalization across diverse subjects. This study presents a Hierarchical Convolutional Neural Network (HCNN) framework that integrates EEG and facial expressions through multi-level convolutional feature extraction and featurelevel fusion. The proposed model combines deep hierarchical representations with handcrafted temporal–frequency and texture-based descriptors to form a unified feature vector. Experiments on the MAHNOB-HCI and DEAP datasets show that the HCNN achieves accuracies of 91.40% and 88.09%, outperforming CNN-, LSTM-, and SVM-based methods. The results demonstrate the model’s ability to effectively capture complementary cross-modal correlations while reducing feature redundancy and computational complexity. The HCNN framework shows great promise for real-time emotion recognition applications, offering a scalable, interpretable, and data-efficient solution for multimodal emotion recognition in next-generation HCI systems.
Published on 04/05/26
Accepted on 04/05/26
Submitted on 03/05/26
Volume Online First, 2026
DOI: 10.23967/j.rimni.2026.10.72094
Licence: CC BY-NC-SA license
Are you one of the authors of this document?