Multimedia Systems

Papers
(The H4-Index of Multimedia Systems is 25. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)
ArticleCitations
Unsupervised deep metric learning algorithm for crop disease images based on knowledge distillation networks97
Pseudo-global strategy-based visual comfort assessment considering attention mechanism85
A research for sound event localization and detection based on local–global adaptive fusion and temporal importance network78
A visual question answering model based on image captioning67
SS-CMT: a label independent cross-modal transferable adversarial video attack with sparse strategy64
Correction: STASiamRPN: visual tracking based on spatiotemporal and attention62
Recent advancement in haze removal approaches57
Face and voice cross-modal association with learning convex feature embedding55
Improving text-image cross-modal retrieval with contrastive loss44
User authentication method based on keystroke dynamics and mouse dynamics using HDA44
The segmented UEC Food-100 dataset with benchmark experiment on food detection44
Feature fusion and optimization integrated refined deep residual network for diabetic retinopathy severity classification using fundus image43
SS-YOLOv8: small-size object detection algorithm based on improved YOLOv8 for UAV imagery43
GVA: guided visual attention approach for automatic image caption generation40
SFRA: spatial fusion regression augmentation network for facial landmark detection38
Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet38
CAPNet: tomato leaf disease detection network based on adaptive feature fusion and convolutional enhancement38
LMFE-RDD: a road damage detector with a lightweight multi-feature extraction network38
Segmentation-aware image super-resolution with generative adversarial networks37
Multi-level sentiment-aware clustering for denoising in multimodal sentiment analysis with ASR errors33
SEMNet: a simple and efficient MLP-based network for 3D Face point clouds landmarks localization31
Dual-branch spectral–spatial feature extraction network for multispectral image compression29
360° video quality assessment based on saliency-guided viewport extraction27
Model-based portrait video compression with spatial constraint and adaptive pose processing27
Deep Learning-based forgery detection and localization for compressed images using a hybrid optimization model26
0.079833984375