Multimedia Systems

Papers
(The H4-Index of Multimedia Systems is 28. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)
ArticleCitations
Unsupervised deep metric learning algorithm for crop disease images based on knowledge distillation networks133
Pseudo-global strategy-based visual comfort assessment considering attention mechanism95
SS-CMT: a label independent cross-modal transferable adversarial video attack with sparse strategy92
DiffRA: universal restorative adversarial attack based on diffusion model86
SFFN-YOLO for small object detection in aerial images78
On-line monitoring of structural performance of scraper conveyor driven by digital twin78
CHCoT-MSLU: a coupled hierarchical chain-of-thought prompt learning model for multi-intent spoken language understanding71
The segmented UEC Food-100 dataset with benchmark experiment on food detection70
A research for sound event localization and detection based on local–global adaptive fusion and temporal importance network62
Dual-branch spectral–spatial feature extraction network for multispectral image compression51
Face and voice cross-modal association with learning convex feature embedding47
ConASD: Contrastive Few Shot Learning for Detecting Autism Spectrum Disorder via Eye Tracking Scanpath46
LMFE-RDD: a road damage detector with a lightweight multi-feature extraction network45
SEMNet: a simple and efficient MLP-based network for 3D Face point clouds landmarks localization41
Feature fusion and optimization integrated refined deep residual network for diabetic retinopathy severity classification using fundus image41
SFRA: spatial fusion regression augmentation network for facial landmark detection41
User authentication method based on keystroke dynamics and mouse dynamics using HDA41
Model-based portrait video compression with spatial constraint and adaptive pose processing39
Real emotion seeker: recalibrating annotation for facial expression recognition35
Improving text-image cross-modal retrieval with contrastive loss35
Correction: STASiamRPN: visual tracking based on spatiotemporal and attention31
Generalizing sentence-level lipreading to unseen speakers: a two-stream end-to-end approach30
Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet30
Multi-level sentiment-aware clustering for denoising in multimodal sentiment analysis with ASR errors30
A visual question answering model based on image captioning30
Deep Learning-based forgery detection and localization for compressed images using a hybrid optimization model30
BENet: bi-directional enhanced network for image captioning29
CAPNet: tomato leaf disease detection network based on adaptive feature fusion and convolutional enhancement29
SS-YOLOv8: small-size object detection algorithm based on improved YOLOv8 for UAV imagery28
0.31573605537415