IEEE Transactions on Multimedia

Papers
(The H4-Index of IEEE Transactions on Multimedia is 61. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
AMS-Net: Adaptive Multi-Scale Network for Image Compressive Sensing514
Self-Mining the Confident Prototypes for Source-Free Unsupervised Domain Adaptation in Image Segmentation288
Focusing on Subtle Differences: A Feature Disentanglement Model for Series Photo Selection273
Semantic-Aware Triplet Loss for Image Classification206
Bias-Correction Feature Learner for Semi-Supervised Instance Segmentation200
Weakly-Supervised Video Object Grounding via Learning Uni-Modal Associations194
Optimal Transport-Based Patch Matching for Image Style Transfer171
Adaptive Weight Generator for Multi-Task Image Recognition by Task Grouping Prompt171
Semi-Supervised Domain Adaptation via Joint Transductive and Inductive Subspace Learning154
Improving Pre-Trained Model-Based Speech Emotion Recognition From a Low-Level Speech Feature Perspective147
Efficient Cross-Modal Video Retrieval With Meta-Optimized Frames133
Rethinking Video Sentence Grounding From a Tracking Perspective With Memory Network and Masked Attention132
Rethinking Affine Transform for Efficient Image Enhancement: A Color Space Perspective127
Perceptual Image Hashing Using Feature Fusion of Orthogonal Moments120
SkyML: A MLaaS Federation Design for Multicloud-based Multimedia Analytics114
Disaggregation Distillation for Person Search113
Multi-Level Transitional Contrast Learning for Personalized Image Aesthetics Assessment108
Online Low-Light Sand-Dust Video Enhancement Using Adaptive Dynamic Brightness Correction and a Rolling Guidance Filter101
Guided Image-to-Image Translation by Discriminator-Generator Communication101
Pixel Bleach Network for Detecting Face Forgery Under Compression100
Vulnerability of Feature Extractors in 2D Image-Based 3D Object Retrieval99
One-shot Human Motion Transfer via Occlusion-Robust Flow Prediction and Neural Texturing98
A Total Variation With Joint Norms For Infrared and Visible Image Fusion98
Adaptive HEVC Video Steganography With High Performance Based on Attention-Net and PU Partition Modes98
SGG-Nets: Generic Rotation-Invariant Plugin Networks for Point Cloud Analysis97
Feature First: Advancing Image-Text Retrieval Through Improved Visual Features96
Improving Vision Anomaly Detection With the Guidance of Language Modality94
BMB: Balanced Memory Bank for Long-Tailed Semi-Supervised Learning92
Ensemble Prototype Networks for Unsupervised Cross-modal Hashing with Cross-Task Consistency91
Asymptotics-Aware Multi-View Subspace Clustering90
Deep Semantic-consistent Penalizing Hashing for Cross-modal Retrieval89
Neighborhood Contrastive Transformer for Change Captioning87
Annealing Genetic GAN for Imbalanced Web Data Learning85
Unsupervised Learning-Based Framework for Deepfake Video Detection81
Late Fusion Multiple Kernel Clustering With Local Kernel Alignment Maximization80
ICE: Interactive 3D Game Character Facial Editing via Dialogue78
Structured Attention Network for Referring Image Segmentation77
MGKsite: Multi-Modal Knowledge-Driven Site Selection via Intra and Inter-Modal Graph Fusion77
Dynamic Contrastive Distillation for Image-Text Retrieval77
SCSP: An Unsupervised Image-to-Image Translation Network Based on Semantic Cooperative Shape Perception74
Disentangled Graph Variational Auto-Encoder for Multimodal Recommendation With Interpretability73
Progressive Local Filter Pruning for Image Retrieval Acceleration72
Few-Shot Generative Model Adaptation via Style-Guided Prompt72
Bidirectional Translation Between UHD-HDR and HD-SDR Videos72
MHRN: A Multimodal Hierarchical Reasoning Network for Topic Detection72
Towards Fast and Robust Real Image Denoising With Attentive Neural Network and PID Controller70
Semi-Supervised Contrastive Learning With Similarity Co-Calibration69
Siamese Alignment Network for Weakly Supervised Video Moment Retrieval68
PhotoHelper: Portrait Photographing Guidance Via Deep Feature Retrieval and Fusion67
A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition66
Semi-Supervised Domain Adaptation for Major Depressive Disorder Detection66
Skeleton-Based Action Recognition With Select-Assemble-Normalize Graph Convolutional Networks65
Hierarchical Equalization Loss for Long-Tailed Instance Segmentation64
Interpretable Graph Convolutional Network for Multi-View Semi-Supervised Learning64
Total Generate: Cycle in Cycle Generative Adversarial Networks for Generating Human Faces, Hands, Bodies, and Natural Scenes64
Dual-task Mutual Reinforcing Embedded Joint Video Paragraph Retrieval and Grounding63
Quality Assessment for DIBR-Synthesized Views Based on Wavelet Transform and Gradient Magnitude Similarity63
Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition63
Exploring Kernel Transformations for Implicit Neural Representations63
STNet: Scale Tree Network With Multi-Level Auxiliator for Crowd Counting62
FoodSAM: Any Food Segmentation62
0.052007913589478