IEEE Transactions on Multimedia

Papers
(The H4-Index of IEEE Transactions on Multimedia is 62. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
StrongSORT: Make DeepSORT Great Again297
Human Memory Update Strategy: A Multi-Layer Template Update Mechanism for Remote Visual Monitoring223
Low-Light Image Enhancement With Semi-Decoupled Decomposition209
AttentionFGAN: Infrared and Visible Image Fusion Using Attention-Based Generative Adversarial Networks208
Extended Feature Pyramid Network for Small Object Detection205
MFDNet: Collaborative Poses Perception and Matrix Fisher Distribution for Head Pose Estimation174
Coarse-to-Fine CNN for Image Super-Resolution152
DSLR: Deep Stacked Laplacian Restorer for Low-Light Image Enhancement147
Image-to-Image Translation: Methods and Applications145
Consensus Graph Learning for Multi-View Clustering137
3D Room Layout Estimation From a Single RGB Image137
EAPT: Efficient Attention Pyramid Transformer for Image Processing131
Parameter Sharing Exploration and Hetero-Center Triplet Loss for Visible-Thermal Person Re-Identification128
Beyond Triplet Loss: Person Re-Identification With Fine-Grained Difference-Aware Pairwise Loss127
SPA-GAN: Spatial Attention GAN for Image-to-Image Translation125
Geometric Back-Projection Network for Point Cloud Classification119
TBEFN: A Two-Branch Exposure-Fusion Network for Low-Light Image Enhancement119
Adaptive Graph Completion Based Incomplete Multi-View Clustering119
Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking116
Spatio-Temporal Attention Networks for Action Recognition and Detection111
Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes107
VehicleNet: Learning Robust Visual Representation for Vehicle Re-Identification104
Predicting the Perceptual Quality of Point Cloud: A 3D-to-2D Projection-Based Exploration100
Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation100
Image-Text Multimodal Emotion Classification via Multi-View Attentional Network99
CCAFNet: Crossflow and Cross-Scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images97
YDTR: Infrared and Visible Image Fusion via Y-Shape Dynamic Transformer96
Deep Multi-View Subspace Clustering With Unified and Discriminative Learning93
Low-Rank Pairwise Alignment Bilinear Network For Few-Shot Fine-Grained Image Classification91
Stacked U-Shape Network With Channel-Wise Attention for Salient Object Detection90
Deep-IRTarget: An Automatic Target Detector in Infrared Imagery Using Dual-Domain Feature Extraction and Allocation88
Multi-View Multi-Label Learning With Sparse Feature Selection for Image Annotation87
Real-Time and Accurate UAV Pedestrian Detection for Social Distancing Monitoring in COVID-19 Pandemic86
SiamCorners: Siamese Corner Networks for Visual Tracking86
Aggregation-Based Graph Convolutional Hashing for Unsupervised Cross-Modal Retrieval85
STNReID: Deep Convolutional Networks With Pairwise Spatial Transformer Networks for Partial Person Re-Identification85
Kernelized Multiview Subspace Analysis By Self-Weighted Learning84
Anti-Forensics for Face Swapping Videos via Adversarial Training83
Learning Disentangled Representation Implicitly Via Transformer for Occluded Person Re-Identification82
Multi-Channel Deep Networks for Block-Based Image Compressive Sensing81
An Automated and Robust Image Watermarking Scheme Based on Deep Neural Networks79
A Serial Image Copy-Move Forgery Localization Scheme With Source/Target Distinguishment78
Luminance-Aware Pyramid Network for Low-Light Image Enhancement77
A Recursive Reversible Data Hiding in Encrypted Images Method With a Very High Payload76
Attribute Restoration Framework for Anomaly Detection75
3D Face Reconstruction From A Single Image Assisted by 2D Face Images in the Wild75
Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval74
MFFENet: Multiscale Feature Fusion and Enhancement Network For RGB–Thermal Urban Road Scene Parsing72
Cross-Domain Contrastive Learning for Unsupervised Domain Adaptation72
Fast Intra Mode Decision Algorithm for Versatile Video Coding72
EHPE: Skeleton Cues-Based Gaussian Coordinate Encoding for Efficient Human Pose Estimation72
Deep Fusion Feature Representation Learning With Hard Mining Center-Triplet Loss for Person Re-Identification71
Driver Yawning Detection Based on Subtle Facial Action Recognition70
BVI-DVC: A Training Database for Deep Video Compression70
Uncertainty-Aware Unsupervised Domain Adaptation in Object Detection69
VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion67
Joint Contrast Enhancement and Exposure Fusion for Real-World Image Dehazing66
cmSalGAN: RGB-D Salient Object Detection With Cross-View Generative Adversarial Networks65
RelationTrack: Relation-Aware Multiple Object Tracking With Decoupled Representation64
A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition64
Illumination-Adaptive Person Re-Identification63
Cross View Capture for Stereo Image Super-Resolution62
RGBT Salient Object Detection: A Large-Scale Dataset and Benchmark62
0.061489105224609