ACM Transactions on Multimedia Computing Communications and Applicatio

Papers
(The TQCC of ACM Transactions on Multimedia Computing Communications and Applicatio is 5. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense Inference188
BiC-Net: Learning Efficient Spatio-temporal Relation for Text-Video Retrieval125
Tensorial Evolutionary Optimization for Natural Image Matting115
QuickCSGModeling: Quick CSG Operations Based on Fusing Signed Distance Fields for VR Modeling94
Unsupervised Domain Expansion for Visual Categorization84
Generative Adversarial Networks with Learnable Auxiliary Module for Image Synthesis83
Context-Based Novel Histogram Bin Stretching Algorithm for Automatic Contrast Enhancement73
UVC: An Unified Deep Video Compression Framework62
Discriminative Action Snippet Propagation Network for Weakly Supervised Temporal Action Localization62
Unleashing Creativity in the Metaverse: Generative AI and Multimodal Content62
Spatio-Temporal Attention for Text-Video Retrieval59
A 360-degree Video Player for Dynamic Video Editing Applications57
Residual-guided In-loop Filter Using Convolution Neural Network55
When Pairs Meet Triplets: Improving Low-Resource Captioning via Multi-Objective Optimization55
Deep Saliency Mapping for 3D Meshes and Applications49
PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications49
Correlation-aware Cross-modal Attention Network for Fashion Compatibility Modeling in UGC Systems48
Variational Autoencoder with CCA for Audio–Visual Cross-modal Retrieval46
Graph Attention Transformer Network for Multi-label Image Classification45
Incomplete Multiview Clustering via Semidiscrete Optimal Transport for Multimedia Data Mining in IoT42
Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network41
Hiding Message Using a Cycle Generative Adversarial Network40
Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning39
Boolean-based Two-in-One Secret Image Sharing by Adaptive Pixel Grouping37
Forgery Detection by Weighted Complementarity between Significant Invariance and Detail Enhancement36
Inner Knowledge-based Img2Doc Scheme for Visual Question Answering34
Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images33
Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition32
CMAF: Cross-Modal Augmentation via Fusion for Underwater Acoustic Image Recognition32
Two-Stage Perceptual Quality Oriented Rate Control Algorithm for HEVC32
Semi-supervised Video Object Segmentation Via an Edge Attention Gated Graph Convolutional Network32
iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection31
Towards Intelligent Attack Detection Using DNA Computing31
Multi-granularity Brushstrokes Network for Universal Style Transfer31
Paying Attention to Vehicles: A Systematic Review on Transformer-Based Vehicle Re-Identification31
FasterPose: A Faster Simple Baseline for Human Pose Estimation30
Context-Aware 3D Points of Interest Detection via Spatial Attention Mechanism29
Realtime Recognition of Dynamic Hand Gestures in Practical Applications28
Collocated Clothing Synthesis with GANs Aided by Textual Information: A Multi-Modal Framework27
Adaptive Cloud VR Gaming Optimized by Gamer QoE Models27
PLACE Dropout: A Progressive Layer-wise and Channel-wise Dropout for Domain Generalization26
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding26
Deep Learning-based Smart Predictive Evaluation for Interactive Multimedia-enabled Smart Healthcare26
Facial-expression-aware Emotional Color Transfer Based on Convolutional Neural Network26
Compressed Point Cloud Quality Index by Combining Global Appearance and Local Details25
KF-VTON: Keypoints-Driven Flow Based Virtual Try-On Network25
Double Attention Based on Graph Attention Network for Image Multi-Label Classification25
Knowledge-integrated Multi-modal Movie Turning Point Identification24
Complementary Feature Pyramid Network for Object Detection24
Image-Based Personality Questionnaire Design22
Joint Source-Channel Decoding of Polar Codes for HEVC-Based Video Streaming22
A Multi-Level Consistency Network for High-Fidelity Virtual Try-On22
Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video21
Table of Contents: Online Supplement Volume 17, Number 2s-3s21
Enhanced 3D Shape Reconstruction With Knowledge Graph of Category Concept20
DiRaC-I: Identifying Diverse and Rare Training Classes for Zero-Shot Learning20
Diversely-Supervised Visual Product Search20
Asymmetric Dual-Decoder U-Net for Joint Rain and Haze Removal20
On Content-Aware Post-Processing: Adapting Statistically Learned Models to Dynamic Content19
Lightweight Food Recognition via Aggregation Block and Feature Encoding19
COBIRAS: Offering a Continuous Bit Rate Slide to Maximize DASH Streaming Bandwidth Utilization19
Rethinking Feature Mining for Light Field Salient Object Detection18
Asymmetric Deformable Spatio-temporal Framework for Infrared Object Tracking18
Fine-Grained Text-to-Video Temporal Grounding from Coarse Boundary17
Introduction to the Special Section on Learning Representations, Similarity, and Associations in Dynamic Multimedia Environments17
Meta-learning Advisor Networks for Long-tail and Noisy Labels in Social Image Classification17
DAG-YOLO: A Context-Feature Adaptive fusion Rotating Detection Network in Remote Sensing Images17
Reconstruction-Free Image Compression for Machine Vision via Knowledge Transfer17
A DNA Based Colour Image Encryption Scheme Using A Convolutional Autoencoder16
Deep Learning Based Occluded Person Re-Identification: A Survey16
BMIF: Privacy-preserving Blockchain-based Medical Image Fusion16
Pose- and Attribute-consistent Person Image Synthesis16
Unsupervised Adversarial Example Detection of Vision Transformers for Trustworthy Edge Computing16
Attentional Composition Networks for Long-Tailed Human Action Recognition16
Semantic Embedding Guided Attention with Explicit Visual Feature Fusion for Video Captioning15
Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video15
Unsupervised Discovery and Manipulation of Continuous Disentangled Factors of Variation15
Quantum Fourier Convolutional Network15
Instance-Based Continual Learning: A Real-World Dataset and Baseline for Fresh Recognition15
Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling15
Animating Still Natural Images Using Warping15
Incomplete Cross-Modal Retrieval with Deep Correlation Transfer15
Exploring Neighbor Correspondence Matching for Multiple-hypotheses Video Frame Synthesis15
CAQoE: A Novel No-Reference Context-aware Speech Quality Prediction Metric15
Tensor-Empowered LSTM for Communication-Efficient and Privacy-Enhanced Cognitive Federated Learning in Intelligent Transportation Systems15
Online Cross-modal Hashing With Dynamic Prototype14
Leveraging Frame- and Feature-Level Progressive Augmentation for Semi-supervised Action Recognition14
Robust Searching-Based Gradient Collaborative Management in Intelligent Transportation System14
Characters Link Shots: Character Attention Network for Movie Scene Segmentation14
Perceptual Hashing of Deep Convolutional Neural Networks for Model Copy Detection14
Meetor: A Human-Centered Automatic Video Editing System for Meeting Recordings14
High Fidelity Makeup via 2D and 3D Identity Preservation Net14
Less Is More: Learning from Synthetic Data with Fine-Grained Attributes for Person Re-Identification14
Deep Variational Learning for 360° Adaptive Streaming14
MS-GDA: Improving Heterogeneous Recipe Representation via Multinomial Sampling Graph Data Augmentation13
Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation13
Identity Feature Disentanglement for Visible-Infrared Person Re-Identification13
Enhanced Video Super-Resolution Network towards Compressed Data13
CVLP-NaVD: Contrastive Visual-Language Pre-training Models for Non-annotated Visual Description13
JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object Tracking13
Smart Director: An Event-Driven Directing System for Live Broadcasting13
Delay Threshold for Social Interaction in Volumetric eXtended Reality Communication13
CLIP-DFGS: A Hard Sample Mining Method for CLIP in Generalizable Person Re-Identification13
Y-Net: Dual-branch Joint Network for Semantic Segmentation13
Detection of Adversarial Facial Accessory Presentation Attacks Using Local Face Differential13
TinyPredNet: A Lightweight Framework for Satellite Image Sequence Prediction13
Interactive Garment Recommendation with User in the Loop13
Exploiting Backdoors of Face Synthesis Detection with Natural Triggers12
Domain-aware Multimodal Dialog Systems with Distribution-based User Characteristic Modeling12
Lightweight Multi-party Authentication and Key Agreement Protocol in IoT-based E-Healthcare Service12
AMC: Adaptive Multi-expert Collaborative Network for Text-guided Image Retrieval12
SSR-Net: A Spatial Structural Relation Network for Vehicle Re-identification12
Feature Extraction Matters More: An Effective and Efficient Universal Deepfake Disruptor12
A Quality-Aware and Obfuscation-Based Data Collection Scheme for Cyber-Physical Metaverse Systems12
Explainable AI: A Multispectral Palm-Vein Identification System with New Augmentation Features12
Interactive Search vs. Automatic Search12
Frequency-aware Camouflaged Object Detection12
Performance Evaluation in Multimedia Retrieval12
Invisible Adversarial Watermarking: A Novel Security Mechanism for Enhancing Copyright Protection12
Part-wise Spatio-temporal Attention Driven CNN-based 3D Human Action Recognition12
RDH-DES: Reversible Data Hiding over Distributed Encrypted-Image Servers Based on Secret Sharing12
Self-supervised Image-based 3D Model Retrieval12
Multi-Scale and Multi-Layer Lattice Transformer for Underwater Image Enhancement11
AGAR - Attention Graph-RNN for Adaptative Motion Prediction of Point Clouds of Deformable Objects11
Perceptual Quality Assessment of Omnidirectional Images: A Benchmark and Computational Model11
An Image Arbitrary-Scale Super-Resolution Network Using Frequency-domain Information11
Bi-manual Haptic-based Periodontal Simulation with Finger Support and Vibrotactile Feedback11
Instance-level Adversarial Source-free Domain Adaptive Person Re-identification11
DBGAN: Dual Branch Generative Adversarial Network for Multi-Modal MRI Translation11
An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning11
Voice-Face Homogeneity Tells Deepfake11
Meta-MMFNet: Meta-learning-based Multi-model Fusion Network for Micro-expression Recognition11
Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection11
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation11
Psychology-Guided Environment Aware Network for Discovering Social Interaction Groups from Videos11
Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search11
Matching Faces and Attributes Between the Artistic and the Real Domain: the PersonArt Approach11
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection11
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?10
Upsampling Algorithm for V-PCC-Coded 3D Point Clouds10
Rank-in-Rank Loss for Person Re-identification10
Joint Mixing Data Augmentation for Skeleton-based Action Recognition10
Hypercube Pooling for Visual Semantic Embedding10
A Hierarchically Discriminative Loss with Group Regularization for Fine-Grained Image Classification10
Blind 3D Video Stabilization with Spatio-Temporally Varying Motion Blur10
Self-Adaptive Representation Learning Model for Multi-Modal Sentiment and Sarcasm Joint Analysis10
Adaptive Compression for Online Computer Vision: An Edge Reinforcement Learning Approach10
Cross-Modal Contrastive Learning with a Style-Mixed Bridge for Single Image 3D Shape Retrieval10
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding9
On Modality Bias Recognition and Reduction9
Hierarchical and Progressive Image Matting9
Leveraging Deep Statistics for Underwater Image Enhancement9
Learning Semantic Representation on Visual Attribute Graph for Person Re-identification and Beyond9
Rank-based Hashing for Effective and Efficient Nearest Neighbor Search for Image Retrieval9
Multimodal Cascaded Framework with Multimodal Latent Loss Functions Robust to Missing Modalities9
UID2021: An Underwater Image Dataset for Evaluation of No-Reference Quality Assessment Metrics9
Point Cloud Quality Assessment: Dataset Construction and Learning-based No-reference Metric9
A Multimodal Hierarchical Attentional Ordering Network9
A Comprehensive Survey on Methods for Image Integrity9
Multimodal Neurosymbolic Approach for Explainable Deepfake Detection9
HCNCT: A Cross-chain Interaction Scheme for the Blockchain-based Metaverse9
TripRes9
Music2Dance: DanceNet for Music-Driven Dance Generation9
Adaptive Prediction Structure for Learned Video Compression9
Backdoor Two-Stream Video Models on Federated Learning9
An Image Privacy Protection Algorithm Based on Adversarial Perturbation Generative Networks8
Age-Invariant Face Recognition by Multi-Feature Fusionand Decomposition with Self-attention8
Deep Learning for Logo Detection: A Survey8
Exploration of Speech and Music Information for Movie Genre Classification8
Disentangle Saliency Detection into Cascaded Detail Modeling and Body Filling8
ProtoRefine: Enhancing Prototypes with Similar Structure in Few-Shot Learning8
Occupancy Map Guided Attributes Artifacts Removal for Video-Based Point Cloud Compression8
Establishing Trust and Security in Decentralized Metaverse: A Web 3.0 Approach8
Semi-supervised Learning for Mars Imagery Classification and Segmentation8
Mirror Segmentation via Semantic-aware Contextual Contrasted Feature Learning8
A Novel Reversible Data Hiding Scheme Based on Pixel-Residual Histogram8
DPDFormer: A Coarse-to-Fine Model for Monocular Depth Estimation8
Mastering Deepfake Detection: A Cutting-edge Approach to Distinguish GAN and Diffusion-model Images8
Robust Hashing with Deep Features and Meixner Moments for Image Copy Detection8
Meta-Review on Brain-Computer Interface (BCI) in the Metaverse8
Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D Image8
A Siamese Inverted Residuals Network Image Steganalysis Scheme based on Deep Learning8
Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders8
Disentangling Features for Fashion Recommendation8
Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning8
Improving Scene Text Retrieval via Stylized Middle Modality8
Meta-Review of Wearable Devices for Healthcare in the Metaverse8
Label Consistent Flexible Matrix Factorization Hashing for Efficient Cross-modal Retrieval7
Domain-Separated Bottleneck Attention Fusion Framework for Multimodal Emotion Recognition7
Texture and Structure-Guided Dual-Attention Mechanism for Image Inpainting7
Modeling Long-range Dependencies and Epipolar Geometry for Multi-view Stereo7
Image Defogging Based on Regional Gradient Constrained Prior7
Boosting Relationship Detection in Images with Multi-Granular Self-Supervised Learning7
ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes7
A Deep Retinex-Based Low-Light Enhancement Network Fusing Rich Intrinsic Prior Information7
MixOOD: Improving Out-of-distribution Detection with Enhanced Data Mixup7
Inter-camera Identity Discrimination for Unsupervised Person Re-identification7
Dilated Convolution-based Feature Refinement Network for Crowd Localization7
High Efficiency Deep-learning Based Video Compression7
ISF-GAN: Imagine, Select, and Fuse with GPT-Based Text Enrichment for Text-to-Image Synthesis7
Aesthetic Attribute Assessment of Images Numerically on Mixed Multi-attribute Datasets7
Semi-supervised RGB-D Hand Gesture Recognition via Mutual Learning of Self-supervised Models7
AED-PADA: Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation7
LiLTv2: Language-substitutable Layout-Image Transformer for Visual Information Extraction7
LogoDet-3K: A Large-scale Image Dataset for Logo Detection7
Medical Image Classification based on an Adaptive Size Deep Learning Model7
Audio-visual Saliency Prediction Model with Implicit Neural Representation7
Multi-Guidance CNNs for Salient Object Detection7
Introduction to the Special Issue on Fine-Grained Visual Recognition and Re-Identification7
Unbiased Semantic Representation Learning Based on Causal Disentanglement for Domain Generalization7
3D Tensor Auto-encoder with Application to Video Compression7
Expanding-Window Zigzag Decodable Fountain Codes for Scalable Multimedia Transmission7
Double High-Order Correlation Preserved Robust Multi-View Ensemble Clustering7
A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation and Comparison Study7
Boundary Attention Guided Sparse Feature Learning for Underwater Object Tracking in Edge Computing7
Multi-Grained Contrastive Learning for Text-supervised Open-vocabulary Semantic Segmentation7
EiMOL: A Secure Medical Image Encryption Algorithm based on Optimization and the Lorenz System7
Temporal Dropout for Weakly Supervised Action Localization7
Sparsity-guided Discriminative Feature Encoding for Robust Keypoint Detection7
The Price of Unlearning: Identifying Unlearning Risk in Edge Computing7
Deep Unsupervised Key Frame Extraction for Efficient Video Classification7
S 2 CL-Leaf Net : Recognizing Leaf Images Like Human Botanists7
TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition7
Multimodal Graph for Unaligned Multimodal Sequence Analysis via Graph Convolution and Graph Pooling7
E-detector: Asynchronous Spatio-temporal for Event-based Object Detection in Intelligent Transportation System7
Pivot: Panoramic-Image-Based VR User Authentication against Side-Channel Attacks6
Person in Uniforms Re-Identification6
A Multi-feature and Time-aware-based Stress Evaluation Mechanism for Mental Status Adjustment6
Millimeter Wave and Free-space-optics for Future Dual-connectivity 6DOF Mobile Multi-user VR Streaming6
Fine-grained Image Classification via Multi-scale Selective Hierarchical Biquadratic Pooling6
Efficient Brain Tumor Segmentation with Lightweight Separable Spatial Convolutional Network6
Semantic-Consistency-guided Learning on Deep Features for Unsupervised Salient Object Detection6
An Explainable Deep Learning Ensemble Model for Robust Diagnosis of Diabetic Retinopathy Grading6
Perceptual Quality Assessment of Low-light Image Enhancement6
A Self-Defense Copyright Protection Scheme for NFT Image Art Based on Information Embedding6
Attention, Please! Adversarial Defense via Activation Rectification and Preservation6
HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition6
Robust Video Stabilization based on Motion Decomposition6
Artificial Intelligence Empowered Digital Twins for ECG Monitoring in a Smart Home6
Efficient Video Transformers via Spatial-temporal Token Merging for Action Recognition6
Immersive Multimedia Service Caching in Edge Cloud with Renewable Energy6
Distribution Aligned Multimodal and Multi-domain Image Stylization6
Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection6
Discard Significant Bits of Compressed Sensing: A Robust Image Coding for Resource-Limited Contexts6
Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection6
Dynamic Weighted Adversarial Learning for Semi-Supervised Classification under Intersectional Class Mismatch6
Deep Active Context Estimation for Automated COVID-19 Diagnosis6
Learning Offset Probability Distribution for Accurate Object Detection6
Improving Face Anti-spoofing via Advanced Multi-perspective Feature Learning6
MCFNet: Multi-Attentional Class Feature Augmentation Network for Real-Time Scene Parsing6
0.046031951904297