ACM Transactions on Multimedia Computing Communications and Applicatio

Papers
(The TQCC of ACM Transactions on Multimedia Computing Communications and Applicatio is 7. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
Upsampling Algorithm for V-PCC-Coded 3D Point Clouds250
Establishing Trust and Security in Decentralized Metaverse: A Web 3.0 Approach148
AED-PADA: Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation122
CVLP-NaVD: Contrastive Visual-Language Pre-training Models for Non-annotated Visual Description111
Facial-expression-aware Emotional Color Transfer Based on Convolutional Neural Network99
Semi-supervised Learning for Mars Imagery Classification and Segmentation86
Image-Based Personality Questionnaire Design80
JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object Tracking79
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation70
Unsupervised Discovery and Manipulation of Continuous Disentangled Factors of Variation69
Self-Adaptive Representation Learning Model for Multi-Modal Sentiment and Sarcasm Joint Analysis63
Tensorial Evolutionary Optimization for Natural Image Matting61
QuickCSGModeling: Quick CSG Operations Based on Fusing Signed Distance Fields for VR Modeling61
Reconstruction-Free Image Compression for Machine Vision via Knowledge Transfer60
Backdoor Two-Stream Video Models on Federated Learning60
Enhanced Video Super-Resolution Network towards Compressed Data58
Hypercube Pooling for Visual Semantic Embedding57
Attentional Composition Networks for Long-Tailed Human Action Recognition55
A Siamese Inverted Residuals Network Image Steganalysis Scheme based on Deep Learning53
Joint Mixing Data Augmentation for Skeleton-Based Action Recognition49
Discriminative Action Snippet Propagation Network for Weakly Supervised Temporal Action Localization48
Towards Intelligent Attack Detection Using DNA Computing47
Explainable AI: A Multispectral Palm-Vein Identification System with New Augmentation Features46
Image Cropping with Content and Composition Attribute-aware Global Relation Reasoning44
Leveraging Deep Statistics for Underwater Image Enhancement42
Fine-Grained Text-to-Video Temporal Grounding from Coarse Boundary41
Smart Director: An Event-Driven Directing System for Live Broadcasting41
Category-Level Pose Estimation and Iterative Refinement for Monocular RGB-D Image40
A Comprehensive Survey on Methods for Image Integrity40
Point Cloud Quality Assessment: Dataset Construction and Learning-based No-reference Metric38
BiC-Net: Learning Efficient Spatio-temporal Relation for Text-Video Retrieval38
Unsupervised Domain Expansion for Visual Categorization37
Quantum Fourier Convolutional Network36
Efficient Light Field Image Compression with Enhanced Random Access35
Using Four Hypothesis Probability Estimators for CABAC in Versatile Video Coding35
Psychology-Guided Environment Aware Network for Discovering Social Interaction Groups from Videos35
Rank-in-Rank Loss for Person Re-identification35
VISCOUNTH: A Large-scale Multilingual Visual Question Answering Dataset for Cultural Heritage35
Detection of Moving Object Using Superpixel Fusion Network34
Decoupling Deep Learning for Enhanced Image Recognition Interpretability34
Expanding-Window Zigzag Decodable Fountain Codes for Scalable Multimedia Transmission34
GMS-3DQA: Projection-Based Grid Mini-patch Sampling for 3D Model Quality Assessment32
(Compress and Restore) N : A Robust Defense Against Adversarial Attacks on Image Classification32
The Price of Unlearning: Identifying Unlearning Risk in Edge Computing31
LogoDet-3K: A Large-scale Image Dataset for Logo Detection31
Immersive Multimedia Service Caching in Edge Cloud with Renewable Energy30
A Self-Defense Copyright Protection Scheme for NFT Image Art Based on Information Embedding30
A Multi-Task Adversarial Attack against Face Authentication29
Boundary Attention Guided Sparse Feature Learning for Underwater Object Tracking in Edge Computing27
HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition26
Robust Video Stabilization based on Motion Decomposition26
Visual-linguistic-stylistic Triple Reward for Cross-lingual Image Captioning25
Exploiting Instance-level Relationships in Weakly Supervised Text-to-Video Retrieval25
Universal Relocalizer for Weakly Supervised Referring Expression Grounding25
SNIPPET: A Framework for Subjective Evaluation of Visual Explanations Applied to DeepFake Detection25
Fine-grained Image Classification via Multi-scale Selective Hierarchical Biquadratic Pooling25
Image Defogging Based on Regional Gradient Constrained Prior25
New Metrics and Dataset for Biological Development Video Generation25
A Multi-feature and Time-aware-based Stress Evaluation Mechanism for Mental Status Adjustment24
ViCoFace: Learning Disentangled Latent Motion Representations for Visual-Consistent Face Reenactment24
GANonymization: A GAN-Based Face Anonymization Framework for Preserving Emotional Expressions23
A Quality of Experience and Visual Attention Evaluation for 360° Videos with Non-spatial and Spatial Audio23
Non-Acted Text and Keystrokes Database and Learning Methods to Recognize Emotions23
Introduction to the Special Issue on Advanced Approaches for Multiple Instance Learning on Multimedia Applications22
EiMOL: A Secure Medical Image Encryption Algorithm based on Optimization and the Lorenz System22
Precise No-Reference Image Quality Evaluation Based on Distortion Identification22
Visual Security Index Combining CNN and Filter for Perceptually Encrypted Light Field Images21
Visual Semantic-Based Representation Learning Using Deep CNNs for Scene Recognition21
LayoutEnc: Leveraging Enhanced Layout Representations for Transformer-based Complex Scene Synthesis21
A Multi-instance Multi-label Dual Learning Approach for Video Captioning21
Melody Generation from Lyrics with Local Interpretability21
TEVL: Trilinear Encoder for Video-language Representation Learning21
Temporal and Semantic Correlation Network for Weakly-Supervised Temporal Action Localization21
Gloss-driven Conditional Diffusion Models for Sign Language Production20
Human Selective Matting20
An Efficient and Accurate GPU-based Deep Learning Model for Multimedia Recommendation20
Similarity Regulation and Calibration Alignment for Weakly Supervised Text-Based Person Re-Identification20
Multi-Grained Point Cloud Geometry Compression via Dual-Model Prediction with Extended Octree20
ATMNet: Adaptive Texture Migration Network for Guided Depth Super-Resolution20
Cyclic Self-attention for Point Cloud Recognition19
Authentication of LINE Chat History Files by Information Hiding19
Principal Component Approximation Network for Image Compression19
Zero-shot Scene Graph Generation via Triplet Calibration and Reduction19
Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array19
Counterfactual Scenario-relevant Knowledge-enriched Multi-modal Emotion Reasoning19
Toward Egocentric Compositional Action Anticipation with Adaptive Semantic Debiasing19
Temporal Dynamic Concept Modeling Network for Explainable Video Event Recognition19
One-Bit Supervision for Image Classification: Problem, Solution, and Beyond19
Cross-modal Semantically Augmented Network for Image-text Matching19
Reversible Data Hiding in Shared JPEG Images18
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition18
Adversarial Sample Synthesis for Visual Question Answering18
Structure-aware Video Style Transfer with Map Art18
DATRA-MIV: Decoder-Adaptive Tiling and Rate Allocation for MPEG Immersive Video18
Motion-Aware Self-Supervised RGBT Tracking with Multi-Modality Hierarchical Transformers18
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach18
Multigranularity Feature Aggregation and Cross-level Boundary Modeling for Temporal Action Detection17
Robust Unsupervised Gaze Calibration Using Conversation and Manipulation Attention Priors17
Learning Domain Invariant Features for Unsupervised Indoor Depth Estimation Adaptation17
Attack-Defending Contrastive Learning for Volumetric Medical Image Zero-Watermarking17
Maximizing Long-Term Task Completion Ratio of UAV-Enabled Wirelessly Powered MEC Systems17
Potential Features Fusion Network for Multimodal Fake News Detection17
BiRe-ID: Binary Neural Network for Efficient Person Re-ID17
Pansharpening Scheme Using Bi-dimensional Empirical Mode Decomposition and Neural Network17
Fully Unsupervised Person Re-Identification via Selective Contrastive Learning17
DISA: Disentangled Dual-Branch Framework for Affordance-Aware Human Insertion17
Mutually-Guided Hierarchical Multi-Modal Feature Learning for Referring Image Segmentation17
Dynamic Transfer Exemplar based Facial Emotion Recognition Model Toward Online Video17
GLPose: Global-Local Representation Learning for Human Pose Estimation16
Diversity-Representativeness Replay and Knowledge Alignment for Lifelong Vehicle Re-identification16
Deep Modular Co-Attention Shifting Network for Multimodal Sentiment Analysis16
Multiply Complementary Priors for Image Compressive Sensing Reconstruction in Impulsive Noise16
Query-Guided Prototype Learning with Decoder Alignment and Dynamic Fusion in Few-Shot Segmentation16
ReFID: Reciprocal Frequency-aware Generalizable Person Re-identification via Decomposition and Filtering16
Towards Integrating Image Encryption with Compression: A Survey15
Introduction to the Special Issue on Recent Trends in Medical Data Security for e-Health Applications15
Robust RGB-T Tracking via Adaptive Modality Weight Correlation Filters and Cross-modality Learning15
A Comprehensive Study of Deep Learning-based Covert Communication15
PADVG: A Simple Baseline of Active Protection for Audio-Driven Video Generation15
Triplet Contrastive Representation Learning for Unsupervised Vehicle Re-identification15
Generative Image Steganography Based on Guidance Feature Distribution15
Text-Guided Synthesis of Masked Face Images15
Shot Boundary Detection Using Color Clustering and Attention Mechanism15
Introduction to the Special Issue on Explainable Deep Learning for Medical Image Computing14
NSDIE: Noise Suppressing Dark Image Enhancement Using Multiscale Retinex and Low-Rank Minimization14
Cascaded Adaptive Graph Representation Learning for Image Copy-Move Forgery Detection14
Semantics and Non-fungible Tokens for Copyright Management on the Metaverse and Beyond14
Sentiment-Oriented Transformer-Based Variational Autoencoder Network for Live Video Commenting14
ProposalVLAD with Proposal-Intra Exploring for Temporal Action Proposal Generation14
Arbitrary Virtual Try-on Network: Characteristics Preservation and Tradeoff between Body and Clothing14
Online Correction of Camera Poses for the Surround-view System: A Sparse Direct Approach14
Content-Aware Selective Encryption for H.265/HEVC Using Deep Hashing Network and Steganography14
Unsupervised Domain Adaptation by Causal Learning for Biometric Signal-based HCI14
Generating Robust Adversarial Examples against Online Social Networks (OSNs)14
Robust Image Hashing via CP Decomposition and DCT for Copy Detection13
Self-supervised Multi-view Learning via Auto-encoding 3D Transformations13
Privacy-preserving Multi-source Cross-domain Recommendation Based on Knowledge Graph13
Multi-Modal Driven Pose-Controllable Talking Head Generation13
A Densely Connected Network Based on U-Net for Medical Image Segmentation13
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications13
Where Are They Going? Predicting Human Behaviors in Crowded Scenes13
Domain-invariant and Patch-discriminative Feature Learning for General Deepfake Detection13
GAN-Assisted Road Segmentation from Satellite Imagery13
Temporal Scene Montage for Self-Supervised Video Scene Boundary Detection13
Mimicking Individual Media Quality Perception with Neural Network based Artificial Observers13
Semantic Completion and Filtration for Image–Text Retrieval13
SSAT: Active Authorization Control and User’s Fingerprint Tracking Framework for DNN IP Protection13
Learning Nighttime Semantic Segmentation the Hard Way13
Dual Dynamic Threshold Adjustment Strategy13
MFECN: Multi-level Feature Enhanced Cumulative Network for Scene Text Detection13
Action-aware Linguistic Skeleton Optimization Network for Non-autoregressive Video Captioning13
Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval13
Transformer-Based Visual Grounding with Cross-Modality Interaction13
Quality Assessment in the Era of Large Models: A Survey12
EVASR: Edge-Based Salience-Aware Super-Resolution for Enhanced Video Quality and Power Efficiency12
Hyperspectral Image Reconstruction Using Multi-scale Fusion Learning12
Dual Scene Graph Convolutional Network for Motivation Prediction12
Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation12
Multi-view Shape Generation for a 3D Human-like Body12
Quality Enhancement of Compressed 360-Degree Videos Using Viewport-based Deep Neural Networks12
A Real-Time Effective Fusion-Based Image Defogging Architecture on FPGA12
A Multiple Sieve Approach Based on Artificial Intelligent Techniques and Correlation Power Analysis12
Review and Analysis of RGBT Single Object Tracking Methods: A Fusion Perspective12
A Normalized Slicing-assigned Virtualization Method for 6G-based Wireless Communication Systems12
Progressive Transformer Machine for Natural Character Reenactment12
Skeleton-Aware Graph-Based Adversarial Networks for Human Pose Estimation from Sparse IMUs12
Language-guided Bias Generation Contrastive Strategy for Visual Question Answering12
Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition12
Autoregressive GAN for Semantic Unconditional Head Motion Generation12
Noise-Resistance Learning via Multi-Granularity Consistency for Unsupervised Domain Adaptive Person Re-Identification12
Toward High-quality Face-Mask Occluded Restoration12
Learning the User’s Deeper Preferences for Multi-modal Recommendation Systems12
Robust Long-Term Tracking via Localizing Occluders11
Lifelog Image Retrieval Based on Semantic Relevance Mapping11
Boosting Few-shot Object Detection with Discriminative Representation and Class Margin11
Full-body Human Motion Reconstruction with Sparse Joint Tracking Using Flexible Sensors11
PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications11
A Hierarchically Discriminative Loss with Group Regularization for Fine-Grained Image Classification11
Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling11
GJFusion: A Channel-Level Correlation Construction Method for Multimodal Physiological Signal Fusion11
Generating and Evaluating Data of Daily Activities with an Autonomous Agent in a Virtual Smart Home11
Balanced and Accurate Pseudo-Labels for Semi-Supervised Image Classification11
Dynamic Weighted Gradient Reversal Network for Visible-infrared Person Re-identification11
Language-guided Residual Graph Attention Network and Data Augmentation for Visual Grounding11
A Convolutional Neural Network Model Using Weighted Loss Function to Detect Diabetic Retinopathy11
Pose- and Attribute-consistent Person Image Synthesis11
AMC: Adaptive Multi-expert Collaborative Network for Text-guided Image Retrieval11
InteractNet: Social Interaction Recognition for Semantic-rich Videos11
Smart City Construction and Management by Digital Twins and BIM Big Data in COVID-19 Scenario11
Joint Structure-Texture Scan-Order for Point Cloud Attribute Compression Using Affine Transformation11
VRVul-Discovery: BiLSTM-based Vulnerability Discovery for Virtual Reality Devices in Metaverse11
Residual-guided In-loop Filter Using Convolution Neural Network11
SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection11
Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search11
Enhanced 3D Shape Reconstruction With Knowledge Graph of Category Concept10
MLIC ++ : Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression10
Instance-level Adversarial Source-free Domain Adaptive Person Re-identification10
Boolean-based Two-in-One Secret Image Sharing by Adaptive Pixel Grouping10
Multi-Scale and Multi-Layer Lattice Transformer for Underwater Image Enhancement10
Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network10
Meetor: A Human-Centered Automatic Video Editing System for Meeting Recordings10
Part-wise Spatio-temporal Attention Driven CNN-based 3D Human Action Recognition10
Compressed Point Cloud Quality Index by Combining Global Appearance and Local Details10
Unsupervised Adversarial Example Detection of Vision Transformers for Trustworthy Edge Computing10
Inner Knowledge-based Img2Doc Scheme for Visual Question Answering10
Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning10
Deep Differential Lifelong Cross-modal Hashing for Stream Medical Data Retrieval10
When Pairs Meet Triplets: Improving Low-Resource Captioning via Multi-Objective Optimization10
Learning Semantic Representation on Visual Attribute Graph for Person Re-identification and Beyond10
A Multimodal Hierarchical Attentional Ordering Network10
Invisible Adversarial Watermarking: A Novel Security Mechanism for Enhancing Copyright Protection10
iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection9
A Review of Player Engagement Estimation in Video Games: Challenges and Opportunities9
DNA Computing-Based Multi-Source Data Storage Model in Digital Twins9
Detail-preserving Joint Image Upsampling9
Double High-Order Correlation Preserved Robust Multi-View Ensemble Clustering9
Privacy-Enhanced Prototype-Based Federated Cross-Modal Hashing for Cross-Modal Retrieval9
Hierarchical and Progressive Image Matting9
Context-Based Novel Histogram Bin Stretching Algorithm for Automatic Contrast Enhancement9
Dual-Modality-Shared Learning and Label Refinement for Unsupervised Visible-Infrared Person ReID9
LFIZW-GRHFMR: Robust Zero-Watermarking with GRHFMR for Light Field Image9
Beyond Songs: Analyzing User Sentiment through Music Playlists and Multimodal Data9
Multimodal Graph for Unaligned Multimodal Sequence Analysis via Graph Convolution and Graph Pooling9
SOEDiff: Efficient Distillation for Small Object Editing9
S 2 CL-Leaf Net : Recognizing Leaf Images Like Human Botanists9
Variational Autoencoder with CCA for Audio–Visual Cross-modal Retrieval9
DPDFormer: A Coarse-to-Fine Model for Monocular Depth Estimation9
FishFormer: Annulus Slicing-based Transformer for Fisheye Rectification9
Multimodal Cascaded Framework with Multimodal Latent Loss Functions Robust to Missing Modalities9
Style-FG: a style-based framework for film grain analysis and synthesis9
A Deep Retinex-Based Low-Light Enhancement Network Fusing Rich Intrinsic Prior Information9
Efficient Video Transformers via Spatial-temporal Token Merging for Action Recognition9
MF 2 ShrT: Multimodal Feature Fusion Using Shared Layered Transformer for Face Anti-spoofing9
Multi-Modal Sarcasm Detection via Knowledge-aware Focused Graph Convolutional Networks9
CAQoE: A Novel No-Reference Context-aware Speech Quality Prediction Metric9
Complementary Feature Pyramid Network for Object Detection9
Self-supervised Image-based 3D Model Retrieval9
Editorial to Special Issue on Multimedia Cognitive Computing for Intelligent Transportation System8
DEGAN: Detail-Enhanced Generative Adversarial Network for Monocular Depth-Based 3D Reconstruction8
Opportunistic Transmission for Video Streaming over Wild Internet8
Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study8
Introduction to the Special Issue on Deep Learning for Robust Human Body Language Understanding8
Ischemic Stroke Segmentation by Transformer and Convolutional Neural Network Using Few-Shot Learning8
TPTE: Text-Guided Patch Token Exploitation for Unsupervised Fine-Grained Representation Learning8
Compression Approaches for LiDAR Point Clouds and Beyond: A Survey8
Bottom-up and Top-down Object Inference Networks for Image Captioning8
TEC-CNN: Toward Efficient Compressing of Convolutional Neural Nets with Low-rank Tensor Decomposition8
Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in Wild8
Invertible Grayscale with Sparsity Enforcing Priors8
RSUIGM: Realistic Synthetic Underwater Image Generation with Image Formation Model8
0.12802410125732