ACM Transactions on Multimedia Computing Communications and Applicatio

Papers
(The median citation count of ACM Transactions on Multimedia Computing Communications and Applicatio is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
Understanding and Creating Art with AI: Review and Outlook164
Depth Image Denoising Using Nuclear Norm and Learning Graph Model161
Age-Invariant Face Recognition by Multi-Feature Fusionand Decomposition with Self-attention117
Learning Adaptive Spatial-Temporal Context-Aware Correlation Filters for UAV Tracking109
Precise No-Reference Image Quality Evaluation Based on Distortion Identification89
A Fast Defogging Image Recognition Algorithm Based on Bilateral Hybrid Filtering84
Chinese Image Captioning via Fuzzy Attention-based DenseNet-BiLSTM78
Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders78
TripRes73
Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry, and Fusion67
Smart City Construction and Management by Digital Twins and BIM Big Data in COVID-19 Scenario62
Wavelength-based Attributed Deep Neural Network for Underwater Image Restoration62
Conditional LSTM-GAN for Melody Generation from Lyrics59
A Weakly Supervised Semantic Segmentation Network by Aggregating Seed Cues: The Multi-Object Proposal Generation Perspective56
Deep Learning-based Smart Predictive Evaluation for Interactive Multimedia-enabled Smart Healthcare55
Cross-modal Graph Matching Network for Image-text Retrieval55
Music2Dance: DanceNet for Music-Driven Dance Generation55
Compatibility-Aware Web API Recommendation for Mashup Creation via Textual Description Mining54
Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection49
Perceptual Quality Assessment of Low-light Image Enhancement49
Trust Mechanism of Feedback Trust Weight in Multimedia Network46
Towards Integrating Image Encryption with Compression: A Survey45
A Survey on Healthcare Data: A Security Perspective42
Spherical Convolution Empowered Viewport Prediction in 360 Video Multicast with Limited FoV Feedback42
Uncertainty-Aware Semi-Supervised Method Using Large Unlabeled and Limited Labeled COVID-19 Data41
A Review on Methods and Applications in Multimodal Deep Learning40
An Effective Forest Fire Detection Framework Using Heterogeneous Wireless Multimedia Sensor Networks40
Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 202039
Local Correlation Ensemble with GCN Based on Attention Features for Cross-domain Person Re-ID37
A Multimodal, Multimedia Point-of-Care Deep Learning Framework for COVID-19 Diagnosis37
UID2021: An Underwater Image Dataset for Evaluation of No-Reference Quality Assessment Metrics36
Integrating Scene Semantic Knowledge into Image Captioning34
Part-wise Spatio-temporal Attention Driven CNN-based 3D Human Action Recognition33
LogoDet-3K: A Large-scale Image Dataset for Logo Detection32
Automatic Assessment of Depression and Anxiety through Encoding Pupil-wave from HCI in VR Scenes32
Point Cloud Quality Assessment: Dataset Construction and Learning-based No-reference Metric32
A Deep Multi-level Attentive Network for Multimodal Sentiment Analysis32
Market2Dish: Health-aware Food Recommendation31
Multitarget Tracking Using Siamese Neural Networks31
Privacy-preserving Decentralized Learning Framework for Healthcare System30
Multi-Tier CloudVR29
Security and Privacy of Patient Information in Medical Systems Based on Blockchain Technology28
Fine-Grained Visual Computing Based on Deep Learning28
Label Consistent Flexible Matrix Factorization Hashing for Efficient Cross-modal Retrieval28
xCos: An Explainable Cosine Metric for Face Verification Task27
A Multi-agent Feature Selection and Hybrid Classification Model for Parkinson's Disease Diagnosis27
Bi-Directional Co-Attention Network for Image Captioning26
A DNA Based Colour Image Encryption Scheme Using A Convolutional Autoencoder26
An Explainable Deep Learning Ensemble Model for Robust Diagnosis of Diabetic Retinopathy Grading26
Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network26
Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images26
Global-Local Enhancement Network for NMF-Aware Sign Language Recognition25
Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching25
Hierarchical Multi-Attention Transfer for Knowledge Distillation24
Scenario-Aware Recurrent Transformer for Goal-Directed Video Captioning24
Deepfake Video Detection via Predictive Representation Learning24
Explanation-Driven HCI Model to Examine the Mini-Mental State for Alzheimer’s Disease22
Disentangling Features for Fashion Recommendation22
eDiaPredict: An Ensemble-based Framework for Diabetes Prediction22
Exploring Image Enhancement for Salient Object Detection in Low Light Images22
Decoupled Low-Light Image Enhancement21
Controlling Neural Learning Network with Multiple Scales for Image Splicing Forgery Detection21
Deep Convolutional Pooling Transformer for Deepfake Detection21
Voice-Face Homogeneity Tells Deepfake20
EiMOL: A Secure Medical Image Encryption Algorithm based on Optimization and the Lorenz System20
A Novel ( t , s , k , n )-Threshold Visual Secret Sharing Scheme Based on Ac20
A Multimodal Framework for Large-Scale Emotion Recognition by Fusing Music and Electrodermal Activity Signals20
Lightweight Multi-party Authentication and Key Agreement Protocol in IoT-based E-Healthcare Service20
HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval20
Attribute-wise Explainable Fashion Compatibility Modeling19
A Semi-supervised Learning Approach Based on Adaptive Weighted Fusion for Automatic Image Annotation19
3D Tooth Instance Segmentation Learning Objectness and Affinity in Point Cloud19
Deep Q Network–Driven Task Offloading for Efficient Multimedia Data Analysis in Edge Computing–Assisted IoV18
Pinball Loss Twin Support Vector Clustering18
Robust Secret Image Sharing Resistant to Noise in Shares17
Fine-grained Image Classification via Multi-scale Selective Hierarchical Biquadratic Pooling17
A Sorting Fuzzy Min-Max Model in an Embedded System for Atrial Fibrillation Detection17
Medical Image Classification based on an Adaptive Size Deep Learning Model17
Part-based Structured Representation Learning for Person Re-identification17
Correlation Discrepancy Insight Network for Video Re-identification16
A Novel Multi-Sample Generation Method for Adversarial Attacks16
RD-IOD: Two-Level Residual-Distillation-Based Triple-Network for Incremental Object Detection16
Knowledge-driven Egocentric Multimodal Activity Recognition16
Revisiting Local Descriptor for Improved Few-Shot Classification16
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition16
Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning16
FasterPose: A Faster Simple Baseline for Human Pose Estimation15
Am I Done? Predicting Action Progress in Videos15
Entropy Slicing Extraction and Transfer Learning Classification for Early Diagnosis of Alzheimer Diseases with sMRI15
SDN Enabled QoE and Security Framework for Multimedia Applications in 5G Networks15
Gaussian Mixture Model Clustering with Incomplete Data15
Double Attention Based on Graph Attention Network for Image Multi-Label Classification15
Deep Semantic and Attentive Network for Unsupervised Video Summarization15
A Comprehensive Study of Deep Learning-based Covert Communication15
Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion15
Causal Inference with Knowledge Distilling and Curriculum Learning for Unbiased VQA15
GuessUNeed15
Detection of AI-Manipulated Fake Faces via Mining Generalized Features15
Learning to Fool the Speaker Recognition15
ECCNAS: Efficient Crowd Counting Neural Architecture Search14
Egocentric Early Action Prediction via Adversarial Knowledge Distillation14
ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes14
Fully Unsupervised Person Re-Identification via Selective Contrastive Learning14
Secure Chaff-less Fuzzy Vault for Face Identification Systems14
Distribution Aligned Multimodal and Multi-domain Image Stylization14
A Novel GAPG Approach to Automatic Property Generation for Formal Verification: The GAN Perspective14
Doctor's Dilemma: Evaluating an Explainable Subtractive Spatial Lightweight Convolutional Neural Network for Brain Tumor Diagnosis14
Answer Questions with Right Image Regions: A Visual Attention Regularization Approach14
Explainable AI: A Multispectral Palm-Vein Identification System with New Augmentation Features14
Leveraging Deep Statistics for Underwater Image Enhancement13
Introduction to the Special Issue on Recent Trends in Medical Data Security for e-Health Applications13
Hyper-node Relational Graph Attention Network for Multi-modal Knowledge Graph Completion13
Binary Representation via Jointly Personalized Sparse Hashing13
Dynamic Graph Learning Convolutional Networks for Semi-supervised Classification13
Smart Director: An Event-Driven Directing System for Live Broadcasting13
Generation of Realistic Synthetic Financial Time-series13
Multi-human Parsing with a Graph-based Generative Adversarial Model13
Generative Metric Learning for Adversarially Robust Open-world Person Re-Identification13
Clustering Matters: Sphere Feature for Fully Unsupervised Person Re-identification13
SADnet: Semi-supervised Single Image Dehazing Method Based on an Attention Mechanism13
Video Frame Interpolation: A Comprehensive Survey13
A Format-compatible Searchable Encryption Scheme for JPEG Images Using Bag-of-words13
Deep Illumination-Enhanced Face Super-Resolution Network for Low-Light Images13
Motion-Aware Structured Matrix Factorization for Foreground Detection in Complex Scenes12
Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition12
Toward Intelligent Fashion Design: A Texture and Shape Disentangled Generative Adversarial Network12
Multi-feature Fusion VoteNet for 3D Object Detection12
Multi-Guidance CNNs for Salient Object Detection12
Graph Attention Transformer Network for Multi-label Image Classification12
A Densely Connected Network Based on U-Net for Medical Image Segmentation12
MMFN12
Structure-aware Meta-fusion for Image Super-resolution12
Deep Self-Supervised Hyperspectral Image Reconstruction12
Deep Unsupervised Key Frame Extraction for Efficient Video Classification12
RDH-DES: Reversible Data Hiding over Distributed Encrypted-Image Servers Based on Secret Sharing12
Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation12
Optimizing Performance of Federated Person Re-identification: Benchmarking and Analysis12
Unifying Dual-Attention and Siamese Transformer Network for Full-Reference Image Quality Assessment12
JoT-GAN: A Framework for Jointly Training GAN and Person Re-Identification Model12
Less Is More: Learning from Synthetic Data with Fine-Grained Attributes for Person Re-Identification12
Blockchain-Based Audio Watermarking Technique for Multimedia Copyright Protection in Distribution Networks11
Semantic Completion and Filtration for Image–Text Retrieval11
Shuffle-invariant Network for Action Recognition in Videos11
Deep Uncoupled Discrete Hashing via Similarity Matrix Decomposition11
Dilated Convolution-based Feature Refinement Network for Crowd Localization11
A Convolutional Neural Network Model Using Weighted Loss Function to Detect Diabetic Retinopathy11
An Explainable Framework for Diagnosis of COVID-19 Pneumonia via Transfer Learning and Discriminant Correlation Analysis11
Output-Bounded and RBFNN-Based Position Tracking and Adaptive Force Control for Security Tele-Surgery11
Hybrid Modality Metric Learning for Visible-Infrared Person Re-Identification11
Multi-granularity Brushstrokes Network for Universal Style Transfer11
Introduction to the Special Issue on Trustworthy Multimedia Computing and Applications in Urban Scenes11
Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on11
MILL: Channel Attention–based Deep Multiple Instance Learning for Landslide Recognition11
GreyReID: A Novel Two-stream Deep Framework with RGB-grey Information for Person Re-identification10
Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval10
WTRPNet: An Explainable Graph Feature Convolutional Neural Network for Epileptic EEG Classification10
Alignment Enhancement Network for Fine-grained Visual Categorization10
iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection10
Socializing the Videos: A Multimodal Approach for Social Relation Recognition10
Harmonious Multi-branch Network for Person Re-identification with Harder Triplet Loss10
Exploring Relations in Untrimmed Videos for Self-Supervised Learning10
Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning10
MKVSE: Multimodal Knowledge Enhanced Visual-semantic Embedding for Image-text Retrieval10
Single-shot Semantic Matching Network for Moment Localization in Videos10
Where Are They Going? Predicting Human Behaviors in Crowded Scenes10
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition10
Full-reference Screen Content Image Quality Assessment by Fusing Multilevel Structure Similarity10
An l ½ and Graph Regularized Subspace Clustering Method for Robust Image Segmentation9
Progressive Localization Networks for Language-Based Moment Localization9
Sensor-based Human Activity Recognition Using Graph LSTM and Multi-task Classification Model9
On Modality Bias Recognition and Reduction9
Frequency-aware Camouflaged Object Detection9
Transformer-Based Visual Grounding with Cross-Modality Interaction9
A Survey on Temporal Sentence Grounding in Videos9
Efficient Light Field Image Compression with Enhanced Random Access9
Bidirectional Transformer GAN for Long-term Human Motion Prediction9
Rank-in-Rank Loss for Person Re-identification9
BMIF: Privacy-preserving Blockchain-based Medical Image Fusion9
Visual Semantic-Based Representation Learning Using Deep CNNs for Scene Recognition9
Rectified Meta-learning from Noisy Labels for Robust Image-based Plant Disease Classification9
Learning Video-Text Aligned Representations for Video Captioning9
Learning Semantic Representation on Visual Attribute Graph for Person Re-identification and Beyond9
Deep Learning Based Occluded Person Re-Identification: A Survey9
ProActive DeepFake Detection using GAN-based Visible Watermarking8
Scribble-Supervised Meibomian Glands Segmentation in Infrared Images8
Adaptive Attention-based High-level Semantic Introduction for Image Caption8
Accelerating Transform Algorithm Implementation for Efficient Intra Coding of 8K UHD Videos8
An End-to-end Heterogeneous Restraint Network for RGB-D Cross-modal Person Re-identification8
Affective Interaction: Attentive Representation Learning for Multi-Modal Sentiment Classification8
Modeling Long-range Dependencies and Epipolar Geometry for Multi-view Stereo8
Beyond the Parts: Learning Coarse-to-Fine Adaptive Alignment Representation for Person Search8
Boosting Scene Graph Generation with Visual Relation Saliency8
Mimicking Individual Media Quality Perception with Neural Network based Artificial Observers8
Automatic Comic Generation with Stylistic Multi-page Layouts and Emotion-driven Text Balloon Generation8
Synthesising Privacy by Design Knowledge Toward Explainable Internet of Things Application Designing in Healthcare8
MIS: A Multi-Identifier Management and Resolution System in the Metaverse8
Hypergraph Association Weakly Supervised Crowd Counting8
Meta-MMFNet: Meta-learning-based Multi-model Fusion Network for Micro-expression Recognition8
SPGAN: Face Forgery Using Spoofing Generative Adversarial Networks8
NumCap: A Number-controlled Multi-caption Image Captioning Network8
Web3 Metaverse: State-of-the-Art and Vision8
Equivariant Adversarial Network for Image-to-image Translation8
Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection8
Non-Acted Text and Keystrokes Database and Learning Methods to Recognize Emotions8
A Security and Privacy Validation Methodology for e-Health Systems8
Interactive Search vs. Automatic Search8
HCNCT: A Cross-chain Interaction Scheme for the Blockchain-based Metaverse8
Millimeter Wave and Free-space-optics for Future Dual-connectivity 6DOF Mobile Multi-user VR Streaming7
A Decoupled Kernel Prediction Network Guided by Soft Mask for Single Image HDR Reconstruction7
Robust Searching-Based Gradient Collaborative Management in Intelligent Transportation System7
Multi-view Shape Generation for a 3D Human-like Body7
Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution7
Robust Copyright Protection Technique with High-embedding Capacity for Color Images7
PRNU-based Image Forgery Localization with Deep Multi-scale Fusion7
Learning Transferable Perturbations for Image Captioning7
SETTI: A S elf-supervised Adv E rsarial Malware De T ection Archi T ecture i7
Using Four Hypothesis Probability Estimators for CABAC in Versatile Video Coding7
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training7
Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection7
CUR Transformer: A Convolutional Unbiased Regional Transformer for Image Denoising7
A Multi-feature and Time-aware-based Stress Evaluation Mechanism for Mental Status Adjustment7
Perceptual Image Compression with Block-Level Just Noticeable Difference Prediction7
Perceptual Hashing of Deep Convolutional Neural Networks for Model Copy Detection7
Eye-based Recognition for User Identification on Mobile Devices7
AMSA: Adaptive Multimodal Learning for Sentiment Analysis7
ESRNet: Efficient Search and Recognition Network for Image Manipulation Detection7
Hypomimia Recognition in Parkinson’s Disease With Semantic Features7
Head Pose Estimation Patterns as Deepfake Detectors7
TT-TSVD: A Multi-modal Tensor Train Decomposition with Its Application in Convolutional Neural Networks for Smart Healthcare7
Detection of Moving Object Using Superpixel Fusion Network7
Robust Hashing via Global and Local Invariant Features for Image Copy Detection7
MMSUM Digital Twins: A Multi-view Multi-modality Summarization Framework for Sporting Events7
From Coarse to Fine: Hierarchical Structure-aware Video Summarization7
Hierarchical and Progressive Image Matting7
Multi-scale Edge-guided Learning for 3D Reconstruction7
Y-Net: Dual-branch Joint Network for Semantic Segmentation7
NR-CNN: Nested-Residual Guided CNN In-loop Filtering for Video Coding7
A Bayesian Quality-of-Experience Model for Adaptive Streaming Videos7
Spatial-temporal Regularized Multi-modality Correlation Filters for Tracking with Re-detection7
Fine-Grained Adversarial Semi-Supervised Learning7
Low-light Image Enhancement via a Frequency-based Model with Structure and Texture Decomposition7
Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array7
Adversarial Multi-Grained Embedding Network for Cross-Modal Text-Video Retrieval7
A Fast View Synthesis Implementation Method for Light Field Applications7
JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object Tracking6
Adaptive Compression for Online Computer Vision: An Edge Reinforcement Learning Approach6
Facial-expression-aware Emotional Color Transfer Based on Convolutional Neural Network6
Variational Autoencoder with CCA for Audio–Visual Cross-modal Retrieval6
Urban Perception: Sensing Cities via a Deep Interactive Multi-task Learning Framework6
Bi-manual Haptic-based Periodontal Simulation with Finger Support and Vibrotactile Feedback6
0.049637079238892