IEEE Transactions on Pattern Analysis and Machine Intelligence

Papers
(The median citation count of IEEE Transactions on Pattern Analysis and Machine Intelligence is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields2297
Deep High-Resolution Representation Learning for Visual Recognition2045
Res2Net: A New Multi-Scale Backbone Architecture1742
Image Segmentation Using Deep Learning: A Survey1304
A Survey on Vision Transformer1246
Deep Learning for 3D Point Clouds: A Survey1024
Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey965
Deep Learning for Person Re-Identification: A Survey and Outlook911
Deep Learning for Image Super-Resolution: A Survey908
Event-Based Vision: A Survey907
GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild838
U2Fusion: A Unified Unsupervised Image Fusion Network829
Cascade R-CNN: High Quality Object Detection and Instance Segmentation779
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning682
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer585
Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection528
Residual Dense Network for Image Restoration528
Meta-Learning in Neural Networks: A Survey528
Image Super-Resolution Via Iterative Refinement507
A continual learning survey: Defying forgetting in classification tasks504
Normalizing Flows: An Introduction and Review of Current Methods465
Plug-and-Play Image Restoration With Deep Denoiser Prior419
Recent Advances in Open Set Recognition: A Survey410
Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition374
A Style-Based Generator Architecture for Generative Adversarial Networks370
Salient Object Detection in the Deep Learning Era: An In-Depth Survey323
Diffusion Models in Vision: A Survey317
Detection and Tracking Meet Drones Challenge310
A Review of Domain Adaptation without Target Labels310
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks304
Deep Multi-View Enhancement Hashing for Image Retrieval303
Imbalance Problems in Object Detection: A Review296
Multi-Task Learning for Dense Prediction Tasks: A Survey294
Contextual Transformer Networks for Visual Recognition292
Prior Guided Feature Enrichment Network for Few-Shot Segmentation275
Domain Generalization: A Survey272
Convolutional Networks with Dense Connectivity263
Concealed Object Detection255
NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization252
Deep Audio-Visual Speech Recognition252
High Speed and High Dynamic Range Video with an Event Camera250
Dynamic Neural Networks: A Survey249
YOLACT++ Better Real-Time Instance Segmentation235
Image-Based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era233
Semi-Supervised Semantic Segmentation With High- and Low-Level Consistency231
Low-Light Image and Video Enhancement Using Deep Learning: A Survey225
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs220
ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training216
Deep Imbalanced Learning for Face Recognition and Attribute Prediction215
Beyond Self-Attention: External Attention Using Two Linear Layers for Visual Tasks214
Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation206
Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges204
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models202
FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals201
MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement198
Revisiting Video Saliency Prediction in the Deep Learning Era191
CCNet: Criss-Cross Attention for Semantic Segmentation191
Human Action Recognition From Various Data Modalities: A Review190
ArcFace: Additive Angular Margin Loss for Deep Face Recognition186
Maximum Density Divergence for Domain Adaptation182
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning178
Multiview Clustering: A Scalable and Parameter-Free Bipartite Graph Fusion Method178
Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition177
A Survey on Curriculum Learning174
Class-Incremental Learning: Survey and Performance Evaluation on Image Classification174
AbdomenCT-1K: Is Abdominal Organ Segmentation a Solved Problem?167
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D166
Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion163
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time161
GAN Inversion: A Survey158
SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing157
Fine-Grained Image Analysis With Deep Learning: A Survey153
Effects of Image Degradation and Degradation Removal to CNN-Based Image Classification151
Spatiotemporal Co-Attention Recurrent Neural Networks for Human-Skeleton Motion Prediction150
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes149
Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks148
A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation146
PaMIR: Parametric Model-Conditioned Implicit Representation for Image-Based Human Reconstruction145
Transfer Learning in Deep Reinforcement Learning: A Survey141
The Emerging Trends of Multi-Label Learning141
Self-Correction for Human Parsing141
MFQE 2.0: A New Approach for Multi-Frame Quality Enhancement on Compressed Video140
Single Image Deraining: From Model-Based to Data-Driven and Beyond139
Learning Enriched Features for Fast Image Restoration and Enhancement138
Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking137
Graph U-Nets137
Coherence Constrained Graph LSTM for Group Activity Recognition134
Weakly Supervised Object Localization and Detection: A Survey134
Robust Multi-View Clustering With Incomplete Information132
Densely Residual Laplacian Super-Resolution131
Direction-Aware Spatial Context Features for Shadow Detection and Removal131
MHF-Net: An Interpretable Deep Network for Multispectral and Hyperspectral Image Fusion130
SpectralGPT: Spectral Remote Sensing Foundation Model127
Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods127
Neural Image Compression for Gigapixel Histopathology Image Analysis127
Robust Low-Rank Tensor Recovery with Rectification and Alignment127
MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval126
NAS-FAS: Static-Dynamic Central Difference Network Search for Face Anti-Spoofing126
From Show to Tell: A Survey on Deep Learning-Based Image Captioning124
Multimodal Learning With Transformers: A Survey124
Negation of the Quantum Mass Function for Multisource Quantum Information Fusion With its Application to Pattern Classification123
A Comprehensive Survey of Scene Graphs: Generation and Application123
Deep Long-Tailed Learning: A Survey122
MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation122
Graph Neural Networks with Convolutional ARMA Filters122
CTNet: Context-Based Tandem Network for Semantic Segmentation120
DeepFake Detection Based on Discrepancies Between Faces and Their Context119
Learning Generalisable Omni-Scale Representations for Person Re-Identification118
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images117
RGB-D SLAM in Dynamic Environments Using Point Correlations114
Disentangling Light Fields for Super-Resolution and Disparity Estimation114
Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion113
Person Re-Identification by Contour Sketch Under Moderate Clothing Change112
UniFormer: Unifying Convolution and Self-Attention for Visual Recognition111
Deep Residual Correction Network for Partial Domain Adaptation110
A Novel Approach to Large-Scale Dynamically Weighted Directed Network Representation109
ZeroNAS: Differentiable Generative Adversarial Networks Search for Zero-Shot Learning109
Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition108
Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds106
Salient Object Detection via Integrity Learning106
A Review of Generalized Zero-Shot Learning Methods106
XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging106
Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction105
Explainability in Graph Neural Networks: A Taxonomic Survey105
An End-to-End Learning Framework for Video Compression105
Learning to Match Anchors for Visual Object Detection104
Siamese Network for RGB-D Salient Object Detection and Beyond104
A Review on Deep Learning Techniques for Video Prediction103
Trusted Multi-View Classification With Dynamic Evidential Fusion103
Towards Large-Scale Small Object Detection: Survey and Benchmarks102
Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation100
Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms98
Infinite Feature Selection: A Graph-based Feature Filtering Approach98
Self-Supervised Learning of Graph Neural Networks: A Unified Review98
A Lightweight Optical Flow CNN —Revisiting Data Fidelity and Regularization97
SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild97
Normalization Techniques in Training DNNs: Methodology, Analysis and Application97
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition96
Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation95
Segmenting Objects From Relational Visual Data95
Enhanced Tensor RPCA and its Application95
Structured Knowledge Distillation for Dense Prediction93
Saliency Prediction in the Deep Learning Era: Successes and Limitations93
Tensor Low-Rank Representation for Data Recovery and Clustering93
Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL92
P2T: Pyramid Pooling Transformer for Scene Understanding92
Physics-Based Generative Adversarial Models for Image Restoration and Beyond91
Semi-Supervised Multi-View Deep Discriminant Representation Learning91
Deep Gait Recognition: A Survey91
Dual Encoding for Video Retrieval by Text91
Distilled Siamese Networks for Visual Tracking91
Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer90
Paying Attention to Video Object Pattern Understanding89
Adversarial Reciprocal Points Learning for Open Set Recognition89
Graph Neural Networks in Network Neuroscience89
Long-Term Visual Localization Revisited89
TransFuser: Imitation With Transformer-Based Sensor Fusion for Autonomous Driving88
High-Dimensional Dense Residual Convolutional Neural Network for Light Field Reconstruction88
SensitiveNets: Learning Agnostic Representations with Application to Face Images88
VOLO: Vision Outlooker for Visual Recognition88
End-to-End Optimized Versatile Image Compression With Wavelet-Like Transform88
Multiset Feature Learning for Highly Imbalanced Data Classification86
Neural Architecture Transfer85
Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships85
Where and How to Transfer: Knowledge Aggregation-Induced Transferability Perception for Unsupervised Domain Adaptation85
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition85
Deep Hough Transform for Semantic Line Detection84
DeepMIH: Deep Invertible Network for Multiple Image Hiding84
Restoring Vision in Adverse Weather Conditions With Patch-Based Denoising Diffusion Models83
Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap83
Parallax Attention for Unsupervised Stereo Correspondence Learning82
Learning a Fixed-Length Fingerprint Representation82
Augmentation Invariant and Instance Spreading Feature for Softmax Embedding81
PoolNet+: Exploring the Potential of Pooling for Salient Object Detection81
Fast and Robust Iterative Closest Point81
Heterogeneous Graph Attention Network for Unsupervised Multiple-Target Domain Adaptation81
Kernel-Based Density Map Generation for Dense Object Counting79
Line Graph Neural Networks for Link Prediction79
Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution79
Context-Aware Visual Policy Network for Fine-Grained Image Captioning78
A Bayesian Formulation of Coherent Point Drift78
Divergence-Agnostic Unsupervised Domain Adaptation by Adversarial Attacks78
Nonlinear Regression via Deep Negative Correlation Learning77
Hyperbolic Deep Neural Networks: A Survey77
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses76
Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy76
Bayesian Temporal Factorization for Multidimensional Time Series Prediction76
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines76
Higher-Order Explanations of Graph Neural Networks via Relevant Walks76
P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization75
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting74
A Survey on Deep Learning Technique for Video Segmentation74
Learning to Compose and Reason with Language Tree Structures for Visual Grounding73
Inferring Point Cloud Quality via Graph Similarity73
Widar3.0: Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi72
Bias in Cross-Entropy-Based Training of Deep Survival Networks72
Deep Clustering: On the Link Between Discriminative Models and K-Means72
Decentralized Federated Averaging72
Affective Image Content Analysis: Two Decades Review and New Perspectives72
Learning Part-based Convolutional Features for Person Re-Identification71
Re-thinking Co-Salient Object Detection71
MobileSal: Extremely Efficient RGB-D Salient Object Detection71
Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation71
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling71
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation71
A Topological Loss Function for Deep-Learning Based Image Segmentation Using Persistent Homology70
TransCenter: Transformers With Dense Representations for Multiple-Object Tracking70
On Learning Disentangled Representations for Gait Recognition69
Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning69
Deep Learning-based Multi-focus Image Fusion: A Survey and A Comparative Study68
Learning End-to-End Lossy Image Compression: A Benchmark68
HGNN+: General Hypergraph Neural Networks68
Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC68
Real-World Image Denoising with Deep Boosting67
Towards Robust Discriminative Projections Learning via Non-Greedy -Norm MinMax67
PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation67
From Handcrafted to Deep Features for Pedestrian Detection: A Survey67
Uncertainty Inspired RGB-D Saliency Detection67
Hypergraph Learning: Methods and Practices67
Unsupervised Grouped Axial Data Modeling via Hierarchical Bayesian Nonparametric Models With Watson Distributions66
Support Vector Machine Classifier via Soft-Margin Loss66
DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement65
GaitSet: Cross-view Gait Recognition through Utilizing Gait as a Deep Set65
Image De-Raining Transformer65
Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation65
Contrastive Learning with Stronger Augmentations65
BDCN: Bi-Directional Cascade Network for Perceptual Edge Detection64
Learning to Model Relationships for Zero-Shot Video Classification64
Base and Meta: A New Perspective on Few-Shot Segmentation64
Deep Back-ProjectiNetworks for Single Image Super-Resolution63
AlignSeg: Feature-Aligned Segmentation Networks63
Attention-Based Dropout Layer for Weakly Supervised Single Object Localization and Semantic Segmentation63
A Background-Agnostic Framework with Adversarial Training for Abnormal Event Detection in Video63
BlockQNN: Efficient Block-Wise Neural Network Architecture Generation62
Contrastive Adaptation Network for Single- and Multi-Source Domain Adaptation62
HiGCIN: Hierarchical Graph-Based Cross Inference Network for Group Activity Recognition62
Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration62
SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation61
Towards a Complete 3D Morphable Model of the Human Head61
CoRRN: Cooperative Reflection Removal Network61
Semantic Object Accuracy for Generative Text-to-Image Synthesis61
Self-Distillation: Towards Efficient and Compact Neural Networks61
What and How: Generalized Lifelong Spectral Clustering via Dual Memory61
NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size60
PMP-Net++: Point Cloud Completion by Transformer-Enhanced Multi-Step Point Moving Paths60
Cascaded Parsing of Human-Object Interaction Recognition60
Towards A Weakly Supervised Framework for 3D Point Cloud Object Detection and Annotation60
A Fully Automated Method for 3D Individual Tooth Identification and Segmentation in Dental CBCT60
MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network59
End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network59
0.090455055236816