IEEE Transactions on Pattern Analysis and Machine Intelligence

(The median citation count of IEEE Transactions on Pattern Analysis and Machine Intelligence is 5. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-09-01 to 2024-09-01.)
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields2220
Deep High-Resolution Representation Learning for Visual Recognition1931
Res2Net: A New Multi-Scale Backbone Architecture1674
Image Segmentation Using Deep Learning: A Survey1249
A Survey on Vision Transformer1130
Deep Learning for 3D Point Clouds: A Survey985
Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey924
Deep Learning for Image Super-Resolution: A Survey866
Deep Learning for Person Re-Identification: A Survey and Outlook853
Event-Based Vision: A Survey834
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding805
GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild799
U2Fusion: A Unified Unsupervised Image Fusion Network769
Cascade R-CNN: High Quality Object Detection and Instance Segmentation742
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning634
Residual Dense Network for Image Restoration506
Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection502
Meta-Learning in Neural Networks: A Survey501
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer486
A continual learning survey: Defying forgetting in classification tasks455
Normalizing Flows: An Introduction and Review of Current Methods445
Image Super-Resolution Via Iterative Refinement397
Recent Advances in Open Set Recognition: A Survey394
Plug-and-Play Image Restoration With Deep Denoiser Prior382
Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition373
A Style-Based Generator Architecture for Generative Adversarial Networks344
Salient Object Detection in the Deep Learning Era: An In-Depth Survey316
A Review of Domain Adaptation without Target Labels303
Deep Multi-View Enhancement Hashing for Image Retrieval298
Detection and Tracking Meet Drones Challenge294
Imbalance Problems in Object Detection: A Review293
The ApolloScape Open Dataset for Autonomous Driving and Its Application292
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks287
Multi-Task Learning for Dense Prediction Tasks: A Survey275
Contextual Transformer Networks for Visual Recognition269
Diffusion Models in Vision: A Survey259
Convolutional Networks with Dense Connectivity255
Prior Guided Feature Enrichment Network for Few-Shot Segmentation247
NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization244
Dynamic Neural Networks: A Survey240
Deep Audio-Visual Speech Recognition234
Domain Generalization: A Survey231
High Speed and High Dynamic Range Video with an Event Camera230
Image-Based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era230
YOLACT++ Better Real-Time Instance Segmentation227
Concealed Object Detection226
Semi-Supervised Semantic Segmentation With High- and Low-Level Consistency215
Deep Imbalanced Learning for Face Recognition and Attribute Prediction213
Low-Light Image and Video Enhancement Using Deep Learning: A Survey209
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs209
Beyond Self-Attention: External Attention Using Two Linear Layers for Visual Tasks200
ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training197
FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals193
Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges192
MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement191
CCNet: Criss-Cross Attention for Semantic Segmentation187
Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation187
Revisiting Video Saliency Prediction in the Deep Learning Era186
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models186
Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos179
Maximum Density Divergence for Domain Adaptation178
Human Action Recognition From Various Data Modalities: A Review176
Multiview Clustering: A Scalable and Parameter-Free Bipartite Graph Fusion Method173
ArcFace: Additive Angular Margin Loss for Deep Face Recognition170
Constructing Stronger and Faster Baselines for Skeleton-Based Action Recognition167
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning163
Learning Depth with Convolutional Spatial Propagation Network159
A Survey on Curriculum Learning153
Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding152
SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing151
A Comprehensive Analysis of Deep Regression151
AbdomenCT-1K: Is Abdominal Organ Segmentation a Solved Problem?148
Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks146
Spatiotemporal Co-Attention Recurrent Neural Networks for Human-Skeleton Motion Prediction146
Effects of Image Degradation and Degradation Removal to CNN-Based Image Classification145
GAN Inversion: A Survey144
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes144
Class-Incremental Learning: Survey and Performance Evaluation on Image Classification142
Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion141
Confidence Propagation through CNNs for Guided Sparse Depth Regression139
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time139
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D139
A Survey on Deep Learning Techniques for Stereo-Based Depth Estimation138
Fine-Grained Image Analysis With Deep Learning: A Survey138
Single Image Deraining: From Model-Based to Data-Driven and Beyond137
Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking136
MFQE 2.0: A New Approach for Multi-Frame Quality Enhancement on Compressed Video136
Graph U-Nets132
Coherence Constrained Graph LSTM for Group Activity Recognition131
The Emerging Trends of Multi-Label Learning130
Self-Correction for Human Parsing128
Robust Low-Rank Tensor Recovery with Rectification and Alignment127
Densely Residual Laplacian Super-Resolution126
Weakly Supervised Object Localization and Detection: A Survey125
MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval125
Direction-Aware Spatial Context Features for Shadow Detection and Removal125
Neural Image Compression for Gigapixel Histopathology Image Analysis125
Robust Multi-View Clustering With Incomplete Information121
Negation of the Quantum Mass Function for Multisource Quantum Information Fusion With its Application to Pattern Classification121
NAS-FAS: Static-Dynamic Central Difference Network Search for Face Anti-Spoofing121
MHF-Net: An Interpretable Deep Network for Multispectral and Hyperspectral Image Fusion120
Learning Enriched Features for Fast Image Restoration and Enhancement120
PaMIR: Parametric Model-Conditioned Implicit Representation for Image-Based Human Reconstruction120
Transfer Learning in Deep Reinforcement Learning: A Survey118
A Comprehensive Survey of Scene Graphs: Generation and Application117
Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods117
Graph Neural Networks with Convolutional ARMA Filters117
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images116
Learning Generalisable Omni-Scale Representations for Person Re-Identification115
From Show to Tell: A Survey on Deep Learning-Based Image Captioning115
MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation115
DeepFake Detection Based on Discrepancies Between Faces and Their Context114
CTNet: Context-Based Tandem Network for Semantic Segmentation111
RGB-D SLAM in Dynamic Environments Using Point Correlations111
Disentangling Light Fields for Super-Resolution and Disparity Estimation110
ZeroNAS: Differentiable Generative Adversarial Networks Search for Zero-Shot Learning109
A Novel Approach to Large-Scale Dynamically Weighted Directed Network Representation109
Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition108
Person Re-Identification by Contour Sketch Under Moderate Clothing Change107
Deep Residual Correction Network for Partial Domain Adaptation107
Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion106
Learning to Match Anchors for Visual Object Detection104
Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds103
Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction103
Siamese Network for RGB-D Salient Object Detection and Beyond102
An End-to-End Learning Framework for Video Compression102
XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging102
UniFormer: Unifying Convolution and Self-Attention for Visual Recognition101
Salient Object Detection via Integrity Learning101
Deep Long-Tailed Learning: A Survey100
Explainability in Graph Neural Networks: A Taxonomic Survey100
Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle98
SpectralGPT: Spectral Remote Sensing Foundation Model98
Multimodal Learning With Transformers: A Survey98
A Review on Deep Learning Techniques for Video Prediction98
Trusted Multi-View Classification With Dynamic Evidential Fusion97
A Review of Generalized Zero-Shot Learning Methods97
Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation97
Video-based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms95
SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild93
Segmenting Objects From Relational Visual Data93
Enhanced Tensor RPCA and its Application93
Self-Supervised Learning of Graph Neural Networks: A Unified Review92
Infinite Feature Selection: A Graph-based Feature Filtering Approach92
Structured Knowledge Distillation for Dense Prediction92
Tensor Low-Rank Representation for Data Recovery and Clustering92
Dual Encoding for Video Retrieval by Text91
Saliency Prediction in the Deep Learning Era: Successes and Limitations91
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition91
Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation91
Semi-Supervised Multi-View Deep Discriminant Representation Learning90
Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL90
Paying Attention to Video Object Pattern Understanding89
A Lightweight Optical Flow CNN —Revisiting Data Fidelity and Regularization88
Distilled Siamese Networks for Visual Tracking88
Physics-Based Generative Adversarial Models for Image Restoration and Beyond87
P2T: Pyramid Pooling Transformer for Scene Understanding87
High-Dimensional Dense Residual Convolutional Neural Network for Light Field Reconstruction87
Deep Gait Recognition: A Survey86
Multiset Feature Learning for Highly Imbalanced Data Classification86
Towards Large-Scale Small Object Detection: Survey and Benchmarks85
SensitiveNets: Learning Agnostic Representations with Application to Face Images85
End-to-End Optimized Versatile Image Compression With Wavelet-Like Transform85
Graph Neural Networks in Network Neuroscience85
Normalization Techniques in Training DNNs: Methodology, Analysis and Application83
Where and How to Transfer: Knowledge Aggregation-Induced Transferability Perception for Unsupervised Domain Adaptation83
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition83
Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer82
Deep Hough Transform for Semantic Line Detection82
Long-Term Visual Localization Revisited82
Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships82
Augmentation Invariant and Instance Spreading Feature for Softmax Embedding81
VOLO: Vision Outlooker for Visual Recognition81
Neural Architecture Transfer81
Adversarial Reciprocal Points Learning for Open Set Recognition81
Learning a Fixed-Length Fingerprint Representation79
DeepMIH: Deep Invertible Network for Multiple Image Hiding79
Heterogeneous Graph Attention Network for Unsupervised Multiple-Target Domain Adaptation78
Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution78
TransFuser: Imitation With Transformer-Based Sensor Fusion for Autonomous Driving78
Fast and Robust Iterative Closest Point77
Parallax Attention for Unsupervised Stereo Correspondence Learning77
Context-Aware Visual Policy Network for Fine-Grained Image Captioning77
A Bayesian Formulation of Coherent Point Drift77
Line Graph Neural Networks for Link Prediction76
Nonlinear Regression via Deep Negative Correlation Learning76
PoolNet+: Exploring the Potential of Pooling for Salient Object Detection76
Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap76
P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization74
Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy73
Divergence-Agnostic Unsupervised Domain Adaptation by Adversarial Attacks73
Kernel-Based Density Map Generation for Dense Object Counting73
Hyperbolic Deep Neural Networks: A Survey73
Bayesian Temporal Factorization for Multidimensional Time Series Prediction72
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines72
Multi-Source Causal Feature Selection72
Higher-Order Explanations of Graph Neural Networks via Relevant Walks71
Learning Part-based Convolutional Features for Person Re-Identification71
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses70
Re-thinking Co-Salient Object Detection70
Deep Clustering: On the Link Between Discriminative Models and K-Means70
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation70
MobileSal: Extremely Efficient RGB-D Salient Object Detection69
Inferring Point Cloud Quality via Graph Similarity69
Widar3.0: Zero-Effort Cross-Domain Gesture Recognition with Wi-Fi68
A Survey on Deep Learning Technique for Video Segmentation67
Decentralized Federated Averaging67
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting66
Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling66
Learning to Compose and Reason with Language Tree Structures for Visual Grounding66
Bias in Cross-Entropy-Based Training of Deep Survival Networks66
Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning66
Uncertainty Inspired RGB-D Saliency Detection66
Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation66
Towards Robust Discriminative Projections Learning via Non-Greedy -Norm MinMax65
PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation65
On Learning Disentangled Representations for Gait Recognition65
Learning End-to-End Lossy Image Compression: A Benchmark65
Deep Learning-based Multi-focus Image Fusion: A Survey and A Comparative Study65
A Topological Loss Function for Deep-Learning Based Image Segmentation Using Persistent Homology65
Unsupervised Grouped Axial Data Modeling via Hierarchical Bayesian Nonparametric Models With Watson Distributions65
Restoring Vision in Adverse Weather Conditions With Patch-Based Denoising Diffusion Models64
Support Vector Machine Classifier via Soft-Margin Loss64
Hypergraph Learning: Methods and Practices63
From Handcrafted to Deep Features for Pedestrian Detection: A Survey63
BDCN: Bi-Directional Cascade Network for Perceptual Edge Detection63
TransCenter: Transformers With Dense Representations for Multiple-Object Tracking63
Contrastive Learning with Stronger Augmentations63
Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC62
AlignSeg: Feature-Aligned Segmentation Networks62
GaitSet: Cross-view Gait Recognition through Utilizing Gait as a Deep Set62
Attention-Based Dropout Layer for Weakly Supervised Single Object Localization and Semantic Segmentation61
Real-World Image Denoising with Deep Boosting61
HiGCIN: Hierarchical Graph-Based Cross Inference Network for Group Activity Recognition61
Learning to Model Relationships for Zero-Shot Video Classification61
BlockQNN: Efficient Block-Wise Neural Network Architecture Generation61
Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation61
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers61
A Fully Automated Method for 3D Individual Tooth Identification and Segmentation in Dental CBCT61
Affective Image Content Analysis: Two Decades Review and New Perspectives61
HGNN+: General Hypergraph Neural Networks60
Cascaded Parsing of Human-Object Interaction Recognition60
Deep Back-ProjectiNetworks for Single Image Super-Resolution60
Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration59
MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network59
DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement59
Towards a Complete 3D Morphable Model of the Human Head59
Semantic Object Accuracy for Generative Text-to-Image Synthesis58
What and How: Generalized Lifelong Spectral Clustering via Dual Memory58
Self-Distillation: Towards Efficient and Compact Neural Networks58