International Journal of Computer Vision

Papers
(The TQCC of International Journal of Computer Vision is 12. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-04-01 to 2024-04-01.)
ArticleCitations
Knowledge Distillation: A Survey979
FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking595
BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation565
Image Matching from Handcrafted to Deep Features: A Survey495
HOTA: A Higher Order Metric for Evaluating Multi-object Tracking326
Learning to Prompt for Vision-Language Models324
Beyond Brightening Low-light Images265
Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation254
Scene Text Detection and Recognition: The Deep Learning Era216
Image Matching Across Wide Baselines: From Paper to Practice174
SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion173
Human Action Recognition and Prediction: A Survey171
The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection154
Weakly-supervised Semantic Guided Hashing for Social Image Retrieval141
MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking137
Attention Guided Low-Light Image Enhancement with a Large Scale Low-Light Simulation Dataset131
OCNet: Object Context for Semantic Segmentation123
EfficientPS: Efficient Panoptic Segmentation119
You Only Look Yourself: Unsupervised and Untrained Single Image Dehazing Neural Network109
Benchmarking Low-Light Image Enhancement and Beyond107
Deep Image Deblurring: A Survey99
Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-10098
Unsupervised Scale-Consistent Depth Learning from Video91
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection87
Comparison of Full-Reference Image Quality Models for Optimization of Image Processing Systems85
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis84
Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images83
Pixel-Wise Crowd Understanding via Synthetic Data75
LaSOT: A High-quality Large-scale Single Object Tracking Benchmark75
On the Arbitrary-Oriented Object Detection: Classification Based Approaches Revisited74
Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training73
Unsupervised Deep Representation Learning for Real-Time Tracking71
JÂA-Net: Joint Facial Action Unit Detection and Face Alignment Via Adaptive Attention69
Curriculum Learning: A Survey65
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains62
Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis62
Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild59
GhostNets on Heterogeneous Devices via Cheap Operations54
CLIP-Adapter: Better Vision-Language Models with Feature Adapters54
Deformable Kernel Networks for Joint Image Filtering53
Explainability of Deep Vision-Based Autonomous Driving Systems: Review and Challenges53
Structure-Measure: A New Way to Evaluate Foreground Maps52
Adaptive Channel Selection for Robust Visual Object Tracking with Discriminative Correlation Filters52
VPR-Bench: An Open-Source Visual Place Recognition Evaluation Framework with Quantifiable Viewpoint and Appearance Change51
Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking50
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond50
AutoScale: Learning to Scale for Crowd Counting50
Rain Rendering for Evaluating and Improving Robustness to Bad Weather48
3D-FUTURE: 3D Furniture Shape with TextURE47
An Exploration of Embodied Visual Exploration45
Occluded Video Instance Segmentation: A Benchmark44
Synthetic Humans for Action Recognition from Unseen Viewpoints43
Towards High Performance Human Keypoint Detection43
Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion41
Vis-MVSNet: Visibility-Aware Multi-view Stereo Network41
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation41
A Comprehensive Benchmark Analysis of Single Image Deraining: Current Challenges and Future Perspectives41
Scale-Aware Domain Adaptive Faster R-CNN40
A Survey on Long-Tailed Visual Recognition40
AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild39
Train Sparsely, Generate Densely: Memory-Efficient Unsupervised Training of High-Resolution Temporal GAN39
Countering Malicious DeepFakes: Survey, Battleground, and Horizon39
Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning37
Bridging Composite and Real: Towards End-to-End Deep Image Matting35
NormAttention-PSN: A High-frequency Region Enhanced Photometric Stereo Network with Normalized Attention35
Deep Nets: What have They Ever Done for Vision?35
Multi-level Motion Attention for Human Motion Prediction34
SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point Clouds33
Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts33
Context Autoencoder for Self-supervised Representation Learning33
Twin Contrastive Learning for Online Clustering33
Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild32
Learning to Reconstruct HDR Images from Events, with Applications to Depth and Flow Prediction32
Quo Vadis, Skeleton Action Recognition?32
Semantic Edge Detection with Diverse Deep Supervision32
Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification31
Successive Graph Convolutional Network for Image De-raining31
3DFaceGAN: Adversarial Nets for 3D Face Representation, Generation, and Translation30
Mimetics: Towards Understanding Human Actions Out of Context29
Compositional GAN: Learning Image-Conditional Binary Composition28
Benchmarking the Robustness of Semantic Segmentation Models with Respect to Common Corruptions28
Manhattan Room Layout Reconstruction from a Single $$360^{\circ }$$ Image: A Comparative Study of State-of-the-Art Methods28
Polysemy Deciphering Network for Robust Human–Object Interaction Detection27
Recursive Context Routing for Object Detection27
MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation27
Low-light Image Enhancement via Breaking Down the Darkness27
Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks27
Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction27
Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization27
LAMP-HQ: A Large-Scale Multi-pose High-Quality Database and Benchmark for NIR-VIS Face Recognition26
SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos26
Mitigating Demographic Bias in Facial Datasets with Style-Based Multi-attribute Transfer26
A Coarse-to-Fine Framework for Resource Efficient Video Recognition24
Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild24
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior23
Hadamard Matrix Guided Online Hashing23
Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations23
Selective Wavelet Attention Learning for Single Image Deraining23
Multi-task Compositional Network for Visual Relationship Detection22
Parallel Single-Pixel Imaging: A General Method for Direct–Global Separation and 3D Shape Reconstruction Under Strong Global Illumination22
Learning JPEG Compression Artifacts for Image Manipulation Detection and Localization22
Fine-Grained Instance-Level Sketch-Based Image Retrieval22
Dual Convolutional Neural Networks for Low-Level Vision22
Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification22
Feature Matching via Motion-Consistency Driven Probabilistic Graphical Model21
Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention and Spatial Memory21
RePCD-Net: Feature-Aware Recurrent Point Cloud Denoising Network21
Memory-Augmented Deep Unfolding Network for Guided Image Super-resolution20
3D Semantic Scene Completion: A Survey20
SODA: Weakly Supervised Temporal Action Localization Based on Astute Background Response and Self-Distillation Learning20
Context-Enhanced Representation Learning for Single Image Deraining19
Learning Deep Patch representation for Probabilistic Graphical Model-Based Face Sketch Synthesis19
Zero-Shot Learning on 3D Point Cloud Objects and Beyond19
SRT3D: A Sparse Region-Based 3D Object Tracking Approach for the Real World19
Underwater Camera: Improving Visual Perception Via Adaptive Dark Pixel Prior and Color Correction19
Spatial–Temporal Relation Reasoning for Action Prediction in Videos19
Adaptive Deep Disturbance-Disentangled Learning for Facial Expression Recognition18
Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks18
3D Object Detection for Autonomous Driving: A Comprehensive Survey18
Intra-Camera Supervised Person Re-Identification18
Dual-Constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior18
A Shape Transformation-based Dataset Augmentation Framework for Pedestrian Detection18
Dual-Attention-Guided Network for Ghost-Free High Dynamic Range Imaging18
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets17
Towards Balanced Learning for Instance Recognition17
Enhanced 3D Human Pose Estimation from Videos by Using Attention-Based Neural Network with Dilated Convolutions17
Learning Self-supervised Low-Rank Network for Single-Stage Weakly and Semi-supervised Semantic Segmentation17
Cascaded Split-and-Aggregate Learning with Feature Recombination for Pedestrian Attribute Recognition17
Incorporating Side Information by Adaptive Convolution17
A General Framework for Deep Supervised Discrete Hashing17
Delving Deeper into Anti-Aliasing in ConvNets17
On Measuring and Controlling the Spectral Bias of the Deep Image Prior16
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision16
Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling16
Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection16
PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer16
Learning to Detect Instance-Level Salient Objects Using Complementary Image Labels16
Class-Difficulty Based Methods for Long-Tailed Visual Recognition15
SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters15
OASIS: Only Adversarial Supervision for Semantic Image Synthesis15
Learning Regression and Verification Networks for Robust Long-term Tracking15
Vote-Based 3D Object Detection with Context Modeling and SOB-3DNMS15
A Survey on Intrinsic Images: Delving Deep into Lambert and Beyond15
Evaluation Metrics for Conditional Image Generation15
One-Shot Object Affordance Detection in the Wild15
Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing15
Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications14
Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-Based Image Retrieval14
Beyond Covariance: SICE and Kernel Based Visual Feature Representation14
DeMoCap: Low-Cost Marker-Based Motion Capture14
Revisiting Consistency Regularization for Semi-Supervised Learning14
ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition14
Pre-Training Without Natural Images14
RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning14
Learning the Clustering of Longitudinal Shape Data Sets into a Mixture of Independent or Branching Trajectories14
Face Image Reflection Removal14
Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification14
I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene Text Detection13
A Benchmark and Evaluation of Non-Rigid Structure from Motion13
Object Priors for Classifying and Localizing Unseen Actions13
RoCGAN: Robust Conditional GAN13
CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection13
EAN: Event Adaptive Network for Enhanced Action Recognition13
Sparse Black-Box Video Attack with Reinforcement Learning13
Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning13
Delving into Inter-Image Invariance for Unsupervised Visual Representations13
Going Deeper than Tracking: A Survey of Computer-Vision Based Recognition of Animal Pain and Emotions12
AutoDet: Pyramid Network Architecture Search for Object Detection12
Pyramid Attention Network for Image Restoration12
Saliency Detection Inspired by Topological Perception Theory12
Attribute Prototype Network for Any-Shot Learning12
Visual Object Tracking in First Person Vision12
A Numerical Framework for Elastic Surface Matching, Comparison, and Interpolation12
Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation12
H-SegMed: A Hybrid Method for Prostate Segmentation in TRUS Images via Improved Closed Principal Curve and Improved Enhanced Machine Learning12
Artificial Intelligence for Dunhuang Cultural Heritage Protection: The Project and the Dataset12
1.069699048996