Computer Vision and Image Understanding

Papers
(The TQCC of Computer Vision and Image Understanding is 5. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-03-01 to 2024-03-01.)
ArticleCitations
UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking258
Monocular human pose estimation: A survey of deep learning-based methods244
Skeleton-based action recognition via spatial and temporal transformer networks133
Deep 3D human pose estimation: A review123
Video anomaly detection and localization via Gaussian Mixture Fully Convolutional Variational Autoencoder120
Pyramid Channel-based Feature Attention Network for image dehazing108
Deep learning for deepfakes creation and detection: A survey80
Pros and cons of GAN evaluation measures: New developments77
A review of 3D human pose estimation algorithms for markerless motion capture74
Fake face detection via adaptive manipulation traces extraction network64
TCLR: Temporal contrastive learning for video representation57
Infrared and visible image fusion via gradientlet filter46
Single-image deblurring with neural networks: A comparative survey46
Adversarial autoencoders for compact representations of 3D point clouds43
High-level prior-based loss functions for medical image segmentation: A survey43
A comprehensive review of past and present image inpainting methods43
Knowledge distillation for incremental learning in semantic segmentation42
CUFD: An encoder–decoder network for visible and infrared image fusion based on common and unique feature decomposition38
Age estimation from faces using deep learning: A comparative analysis36
Nighttime image dehazing based on Retinex and dark channel prior using Taylor series expansion34
Visual complexity analysis using deep intermediate-layer features33
Multi-focus image fusion approach based on CNP systems in NSCT domain31
Detection of Face Recognition Adversarial Attacks29
Human action recognition in drone videos using a few aerial training examples29
Adversarial examples for replay attacks against CNN-based face recognition with anti-spoofing capability28
End-to-end deep learning-based fringe projection framework for 3D profiling of objects27
Learning deep edge prior for image denoising27
Curriculum self-paced learning for cross-domain object detection25
MTRNet++: One-stage mask-based scene text eraser24
Visual object tracking: A survey24
Video Deblurring via Spatiotemporal Pyramid Network and Adversarial Gradient Prior24
The synergy of double attention: Combine sentence-level and word-level attention for image captioning24
Cascade multi-head attention networks for action recognition23
Decoupled appearance and motion learning for efficient anomaly detection in surveillance video22
ICycleGAN: Single image dehazing based on iterative dehazing model and CycleGAN21
A survey on bias in visual datasets21
Predicting the future from first person (egocentric) vision: A survey21
Detail preserving image denoising with patch-based structure similarity via sparse representation and SVD20
Ghost Removal via Channel Attention in Exposure Fusion20
Uncertainty-aware consistency regularization for cross-domain semantic segmentation20
Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition20
Multi-scale attention network for image inpainting18
Deep structural information fusion for 3D object detection on LiDAR–camera system18
Multimodal attention networks for low-level vision-and-language navigation17
Person re-identification with part prediction alignment16
SSMTL++: Revisiting self-supervised multi-task learning for video anomaly detection16
Hyperspectral image restoration via CNN denoiser prior regularized low-rank tensor recovery15
Real-time and accurate object detection in compressed video by long short-term feature aggregation15
SSDA-YOLO: Semi-supervised domain adaptive YOLO for cross-domain object detection15
Pruning CNN filters via quantifying the importance of deep visual representations15
JSNet: A simulation network of JPEG lossy compression and restoration for robust image watermarking against JPEG attack15
Scalable learning for bridging the species gap in image-based plant phenotyping14
An attention recurrent model for human cooperation detection14
Evaluate and improve the quality of neural style transfer14
Attentive deep network for blind motion deblurring on dynamic scenes14
Sejong face database: A multi-modal disguise face database14
Image dehazing based on a transmission fusion strategy by automatic image matting14
Deep learning-based single image face depth data enhancement14
Efficient dual attention SlowFast networks for video action recognition13
Task dependent deep LDA pruning of neural networks13
Automatic detection and localization of thighbone fractures in X-ray based on improved deep learning method13
Residual network with detail perception loss for single image super-resolution13
Product image recognition with guidance learning and noisy supervision13
Representation learning of image composition for aesthetic prediction13
Video action detection by learning graph-based spatio-temporal interactions13
PS-DeVCEM: Pathology-sensitive deep learning model for video capsule endoscopy based on weakly labeled data13
Cross-modal distillation for RGB-depth person re-identification12
Joint identification–verification for person re-identification: A four stream deep learning approach with improved quartet loss function12
A survey on RGB-D datasets12
Few-shot action recognition with implicit temporal alignment and pair similarity optimization12
Momental directional patterns for dynamic texture recognition12
Physics-based shading reconstruction for intrinsic image decomposition11
Embedding group and obstacle information in LSTM networks for human trajectory prediction in crowded scenes11
Comprehensive comparative evaluation of background subtraction algorithms in open sea environments11
Adaptive CNN filter pruning using global importance metric11
Frame-level refinement networks for skeleton-based gait recognition11
A novel shape matching descriptor for real-time static hand gesture recognition11
Graph-matching-based correspondence search for nonrigid point cloud registration11
Light-weight shadow detection via GCN-based annotation strategy and knowledge distillation11
Animal pose estimation: A closer look at the state-of-the-art, existing gaps and opportunities11
Unifying frame rate and temporal dilations for improved remote pulse detection11
Fully convolutional online tracking10
AC-VRNN: Attentive Conditional-VRNN for multi-future trajectory prediction10
Multi-modal semantic image segmentation10
SID: Incremental learning for anchor-free object detection via Selective and Inter-related Distillation10
Image retrieval with mixed initiative and multimodal feedback10
Robust real-world point cloud registration by inlier detection10
Lightweight adaptive weighted network for single image super-resolution10
Intelligent video analysis: A Pedestrian trajectory extraction method for the whole indoor space without blind areas10
Facial landmarks localization using cascaded neural networks10
Single image rain removal via multi-module deep grid network10
Investigating the significance of adversarial attacks and their relation to interpretability for radar-based human activity recognition systems9
Learning to locate for fine-grained image recognition9
Facial landmark points detection using knowledge distillation-based neural networks9
A data augmentation framework by mining structured features for fake face image detection9
Self-supervised on-line cumulative learning from video streams9
MC-Calib: A generic and robust calibration toolbox for multi-camera systems9
Guess where? Actor-supervision for spatiotemporal action localization9
Visual BMI estimation from face images using a label distribution based method9
Rotation invariant features based on three dimensional Gaussian Markov random fields for volumetric texture classification8
A multi-view-CNN framework for deep representation learning in image classification8
Multi-human Fall Detection and Localization in Videos8
BacklitNet: A dataset and network for backlit image enhancement8
When CNNs meet random RNNs: Towards multi-level analysis for RGB-D object and scene recognition8
Monocular 3D multi-person pose estimation via predicting factorized correction factors8
Accurate MR image super-resolution via lightweight lateral inhibition network8
Encoder and decoder network with ResNet-50 and global average feature pooling for local change detection8
Learning a confidence measure in the disparity domain from O(1) features7
FIFNET: A convolutional neural network for motion-based multiframe super-resolution using fusion of interpolated frames7
Context understanding in computer vision: A survey7
Periocular biometrics and its relevance to partially masked faces: A survey7
Model-image registration of a building’s facade based on dense semantic segmentation7
A comparison of methods for 3D scene shape retrieval7
Open cross-domain visual search7
MTCD: Cataract detection via near infrared eye images7
Self-attentive 3D human pose and shape estimation from videos7
Video scene parsing: An overview of deep learning methods and datasets7
Detecting abnormality with separated foreground and background: Mutual Generative Adversarial Networks for video abnormal event detection7
Rolling-Shutter-stereo-aware motion estimation and image correction7
Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts6
Multi-perspective cross-class domain adaptation for open logo detection6
Semantic segmentation from remote sensor data and the exploitation of latent learning for classification of auxiliary tasks6
Anti-jamming heart rate estimation using a spatial–temporal fusion network6
Pointly-supervised scene parsing with uncertainty mixture6
On the exact recovery conditions of 3D human motion from 2D landmark motion with sparse articulated motion6
Are 3D convolutional networks inherently biased towards appearance?6
Deep code operation network for multi-label image retrieval6
Multi-person 3D pose estimation from a single image captured by a fisheye camera6
Anchor pruning for object detection6
LSTM guided ensemble correlation filter tracking with appearance model pool6
E-ProSRNet: An enhanced progressive single image super-resolution approach6
Adversarial feature distribution alignment for semi-supervised learning6
Video frame interpolation via down–up scale generative adversarial networks6
Classifier-agnostic saliency map extraction6
MetaVD: A Meta Video Dataset for enhancing human action recognition datasets6
Camouflaged object detection via Neighbor Connection and Hierarchical Information Transfer6
HSGAN: Reducing mode collapse in GANs by the latent code distance of homogeneous samples6
Weakly supervised instance segmentation using multi-prior fusion6
Diff attention: A novel attention scheme for person re-identification5
Zero-shot sketch-based image retrieval with structure-aware asymmetric disentanglement5
Triplanar convolution with shared 2D kernels for 3D classification and shape retrieval5
Dissected 3D CNNs: Temporal skip connections for efficient online video processing5
One-class anomaly detection via novelty normalization5
Fine-grained facial landmark detection exploiting intermediate feature representations5
Human skeletons and change detection for efficient violence detection in surveillance videos5
Learning to teach and learn for semi-supervised few-shot image classification5
LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR5
STURE: Spatial–Temporal Mutual Representation Learning for robust data association in online multi-object tracking5
Reliable shot identification for complex event detection via visual-semantic embedding5
TMF: Temporal Motion and Fusion for action recognition5
Adaptive Capsule Network5
Dynamic mode decomposition via convolutional autoencoders for dynamics modeling in videos5
Diversified text-to-image generation via deep mutual information estimation5
Single image super-resolution via hybrid resolution NSST prediction5
TransRPN: Towards the Transferable Adversarial Perturbations using Region Proposal Networks and Beyond5
3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications5
DenseNet-CTC: An end-to-end RNN-free architecture for context-free string recognition5
Learning representational invariances for data-efficient action recognition5
Infrared and visible image fusion using a guiding network to leverage perceptual similarity5
Learning transformer-based attention region with multiple scales for occluded person re-identification5
Pick-Object-Attack: Type-specific adversarial attack for object detection5
SAPS: Self-Attentive Pathway Search for weakly-supervised action localization with background-action augmentation5
SnapshotNet: Self-supervised feature learning for point cloud data segmentation using minimal labeled data5
A semantically driven self-supervised algorithm for detecting anomalies in image sets5
Feature reconstruction and metric based network for few-shot object detection5
Refining high-frequencies for sharper super-resolution and deblurring5
0.025386095046997