Computer Vision and Image Understanding

Papers
(The TQCC of Computer Vision and Image Understanding is 4. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
Fake News Detection Based on BERT Multi-domain and Multi-modal Fusion Network178
Luminance prior guided Low-Light 4C catenary image enhancement165
Estimating 3D body mesh without SMPL annotations via alternating successive convex approximation161
Twin-SegNet: Dynamically coupled complementary segmentation networks for generalized medical image segmentation161
Periocular biometrics and its relevance to partially masked faces: A survey102
Target-aware and spatial-spectral discriminant feature joint correlation filters for hyperspectral video object tracking91
A novel fast combine-and-conquer object detector based on only one-level feature map78
Knowledge distillation for incremental learning in semantic segmentation74
Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition66
FusionDiff: A unified image fusion network based on diffusion probabilistic models53
De2Net: Under-52
Improving the planarity and sharpness of monocularly estimated depth images using the Phong reflection model51
GAMA: Geometric analysis based motion-aware architecture for moving object segmentation42
Editorial Board42
Learning rotation equivalent scene representation from instance-level semantics: A novel top-down perspective42
Precondition and effect reasoning for action recognition38
MATTE: Multi-task multi-scale attention38
Editorial Board37
Efficient multi-stage network with pixel-wise degradation prediction for real-time motion deblurring36
Lightweight feature point detection network with channel enhancement33
Trimap-guided feature mining and fusion network for natural image matting28
Editorial Board28
Certifiable algorithms for the two-view planar triangulation problem27
Editorial Board27
Cost-free adversarial defense: Distance-based optimization for model robustness without adversarial training25
Adaptive gradients and weight projection based on quantized neural networks for efficient image classification24
Sparse coding with morphology segmentation and multi-label fusion for hyperspectral image super-resolution24
Feature reconstruction and metric based network for few-shot object detection24
Editorial Board22
MetaVD: A Meta Video Dataset for enhancing human action recognition datasets21
Online object tracking based interactive attention20
Editorial Board20
Streaming egocentric action anticipation: An evaluation scheme and approach20
Editorial Board20
Structural reasoning for image-based social relation recognition20
Towards explainable deep visual saliency models19
Efficient 6-DoF camera pose tracking with circular edges19
Multi-view clustering with Laplacian rank constraint based on symmetric and nonnegative low-rank representation19
Editorial Board18
LOFReg: An outlier-based regulariser for deep metric learning18
Simplifying open-set video domain adaptation with contrastive learning18
On the coherency of quantitative evaluation of visual explanations17
Dehazing cost volume for deep multi-view stereo in scattering media with airlight and scattering coefficient estimation17
Feature independent Filter Pruning by Successive Layers analysis17
Handling new target classes in semantic segmentation with domain adaptation17
Domain generalized federated learning for Person Re-identification17
Superclass-aware network for few-shot learning16
Physics-based shading reconstruction for intrinsic image decomposition16
Editorial Board16
Feature fine-tuning and attribute representation transformation for zero-shot learning15
Editorial Board15
Emerging image generation with flexible control of perceived difficulty15
Editorial Board15
Exploring using jigsaw puzzles for out-of-distribution detection15
Fréchet AutoEncoder Distance: A new approach for evaluation of Generative Adversarial Networks14
Deducing health cues from biometric data14
Self-supervision & meta-learning for one-shot unsupervised cross-domain detection14
Efficient cross-information fusion decoder for semantic segmentation14
Real-world efficient fall detection: Balancing performance and complexity with FDGA workflow14
SdcNet for object recognition14
Joint coupled dictionaries-based visible-infrared image fusion method via texture preservation structure in sparse domain14
3DF-FCOS: Small object detection with 3D features based on FCOS14
Frame-level refinement networks for skeleton-based gait recognition14
Editorial Board13
Multi-patch multi-scale model for motion deblurring with high-frequency information13
Minimum error adaptive RGB calibration in a context of colorimetric uncertainty for cultural heritage preservation13
Collaborative three-stream transformers for video captioning12
Image amodal completion: A survey12
LocoGAN — Locally convolutional GAN11
Siamese self-supervised learning for fine-grained visual classification11
Weakly supervised action segmentation with effective use of attention and self-attention11
Robust Teacher: Self-correcting pseudo-label-guided semi-supervised learning for object detection11
Anti-jamming heart rate estimation using a spatial–temporal fusion network11
Skeleton Cluster Tracking for robust multi-view multi-person 3D human pose estimation10
FedER: Federated Learning through Experience Replay and privacy-preserving data synthesis10
Grow-push-prune: Aligning deep discriminants for effective structural network compression10
Editorial Board10
Robust real-world point cloud registration by inlier detection10
AdvFAS: A robust face anti-spoofing framework against adversarial examples10
Learning to teach and learn for semi-supervised few-shot image classification10
Instance-level salient object segmentation10
DHBSR: A deep hybrid representation-based network for blind image super resolution10
Modality adaptation via feature difference learning for depth human parsing10
Discriminative object tracking by domain contrast10
EFSCNN: Encoded Feature Sphere Convolution Neural Network for fast non-rigid 3D models classification and retrieval9
View consistency aware holistic triangulation for 3D human pose estimation9
Bidirectional brain image translation using transfer learning from generic pre-trained models9
CT-VOS: Cutout prediction and tagging for self-supervised video object segmentation9
Confidence sharing adaptation for out-of-domain human pose and shape estimation9
MFCT: Multi-Frequency Cascade Transformers for no-reference SR-IQA9
An egocentric video and eye-tracking dataset for visual search in convenience stores9
MDC-Net: Multi-domain constrained kernel estimation network for blind image super resolution9
Dual cross perception network with texture and boundary guidance for camouflaged object detection9
A review of 3D human pose estimation algorithms for markerless motion capture8
Scene adaptive mechanism for action recognition8
End-to-end pedestrian trajectory prediction via Efficient Multi-modal Predictors8
Extending function mixture network for improved spectral super-resolution8
Deep video compression based on Long-range Temporal Context Learning8
Semantically accurate super-resolution Generative Adversarial Networks8
Exploring the differences in adversarial robustness between ViT- and CNN-based models using novel metrics8
3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications8
Decoupled appearance and motion learning for efficient anomaly detection in surveillance video8
SCA-Net: Spatial and channel attention-based network for 3D point clouds8
Anchor pruning for object detection8
SimpleCut: A simple and strong 2D model for multi-person pose estimation8
RFCNet: Enhancing urban segmentation using regularization, fusion, and completion8
Improved high dynamic range imaging using multi-scale feature flows balanced between task-orientedness and accuracy8
An unsupervised multi-focus image fusion method via dual-channel convolutional network and discriminator7
Multi-Scale Adaptive Skeleton Transformer for action recognition7
RetSeg3D: Retention-based 3D semantic segmentation for autonomous driving7
PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning7
Spatial attention for human-centric visual understanding: An Information Bottleneck method7
Targeted adversarial attack on classic vision pipelines7
Seam estimation based on dense matching for parallax-tolerant image stitching7
CRML-Net: Cross-Modal Reasoning and Multi-Task Learning Network for tooth image segmentation7
A fast differential network with adaptive reference sample for gaze estimation7
FTM: The Face Truth Machine—Hand-crafted features from micro-expressions to support lie detection7
Local optimization cropping and boundary enhancement for end-to-end weakly-supervised segmentation network7
Facial landmarks localization using cascaded neural networks7
Dual stage semantic information based generative adversarial network for image super-resolution7
M-adapter: Multi-level image-to-video adaptation for video action recognition7
Delving into CLIP latent space for Video Anomaly Recognition7
3D scene generation for zero-shot learning using ChatGPT guided language prompts7
AWADA: Foreground-focused adversarial learning for cross-domain object detection7
Curriculum self-paced learning for cross-domain object detection6
End-to-end weakly-supervised single-stage multiple 3D hand mesh reconstruction from a single RGB image6
Deep learning-based single image face depth data enhancement6
Implicit and explicit commonsense for multi-sentence video captioning6
Deep-STaR: Classification of image time series based on spatio-temporal representations6
Adaptive semantic guidance network for video captioning6
Semantic-preserved point-based human avatar6
MAL-Net: Multiscale Attention Link Network for accurate eye center detection6
Enhancing image-based facial expression recognition through muscle activation-based facial feature extraction6
Empirical study on using adapters for debiased Visual Question Answering6
Single-image deblurring with neural networks: A comparative survey6
Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval6
Full-body virtual try-on using top and bottom garments with wearing style control6
Incorporating structural prior for depth regularization in shape from focus6
AFA-Net: Adaptive Feature Attention Network in image deblurring and super-resolution for improving license plate recognition6
PPformer: Using pixel-wise and patch-wise cross-attention for low-light image enhancement6
Unifying frame rate and temporal dilations for improved remote pulse detection6
Evaluate and improve the quality of neural style transfer6
Rolling-Shutter-stereo-aware motion estimation and image correction6
Multi-label image classification using adaptive graph convolutional networks: From a single domain to multiple domains6
Embedding AI ethics into the design and use of computer vision technology for consumer’s behaviour understanding6
Rebalanced supervised contrastive learning with prototypes for long-tailed visual recognition6
UATST: Towards unpaired arbitrary text-guided style transfer with cross-space modulation6
Low-budget label query through domain alignment enforcement6
Editorial Board6
Prediction and Description of Near-Future Activities in Video5
CMGNet: Collaborative multi-modal graph network for video captioning5
Multi-person 3D pose estimation from a single image captured by a fisheye camera5
A semantically driven self-supervised algorithm for detecting anomalies in image sets5
Adversarial composite prediction of normal video dynamics for anomaly detection5
RS3Lip: Consis5
Cross-domain fashion cloth retrieval via novel attention-guided cascade neural network and clothing parsing5
RelFormer: Advancing contextual relations for transformer-based dense captioning5
Enhanced local distribution learning for real image super-resolution5
Video frame interpolation via down–up scale generative adversarial networks5
Deep structural information fusion for 3D object detection on LiDAR–camera system5
Editorial Board5
Action assessment in rehabilitation: Leveraging machine learning and vision-based analysis5
Building extraction from remote sensing images with deep learning: A survey on vision techniques5
3D detection transformer: Set prediction of objects using point clouds5
As-Global-As-Possible stereo matching with Sparse Depth Measurement Fusion5
Improved domain adaptive object detector via adversarial feature learning5
Indoor Synthetic Data Generation: A Systematic Review5
Multi-perspective cross-class domain adaptation for open logo detection5
Nonlocal Gaussian scale mixture modeling for hyperspectral image denoising5
Self-supervised vision transformers for semantic segmentation5
Editorial Board4
MERLIN-Seg: Self-supervised despeckling for label-efficient semantic segmentation4
Cutout with patch-loss augmentation for improving generative adversarial networks against instability4
Editorial Board4
Editorial Board4
TransRPN: Towards the Transferable Adversarial Perturbations using Region Proposal Networks and Beyond4
A new deep CNN for 3D text localization in the wild through shadow removal4
High frame rate optical flow estimation from event sensors via intensity estimation4
Improved Short-term Dense Bottleneck network for efficient scene analysis4
SIERRA: A robust bilateral feature upsampler for dense prediction4
Improving the robustness of adversarial attacks using an affine-invariant gradient estimator4
Disentangled generation network for enlarged license plate recognition and a unified dataset4
A distribution independence based method for 3D face shape decomposition4
GAFL: Global adaptive filtering layer for computer vision4
Class knowledge overlay to visual feature learning for zero-shot image classification4
Image retrieval with mixed initiative and multimodal feedback4
Semantic segmentation from remote sensor data and the exploitation of latent learning for classification of auxiliary tasks4
LKDA-GAN: Cross-modality image synthesis via Generative Adversarial Network aggregating large kernel decomposable attention bottleneck block4
A closer look at branch classifiers of multi-exit architectures4
3D object feature extraction and classification using 3D MF-DFA4
Unsupervised domain adaptation for semantic segmentation via cross-region alignment4
Hierarchical image peeling: A flexible scale-space filtering framework4
Lmser-pix2seq: Learning stable sketch representations for sketch healing4
A formal approach to good practices in Pseudo-Labeling for Unsupervised Domain Adaptive Re-Identification4
Single image super-resolution via hybrid resolution NSST prediction4
Editorial Board4
MAIN: Multi-Attention Instance Network for video segmentation4
Editorial Board4
Constituent Attention for Vision Transformers4
0.061174154281616