Computer Vision and Image Understanding

Papers
(The median citation count of Computer Vision and Image Understanding is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
Fake News Detection Based on BERT Multi-domain and Multi-modal Fusion Network178
Luminance prior guided Low-Light 4C catenary image enhancement165
Twin-SegNet: Dynamically coupled complementary segmentation networks for generalized medical image segmentation161
Estimating 3D body mesh without SMPL annotations via alternating successive convex approximation161
Periocular biometrics and its relevance to partially masked faces: A survey102
Target-aware and spatial-spectral discriminant feature joint correlation filters for hyperspectral video object tracking91
A novel fast combine-and-conquer object detector based on only one-level feature map78
Knowledge distillation for incremental learning in semantic segmentation74
Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition66
FusionDiff: A unified image fusion network based on diffusion probabilistic models53
De2Net: Under-52
Improving the planarity and sharpness of monocularly estimated depth images using the Phong reflection model51
Learning rotation equivalent scene representation from instance-level semantics: A novel top-down perspective42
GAMA: Geometric analysis based motion-aware architecture for moving object segmentation42
Editorial Board42
Precondition and effect reasoning for action recognition38
MATTE: Multi-task multi-scale attention38
Editorial Board37
Efficient multi-stage network with pixel-wise degradation prediction for real-time motion deblurring36
Lightweight feature point detection network with channel enhancement33
Editorial Board28
Trimap-guided feature mining and fusion network for natural image matting28
Certifiable algorithms for the two-view planar triangulation problem27
Editorial Board27
Cost-free adversarial defense: Distance-based optimization for model robustness without adversarial training25
Sparse coding with morphology segmentation and multi-label fusion for hyperspectral image super-resolution24
Feature reconstruction and metric based network for few-shot object detection24
Adaptive gradients and weight projection based on quantized neural networks for efficient image classification24
Editorial Board22
MetaVD: A Meta Video Dataset for enhancing human action recognition datasets21
Streaming egocentric action anticipation: An evaluation scheme and approach20
Editorial Board20
Structural reasoning for image-based social relation recognition20
Online object tracking based interactive attention20
Editorial Board20
Multi-view clustering with Laplacian rank constraint based on symmetric and nonnegative low-rank representation19
Towards explainable deep visual saliency models19
Efficient 6-DoF camera pose tracking with circular edges19
Simplifying open-set video domain adaptation with contrastive learning18
Editorial Board18
LOFReg: An outlier-based regulariser for deep metric learning18
Domain generalized federated learning for Person Re-identification17
On the coherency of quantitative evaluation of visual explanations17
Dehazing cost volume for deep multi-view stereo in scattering media with airlight and scattering coefficient estimation17
Feature independent Filter Pruning by Successive Layers analysis17
Handling new target classes in semantic segmentation with domain adaptation17
Editorial Board16
Superclass-aware network for few-shot learning16
Physics-based shading reconstruction for intrinsic image decomposition16
Emerging image generation with flexible control of perceived difficulty15
Editorial Board15
Exploring using jigsaw puzzles for out-of-distribution detection15
Feature fine-tuning and attribute representation transformation for zero-shot learning15
Editorial Board15
Efficient cross-information fusion decoder for semantic segmentation14
Real-world efficient fall detection: Balancing performance and complexity with FDGA workflow14
SdcNet for object recognition14
Joint coupled dictionaries-based visible-infrared image fusion method via texture preservation structure in sparse domain14
3DF-FCOS: Small object detection with 3D features based on FCOS14
Frame-level refinement networks for skeleton-based gait recognition14
Fréchet AutoEncoder Distance: A new approach for evaluation of Generative Adversarial Networks14
Deducing health cues from biometric data14
Self-supervision & meta-learning for one-shot unsupervised cross-domain detection14
Multi-patch multi-scale model for motion deblurring with high-frequency information13
Minimum error adaptive RGB calibration in a context of colorimetric uncertainty for cultural heritage preservation13
Editorial Board13
Image amodal completion: A survey12
Collaborative three-stream transformers for video captioning12
Robust Teacher: Self-correcting pseudo-label-guided semi-supervised learning for object detection11
Anti-jamming heart rate estimation using a spatial–temporal fusion network11
LocoGAN — Locally convolutional GAN11
Siamese self-supervised learning for fine-grained visual classification11
Weakly supervised action segmentation with effective use of attention and self-attention11
Instance-level salient object segmentation10
DHBSR: A deep hybrid representation-based network for blind image super resolution10
Modality adaptation via feature difference learning for depth human parsing10
Discriminative object tracking by domain contrast10
Skeleton Cluster Tracking for robust multi-view multi-person 3D human pose estimation10
FedER: Federated Learning through Experience Replay and privacy-preserving data synthesis10
Grow-push-prune: Aligning deep discriminants for effective structural network compression10
Editorial Board10
Robust real-world point cloud registration by inlier detection10
AdvFAS: A robust face anti-spoofing framework against adversarial examples10
Learning to teach and learn for semi-supervised few-shot image classification10
MFCT: Multi-Frequency Cascade Transformers for no-reference SR-IQA9
CT-VOS: Cutout prediction and tagging for self-supervised video object segmentation9
An egocentric video and eye-tracking dataset for visual search in convenience stores9
Confidence sharing adaptation for out-of-domain human pose and shape estimation9
Dual cross perception network with texture and boundary guidance for camouflaged object detection9
MDC-Net: Multi-domain constrained kernel estimation network for blind image super resolution9
EFSCNN: Encoded Feature Sphere Convolution Neural Network for fast non-rigid 3D models classification and retrieval9
Bidirectional brain image translation using transfer learning from generic pre-trained models9
View consistency aware holistic triangulation for 3D human pose estimation9
Deep video compression based on Long-range Temporal Context Learning8
Semantically accurate super-resolution Generative Adversarial Networks8
Exploring the differences in adversarial robustness between ViT- and CNN-based models using novel metrics8
3D semantic segmentation based on spatial-aware convolution and shape completion for augmented reality applications8
Decoupled appearance and motion learning for efficient anomaly detection in surveillance video8
SCA-Net: Spatial and channel attention-based network for 3D point clouds8
Anchor pruning for object detection8
SimpleCut: A simple and strong 2D model for multi-person pose estimation8
RFCNet: Enhancing urban segmentation using regularization, fusion, and completion8
Improved high dynamic range imaging using multi-scale feature flows balanced between task-orientedness and accuracy8
A review of 3D human pose estimation algorithms for markerless motion capture8
Scene adaptive mechanism for action recognition8
End-to-end pedestrian trajectory prediction via Efficient Multi-modal Predictors8
Extending function mixture network for improved spectral super-resolution8
Targeted adversarial attack on classic vision pipelines7
Seam estimation based on dense matching for parallax-tolerant image stitching7
CRML-Net: Cross-Modal Reasoning and Multi-Task Learning Network for tooth image segmentation7
A fast differential network with adaptive reference sample for gaze estimation7
FTM: The Face Truth Machine—Hand-crafted features from micro-expressions to support lie detection7
Local optimization cropping and boundary enhancement for end-to-end weakly-supervised segmentation network7
Facial landmarks localization using cascaded neural networks7
Dual stage semantic information based generative adversarial network for image super-resolution7
M-adapter: Multi-level image-to-video adaptation for video action recognition7
Delving into CLIP latent space for Video Anomaly Recognition7
3D scene generation for zero-shot learning using ChatGPT guided language prompts7
AWADA: Foreground-focused adversarial learning for cross-domain object detection7
An unsupervised multi-focus image fusion method via dual-channel convolutional network and discriminator7
Multi-Scale Adaptive Skeleton Transformer for action recognition7
RetSeg3D: Retention-based 3D semantic segmentation for autonomous driving7
PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning7
Spatial attention for human-centric visual understanding: An Information Bottleneck method7
Enhancing image-based facial expression recognition through muscle activation-based facial feature extraction6
Empirical study on using adapters for debiased Visual Question Answering6
Single-image deblurring with neural networks: A comparative survey6
Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval6
Full-body virtual try-on using top and bottom garments with wearing style control6
Incorporating structural prior for depth regularization in shape from focus6
AFA-Net: Adaptive Feature Attention Network in image deblurring and super-resolution for improving license plate recognition6
PPformer: Using pixel-wise and patch-wise cross-attention for low-light image enhancement6
Unifying frame rate and temporal dilations for improved remote pulse detection6
Evaluate and improve the quality of neural style transfer6
Rolling-Shutter-stereo-aware motion estimation and image correction6
Multi-label image classification using adaptive graph convolutional networks: From a single domain to multiple domains6
Embedding AI ethics into the design and use of computer vision technology for consumer’s behaviour understanding6
Rebalanced supervised contrastive learning with prototypes for long-tailed visual recognition6
UATST: Towards unpaired arbitrary text-guided style transfer with cross-space modulation6
Low-budget label query through domain alignment enforcement6
Editorial Board6
Curriculum self-paced learning for cross-domain object detection6
End-to-end weakly-supervised single-stage multiple 3D hand mesh reconstruction from a single RGB image6
Deep learning-based single image face depth data enhancement6
Implicit and explicit commonsense for multi-sentence video captioning6
Deep-STaR: Classification of image time series based on spatio-temporal representations6
Adaptive semantic guidance network for video captioning6
Semantic-preserved point-based human avatar6
MAL-Net: Multiscale Attention Link Network for accurate eye center detection6
Cross-domain fashion cloth retrieval via novel attention-guided cascade neural network and clothing parsing5
RelFormer: Advancing contextual relations for transformer-based dense captioning5
Enhanced local distribution learning for real image super-resolution5
Video frame interpolation via down–up scale generative adversarial networks5
Deep structural information fusion for 3D object detection on LiDAR–camera system5
Editorial Board5
Action assessment in rehabilitation: Leveraging machine learning and vision-based analysis5
Building extraction from remote sensing images with deep learning: A survey on vision techniques5
3D detection transformer: Set prediction of objects using point clouds5
As-Global-As-Possible stereo matching with Sparse Depth Measurement Fusion5
Improved domain adaptive object detector via adversarial feature learning5
Indoor Synthetic Data Generation: A Systematic Review5
Multi-perspective cross-class domain adaptation for open logo detection5
Nonlocal Gaussian scale mixture modeling for hyperspectral image denoising5
Self-supervised vision transformers for semantic segmentation5
Prediction and Description of Near-Future Activities in Video5
CMGNet: Collaborative multi-modal graph network for video captioning5
Multi-person 3D pose estimation from a single image captured by a fisheye camera5
A semantically driven self-supervised algorithm for detecting anomalies in image sets5
Adversarial composite prediction of normal video dynamics for anomaly detection5
RS3Lip: Consis5
Lmser-pix2seq: Learning stable sketch representations for sketch healing4
A formal approach to good practices in Pseudo-Labeling for Unsupervised Domain Adaptive Re-Identification4
Single image super-resolution via hybrid resolution NSST prediction4
Editorial Board4
Image retrieval with mixed initiative and multimodal feedback4
Constituent Attention for Vision Transformers4
Semantic segmentation from remote sensor data and the exploitation of latent learning for classification of auxiliary tasks4
Editorial Board4
MERLIN-Seg: Self-supervised despeckling for label-efficient semantic segmentation4
Cutout with patch-loss augmentation for improving generative adversarial networks against instability4
Editorial Board4
Editorial Board4
A new deep CNN for 3D text localization in the wild through shadow removal4
Improved Short-term Dense Bottleneck network for efficient scene analysis4
Editorial Board4
SIERRA: A robust bilateral feature upsampler for dense prediction4
MAIN: Multi-Attention Instance Network for video segmentation4
Improving the robustness of adversarial attacks using an affine-invariant gradient estimator4
Disentangled generation network for enlarged license plate recognition and a unified dataset4
A distribution independence based method for 3D face shape decomposition4
GAFL: Global adaptive filtering layer for computer vision4
Class knowledge overlay to visual feature learning for zero-shot image classification4
LKDA-GAN: Cross-modality image synthesis via Generative Adversarial Network aggregating large kernel decomposable attention bottleneck block4
A closer look at branch classifiers of multi-exit architectures4
TransRPN: Towards the Transferable Adversarial Perturbations using Region Proposal Networks and Beyond4
3D object feature extraction and classification using 3D MF-DFA4
High frame rate optical flow estimation from event sensors via intensity estimation4
Unsupervised domain adaptation for semantic segmentation via cross-region alignment4
Hierarchical image peeling: A flexible scale-space filtering framework4
Feature preserving 3D mesh denoising with a Dense Local Graph Neural Network3
Advancing Image Generation with Denoising Diffusion Probabilistic Model and ConvNeXt-V2: A novel approach for enhanced diversity and quality3
Enhanced local multi-windows attention network for lightweight image super-resolution3
ARCTIC: A knowledge distillation approach via attention-based relation matching and activation region constraint for RGB-to-Infrared videos action recognition3
Editorial Board3
TECD_Attention: Texture-enhanced and cross-domain attention modeling for visual place recognition3
Multi-focus image fusion approach based on CNP systems in NSCT domain3
Stacked Capsule Graph Autoencoders for geometry-aware 3D head pose estimation3
Opti-CAM: Optimizing saliency maps for interpretability3
Model-based inexact graph matching on top of DNNs for semantic scene understanding3
ICycleGAN: Single image dehazing based on iterative dehazing model and CycleGAN3
Spatial attention inference model for cascaded siamese tracking with dynamic residual update strategy3
Fourier analysis on robustness of graph convolutional neural networks for skeleton-based action recognition3
On the inductive biases of deep domain adaptation3
PGF-BIQA: Blind image quality assessment via probability multi-grained cascade forest3
Improving semantic video retrieval models by training with a relevance-aware online mining strategy3
Low-light image enhancement by deep learning network for improved illumination map3
A comprehensive review of past and present image inpainting methods3
MKP-Net: Memory knowledge propagation network for point-supervised temporal action localization in livestreaming3
Human skeletons and change detection for efficient violence detection in surveillance videos3
Multi-timescale boosting for efficient and improved event camera face pose alignment3
Towards efficient image and video style transfer via distillation and learnable feature transformation3
LLAFN-Generator: Learnable linear-attention with fast-normalization for large-scale image captioning3
Certifiable planar relative pose estimation with gravity prior3
Audio–visual deepfake detection using articulatory representation learning3
Sejong face database: A multi-modal disguise face database3
Adaptive feature denoising based deep convolutional network for single image super-resolution3
Identity-preserving editing of multiple facial attributes by learning global edit directions and local adjustments3
Learning geodesic-aware local features from RGB-D images3
Memory-efficient multi-scale residual dense network for single image rain removal3
Light-weight shadow detection via GCN-based annotation strategy and knowledge distillation3
Bridging the gap between object detection in close-up and high-resolution wide shots3
Adaptive semantic transfer network for unsupervised 2D image-based 3D model retrieval3
Robust detection of dehazed images via dual-stream CNNs with adaptive feature fusion3
Fake face detection via adaptive manipulation traces extraction network3
Adaptive CNN filter pruning using global importance metric3
Margin-based discriminant embedding guided sparse matrix regression for image supervised feature selection3
SANet: Selective Aggregation Network for unsupervised object re-identification3
Snow Mask Guided Adaptive Residual Network for Image Snow Removal3
Frequency aware face hallucination generative adversarial network with semantic structural constraint3
An image denoising method based on the nonlinear Schrödinger equation and spectral subband decomposition3
Deep unsupervised shadow detection with curriculum learning and self-training3
CPNet: Continuity Preservation Network for infrared video colorization3
Hi-ROS: Open-source multi-camera sensor fusion for real-time people tracking3
SlowFastFormer for 3D human pose estimation3
The MSR-Video to Text dataset with clean annotations3
Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks3
View-aligned pixel-level feature aggregation for 3D shape classification3
MoMa: Skinned motion retargeting using masked pose modeling3
Recurrent context-aware multi-stage network for single image deraining3
Estimating the vertical direction in a photogrammetric 3D model, with application to visualization3
0.067092895507812