Image and Vision Computing

Papers
(The median citation count of Image and Vision Computing is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
Active domain adaptation for semantic segmentation via dynamically balancing domainness and uncertainty266
Alignment and fusion for adaptive domain nighttime semantic segmentation173
Cross-scale global attention feature pyramid network for person search146
Single stage architecture for improved accuracy real-time object detection on mobile devices123
Few-shot classification with multisemantic information fusion network109
Modeling content-attribute preference for personalized image esthetics assessment101
Feature decoupling and interaction network for defending against adversarial examples97
Learning diverse and deep clues for person reidentification87
Editorial Board78
ABC: Aligning binary centers for single-stage monocular 3D object detection77
Hourglass cascaded recurrent stereo matching network67
CODNet: Context-based object detection network for multimodal image captioning and virtual question answering64
BF3D: Bi-directional fusion 3D detector with semantic sampling and geometric mapping63
Accurate and efficient salient object detection via position prior attention60
Synthetic lidar point cloud generation using deep generative models for improved driving scene object recognition55
FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public49
GLMambaNet: Mamba-based decoder with local detail enhancement for semantic segmentation of remote sensing imagery44
G-TRACE: Grouped temporal recalibration for video object segmentation44
HPD-Depth: High performance decoding network for self-supervised monocular depth estimation43
SRMA-KD: Structured relational multi-scale attention knowledge distillation for effective lightweight cardiac image segmentation41
Background debiased class incremental learning for video action recognition41
ADVC: Adversarial dense video captioning with unsupervised pretraining37
Learning an augmentation strategy for sparse datasets36
PU-GACNet: Graph Attention Convolution Network for Point Cloud Upsampling35
AI-powered trustable and explainable fall detection system using transfer learning34
MAFUNet: Mamba with adaptive fusion UNet for medical image segmentation34
Multi-information guided camouflaged object detection34
GAN-BodyPose: Real-time 3D human body pose data key point detection and quality assessment assisted by generative adversarial network34
DMNet: Image dehazing via Dual-Domain Modulation33
1D kernel distillation network for efficient image super-resolution32
Privacy-preserving explainable AI enable federated learning-based denoising fingerprint recognition model32
DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions32
RGB-T tracking by modality difference reduction and feature re-selection32
Deep learning with adaptive convolutions for classification of retinal diseases via optical coherence tomography32
PST-Mamba: Spatio-temporal selective state fusion for effective point cloud video understanding with state space models32
Lightweight multi-scale global attention enhancement network for image super-resolution32
SAGNet: Synergistic Attention-Graph Network For video salient object detection31
SAFENet: Semantic-Aware Feature Enhancement Network for unsupervised cross-domain road scene segmentation30
Two-stream transformer tracking with messengers29
Utilizing Inherent Bias for Memory Efficient Continual Learning: A Simple and Robust Baseline29
Memory-MambaNav: Enhancing object-goal navigation through integration of spatial–temporal scanning with state space models29
Learning accurate monocular 3D voxel representation via bilateral voxel transformer29
Depth assisted novel view synthesis using few images28
FSBI: Deepfake detection with frequency enhanced self-blended images28
A Point-2s reinforcement learning biomimetic model for estimating and analyzing human 3D motion posture27
Recent advances in deterministic human motion prediction: A review27
Underwater bubble plume image generative model based on noise prior and multi conditional labels27
MVPCC-Net: Multi-View Based Point Cloud Completion Network for MLS data27
Spatial likelihood voting with self-knowledge distillation for weakly supervised object detection27
ST-VTON: Self-supervised vision transformer for image-based virtual try-on27
Dual subspace clustering for spectral-spatial hyperspectral image clustering25
Multi-view dynamic facial action unit detection25
Visionary vigilance: Optimized YOLOV8 for fallen person detection with large-scale benchmark dataset25
CMS-net: Edge-aware multimodal MRI feature fusion for brain tumor segmentation24
Self-supervised Vision Transformers for 3D pose estimation of novel objects24
Enhanced residual network for burst image super-resolution using simple base frame guidance23
Burst image super-resolution via multi-cross attention encoding and multi-scan state-space decoding23
Frequency and content dual stream network for image dehazing23
Editorial Board23
TransMix: Crafting highly transferable adversarial examples to evade face recognition models23
RFSC-net: Re-parameterization forward semantic compensation network in low-light environments23
Editorial Board22
CNN and Transformer-based deep learning models for automated white blood cell detection22
Landmark-in-facial-component: Towards occlusion-robust facial landmark localization22
Feature alignment via mutual mapping for few-shot fine-grained visual classification22
A new multi-picture architecture for learned video deinterlacing and demosaicing with parallel deformable convolution and self-attention blocks21
Deep learning enhanced monocular visual odometry: Advancements in fusion mechanisms and training strategies21
SADGFeat: Learning local features with layer spatial attention and domain generalization21
Mixup Mask Adaptation: Bridging the gap between input saliency and representations via attention mechanism in feature mixup21
Distributed collaborative machine learning in real-world application scenario: A white blood cell subtypes classification case study20
Robust visual tracking via modified Harris hawks optimization20
A spatial-frequency domain multi-branch decoder method for real-time semantic segmentation20
Enhancing consistency in virtual try-on: A novel diffusion-based approach20
AGSAM-Net: UAV route planning and visual guidance model for bridge surface defect detection20
NPVForensics: Learning VA correlations in non-critical phoneme–viseme regions for deepfake detection20
Face deidentification with controllable privacy protection20
A multi-branch dual attention segmentation network for epiphyte drone images20
PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding20
Doctor-in-the-Loop: An explainable, multi-view deep learning framework for predicting pathological response in non-small cell lung cancer19
FastNet: Fast high-resolution network for human pose estimation19
Object tracking based on temporal and spatial context information19
STAFFormer: Spatio-temporal adaptive fusion transformer for efficient 3D human pose estimation18
A deep-shallow and global–local multi-feature fusion network for photometric stereo18
Multi-view self-supervised learning for 3D facial texture reconstruction from single image18
PAGML: Precise Alignment Guided Metric Learning for sketch-based 3D shape retrieval18
SDMNet: Spatially dilated multi-scale network for object detection for drone aerial imagery18
EMA-GS: Improving sparse point cloud rendering with EMA gradient and anchor upsampling18
Mitigating human fall injuries: A novel system utilizing 3D 4-stream convolutional neural networks and image fusion18
Enhancing brain tumor classification in MRI images: A deep learning-based approach for accurate diagnosis18
Intelligent deep learning based ethnicity recognition and classification using facial images18
Dual-branch adaptive attention transformer for occluded person re-identification18
Detection of anomaly in surveillance videos using quantum convolutional neural networks18
UHDNet: Unified multimodal fusion harmonization and hierarchical dependency learning for visible-infrared person re-identification18
Underwater image restoration based on light attenuation prior and color-contrast adaptive correction17
Contrast enhancement of region of interest of backlit image for surveillance systems based on multi-illumination fusion17
Adaptive scale matching for remote sensing object detection based on aerial images17
AHA-track: Aggregating hierarchical awareness features for single17
Intelligent facial expression recognition and classification using optimal deep transfer learning model17
CollaborativeBEV: Collaborative bird eye view for reconstructing crowded environment17
Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy17
SAMNet: Adapting segment anything model for accurate light field salient object detection17
Matte anything: Interactive natural image matting with segment anything model17
TQRFormer: Tubelet query recollection transformer for action detection17
DFG-HCEN: A distinctive-feature guided and hierarchical channel enhanced network-based infrared and visible image fusion17
Deep learning-based efficient diagnosis of periapical diseases with dental X-rays16
ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection16
Adaptive graph reasoning network for object detection16
A novel facial expression recognition model based on harnessing complementary features in multi-scale network with attention fusion16
Distribution-modulated binary neural network for image classification16
PW-NeRF: Progressive wavelet-mask guided neural radiance fields view synthesis16
Estimating blood pressure using video-based PPG and deep learning16
Phase shift guided dynamic view synthesis from monocular video16
M2VAD: Multiview multi16
Self-knowledge distillation based on knowledge transfer from soft to hard examples15
Multi-axis interactive multidimensional attention network for vehicle re-identification15
Social robot in service of the cognitive therapy of elderly people: Exploring robot acceptance in a real-world scenario15
Source domain prior-assisted segment anything model for single domain generalization in medical image segmentation15
OFACD: An end-to-end change detection network for small UAVs remote sensing with viewpoint differences15
Real-time human-centric segmentation for complex video scenes15
Adaptive and fast image superpixel segmentation approach15
CRFormer: A cross-region transformer for shadow removal15
PixTention: Dynamic pixel-level adapter using attention maps15
FgbCNN: A unified bilinear architecture for learning a fine-grained feature representation in facial expression recognition15
WPE: Weighted prototype estimation for few-shot learning15
Anchor-based discriminative dual distribution calibration for transductive zero-shot learning15
Attentive spatial-temporal contrastive learning for self-supervised video representation15
Self-trained prediction model and novel anomaly score mechanism for video anomaly detection14
Enhancing small object tracking with reversible rescaling networks14
Face and body-shape integration model for cloth-changing person re-identification14
Your image generator is your new private dataset14
Online multi-object tracking with δ-GLMB filter based on occlusion and identity switch handling14
Real-time gait biometrics for surveillance applications: A review14
RLTNT: An explainable residual learning-based transformer model for kidney disease classification14
Stacked graph bone region U-net with bone representation for hand pose estimation and semi-supervised training14
Class-discriminative domain generalization for semantic segmentation14
Semantic-aware for point cloud domain adaptation with self-distillation learning14
PR-DETR: Extracting and utilizing prior knowledge for improved end-to-end object detection14
Few-shot class incremental learning via prompt transfer and knowledge distillation14
CVAD-GAN: Constrained video anomaly detection via generative adversarial network13
H-net: Unsupervised domain adaptation person re-identification network based on hierarchy13
Perceiving local relative motion and global correlations for weakly supervised group activity recognition13
Bridging efficiency and interpretability: Explainable AI for multi-classification of pulmonary diseases utilizing modified lightweight CNNs13
Corrigendum to “A novel framework for diverse video generation from a single video using frame-conditioned denoising diffusion probabilistic model and ConvNeXt-V2” [Image and Vision Computing 154 (20213
Dynamic semantic prototype perception for text–video retrieval13
Optimal deep transfer learning based ethnicity recognition on face images13
Point-cloud-based hand gesture recognition using principal component analysis and boundary extraction13
Data-driven 2D-EWT based diabetic retinopathy identification using hybrid neural network13
An edge-aware high-resolution framework for camouflaged object detection13
Enhancing single-view 3D mesh reconstruction with the aid of implicit surface learning13
Resource-aware strategies for real-time multi-person pose estimation13
Incremental human action recognition with dual memory13
BCDPose: Diffusion-based 3D Human Pose Estimation with bone-chain prior knowledge13
Monocular contextual constraint for stereo matching with adaptive weights assignment12
Similarity verification of kinship pairs using metricized emphasis12
Optimizing multimodal personalized disease prediction accuracy using generated prompts and large language models12
Qualitative failures of image generation models and their application in detecting deepfakes12
SDE-RAE:CLIP-based realistic image reconstruction and editing network using stochastic differential diffusion12
AI4RDD: Artificial Intelligence and Rare Disease Diagnosis: A proposal to improve the anamnesis process12
Semantic scene graph generation based on an edge dual scene graph and message passing neural network12
Flexible multi-objective particle swarm optimization clustering with game theory to address human activity discovery fully unsupervised12
BTMTrack: Robust RGB-T tracking via dual-template bridging and temporal-modal candidate elimination12
GFFT: Global-local feature fusion transformers for facial expression recognition in the wild12
Weather-degraded image semantic segmentation with multi-task knowledge distillation12
A decision support system for acute lymphoblastic leukemia detection based on explainable artificial intelligence12
Efficient Mamba: Overcoming the visual limitations of Mamba with innovative structures12
CoHAtNet: An integrated convolutional-transformer architecture with hybrid self-attention for end-to-end camera localization12
Guest Editorial : Learning with Manifolds in Computer Vision12
Editorial Board12
Speaker independent VSR: A systematic review and futuristic applications11
Video object segmentation by multi-scale attention using bidirectional strategy11
An analytical proof on suitability of Cauchy-Schwarz Divergence as the aggregation criterion in Region Growing Algorithm11
Parameter efficient finetuning of text-to-image models with trainable self-attention layer11
Exploiting spatial and temporal context for online tracking with improved transformer11
Synthetic multi-view clustering with missing relationships and instances11
Geometric feature statistics histogram for both real-valued and binary feature representations of 3D local shape11
Multi-object tracking with adaptive measurement noise and information fusion11
External knowledge-assisted Transformer for image captioning11
Knowledge graph construction in hyperbolic space for automatic image annotation11
Twin relaxed least squares regression with classwise mean constraint for image classification11
Editorial Board11
Multi-granularity for knowledge distillation11
UIR-ES: An unsupervised underwater image restoration framework with equivariance and stein unbiased risk estimator11
OCUCFormer: An Over-Complete Under-Complete Transformer Network for accelerated MRI reconstruction11
Synthetic data sets for person Re-Identification: A critical analysis11
Gait recognition via View-aware Part-wise Attention and Multi-scale Dilated Temporal Extractor11
Deep hybrid learning for facial expression binary classifications and predictions11
Black-box reversible adversarial examples with invertible neural network11
Editorial Board11
Self-distillation guided Semantic Knowledge Feedback network for infrared–visible image fusion11
Editorial Board10
Robust ensemble person reidentification via orthogonal fusion with occlusion handling10
Feature extraction and fusion algorithm for infrared visible light images based on residual and generative adversarial network10
ECT: Fine-grained edge detection with learned cause tokens10
Boosting semi-supervised face recognition with raw faces10
Contrastive learning based facial action unit detection in children with hearing impairment for a socially assistive robot platform10
Cross-modal hybrid architectures for gastrointestinal tract image analysis: A systematic review and futuristic applications10
Depth awakens: A depth-perceptual attention fusion network for RGB-D camouflaged object detection10
Continual coarse-to-fine domain adaptation in semantic segmentation10
Transformer-based feature interactor for person re-identification with margin self-punishment loss10
A dedicated benchmark for contour-based corner detection evaluation10
DiPS: Discriminative pseudo-label sampling with self-supervised transformers for weakly supervised object localization10
Unified Volumetric Avatar: Enabling flexible editing and rendering of neural human representations10
ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation10
Fuzzy set-based Bernoulli Random Noise Weighted Loss for unsupervised person re-identification10
Editorial Board10
Learning language to symbol and language to vision mapping for visual grounding10
RGB road scene material segmentation9
Multi-level feature disentanglement network for cross-dataset face forgery detection9
Robust visual tracking based on modified mayfly optimization algorithm9
Federated learning based nonlinear two-stage framework for full-reference image quality assessment: An application for biometric9
Multiscale parallel deep CNN (mpdCNN) architecture for the real low-resolution face recognition for surveillance9
AES-Net: An adapter and enhanced self-attention guided network for multi-stage glaucoma classification using fundus images9
CF-SOLT: Real-time and accurate traffic accident detection using correlation filter-based tracking9
A dual-channel network based on occlusion feature compensation for human pose estimation9
Efficient masked feature and group attention network for stereo image super-resolution9
SinWaveFusion: Learning a single image diffusion model in wavelet domain9
Improving defocus blur detection via adaptive supervision prior-tokens9
Dense open-set recognition based on training with noisy negative images9
A lightweight hash-directed global perception and self-calibrated multiscale fusion network for image super-resolution9
Generative feature-driven image replay for continual learning9
Attention guided multi-level feature aggregation network for camouflaged object detection9
Mobile-friendly and multi-feature aggregation via transformer for human pose estimation9
Drone-NeRF: Efficient NeRF based 3D scene reconstruction for large-scale drone survey9
SAKD: Sparse attention knowledge distillation9
Rethinking the sample relations for few-shot classification9
Noisy label facial expression recognition via face-specific label distribution learning9
SAMUNet: Enhancing pillar-based 3D object detection in autonomous driving with Shape-aware Mini-Unet9
Leveraging spatial-channel attention in U-Net for enhanced segmentation of martian dust storms9
Modal-aware contrastive learning for hyperspectral and LiDAR classification9
FRoundation: Are foundation models ready for face recognition?9
Effective hybrid attention network based on pseudo-color enhancement in ultrasound image segmentation9
Does explainable machine learning uncover the black box in vision applications?9
Editorial Board8
Image–text feature learning for unsupervised visible–infrared person re-identification8
Human activity recognition from UAV videos using a novel DMLC-CNN model8
MODE: Monocular omnidirectional depth estimation via consistent depth fusion8
Ricci curvature based volumetric segmentation8
Corrigendum to “STAFFormer: Spatio-temporal adaptive fusion transformer for efficient 3D human pose estimation” [Journal of Image and Vision Computing volume 149 (2024) 105142]8
Transferable dual multi-granularity semantic excavating for partially relevant video retrieval8
Wave-based cross-phase representation for weakly supervised classification8
A novel framework for diverse video generation from a single video using frame-conditioned denoising diffusion probabilistic model and ConvNeXt-V28
EatSense: Human centric, action recognition and localization dataset for understanding eating behaviors and quality of motion assessment8
An Active Transfer Learning framework for image classification based on Maximum Differentiation Classifier8
IRPE: Instance-level reconstruction-based 6D pose estimator8
CREAM: Few-shot Object Counting with Cross REfinement and Adaptive density Map8
Learning to disentangle scenes for person re-identification8
Markerless multi-view 3D human pose estimation: A survey8
EmbryoVision AI: An explainable deep learning framework for enhanced blastocyst selection in assisted reproductive technologies8
Editorial Board8
Assessing the noise robustness of Class Activation Maps: A framework for reliable model interpretability8
0.094322204589844