OOIR: Observatory of International Research

Papers

(The TQCC of Image and Vision Computing is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)

Article	Citations
Active domain adaptation for semantic segmentation via dynamically balancing domainness and uncertainty	306
Learning diverse and deep clues for person reidentification	187
ABC: Aligning binary centers for single-stage monocular 3D object detection	154
Alignment and fusion for adaptive domain nighttime semantic segmentation	128
G-TRACE: Grouped temporal recalibration for video object segmentation	124
Hourglass cascaded recurrent stereo matching network	113
CODNet: Context-based object detection network for multimodal image captioning and virtual question answering	105
PST-Mamba: Spatio-temporal selective state fusion for effective point cloud video understanding with state space models	101
ADVC: Adversarial dense video captioning with unsupervised pretraining	91
RGB-T tracking by modality difference reduction and feature re-selection	83
Lightweight multi-scale global attention enhancement network for image super-resolution	76
Background debiased class incremental learning for video action recognition	65
SRMA-KD: Structured relational multi-scale attention knowledge distillation for effective lightweight cardiac image segmentation	65
Feature decoupling and interaction network for defending against adversarial examples	65
HPD-Depth: High performance decoding network for self-supervised monocular depth estimation	65
Modeling content-attribute preference for personalized image esthetics assessment	51
GAN-BodyPose: Real-time 3D human body pose data key point detection and quality assessment assisted by generative adversarial network	50
GLMambaNet: Mamba-based decoder with local detail enhancement for semantic segmentation of remote sensing imagery	49
Few-shot-based video generation via multimodal fusion and Fourier Spliter	46
Learning an augmentation strategy for sparse datasets	46
Window normalization: Enhancing point cloud understanding by unifying inconsistent point densities	45
PU-GACNet: Graph Attention Convolution Network for Point Cloud Upsampling	41
CAGS: Open-vocabulary 3D scene understanding with context-aware Gaussian splatting	40
Privacy-preserving explainable AI enable federated learning-based denoising fingerprint recognition model	39
Single stage architecture for improved accuracy real-time object detection on mobile devices	39

FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public	39
Accurate and efficient salient object detection via position prior attention	36
Few-shot classification with multisemantic information fusion network	35
Multi-information guided camouflaged object detection	35
DMNet: Image dehazing via Dual-Domain Modulation	34
MAFUNet: Mamba with adaptive fusion UNet for medical image segmentation	33
MVPCC-Net: Multi-View Based Point Cloud Completion Network for MLS data	33
BF3D: Bi-directional fusion 3D detector with semantic sampling and geometric mapping	33
AI-powered trustable and explainable fall detection system using transfer learning	33
DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions	33
Synthetic lidar point cloud generation using deep generative models for improved driving scene object recognition	33
FSBI: Deepfake detection with frequency enhanced self-blended images	32
Deep learning with adaptive convolutions for classification of retinal diseases via optical coherence tomography	31
Memory-MambaNav: Enhancing object-goal navigation through integration of spatial–temporal scanning with state space models	30
SAFENet: Semantic-Aware Feature Enhancement Network for unsupervised cross-domain road scene segmentation	30
Visionary vigilance: Optimized YOLOV8 for fallen person detection with large-scale benchmark dataset	30
SAGNet: Synergistic Attention-Graph Network For video salient object detection	30
Utilizing Inherent Bias for Memory Efficient Continual Learning: A Simple and Robust Baseline	29
CMS-net: Edge-aware multimodal MRI feature fusion for brain tumor segmentation	29
Burst image super-resolution via multi-cross attention encoding and multi-scan state-space decoding	29
Enhanced residual network for burst image super-resolution using simple base frame guidance	28
Two-stream transformer tracking with messengers	28
Underwater bubble plume image generative model based on noise prior and multi conditional labels	27
ST-VTON: Self-supervised vision transformer for image-based virtual try-on	27
A Point-2s reinforcement learning biomimetic model for estimating and analyzing human 3D motion posture	27
1D kernel distillation network for efficient image super-resolution	26
Learning accurate monocular 3D voxel representation via bilateral voxel transformer	26
Multi-view dynamic facial action unit detection	26
Depth assisted novel view synthesis using few images	26
Recent advances in deterministic human motion prediction: A review	26
Self-supervised Vision Transformers for 3D pose estimation of novel objects	25
Frequency and content dual stream network for image dehazing	25
Dual subspace clustering for spectral-spatial hyperspectral image clustering	25
Editorial Board	24
Editorial Board	24
TransMix: Crafting highly transferable adversarial examples to evade face recognition models	24
SADGFeat: Learning local features with layer spatial attention and domain generalization	23
A new multi-picture architecture for learned video deinterlacing and demosaicing with parallel deformable convolution and self-attention blocks	23
Robust visual tracking via modified Harris hawks optimization	23
Landmark-in-facial-component: Towards occlusion-robust facial landmark localization	23
Mixup Mask Adaptation: Bridging the gap between input saliency and representations via attention mechanism in feature mixup	23
A multi-branch dual attention segmentation network for epiphyte drone images	22
Contrast enhancement of region of interest of backlit image for surveillance systems based on multi-illumination fusion	22
Intelligent deep learning based ethnicity recognition and classification using facial images	22
EMA-GS: Improving sparse point cloud rendering with EMA gradient and anchor upsampling	22
Unsupervised Object Localization driven by self-supervised foundation models: A comprehensive review	21
FastNet: Fast high-resolution network for human pose estimation	21
PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding	21
Intelligent facial expression recognition and classification using optimal deep transfer learning model	21
Mitigating human fall injuries: A novel system utilizing 3D 4-stream convolutional neural networks and image fusion	20

Dual-branch adaptive attention transformer for occluded person re-identification	20
RFSC-net: Re-parameterization forward semantic compensation network in low-light environments	20
Feature alignment via mutual mapping for few-shot fine-grained visual classification	20
PAGML: Precise Alignment Guided Metric Learning for sketch-based 3D shape retrieval	20
Object tracking based on temporal and spatial context information	20
Face deidentification with controllable privacy protection	20
Enhancing consistency in virtual try-on: A novel diffusion-based approach	19
UHDNet: Unified multimodal fusion harmonization and hierarchical dependency learning for visible-infrared person re-identification	19
DynaGuide: A generalizable dynamic guidance framework for zero-shot guided unsupervised semantic segmentation	19
Deep learning enhanced monocular visual odometry: Advancements in fusion mechanisms and training strategies	19
STAFFormer: Spatio-temporal adaptive fusion transformer for efficient 3D human pose estimation	19
Doctor-in-the-Loop: An explainable, multi-view deep learning framework for predicting pathological response in non-small cell lung cancer	19
A spatial-frequency domain multi-branch decoder method for real-time semantic segmentation	19
Distributed collaborative machine learning in real-world application scenario: A white blood cell subtypes classification case study	19
AGSAM-Net: UAV route planning and visual guidance model for bridge surface defect detection	18
SDMNet: Spatially dilated multi-scale network for object detection for drone aerial imagery	18
Matte anything: Interactive natural image matting with segment anything model	18
NPVForensics: Learning VA correlations in non-critical phoneme–viseme regions for deepfake detection	18
CNN and Transformer-based deep learning models for automated white blood cell detection	18
Enhancing brain tumor classification in MRI images: A deep learning-based approach for accurate diagnosis	18
Underwater image restoration based on light attenuation prior and color-contrast adaptive correction	18
A deep-shallow and global–local multi-feature fusion network for photometric stereo	18
Detection of anomaly in surveillance videos using quantum convolutional neural networks	18
SAMNet: Adapting segment anything model for accurate light field salient object detection	17
A novel facial expression recognition model based on harnessing complementary features in multi-scale network with attention fusion	17
Real-time human-centric segmentation for complex video scenes	17
Distribution-modulated binary neural network for image classification	17
AHA-track: Aggregating hierarchical awareness features for single	17
Phase shift guided dynamic view synthesis from monocular video	17
PW-NeRF: Progressive wavelet-mask guided neural radiance fields view synthesis	17
TQRFormer: Tubelet query recollection transformer for action detection	17
E-Net for pansharpening: A super-resolution perspective	17
Estimating blood pressure using video-based PPG and deep learning	17
OFACD: An end-to-end change detection network for small UAVs remote sensing with viewpoint differences	17
CollaborativeBEV: Collaborative bird eye view for reconstructing crowded environment	16
PixTention: Dynamic pixel-level adapter using attention maps	16
Attentive spatial-temporal contrastive learning for self-supervised video representation	16
Adaptive graph reasoning network for object detection	16
Anchor-based discriminative dual distribution calibration for transductive zero-shot learning	16
Multi-axis interactive multidimensional attention network for vehicle re-identification	16
Self-trained prediction model and novel anomaly score mechanism for video anomaly detection	16
RLTNT: An explainable residual learning-based transformer model for kidney disease classification	16
Self-knowledge distillation based on knowledge transfer from soft to hard examples	16
Adaptive scale matching for remote sensing object detection based on aerial images	15
Deep learning-based efficient diagnosis of periapical diseases with dental X-rays	15
Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy	15
Source domain prior-assisted segment anything model for single domain generalization in medical image segmentation	15
ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection	15
WPE: Weighted prototype estimation for few-shot learning	15
Real-time gait biometrics for surveillance applications: A review	15
FgbCNN: A unified bilinear architecture for learning a fine-grained feature representation in facial expression recognition	15
Social robot in service of the cognitive therapy of elderly people: Exploring robot acceptance in a real-world scenario	15
Online multi-object tracking with δ-GLMB filter based on occlusion and identity switch handling	15
Semantic-aware for point cloud domain adaptation with self-distillation learning	15
Class-discriminative domain generalization for semantic segmentation	15
DFG-HCEN: A distinctive-feature guided and hierarchical channel enhanced network-based infrared and visible image fusion	15
M2VAD: Multiview multi	15
Stacked graph bone region U-net with bone representation for hand pose estimation and semi-supervised training	15
CRFormer: A cross-region transformer for shadow removal	15
Enhancing small object tracking with reversible rescaling networks	15
Enhancing single-view 3D mesh reconstruction with the aid of implicit surface learning	14
Your image generator is your new private dataset	14
CVAD-GAN: Constrained video anomaly detection via generative adversarial network	14
Corrigendum to “A novel framework for diverse video generation from a single video using frame-conditioned denoising diffusion probabilistic model and ConvNeXt-V2” [Image and Vision Computing 154 (202	14
Dynamic semantic prototype perception for text–video retrieval	14
Data-driven 2D-EWT based diabetic retinopathy identification using hybrid neural network	14
Few-shot class incremental learning via prompt transfer and knowledge distillation	14
PR-DETR: Extracting and utilizing prior knowledge for improved end-to-end object detection	14
Point-cloud-based hand gesture recognition using principal component analysis and boundary extraction	14
BCDPose: Diffusion-based 3D Human Pose Estimation with bone-chain prior knowledge	14
H-net: Unsupervised domain adaptation person re-identification network based on hierarchy	14
Bridging efficiency and interpretability: Explainable AI for multi-classification of pulmonary diseases utilizing modified lightweight CNNs	14
Face and body-shape integration model for cloth-changing person re-identification	14
An edge-aware high-resolution framework for camouflaged object detection	14
A comprehensive survey on magnetic resonance image reconstruction	13
Similarity verification of kinship pairs using metricized emphasis	13
SDE-RAE:CLIP-based realistic image reconstruction and editing network using stochastic differential diffusion	13
Geometric feature statistics histogram for both real-valued and binary feature representations of 3D local shape	13
Editorial Board	13
Guest Editorial : Learning with Manifolds in Computer Vision	13

HMPFormer: Hierarchical vision transformer with multi-perspective feature learning for precise polyp segmentation	13
Editorial Board	13
Optimizing multimodal personalized disease prediction accuracy using generated prompts and large language models	13
Exploiting spatial and temporal context for online tracking with improved transformer	13
Semantic scene graph generation based on an edge dual scene graph and message passing neural network	13
Perceiving local relative motion and global correlations for weakly supervised group activity recognition	13
Resource-aware strategies for real-time multi-person pose estimation	13
Optimal deep transfer learning based ethnicity recognition on face images	13
Flexible multi-objective particle swarm optimization clustering with game theory to address human activity discovery fully unsupervised	13
Synthetic multi-view clustering with missing relationships and instances	13
External knowledge-assisted Transformer for image captioning	12
Efficient Mamba: Overcoming the visual limitations of Mamba with innovative structures	12
Speaker independent VSR: A systematic review and futuristic applications	12
Black-box reversible adversarial examples with invertible neural network	12
AI4RDD: Artificial Intelligence and Rare Disease Diagnosis: A proposal to improve the anamnesis process	12
Synthetic data sets for person Re-Identification: A critical analysis	12
GFFT: Global-local feature fusion transformers for facial expression recognition in the wild	12
A decision support system for acute lymphoblastic leukemia detection based on explainable artificial intelligence	12
Monocular contextual constraint for stereo matching with adaptive weights assignment	12
BTMTrack: Robust RGB-T tracking via dual-template bridging and temporal-modal candidate elimination	12
CoHAtNet: An integrated convolutional-transformer architecture with hybrid self-attention for end-to-end camera localization	12
Deep hybrid learning for facial expression binary classifications and predictions	12
Video object segmentation by multi-scale attention using bidirectional strategy	11
Boosting semi-supervised face recognition with raw faces	11
A dedicated benchmark for contour-based corner detection evaluation	11
Fuzzy set-based Bernoulli Random Noise Weighted Loss for unsupervised person re-identification	11
ECT: Fine-grained edge detection with learned cause tokens	11
Twin relaxed least squares regression with classwise mean constraint for image classification	11
Qualitative failures of image generation models and their application in detecting deepfakes	11
Weather-degraded image semantic segmentation with multi-task knowledge distillation	11
Unified Volumetric Avatar: Enabling flexible editing and rendering of neural human representations	11
Continual coarse-to-fine domain adaptation in semantic segmentation	11
Cross-modal hybrid architectures for gastrointestinal tract image analysis: A systematic review and futuristic applications	11
Editorial Board	11
Does explainable machine learning uncover the black box in vision applications?	11
Multi-object tracking with adaptive measurement noise and information fusion	11
Self-distillation guided Semantic Knowledge Feedback network for infrared–visible image fusion	11
Editorial Board	11
Parameter efficient finetuning of text-to-image models with trainable self-attention layer	11
DiPS: Discriminative pseudo-label sampling with self-supervised transformers for weakly supervised object localization	11
Editorial Board	11
Contrastive learning based facial action unit detection in children with hearing impairment for a socially assistive robot platform	11
Learning language to symbol and language to vision mapping for visual grounding	11
Gait recognition via View-aware Part-wise Attention and Multi-scale Dilated Temporal Extractor	11
UIR-ES: An unsupervised underwater image restoration framework with equivariance and stein unbiased risk estimator	11
OCUCFormer: An Over-Complete Under-Complete Transformer Network for accelerated MRI reconstruction	11
Rethinking the sample relations for few-shot classification	10
Drone-NeRF: Efficient NeRF based 3D scene reconstruction for large-scale drone survey	10
Attention guided multi-level feature aggregation network for camouflaged object detection	10
Editorial Board	10
Dense open-set recognition based on training with noisy negative images	10
Mobile-friendly and multi-feature aggregation via transformer for human pose estimation	10
MOT-STM: Maritime Object Tracking: A Spatial-Temporal and Metadata-based approach	10
Person re-identification: A taxonomic survey and the path ahead	10
Effective hybrid attention network based on pseudo-color enhancement in ultrasound image segmentation	10
Feature extraction and fusion algorithm for infrared visible light images based on residual and generative adversarial network	10
Efficient masked feature and group attention network for stereo image super-resolution	10
SinWaveFusion: Learning a single image diffusion model in wavelet domain	10
Robust visual tracking based on modified mayfly optimization algorithm	10
CF-SOLT: Real-time and accurate traffic accident detection using correlation filter-based tracking	10
SAKD: Sparse attention knowledge distillation	10
Transferable dual multi-granularity semantic excavating for partially relevant video retrieval	10
Generative feature-driven image replay for continual learning	10
Transformer-based feature interactor for person re-identification with margin self-punishment loss	10
ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation	10
Knowledge graph construction in hyperbolic space for automatic image annotation	10
Robust ensemble person reidentification via orthogonal fusion with occlusion handling	10
RGB road scene material segmentation	10
Leveraging spatial-channel attention in U-Net for enhanced segmentation of martian dust storms	10
Hierarchical spatiotemporal Feature Interaction Network for video saliency prediction	10
Part-aware distillation and aggregation network for human parsing	10
RBGAN: Realistic-generation and balanced-utility GAN for face de-identification	10
EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation	10
FRoundation: Are foundation models ready for face recognition?	10
Depth awakens: A depth-perceptual attention fusion network for RGB-D camouflaged object detection	10
A dual-channel network based on occlusion feature compensation for human pose estimation	10
Multi-level feature disentanglement network for cross-dataset face forgery detection	9
Universal domain adaptation from multiple black-box sources	9
Combining complementary trackers for enhanced long-term visual object tracking	9
Image–text feature learning for unsupervised visible–infrared person re-identification	9
Text-augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification	9
Three dimensional tracking of rigid objects in motion using 2D optical flows	9
Machine learning applications in breast cancer prediction using mammography	9
A lightweight hash-directed global perception and self-calibrated multiscale fusion network for image super-resolution	9
Federated learning based nonlinear two-stage framework for full-reference image quality assessment: An application for biometric	9
A supervised approach for the detection of AM-FM signals’ interference regions in spectrogram images	9
Corrigendum to “STAFFormer: Spatio-temporal adaptive fusion transformer for efficient 3D human pose estimation” [Journal of Image and Vision Computing volume 149 (2024) 105142]	9
AES-Net: An adapter and enhanced self-attention guided network for multi-stage glaucoma classification using fundus images	9
Video prediction by efficient transformers	9
Editorial Board	9
Noisy label facial expression recognition via face-specific label distribution learning	9
Improving defocus blur detection via adaptive supervision prior-tokens	9
EMNet: Edge-guided multi-level network for salient object detection in low-light images	9
IRPE: Instance-level reconstruction-based 6D pose estimator	9
LELD: Learn enhancement by learning degradation	9