Image and Vision Computing

Papers
(The median citation count of Image and Vision Computing is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-09-01 to 2024-09-01.)
ArticleCitations
Weighted boxes fusion: Ensembling boxes from different object detection models251
Deep learning-based object detection in low-altitude UAV datasets: A survey167
A comprehensive review on deep learning-based methods for video anomaly detection146
Application of the best evacuation model of deep learning in the design of public structures122
Deep multimodal fusion for semantic image segmentation: A survey110
FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public105
A framework of human action recognition using length control features fusion and weighted entropy-variances based feature selection103
Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning model93
A review on 2D instance segmentation based on deep neural networks92
Deep learning-based detection from the perspective of small or tiny objects: A survey82
Deep learning-based person re-identification methods: A survey and outlook of recent works66
ReMOT: A model-agnostic refinement for multiple object tracking57
A review of deep learning techniques for 2D and 3D human pose estimation50
Intelligent detection of building cracks based on deep learning47
Intelligent deep learning based ethnicity recognition and classification using facial images46
Visual question answering model based on graph neural network and contextual attention42
Motion saliency based multi-stream multiplier ResNets for action recognition40
An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network39
LSTM with bio inspired algorithm for action recognition in sports videos37
Person search: New paradigm of person re-identification: A survey and outlook of recent works37
Iris and periocular biometrics for head mounted displays: Segmentation, recognition, and synthetic data generation36
A Survey on Object Detection for the Internet of Multimedia Things (IoMT) using Deep Learning and Event-based Middleware: Approaches, Challenges, and Future Directions36
Exploring region relationships implicitly: Image captioning with visual relationship attention36
A survey of iris datasets35
Optimization of face recognition algorithm based on deep learning multi feature fusion driven by big data34
Synthetic data for face recognition: Current state and future prospects34
Generative adversarial networks and their application to 3D face generation: A survey30
A survey of micro-expression recognition28
Feedback-driven loss function for small object detection28
MEmoR: A Multimodal Emotion Recognition using affective biomarkers for smart prediction of emotional health for people analytics in smart industries27
Improved YOLOX-X based UAV aerial photography object detection algorithm27
Attention-guided chained context aggregation for semantic segmentation27
Multimodal facial biometrics recognition: Dual-stream convolutional neural networks with multi-feature fusion layers26
Application of 3D laser scanning technology for image data processing in the protection of ancient building sites through deep learning26
A two-stage real-time YOLOv2-based road marking detector with lightweight spatial transformation-invariant classification26
Robust biometric authentication system with a secure user template26
Revisiting crowd counting: State-of-the-art, trends, and future perspectives25
RoI Tanh-polar transformer network for face parsing in the wild25
Facial expression recognition using human machine interaction and multi-modal visualization analysis for healthcare applications25
CrossATNet - a novel cross-attention based framework for sketch-based image retrieval25
Projection-dependent input processing for 3D object recognition in human robot interaction systems25
Improving image captioning with Pyramid Attention and SC-GAN25
R4 Det: Refined single-stage detector with feature recursion and refinement for rotating object detection in aerial images25
Efficient pedestrian detection in top-view fisheye images using compositions of perspective view patches24
A survey of methods, datasets and evaluation metrics for visual question answering23
Learning to disentangle scenes for person re-identification23
Self-trained prediction model and novel anomaly score mechanism for video anomaly detection23
PCANet: Pyramid convolutional attention network for semantic segmentation23
Few-Shot learning for face recognition in the presence of image discrepancies for limited multi-class datasets22
FastNet: Fast high-resolution network for human pose estimation22
Improved generative adversarial network and its application in image oil painting style transfer21
Attention guided contextual feature fusion network for salient object detection21
Multi-stream slowFast graph convolutional networks for skeleton-based action recognition21
An unsupervised domain adaptation scheme for single-stage artwork recognition in cultural sites21
Multiscale parallel deep CNN (mpdCNN) architecture for the real low-resolution face recognition for surveillance21
Lightweight and computationally faster Hypermetropic Convolutional Neural Network for small size object detection20
An automated hyperparameter tuned deep learning model enabled facial emotion recognition for autonomous vehicle drivers20
Cluster adaptation networks for unsupervised domain adaptation20
Investigating bias in deep face analysis: The KANFace dataset and empirical study19
A deep-shallow and global–local multi-feature fusion network for photometric stereo19
IRANet: Identity-relevance aware representation for cloth-changing person re-identification19
SalFBNet: Learning pseudo-saliency distribution via feedback convolutional networks18
Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction18
Cross-database and cross-attack Iris presentation attack detection using micro stripes analyses18
Boundary guidance network for camouflage object detection18
PU-GACNet: Graph Attention Convolution Network for Point Cloud Upsampling18
A neural network aided attuned scheme for gun detection in video surveillance images18
Face anti-spoofing detection based on multi-scale image quality assessment18
Generalizable deep features for ocular biometrics18
ERF-YOLO: A YOLO algorithm compatible with fewer parameters and higher accuracy18
E2E-VSDL: End-to-end video surveillance-based deep learning model to detect and prevent criminal activities18
Transformer models for enhancing AttnGAN based text to image generation18
Multi-view dynamic facial action unit detection17
Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning17
Facial expression recognition using densely connected convolutional neural network and hierarchical spatial attention17
Video prediction by efficient transformers17
Bald eagle search optimization with deep transfer learning enabled age-invariant face recognition model16
HPRNet: Hierarchical point regression for whole-body human pose estimation16
Multi-level refinement enriched feature pyramid network for object detection16
ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation16
An efficient foreign objects detection network for power substation16
The effect of image recognition traffic prediction method under deep learning and naive Bayes algorithm on freeway traffic safety16
Detection of anomaly in surveillance videos using quantum convolutional neural networks16
Face mask detection using deep convolutional neural network and multi-stage image processing16
Deep hybrid learning for facial expression binary classifications and predictions16
Unsupervised face Frontalization for pose-invariant face recognition16
Synergetic reconstruction from 2D pose and 3D motion for wide-space multi-person video motion capture in the wild16
Few-shot object detection via baby learning15
SalED: Saliency prediction with a pithy encoder-decoder architecture sensing local and global information15
Beyond modality alignment: Learning part-level representation for visible-infrared person re-identification15
Point cloud completion using multiscale feature fusion and cross-regional attention15
Real-time semantic segmentation with weighted factorized-depthwise convolution15
Real-time semantic segmentation with local spatial pixel adjustment15
Intelligent facial expression recognition and classification using optimal deep transfer learning model15
Dual-path CNN with Max Gated block for text-based person re-identification14
An attention-based deep learning model for multiple pedestrian attributes recognition14
Certifiable relative pose estimation14
Multimodal assessment of apparent personality using feature attention and error consistency constraint14
Novel features for art movement classification of portrait paintings14
Collaborative representation of blur invariant deep sparse features for periocular recognition from smartphones14
A new perceptual hashing method for verification and identity classification of occluded faces14
Dense open-set recognition based on training with noisy negative images14
CAM: A fine-grained vehicle model recognition method based on visual attention model14
A calibration method of computer vision system based on dual attention mechanism14
Real-time gait biometrics for surveillance applications: A review14
Pose-guided part matching network via shrinking and reweighting for occluded person re-identification14
A study on attention-based LSTM for abnormal behavior recognition with variable pooling14
Intelligent multimodal pedestrian detection using hybrid metaheuristic optimization with deep learning model14
Point cloud classification with deep normalized Reeb graph convolution13
Explaining VQA predictions using visual grounding and a knowledge base13
Batch feature standardization network with triplet loss for weakly-supervised video anomaly detection13
Feature based video stabilization based on boosted HAAR Cascade and representative point matching algorithm13
Expression recognition with deep features extracted from holistic and part-based models13
Fusion of iris and sclera using phase intensive rubbersheet mutual exclusion for periocular recognition13
Single stage architecture for improved accuracy real-time object detection on mobile devices13
Variance-guided attention-based twin deep network for cross-spectral periocular recognition13
Edge supervision and multi-scale cost volume for stereo matching13
From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts13
MFC-Net : Multi-feature fusion cross neural network for salient object detection13
Emotion detection and face recognition of drivers in autonomous vehicles in IoT platform12
Using synthetic data for person tracking under adverse weather conditions12
R2Net: Residual refinement network for salient object detection12
Interactive multi-scale feature representation enhancement for small object detection12
Multi parallel U-net encoder network for effective polyp image segmentation12
Double anchor embedding for accurate multi-person 2D pose estimation12
I-SOCIAL-DB: A labeled database of images collected from websites and social media for Iris recognition12
Multi-level prediction Siamese network for real-time UAV visual tracking12
Combining complementary trackers for enhanced long-term visual object tracking12
ASPset: An outdoor sports pose video dataset with 3D keypoint annotations12
Cancelable Iris template generation by aggregating patch level ordinal relations with its holistically extended performance and security analysis12
Spatiotemporal module for video saliency prediction based on self-attention12
How robust are discriminatively trained zero-shot learning models?12
A novel co-attention computation block for deep learning based image co-segmentation12
Dense graph convolutional neural networks on 3D meshes for 3D object segmentation and classification11
Human object interaction detection: Design and survey11
PDA: Proxy-based domain adaptation for few-shot image recognition11
Does explainable machine learning uncover the black box in vision applications?11
Modeling graph-structured contexts for image captioning11
H-net: Unsupervised domain adaptation person re-identification network based on hierarchy11
Pose-guided counterfactual inference for occluded person re-identification11
Detection of panoramic vision pedestrian based on deep learning11
SiaTrans: Siamese transformer network for RGB-D salient object detection with depth image classification11
Geometric feature statistics histogram for both real-valued and binary feature representations of 3D local shape10
Image captioning via proximal policy optimization10
Handcrafted localized phase features for human action recognition10
Multistage temporal convolution transformer for action segmentation10
A motion model based on recurrent neural networks for visual object tracking10
Adaptive weight based on overlapping blocks network for facial expression recognition10
Whether normalized or not? Towards more robust iris recognition using dynamic programming10
MDCS with fully encoding the information of local shape description for 3D Rigid Data matching10
Engagement detection and enhancement for STEM education through computer vision, augmented reality, and haptics10
Boundary graph convolutional network for temporal action detection10
Context-based image explanations for deep neural networks10
Continual coarse-to-fine domain adaptation in semantic segmentation10
Single image dehazing using extended local dark channel prior10
Towards generalized morphing attack detection by learning residuals10
E2E-V2SResNet: Deep residual convolutional neural networks for end-to-end video driven speech synthesis10
Tracking fiducial markers with discriminative correlation filters10
Grassmann manifold based framework for automated fall detection from a camera10
Spatial–temporal graph attention network for video anomaly detection10
Edge-aware salient object detection network via context guidance10
Geometry consistency aware confidence evaluation for feature matching10
A novel micro-expression detection algorithm based on BERT and 3DCNN9
Enhancing single-view 3D mesh reconstruction with the aid of implicit surface learning9
RAMT-GAN: Realistic and accurate makeup transfer with generative adversarial network9
Generating facial expression adversarial examples based on saliency map9
ArCo: Attention-reinforced transformer with contrastive learning for image captioning9
Dual-branch adaptive attention transformer for occluded person re-identification9
Tackling multiple object tracking with complicated motions — Re-designing the integration of motion and appearance9
Crowd density detection method based on crowd gathering mode and multi-column convolutional neural network9
Clothing generation by multi-modal embedding: A compatibility matrix-regularized GAN model9
Omnidirectional stereo depth estimation based on spherical deep network9
Knowledge distillation methods for efficient unsupervised adaptation across multiple domains9
Texture classification-based feature processing for violence-based anomaly detection in crowded environments9
Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation9
Multi-scale interaction transformer for temporal action proposal generation9
Demographic classification through pupil analysis9
Camera pose estimation in multi-view environments: From virtual scenarios to the real world9
Learning an augmentation strategy for sparse datasets9
Composite recurrent network with internal denoising for facial alignment in still and video images in the wild9
Deep domain adaptation with ordinal regression for pain assessment using weakly-labeled videos9
AESPNet: Attention Enhanced Stacked Parallel Network to improve automatic Diabetic Foot Ulcer identification9
Short-term anchor linking and long-term self-guided attention for video object detection9
Lightweight boundary refinement module based on point supervision for semantic segmentation9
RGB-T tracking by modality difference reduction and feature re-selection8
Student behavior recognition for interaction detection in the classroom environment8
Weather-degraded image semantic segmentation with multi-task knowledge distillation8
Depth-guided saliency detection via boundary information8
Adversarial sliced Wasserstein domain adaptation networks8
Video-based person re-identification by intra-frame and inter-frame graph neural network8
Online multi-object tracking with δ-GLMB filter based on occlusion and identity switch handling8
Flow guided mutual attention for person re-identification8
Faster and finer pose estimation for multiple instance objects in a single RGB image8
Few-shot personalized saliency prediction using meta-learning8
A survey on computer vision based human analysis in the COVID-19 era8
A novel facial expression recognition algorithm using geometry β –skeleton in fusion based on deep CNN8
Distinguishing foreground and background alignment for unsupervised domain adaptative semantic segmentation8
Cross-modal feature extraction and integration based RGBD saliency detection8
Cross-view action recognition with small-scale datasets8
Authenticating and securing healthcare records: A deep learning-based zero watermarking approach8
Improving eye movement biometrics in low frame rate eye-tracking devices using periocular and eye blinking features8
ScPnP: A non-iterative scale compensation solution for PnP problems8
A Tibetan Thangka data set and relative tasks8
A pooling-based feature pyramid network for salient object detection8
A novel approach for breast cancer detection using optimized ensemble learning framework and XAI8
SyPer: Synthetic periocular data for quantized light-weight recognition in the NIR and visible domains8
Activity guided multi-scales collaboration based on scaled-CNN for saliency prediction8
View knowledge transfer network for multi-view action recognition8
Advances in deep learning-based image recognition of product packaging7
VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering7
Person re-identification: A taxonomic survey and the path ahead7
AGLC-GAN: Attention-based global-local cycle-consistent generative adversarial networks for unpaired single image dehazing7
ThickSeg: Efficient semantic segmentation of large-scale 3D point clouds using multi-layer projection7
Spatial temporal and channel aware network for video-based person re-identification7
Aligning vision-language for graph inference in visual dialog7
AGA-GAN: Attribute Guided Attention Generative Adversarial Network with U-Net for face hallucination7
Multimodal image fusion based on point-wise mutual information7
Monocular contextual constraint for stereo matching with adaptive weights assignment7
Single-shot cuboids: Geodesics-based end-to-end Manhattan aligned layout estimation from spherical panoramas7
Local information fusion network for 3D shape classification and retrieval7
Building NAS: Automatic designation of efficient neural architectures for building extraction in high-resolution aerial images7
Dual guidance enhanced network for light field salient object detection7
Semantic-aligned reinforced attention model for zero-shot learning7
Attention-guided aggregation stereo matching network7
Multi-source material image optimized selection based multi-option composition7
A cross-modal crowd counting method combining CNN and cross-modal transformer7
Joint patch and instance discrimination learning for unsupervised person re-identification7
Future pedestrian location prediction in first-person videos for autonomous vehicles and social robots6
Lightweight multi-scale convolutional neural network for real time stereo matching6
Double cross-modality progressively guided network for RGB-D salient object detection6
3D-VDNet: Exploiting the vertical distribution characteristics of point clouds for 3D object detection and augmentation6
Interpretable visual reasoning: A survey6
Collaborative knowledge distillation for incomplete multi-view action prediction6
Bias alleviating generative adversarial network for generalized zero-shot classification6
Cuepervision: self-supervised learning for continuous domain adaptation without catastrophic forgetting6
Multi–feature fusion tracking algorithm based on peak–context learning6
Feature fusion for object detection at one map6
A dynamic keypoint selection network for 6DoF pose estimation6
LP-GAN: Learning perturbations based on generative adversarial networks for point cloud adversarial attacks6
Adversarial color projection: A projector-based physical-world attack to DNNs6
TENet: Accurate light-field salient object detection with a transformer embedding network6
GIFSL - grafting based improved few-shot learning6
Progressive ShallowNet for large scale dynamic and spontaneous facial behaviour analysis in children6
Reinforced pedestrian attribute recognition with group optimization reward6
Face deidentification with controllable privacy protection6
LTST: Long-term segmentation tracker with memory attention network6
Multi-view self-supervised learning for 3D facial texture reconstruction from single image6
Rich global feature guided network for monocular depth estimation6
Task-based parameter isolation for foreground segmentation without catastrophic forgetting using multi-scale region and edges fusion network6
A few-shot learning-based ischemic stroke segmentation system using weighted MRI fusion6
0.040336132049561