Image and Vision Computing

Papers
(The TQCC of Image and Vision Computing is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
Weighted boxes fusion: Ensembling boxes from different object detection models252
Deep learning-based object detection in low-altitude UAV datasets: A survey173
A comprehensive review on deep learning-based methods for video anomaly detection148
Deep multimodal fusion for semantic image segmentation: A survey111
FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public108
A framework of human action recognition using length control features fusion and weighted entropy-variances based feature selection103
A review on 2D instance segmentation based on deep neural networks95
Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning model93
Deep learning-based detection from the perspective of small or tiny objects: A survey85
Deep learning-based person re-identification methods: A survey and outlook of recent works67
ReMOT: A model-agnostic refinement for multiple object tracking58
A review of deep learning techniques for 2D and 3D human pose estimation49
Intelligent deep learning based ethnicity recognition and classification using facial images48
Intelligent detection of building cracks based on deep learning47
Visual question answering model based on graph neural network and contextual attention42
Motion saliency based multi-stream multiplier ResNets for action recognition41
An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network40
Synthetic data for face recognition: Current state and future prospects37
A Survey on Object Detection for the Internet of Multimedia Things (IoMT) using Deep Learning and Event-based Middleware: Approaches, Challenges, and Future Directions37
LSTM with bio inspired algorithm for action recognition in sports videos37
Exploring region relationships implicitly: Image captioning with visual relationship attention36
Iris and periocular biometrics for head mounted displays: Segmentation, recognition, and synthetic data generation36
A survey of iris datasets35
Optimization of face recognition algorithm based on deep learning multi feature fusion driven by big data34
Improved YOLOX-X based UAV aerial photography object detection algorithm34
Generative adversarial networks and their application to 3D face generation: A survey30
Feedback-driven loss function for small object detection29
RoI Tanh-polar transformer network for face parsing in the wild29
ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation28
Attention-guided chained context aggregation for semantic segmentation28
A survey of micro-expression recognition28
MEmoR: A Multimodal Emotion Recognition using affective biomarkers for smart prediction of emotional health for people analytics in smart industries27
Robust biometric authentication system with a secure user template27
Revisiting crowd counting: State-of-the-art, trends, and future perspectives26
Facial expression recognition using human machine interaction and multi-modal visualization analysis for healthcare applications25
Projection-dependent input processing for 3D object recognition in human robot interaction systems25
CrossATNet - a novel cross-attention based framework for sketch-based image retrieval25
Improving image captioning with Pyramid Attention and SC-GAN25
R4 Det: Refined single-stage detector with feature recursion and refinement for rotating object detection in aerial images25
A survey of methods, datasets and evaluation metrics for visual question answering24
Learning to disentangle scenes for person re-identification24
Self-trained prediction model and novel anomaly score mechanism for video anomaly detection24
Efficient pedestrian detection in top-view fisheye images using compositions of perspective view patches24
Few-Shot learning for face recognition in the presence of image discrepancies for limited multi-class datasets23
PCANet: Pyramid convolutional attention network for semantic segmentation23
Attention guided contextual feature fusion network for salient object detection22
FastNet: Fast high-resolution network for human pose estimation22
Improved generative adversarial network and its application in image oil painting style transfer21
An unsupervised domain adaptation scheme for single-stage artwork recognition in cultural sites21
Multiscale parallel deep CNN (mpdCNN) architecture for the real low-resolution face recognition for surveillance21
Multi-stream slowFast graph convolutional networks for skeleton-based action recognition21
A deep-shallow and global–local multi-feature fusion network for photometric stereo20
An automated hyperparameter tuned deep learning model enabled facial emotion recognition for autonomous vehicle drivers20
Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning20
Cluster adaptation networks for unsupervised domain adaptation20
Lightweight and computationally faster Hypermetropic Convolutional Neural Network for small size object detection20
IRANet: Identity-relevance aware representation for cloth-changing person re-identification19
Generalizable deep features for ocular biometrics19
Transformer models for enhancing AttnGAN based text to image generation19
PU-GACNet: Graph Attention Convolution Network for Point Cloud Upsampling19
Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction19
Video prediction by efficient transformers19
A neural network aided attuned scheme for gun detection in video surveillance images18
E2E-VSDL: End-to-end video surveillance-based deep learning model to detect and prevent criminal activities18
SalFBNet: Learning pseudo-saliency distribution via feedback convolutional networks18
ERF-YOLO: A YOLO algorithm compatible with fewer parameters and higher accuracy18
Face anti-spoofing detection based on multi-scale image quality assessment18
Cross-database and cross-attack Iris presentation attack detection using micro stripes analyses18
Boundary guidance network for camouflage object detection18
Bald eagle search optimization with deep transfer learning enabled age-invariant face recognition model17
Synergetic reconstruction from 2D pose and 3D motion for wide-space multi-person video motion capture in the wild17
The effect of image recognition traffic prediction method under deep learning and naive Bayes algorithm on freeway traffic safety17
Multi-view dynamic facial action unit detection17
Multi-level refinement enriched feature pyramid network for object detection17
Facial expression recognition using densely connected convolutional neural network and hierarchical spatial attention17
Deep hybrid learning for facial expression binary classifications and predictions16
An efficient foreign objects detection network for power substation16
Beyond modality alignment: Learning part-level representation for visible-infrared person re-identification16
Multimodal assessment of apparent personality using feature attention and error consistency constraint16
Face mask detection using deep convolutional neural network and multi-stage image processing16
Detection of anomaly in surveillance videos using quantum convolutional neural networks16
HPRNet: Hierarchical point regression for whole-body human pose estimation16
Unsupervised face Frontalization for pose-invariant face recognition16
Dense open-set recognition based on training with noisy negative images16
SalED: Saliency prediction with a pithy encoder-decoder architecture sensing local and global information15
Point cloud completion using multiscale feature fusion and cross-regional attention15
Intelligent facial expression recognition and classification using optimal deep transfer learning model15
Few-shot object detection via baby learning15
Real-time gait biometrics for surveillance applications: A review15
Real-time semantic segmentation with weighted factorized-depthwise convolution15
Real-time semantic segmentation with local spatial pixel adjustment15
CAM: A fine-grained vehicle model recognition method based on visual attention model14
Novel features for art movement classification of portrait paintings14
Edge supervision and multi-scale cost volume for stereo matching14
Certifiable relative pose estimation14
Variance-guided attention-based twin deep network for cross-spectral periocular recognition14
A calibration method of computer vision system based on dual attention mechanism14
A new perceptual hashing method for verification and identity classification of occluded faces14
Fusion of iris and sclera using phase intensive rubbersheet mutual exclusion for periocular recognition14
Intelligent multimodal pedestrian detection using hybrid metaheuristic optimization with deep learning model14
Dual-path CNN with Max Gated block for text-based person re-identification14
Pose-guided part matching network via shrinking and reweighting for occluded person re-identification14
A study on attention-based LSTM for abnormal behavior recognition with variable pooling14
Expression recognition with deep features extracted from holistic and part-based models13
Batch feature standardization network with triplet loss for weakly-supervised video anomaly detection13
Emotion detection and face recognition of drivers in autonomous vehicles in IoT platform13
How robust are discriminatively trained zero-shot learning models?13
MFC-Net : Multi-feature fusion cross neural network for salient object detection13
Point cloud classification with deep normalized Reeb graph convolution13
From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts13
Single stage architecture for improved accuracy real-time object detection on mobile devices13
Multi-level prediction Siamese network for real-time UAV visual tracking12
Multi parallel U-net encoder network for effective polyp image segmentation12
ASPset: An outdoor sports pose video dataset with 3D keypoint annotations12
Cancelable Iris template generation by aggregating patch level ordinal relations with its holistically extended performance and security analysis12
I-SOCIAL-DB: A labeled database of images collected from websites and social media for Iris recognition12
Interactive multi-scale feature representation enhancement for small object detection12
Double anchor embedding for accurate multi-person 2D pose estimation12
Human object interaction detection: Design and survey12
A novel approach for breast cancer detection using optimized ensemble learning framework and XAI12
R2Net: Residual refinement network for salient object detection12
Combining complementary trackers for enhanced long-term visual object tracking12
Modeling graph-structured contexts for image captioning12
Using synthetic data for person tracking under adverse weather conditions12
Spatiotemporal module for video saliency prediction based on self-attention12
SiaTrans: Siamese transformer network for RGB-D salient object detection with depth image classification11
Pose-guided counterfactual inference for occluded person re-identification11
Dense graph convolutional neural networks on 3D meshes for 3D object segmentation and classification11
Detection of panoramic vision pedestrian based on deep learning11
PDA: Proxy-based domain adaptation for few-shot image recognition11
Does explainable machine learning uncover the black box in vision applications?11
H-net: Unsupervised domain adaptation person re-identification network based on hierarchy11
Multistage temporal convolution transformer for action segmentation11
Learning an augmentation strategy for sparse datasets11
Grassmann manifold based framework for automated fall detection from a camera10
Tackling multiple object tracking with complicated motions — Re-designing the integration of motion and appearance10
Continual coarse-to-fine domain adaptation in semantic segmentation10
Geometry consistency aware confidence evaluation for feature matching10
Edge-aware salient object detection network via context guidance10
Geometric feature statistics histogram for both real-valued and binary feature representations of 3D local shape10
Crowd density detection method based on crowd gathering mode and multi-column convolutional neural network10
Camera pose estimation in multi-view environments: From virtual scenarios to the real world10
Engagement detection and enhancement for STEM education through computer vision, augmented reality, and haptics10
Boundary graph convolutional network for temporal action detection10
Spatial–temporal graph attention network for video anomaly detection10
A motion model based on recurrent neural networks for visual object tracking10
Whether normalized or not? Towards more robust iris recognition using dynamic programming10
Authenticating and securing healthcare records: A deep learning-based zero watermarking approach10
Towards generalized morphing attack detection by learning residuals10
E2E-V2SResNet: Deep residual convolutional neural networks for end-to-end video driven speech synthesis10
Texture classification-based feature processing for violence-based anomaly detection in crowded environments10
Tracking fiducial markers with discriminative correlation filters10
Context-based image explanations for deep neural networks10
Clothing generation by multi-modal embedding: A compatibility matrix-regularized GAN model10
Single image dehazing using extended local dark channel prior10
RAMT-GAN: Realistic and accurate makeup transfer with generative adversarial network10
Adaptive weight based on overlapping blocks network for facial expression recognition10
MDCS with fully encoding the information of local shape description for 3D Rigid Data matching10
Image captioning via proximal policy optimization10
Deep domain adaptation with ordinal regression for pain assessment using weakly-labeled videos10
Handcrafted localized phase features for human action recognition10
Omnidirectional stereo depth estimation based on spherical deep network9
Video-based person re-identification by intra-frame and inter-frame graph neural network9
Triangulate geometric constraint combined with visual-flow fusion network for accurate 6DoF pose estimation9
Dual-branch adaptive attention transformer for occluded person re-identification9
A novel micro-expression detection algorithm based on BERT and 3DCNN9
Few-shot personalized saliency prediction using meta-learning9
Composite recurrent network with internal denoising for facial alignment in still and video images in the wild9
Distinguishing foreground and background alignment for unsupervised domain adaptative semantic segmentation9
Short-term anchor linking and long-term self-guided attention for video object detection9
Multi-scale interaction transformer for temporal action proposal generation9
Knowledge distillation methods for efficient unsupervised adaptation across multiple domains9
Enhancing single-view 3D mesh reconstruction with the aid of implicit surface learning9
Generating facial expression adversarial examples based on saliency map9
AESPNet: Attention Enhanced Stacked Parallel Network to improve automatic Diabetic Foot Ulcer identification9
ArCo: Attention-reinforced transformer with contrastive learning for image captioning9
Lightweight boundary refinement module based on point supervision for semantic segmentation9
View knowledge transfer network for multi-view action recognition9
Improving eye movement biometrics in low frame rate eye-tracking devices using periocular and eye blinking features8
ScPnP: A non-iterative scale compensation solution for PnP problems8
Weather-degraded image semantic segmentation with multi-task knowledge distillation8
A pooling-based feature pyramid network for salient object detection8
Cross-domain car detection model with integrated convolutional block attention mechanism8
Person re-identification: A taxonomic survey and the path ahead8
RGB-T tracking by modality difference reduction and feature re-selection8
AGLC-GAN: Attention-based global-local cycle-consistent generative adversarial networks for unpaired single image dehazing8
Student behavior recognition for interaction detection in the classroom environment8
A Tibetan Thangka data set and relative tasks8
SyPer: Synthetic periocular data for quantized light-weight recognition in the NIR and visible domains8
Advances in deep learning-based image recognition of product packaging8
Activity guided multi-scales collaboration based on scaled-CNN for saliency prediction8
Faster and finer pose estimation for multiple instance objects in a single RGB image8
Depth-guided saliency detection via boundary information8
A survey on computer vision based human analysis in the COVID-19 era8
A novel facial expression recognition algorithm using geometry β –skeleton in fusion based on deep CNN8
Online multi-object tracking with δ-GLMB filter based on occlusion and identity switch handling8
Cross-view action recognition with small-scale datasets8
Flow guided mutual attention for person re-identification8
Local information fusion network for 3D shape classification and retrieval7
Building NAS: Automatic designation of efficient neural architectures for building extraction in high-resolution aerial images7
VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering7
AGA-GAN: Attribute Guided Attention Generative Adversarial Network with U-Net for face hallucination7
Attention-guided aggregation stereo matching network7
Monocular contextual constraint for stereo matching with adaptive weights assignment7
A cross-modal crowd counting method combining CNN and cross-modal transformer7
Joint patch and instance discrimination learning for unsupervised person re-identification7
Aligning vision-language for graph inference in visual dialog7
Interpretable visual reasoning: A survey7
Semantic-aligned reinforced attention model for zero-shot learning7
Multi-source material image optimized selection based multi-option composition7
Single-shot cuboids: Geodesics-based end-to-end Manhattan aligned layout estimation from spherical panoramas7
ThickSeg: Efficient semantic segmentation of large-scale 3D point clouds using multi-layer projection7
Spatial temporal and channel aware network for video-based person re-identification7
Dual guidance enhanced network for light field salient object detection7
Adversarial color projection: A projector-based physical-world attack to DNNs7
Multimodal image fusion based on point-wise mutual information7
Semantic video segmentation with dynamic keyframe selection and distortion-aware feature rectification7
COME: Clip-OCR and Master ObjEct for text image captioning7
Progressive ShallowNet for large scale dynamic and spontaneous facial behaviour analysis in children6
Automatic Deep Sparse Multi-Trial Vector-based Differential Evolution clustering with manifold learning and incremental technique6
Reinforced pedestrian attribute recognition with group optimization reward6
Face deidentification with controllable privacy protection6
LTST: Long-term segmentation tracker with memory attention network6
Multi-view self-supervised learning for 3D facial texture reconstruction from single image6
Rich global feature guided network for monocular depth estimation6
Bias alleviating generative adversarial network for generalized zero-shot classification6
Cuepervision: self-supervised learning for continuous domain adaptation without catastrophic forgetting6
Comparison of fine-tuning strategies for transfer learning in medical image classification6
Future pedestrian location prediction in first-person videos for autonomous vehicles and social robots6
Pixel-wise ordinal classification for salient object grading6
A deep feature fusion network with global context and cross-dimensional dependencies for classification of mild cognitive impairment from brain MRI6
Feature fusion for object detection at one map6
Double cross-modality progressively guided network for RGB-D salient object detection6
A dynamic keypoint selection network for 6DoF pose estimation6
Cross-stream contrastive learning for self-supervised skeleton-based action recognition6
TENet: Accurate light-field salient object detection with a transformer embedding network6
GIFSL - grafting based improved few-shot learning6
Qualitative failures of image generation models and their application in detecting deepfakes6
Face reenactment via generative landmark guidance6
Optimal deep transfer learning based ethnicity recognition on face images6
Lightweight multi-scale convolutional neural network for real time stereo matching6
Multi–feature fusion tracking algorithm based on peak–context learning6
PAGML: Precise Alignment Guided Metric Learning for sketch-based 3D shape retrieval6
3D-VDNet: Exploiting the vertical distribution characteristics of point clouds for 3D object detection and augmentation6
LP-GAN: Learning perturbations based on generative adversarial networks for point cloud adversarial attacks6
Collaborative knowledge distillation for incomplete multi-view action prediction6
Task-based parameter isolation for foreground segmentation without catastrophic forgetting using multi-scale region and edges fusion network6
A few-shot learning-based ischemic stroke segmentation system using weighted MRI fusion6
A unified RGB-T crowd counting learning framework6
Speech driven video editing via an audio-conditioned diffusion model6
0.119384765625