IEEE Transactions on Circuits and Systems for Video Technology

Papers
(The TQCC of IEEE Transactions on Circuits and Systems for Video Technology is 19. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-12-01 to 2025-12-01.)
ArticleCitations
2022 Index IEEE Transactions on Circuits and Systems for Video Technology Vol. 32372
IEEE Transactions on Circuits and Systems for Video Technology Publication Information354
Table of Contents325
IEEE Transactions on Circuits and Systems for Video Technology publication information302
USVTrack: A Benchmark for Multi-Object Tracking in Complex Water Surface Scenes294
IEEE Transactions on Circuits and Systems for Video Technology publication information294
Unsupervised Action Segmentation via Multi-scale Temporal-interaction Enhancement291
Pose-Guided Transformer for Fine-Grained Action Quality Assessment275
Scene Prior Constrained Self-Paced Learning for Unsupervised Satellite Video Vehicle Detection246
Multi-Modal Multi-Grained Embedding Learning for Generalized Zero-Shot Video Classification242
Dual Difficulty-Aware Adaptive Pseudo Labeling for Semi-Supervised CNV Segmentation240
SpiReco: Fast and Efficient Recognition of High-Speed Moving Objects With Spike Camera220
Deep Affine Motion Compensation Network for Inter Prediction in VVC219
Representation Robustness and Feature Expansion for Exemplar-Free Class-Incremental Learning208
Highly-Parallel Hardwired Deep Convolutional Neural Network for 1-ms Dual-Hand Tracking204
Ct-LVI: A Framework Toward Continuous-Time Laser-Visual-Inertial Odometry and Mapping202
Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching191
DS2VP: Dynamically-Selected Spatially Visual Prompting184
IEEE Circuits and Systems Society Information180
Table of Contents180
Guest Editorial Introduction to the Special Issue on Label-Efficient Learning on Video Data170
MEF-GD: Multimodal Enhancement and Fusion Network for Garment Designer169
A Format Compliant Framework for HEVC Selective Encryption After Encoding168
Semi-Supervised Crowd Counting via Multi-Task Pseudo-Label Self-Correction Strategy167
Push-and-Pull: A General Training Framework With Differential Augmentor for Domain Generalized Point Cloud Classification161
Toward Meta-Shape-Based Multi-View 3D Point Cloud Registration: An Evaluation160
Adversarial Dual-Student With Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation159
Multi-Stage Cross-Modality Feature Interaction for RGB-Thermal Multi-Object Tracking158
Learning Depth-Density Priors for Fourier-Based Unpaired Image Restoration156
Frequency Generation for Real-World Image Super-Resolution155
Truncated Robust Natural Watermarking With Hungarian Optimization146
TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation146
Filtering-and-Alternating-Calibration: Spatiotemporal Context Alternating Fusion for Event-based Monocular Depth Estimation145
Cross-Level Multi-Modal Features Learning With Transformer for RGB-D Object Recognition143
Equity in Unsupervised Domain Adaptation by Nuclear Norm Maximization142
Scalable and Robust Tensor Ring Decomposition for Large-Scale Data With Missing Data and Outliers142
EIFNet: An Explicit and Implicit Feature Fusion Network for Finger Vein Verification141
FastAL: Fast Evaluation Module for Efficient Dynamic Deep Active Learning Using Broad Learning System140
VPA: Multi-modal Virtual Point Augmentation for 3D Object Detection138
RT3DHVC: A Real-Time Human Holographic Video Conferencing System With a Consumer RGB-D Camera Array129
Block Diagonal Graph Embedded Discriminative Regression for Image Representation128
Future Feature-Based Supervised Contrastive Learning for Streaming Perception128
Convolutional Neural Networks for Omnidirectional Image Quality Assessment: A Benchmark127
CRP2-VCS: Contrast-Oriented Region-Based Progressive Probabilistic Visual Cryptography Schemes126
Semantic-Aware Late-Stage Supervised Contrastive Learning for Fine-Grained Action Recognition125
Viewport Prediction for Volumetric Video Streaming by Exploring Video Saliency and User Trajectory Information123
Synergistic Fusion Network of Microscopic Hyperspectral and RGB Images for Multi-perspective Segmentation121
Dependability Feature Learning based on Sample Generation for Unsupervised Text-to-Image Person Re-identification118
Stochastic Gradient Perturbation: An Implicit Regularizer for Person Re-Identification116
Multi-Level Feature Fusion Network for Shadow Removal Detection115
Learning Spatio-Temporal Sharpness Map for Video Deblurring115
Uni3DA: Universal 3D Domain Adaptation for Object Recognition115
MCCE-REC: MLLM-Driven Cross-Modal Contrastive Entropy Model for Zero-Shot Referring Expression Comprehension112
Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter111
A Clinically Guided Graph Convolutional Network for Assessment of Parkinsonian Pronation-Supination Movements of Hands110
Hierarchical Dynamic Programming Module for Human Pose Refinement109
Learning to Capture the Query Distribution for Few-Shot Learning108
Efficient Single-Object Tracker Based on Local-Global Feature Fusion108
Dual-Stream Transformer With Distribution Alignment for Visible-Infrared Person Re-Identification107
PhyDAA: Physiological Dataset Assessing Attention107
Negative Class Guided Spatial Consistency Network for Sparsely Supervised Semantic Segmentation of Remote Sensing Images106
Plausible Proxy Mining With Credibility for Unsupervised Person Re-Identification106
Harmony: An Eco-Friendly Adaptive Rate Control Scheme for Video-on-Demand in Low Earth Orbit Satellite Internet105
Joint Learning of Image Deblurring and Depth Estimation Through Adversarial Multi-Task Network103
Lightweight Neural Network for Enhancing Imaging Performance of Under-Display Camera102
DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving101
Relative Comparison-Based Consensus Learning for Multi-View Subspace Clustering100
VDTR: Video Deblurring With Transformer100
Fuzzified Contrast Enhancement for Nearly Invisible Images99
Fully Unsupervised Domain-Agnostic Image Retrieval99
Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic Learning99
PPIFuse: Physical Priors Injected Infrared and Visible Image Fusion98
UDTCWT-PHFMs Domain Statistical Image Watermarking Using Vector BW-Type R Distribution97
Local Attention Transformer-Based Full-View Finger-Vein Identification96
Exploring Explicitly Disentangled Features for Domain Generalization96
Towards Video Anomaly Detection in the Real World: A Binarization Embedded Weakly-Supervised Network96
Instance-Incremental Scene Graph Generation From Real-World Point Clouds via Normalizing Flows95
ASCFormer: An Adaptive Strucure-aware Cascaded Transformer for 3D Object Detection94
Robust Image Watermarking With Synchronization Using Template Enhanced-Extracted Network94
Reversible Data Hiding Over Encrypted Images via Preprocessing-Free Matrix Secret Sharing93
Reversible Data Hiding in Encrypted Image via Secret Sharing Based on GF(p) and GF(2⁸)91
Image Super-Resolution With Self-Similarity Prior Guided Network and Sample-Discriminating Learning91
SMART: Semantic Matching Contrastive Learning for Partially View-Aligned Clustering90
Progressive Point Cloud Upsampling via Differentiable Rendering90
Learning Appearance-Motion Synergy via Memory-Guided Event Prediction for Video Anomaly Detection87
TPCM-SegNet: A Text-Prompted Dual-Path Convolution-Mamba Network for Anomaly Segmentation87
D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing87
Exploring and Exploiting High-Order Spatial–Temporal Dynamics for Long-Term Frame Prediction87
Iterative Self-Guided Image Filtering87
AirSOD: A Lightweight Network for RGB-D Salient Object Detection86
Pro-Tuning: Unified Prompt Tuning for Vision Tasks86
Single Image Haze Removal With Haze Map Optimization for Various Haze Concentrations85
UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth Completion85
MMI-Det: Exploring Multi-Modal Integration for Visible and Infrared Object Detection85
Projected Generative Adversarial Network for Point Cloud Completion84
Enhancing Representation Learning With Spatial Transformation and Early Convolution for Reinforcement Learning-Based Small Object Detection83
Key Role Guided Transformer for Group Activity Recognition82
Graph-Guided Unsupervised Multiview Representation Learning82
Reliable Entropy-Induced Anchor Learning for Incomplete Multi-View Subspace Clustering81
Video Understanding with Large Language Models: A Survey81
Active Spatial Positions Based Hierarchical Relation Inference for Group Activity Recognition80
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation80
Spatial Attention-Guided Light Field Salient Object Detection Network With Implicit Neural Representation79
Edge and Skeleton Guidance Network for Salient Object Detection in Optical Remote Sensing Images79
Spectral–Spatial Feature Extraction With Dual Graph Autoencoder for Hyperspectral Image Clustering79
Deep and Low-Rank Quaternion Priors for Color Image Processing78
SARGAN: Spatial Attention-Based Residuals for Facial Expression Manipulation78
Representing Boundary-Ambiguous Scene Online With Scale-Encoded Cascaded Grids and Radiance Field Deblurring78
DMRFlow: 4D Radar Scene Flow Estimation With Decoupled Matching and Refinement77
IEEE Transactions on Circuits and Systems for Video Technology publication information76
Multi-Modal Attribute Prompting for Vision-Language Models76
Relation-Aware Multi-Pass Comparison Deconfounded Network for Change Captioning76
IEEE Circuits and Systems Society Information75
IEEE Transactions on Circuits and Systems for Video Technology publication information75
OraL: An Observational Learning Paradigm for Unsupervised Hyperspectral Change Detection74
StreetSurfGS: Scalable Urban Street Surface Reconstruction With Planar-Based Gaussian Splatting74
Flow Visualization for Complex Fluid Flows via a Structure-Enhanced Motion Estimator73
An Efficient Algorithm for Generating Harmonized Stereoscopic 360° VR Images73
Holistic Prototype Attention Network for Few-Shot Video Object Segmentation72
ImagingNet: A New Learnable SAR Imaging Method via Hierarchical U-shaped Network71
FDNet: Frequency Decomposition Network for Learned Image Compression71
A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras71
Compensating for the Incomplete with the Complete: An Efficient Scene Text Detector70
Monocular Depth Estimation on Adverse Weathers With Curriculum Domain Distribution Alignment70
MixSSC: Forward-Backward Mixture for Vision-Based 3D Semantic Scene Completion70
Mesh2Animation: Unsupervised Animating for Quadruped 3D Objects70
Boosting Semi-Supervised Face Recognition With Noise Robustness69
FaceGCN: Structured Priors Inspired Graph Convolutional Networks for Face Restoration With Unknown Degradations69
FDAC: Federated Domain Adaptation via Dual Contrastive Learning69
A Novel Deep Learning Framework for Automatic Recognition of Thyroid Gland and Tissues of Neck in Ultrasound Image69
Efficiently Exploiting Spatially Variant Knowledge for Video Deblurring69
Enhancing Robustness of Multi-Object Trackers With Temporal Feature Mix68
Multi-Scale Explicit Matching and Mutual Subject Teacher Learning for Generalizable Person Re-Identification68
SMR: Spatial-Guided Model-Based Regression for 3D Hand Pose and Mesh Reconstruction68
Non-local Guided Neural Fields for 4D CT Reconstruction67
Texture-Aware Spherical Rotation for High Efficiency Omnidirectional Intra Video Coding67
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation67
Learning Scene-invariant Distribution for Generalizable Blind Image Quality Assessment67
Task-Specific Loss for Robust Instance Segmentation With Noisy Class Labels67
Table of Contents66
Multimodal Industrial Anomaly Detection via Geometric Prior66
Dynamic Particle Filter Framework for Robust Object Tracking66
Appearance Matters, So Does Audio: Revealing the Hidden Face via Cross-Modality Transfer66
Table of Contents66
HyPSAM: Hybrid Prompt-driven Segment Anything Model for RGB-Thermal Salient Object Detection66
Touchless Finger Vein and Fingerprint Verification via Exploiting Attention-Based Cross-Domain Fusion65
WeaFU: Weather-Informed Image Blind Restoration via Multi-Weather Distribution Diffusion65
G2LP-Net: Global to Local Progressive Video Inpainting Network65
Errata to “Local-Global Temporal Difference Learning for Satellite Video Super-Resolution”65
Inter-Scale Similarity Guided Cost Aggregation for Stereo Matching65
Self-Supervised Adversarial Video Summarizer With Context Latent Sequence Learning64
A Novel Video Coding Strategy in HEVC for Object Detection64
Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering64
VSOIQE: A Novel Viewport-Based Stitched 360° Omnidirectional Image Quality Evaluator64
Diverse Batch Steganography Using Model-Based Selection and Double-Layered Payload Assignment64
Fixing Defect of Photometric Loss for Self-Supervised Monocular Depth Estimation63
Flow-Edge Guided Unsupervised Video Object Segmentation63
Interlayer Restoration Deep Neural Network for Scalable High Efficiency Video Coding63
Blind Image Quality Index for Authentic Distortions With Local and Global Deep Feature Aggregation63
Feature Evaluation and Joint Interaction for Audio-Visual Emotion Recognition63
DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication63
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion63
Efficient Non-Blind Image Deblurring With Discriminative Shrinkage Deep Networks62
Optical Flow Reusing for High-Efficiency Space-Time Video Super Resolution62
Surveillance Video-and-Language Understanding: From Small to Large Multimodal Models62
STAF: 3D Human Mesh Recovery From Video With Spatio-Temporal Alignment Fusion62
Question-Aware Global-Local Video Understanding Network for Audio-Visual Question Answering62
Dynamic Hypergraph Convolutional Network for No-Reference Point Cloud Quality Assessment62
Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition62
DAHP: Deep Attention-Guided Hashing With Pairwise Labels61
Multi-Prior Driven Network for RGB-D Salient Object Detection61
Depth Estimation From a Single Image of Blast Furnace Burden Surface Based on Edge Defocus Tracking61
Robust Matrix Completion Based on Factorization and Truncated-Quadratic Loss Function61
Balanced Teacher for Source-Free Object Detection61
CNN-Transformer Based Generative Adversarial Network for Copy-Move Source/ Target Distinguishment61
TAKD: Target-Aware Knowledge Distillation for Remote Sensing Scene Classification60
Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval60
Learning With Noisy Labels by Semantic and Feature Space Collaboration60
Low-Rank Tensor Graph Learning for Multi-View Subspace Clustering59
A Universal Framework for Improving the Robustness of Coverless Image Steganography Based on Image Restoration59
Unsupervised Deep Hashing With Fine-Grained Similarity-Preserving Contrastive Learning for Image Retrieval59
Recent Advances in Rate Control: From Optimization to Implementation and Beyond59
Adaptive Mixture-of-Experts Distillation for Cross-Satellite Generalizable Incremental Remote Sensing Scene Classification59
All-Inclusive Image Enhancement for Degraded Images Exhibiting Low-Frequency Corruption59
Low-Resolution Object Recognition With Cross-Resolution Relational Contrastive Distillation59
Forgery-Aware Adaptive Learning With Vision Transformer for Generalized Face Forgery Detection59
Target-Aware Tracking With Spatial-Temporal Context Attention59
MSGA-Net: Progressive Feature Matching via Multi-Layer Sparse Graph Attention59
Transformer-Based Multimodal Emotional Perception for Dynamic Facial Expression Recognition in the Wild58
Meta-Learning Based Domain Prior With Application to Optical-ISAR Image Translation58
One for All: A Unified Generative Framework for Image Emotion Classification58
Laplacian Pyramid Fusion Network With Hierarchical Guidance for Infrared and Visible Image Fusion58
Cloth-Imbalanced Gait Recognition via Hallucination58
DEP-Former: Multimodal Depression Recognition Based on Facial Expressions and Audio Features via Emotional Changes57
Conditional Dual Diffusion for Multimodal Clustering of Optical and SAR Images57
VmambaIR: Visual State Space Model for Image Restoration57
Contrastive Learning With Enhancing Detailed Information for Pre-Training Vision Transformer56
Concept-Enhanced Relation Network for Video Visual Relation Inference56
Learning Physical-Spatio-Temporal Features for Video Shadow Removal56
Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection56
Learning Multi-View Stereo with Geometry-Aware Prior56
Dual Prototypes-Based Personalized Federated Adversarial Cross-Modal Hashing56
Table of Contents56
M3CS: Multi-Target Masked Point Modeling With Learnable Codebook and Siamese Decoders56
IEEE Transactions on Circuits and Systems for Video Technology publication information56
UNeLF: Unconstrained Neural Light Field for Self-Supervised Angular Super-Resolution55
StarPose: 3D Human Pose Estimation via Spatial-Temporal Autoregressive Diffusion55
Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection55
Optical Flow-Based Spatiotemporal Sketch for Video Representation: A Novel Framework55
DilatedTAD: Enhancing Adaptability to Actions of Varying Durations for Temporal Action Detection55
Generative Augmentation Hashing for Few-shot Cross-Modal Retrieval54
PointOT: Interpretable Geometry-Inspired Point Cloud Generative Model via Optimal Transport54
Enhancing Transparent Object Matting Using Predicted Definite Foreground and Background54
Flexible Temperature Parallel Distillation for Dense Object Detection: Make Response-Based Knowledge Distillation Great Again54
Hypergraph Contrastive Learning for Large-Scale Hyperspectral Image Clustering54
Dual-Net: Dual Visual Spectral Affinity Monitoring Network for Hyperspectral Anomaly Detection54
Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression54
Exploiting Multiperspective Driven Hierarchical Content-Aware Network for Finger Vein Verification53
Progressive Multi-Prompt Learning for Vision-Language Models53
ViMAEdit: Vision-guided and Mask-enhanced Adaptive Editing Algorithm for Prompt-based Image Editing53
MambaPTP: Exploring the Potential of Mamba for Pedestrian Trajectory Prediction53
A Pixel-Level Segmentation-Synthesis Framework for Dynamic Texture Video Compression52
Locality-Adaptive Structured Dictionary Learning for Cross-Domain Recognition52
Generalized Intra-Camera Supervised Person Re-Identification52
CodingHomo: Bootstrapping Deep Homography With Video Coding52
Content-Adaptive Rate Control Method for User-Generated Content Videos52
Point Cloud Completion via Self-Projected View Augmentation and Implicit Field Constraint52
Special Issue on Segment Anything for Videos and Beyond52
Semantic-Context Graph Network for Point-Based 3D Object Detection52
Modality Fused Class-Proxy With Knowledge Distillation for Zero-Shot Sketch-Based Image Retrieval51
Improving Zero-Shot Generalization for CLIP with Prompt Ensemble self-Distillation51
POS-Trends Dynamic-Aware Model for Video Caption51
Neuromorphic Imaging With Super-Resolution51
PCTrack: Accurate Object Tracking for Live Video Analytics on Resource-Constrained Edge Devices51
Table of Contents51
Diffusion-Based Hypotheses Generation and Joint-Level Hypotheses Aggregation for 3D Human Pose Estimation51
Enhancing Vision and Language Navigation With Prompt-Based Scene Knowledge51
SPCL: Semantic Polymorphism and Commonality Learning for Text-based Person Retrieval51
DBVC: An End-to-End 3-D Deep Biomedical Video Coding Framework51
Generative Image Steganography Based on Text-to-Image Multimodal Generative Model51
SmokePose: End-to-End Smoke Keypoint Detection51
Enhancing Skeleton-Based Action Recognition With Language Descriptions From Pre-Trained Large Multimodal Models50
CLSR: Cross-Layer Interaction Pyramid Super-Resolution Network50
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects50
Bridging Inter-task Gap of Continual Self-supervised Learning with External Data50
Exploring Relational Knowledge for Source-Free Domain Adaptation49
CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning49
Dual-Domain Feature Fusion and Multi-Level Memory-Enhanced Network for Spectral Compressive Imaging49
Class Activation Map Calibration for Weakly Supervised Semantic Segmentation49
Deep Video Super-Resolution Using Hybrid Imaging System49
0.18875503540039