IEEE-ACM Transactions on Audio Speech and Language Processing

Papers
(The H4-Index of IEEE-ACM Transactions on Audio Speech and Language Processing is 32. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space225
Interpretable Multimodal Capsule Fusion189
Similarity Measurement of Segment-Level Speaker Embeddings in Speaker Diarization160
Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping94
Representation Learning With Hidden Unit Clustering for Low Resource Speech Applications91
A User-Centric Approach for Deep Residual-Echo Suppression in Double-Talk79
Decorrelation in Feedback Delay Networks72
CET2: Modelling Topic Transitions for Coherent and Engaging Knowledge-Grounded Conversations63
WDEA: The Structure and Semantic Fusion With Wasserstein Distance for Low-Resource Language Entity Alignment53
Review of Methods for Automatic Speaker Verification51
The VoxCeleb Speaker Recognition Challenge: A Retrospective48
Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition46
Envelope-Based Multichannel Noise Reduction for Cochlear Implant Applications45
Efficient Lightweight Speaker Verification With Broadcasting CNN-Transformer and Knowledge Distillation Training of Self-Attention Maps45
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems42
Towards Generating Diverse Audio Captions via Adversarial Training42
$\mathcal {P}$owMix: A Versatile Regularizer for Multimodal Sentiment Analysis42
Multi-Channel to Multi-Channel Noise Reduction and Reverberant Speech Preservation in Time-Varying Acoustic Scenes for Binaural Reproduction40
Audio-Only Phonetic Segment Classification Using Embeddings Learned From Audio and Ultrasound Tongue Imaging Data39
MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation37
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection37
Reverberant Source Separation Using NTF With Delayed Subsources and Spatial Priors36
DropAttack: A Random Dropped Weight Attack Adversarial Training for Natural Language Understanding36
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor36
Inference Skipping for More Efficient Real-Time Speech Enhancement With Parallel RNNs35
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity35
Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features35
Learning Discriminative Representations and Decision Boundaries for Open Intent Detection32
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network32
Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques32
Attention-Based Speech Enhancement Using Human Quality Perception Modeling32
AudioLM: A Language Modeling Approach to Audio Generation32
0.17128205299377