EURASIP Journal on Audio Speech and Music Processing

Papers
(The TQCC of EURASIP Journal on Audio Speech and Music Processing is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
A review of infant cry analysis and classification50
Accent modification for speech recognition of non-native speakers using neural style transfer21
Dynamically localizing multiple speakers based on the time-frequency domain18
Automated audio captioning: an overview of recent progress and new challenges18
End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network18
Progressive loss functions for speech enhancement with deep neural networks17
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction16
Performance vs. hardware requirements in state-of-the-art automatic speech recognition16
Auxiliary function-based algorithm for blind extraction of a moving speaker15
dEchorate: a calibrated room impulse response dataset for echo-aware signal processing13
Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition12
Acoustic DOA estimation using space alternating sparse Bayesian learning11
Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks11
Components loss for neural networks in mask-based speech enhancement10
MetaMGC: a music generation framework for concerts in metaverse10
Transformer-based ensemble method for multiple predominant instruments recognition in polyphonic music9
Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation9
Deep multiple instance learning for foreground speech localization in ambient audio from wearable devices9
Geometry calibration in wireless acoustic sensor networks utilizing DoA and distance information9
MYRiAD: a multi-array room acoustic database8
Steerable differential beamformers with planar microphone arrays8
Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech8
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling8
Comparison of semi-supervised deep learning algorithms for audio classification7
AUC optimization for deep learning-based voice activity detection7
Time–frequency scattering accurately models auditory similarities between instrumental playing techniques7
Deep neural networks for automatic speech processing: a survey from large corpora to limited data7
DOANet: a deep dilated convolutional neural network approach for search and rescue with drone-embedded sound source localization6
Beyond the Big Five personality traits for music recommendation systems6
Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms6
Depression-level assessment from multi-lingual conversational speech data using acoustic and text features5
Speech emotion recognition based on emotion perception5
RPCA-DRNN technique for monaural singing voice separation5
Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit5
Single-channel speech enhancement based on joint constrained dictionary learning5
NMF-weighted SRP for multi-speaker direction of arrival estimation: robustness to spatial aliasing while exploiting sparsity in the atom-time domain5
A simulation study on optimal scores for speaker recognition5
Review of methods for coding of speech signals5
Estimation of playable piano fingering by pitch-difference fingering match model5
Timestamp-aligning and keyword-biasing end-to-end ASR front-end for a KWS system5
A CNN-based approach to identification of degradations in speech signals4
Comparative evaluation of interpolation methods for the directivity of musical instruments4
Paralinguistic singing attribute recognition using supervised machine learning for describing the classical tenor solo singing voice in vocal pedagogy4
An online algorithm for echo cancellation, dereverberation and noise reduction based on a Kalman-EM Method4
AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks4
Audio source separation by activity probability detection with maximum correlation and simplex geometry4
A survey of technologies for automatic Dysarthric speech recognition4
Stripe-Transformer: deep stripe feature learning for music source separation4
Dynamic out-of-vocabulary word registration to language model for speech recognition3
Trainable windows for SincNet architecture3
Points2Sound: from mono to binaural audio using 3D point cloud scenes3
U2-VC: one-shot voice conversion using two-level nested U-structure3
Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling3
Sparse pursuit and dictionary learning for blind source separation in polyphonic music recordings3
Attention mechanism combined with residual recurrent neural network for sound event detection and localization3
Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement3
Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios3
A large TV dataset for speech and music activity detection3
Time-domain adaptive attention network for single-channel speech separation3
0.030686855316162