EURASIP Journal on Audio Speech and Music Processing

Papers
(The median citation count of EURASIP Journal on Audio Speech and Music Processing is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-03-01 to 2024-03-01.)
ArticleCitations
A review of infant cry analysis and classification40
Ensemble of convolutional neural networks to improve animal audio classification37
Multiclass audio segmentation based on recurrent neural networks for broadcast domain data24
End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network16
Progressive loss functions for speech enhancement with deep neural networks14
Performance vs. hardware requirements in state-of-the-art automatic speech recognition14
Dynamically localizing multiple speakers based on the time-frequency domain14
Accent modification for speech recognition of non-native speakers using neural style transfer14
A depthwise separable convolutional neural network for keyword spotting on an embedded system12
Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks11
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction10
Auxiliary function-based algorithm for blind extraction of a moving speaker10
dEchorate: a calibrated room impulse response dataset for echo-aware signal processing9
Steerable differential beamformers with planar microphone arrays8
Geometry calibration in wireless acoustic sensor networks utilizing DoA and distance information8
Components loss for neural networks in mask-based speech enhancement8
Automated audio captioning: an overview of recent progress and new challenges8
Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition8
Acoustic DOA estimation using space alternating sparse Bayesian learning7
Deep multiple instance learning for foreground speech localization in ambient audio from wearable devices7
Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation7
DOANet: a deep dilated convolutional neural network approach for search and rescue with drone-embedded sound source localization6
Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms6
MetaMGC: a music generation framework for concerts in metaverse6
Joint speaker localization and array calibration using expectation-maximization6
A simulation study on optimal scores for speaker recognition6
Comparison of semi-supervised deep learning algorithms for audio classification6
Time–frequency scattering accurately models auditory similarities between instrumental playing techniques6
Depression-level assessment from multi-lingual conversational speech data using acoustic and text features5
RPCA-DRNN technique for monaural singing voice separation5
Transformer-based ensemble method for multiple predominant instruments recognition in polyphonic music5
Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit5
Deep neural networks for automatic speech processing: a survey from large corpora to limited data5
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling5
Estimation of acoustic echoes using expectation-maximization methods5
Noise power spectral density scaled SNR response estimation with restricted range search for sound source localisation using unmanned aerial vehicles4
Estimation of playable piano fingering by pitch-difference fingering match model4
AUC optimization for deep learning-based voice activity detection4
MYRiAD: a multi-array room acoustic database4
Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech4
Audio source separation by activity probability detection with maximum correlation and simplex geometry4
A CNN-based approach to identification of degradations in speech signals4
Single-channel speech enhancement based on joint constrained dictionary learning4
Discriminative features based on modified log magnitude spectrum for playback speech detection4
Paralinguistic singing attribute recognition using supervised machine learning for describing the classical tenor solo singing voice in vocal pedagogy4
An online algorithm for echo cancellation, dereverberation and noise reduction based on a Kalman-EM Method3
Motor data-regularized nonnegative matrix factorization for ego-noise suppression3
Timestamp-aligning and keyword-biasing end-to-end ASR front-end for a KWS system3
Trainable windows for SincNet architecture3
A large TV dataset for speech and music activity detection3
Comparative evaluation of interpolation methods for the directivity of musical instruments3
NMF-weighted SRP for multi-speaker direction of arrival estimation: robustness to spatial aliasing while exploiting sparsity in the atom-time domain3
Sparse pursuit and dictionary learning for blind source separation in polyphonic music recordings3
Dynamic out-of-vocabulary word registration to language model for speech recognition3
Stripe-Transformer: deep stripe feature learning for music source separation3
Beyond the Big Five personality traits for music recommendation systems2
Frequency-dependent auto-pooling function for weakly supervised sound event detection2
Automatic detection of attachment style in married couples through conversation analysis2
Multi-source localization by using offset residual weight2
Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation2
Feature compensation based on independent noise estimation for robust speech recognition2
Time-domain adaptive attention network for single-channel speech separation2
AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks2
Quadratic approach for single-channel noise reduction2
Cross-corpus speech emotion recognition using subspace learning and domain adaption2
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders2
Masked multi-center angular margin loss for language recognition2
An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones2
Data-based spatial audio processing2
Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition2
Review of methods for coding of speech signals2
Automatic discrimination between front and back ensemble locations in HRTF-convolved binaural recordings of music1
A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices1
Black-box adversarial attacks through speech distortion for speech emotion recognition1
Forward-backward recursive expectation-maximization for concurrent speaker tracking1
Convolutional neural networks for the classification of guitar effects and extraction of the parameter settings of single and multi-guitar effects from instrument mixes1
Quantifying headphone listening experience in virtual sound environments using distraction1
A multichannel learning-based approach for sound source separation in reverberant environments1
Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting1
Analysis of transition cost and model parameters in speaker diarization for meetings1
Attention mechanism combined with residual recurrent neural network for sound event detection and localization1
Neural network-based non-intrusive speech quality assessment using attention pooling function1
Correction to: An integrated MVDR beamformer for speech enhancement using a local microphone array and external microphones1
Language agnostic missing subtitle detection1
Pronunciation augmentation for Mandarin-English code-switching speech recognition1
Dual input neural networks for positional sound source localization1
Points2Sound: from mono to binaural audio using 3D point cloud scenes1
Speech emotion recognition based on emotion perception1
Robust single- and multi-loudspeaker least-squares-based equalization for hearing devices1
U2-VC: one-shot voice conversion using two-level nested U-structure1
Musical note onset detection based on a spectral sparsity measure1
Direction-of-arrival and power spectral density estimation using a single directional microphone and group-sparse optimization1
Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios1
Residual feedback suppression with extended model-based postfilters1
On the selection of the number of beamformers in beamforming-based binaural reproduction1
Nonlinear residual echo suppression based on dual-stream DPRNN1
Parallel processing of distributed beamforming and multichannel linear prediction for speech denoising and deverberation in wireless acoustic sensor networks1
Multichannel speaker interference reduction using frequency domain adaptive filtering1
A noise PSD estimation algorithm using derivative-based high-pass filter in non-stationary noise conditions1
Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques1
Improving sign-algorithm convergence rate using natural gradient for lossless audio compression1
A recursive expectation-maximization algorithm for speaker tracking and separation1
Learning-based robust speaker counting and separation with the aid of spatial coherence1
Three-stage training and orthogonality regularization for spoken language recognition1
Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling1
PlugSonic: a web- and mobile-based platform for dynamic and navigable binaural audio1
0.021563053131104