OOIR: Observatory of International Research

Papers

(The TQCC of EURASIP Journal on Audio Speech and Music Processing is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)

Article	Citations
Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning	47
Advancing guitar emotion recognition through audio data augmentation to enhance smart musical instruments	40
Hybrid lightweight temporal-frequency analysis network for multi-channel speech enhancement	40
MIRACLE—a microphone array impulse response dataset for acoustic learning	31
Generating chord progression from melody with flexible harmonic rhythm and controllable harmonic density	31
Domain-weighted transfer learning and discriminative embeddings for low-resource speaker verification	30
A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes	29
Speech-dependent data augmentation for own voice reconstruction with hearable microphones in noisy environments	29
AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks	28
Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement	25
Compression of room impulse responses for compact storage and fast low-latency convolution	23
Parameter-efficient adaptation with multi-channel adversarial training for far-field speech recognition	20
Enhancing Speaker Recognition with CRET Model: a fusion of CONV2D, RESNET and ECAPA-TDNN	18
Three-stage training and orthogonality regularization for spoken language recognition	17
Parameter optimisation for a physical model of the vocal system	17
Attention mechanism combined with residual recurrent neural network for sound event detection and localization	17
Sound recurrence analysis for acoustic scene classification	15
Investigations on higher-order spherical harmonic input features for deep learning-based multiple speaker detection and localization	15
Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech	15
Neural electric bass guitar synthesis framework enabling attack-sustain-representation-based technique control	14
Silent speech recognition using visual cascading fusion of tongue-lip movements based on pre-trained and fine-tuned model	14
Parallel processing of distributed beamforming and multichannel linear prediction for speech denoising and deverberation in wireless acoustic sensor networks	14
Multi-rate modulation encoding via unsupervised learning for audio event detection	14
Sound field reconstruction using neural processes with dynamic kernels	14
Real-time playing technique recognition embedded in a smart acoustic guitar	12

The whole is greater than the sum of its parts: improving music source separation by bridging networks	11
Vulnerability issues in Automatic Speaker Verification (ASV) systems	11
AudioSet-tools: a Python framework for taxonomy-aware AudioSet curation and reproducible audio research	11
Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos	11
Comparative performance analysis of end-to-end ASR models on Indo-Aryan and Dravidian languages within India’s linguistic landscape	11
Variational Autoencoders for chord sequence generation conditioned on Western harmonic music complexity	11
Data-based spatial audio processing	10
W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision	10
An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction	10
Automatic detection of attachment style in married couples through conversation analysis	9
Dual-branch attention module-based network with parameter sharing for joint sound event detection and localization	9
Correction: N-dimensional N-microphone sound source localization	9
Masked multi-center angular margin loss for language recognition	9
Recognition of target domain Japanese speech using language model replacement	8
Robust and early howling detection based on a sparsity measure	8
DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situations	8
Performance evaluation of perceptible impulsive noise detection methods based on auditory models	8
Multi-scale Information Aggregation for Spoofing Detection	7
Comparative study of state-based neural networks for virtual analog audio effects modeling	7
Multilingual speech-to-vocal tract visualization using deep learning for pronunciation training	7
Guest editorial: AI for computational audition—sound and music processing	7
Training audio transformers for cover song identification	7
Significance of relative phase features for shouted and normal speech classification	7
Mi-Go: tool which uses YouTube as data source for evaluating general-purpose speech recognition machine learning models	7
AI-based Chinese-style music generation from video content: a study on cross-modal analysis and generation methods	6
Automatic dysarthria detection and severity level assessment using CWT-layered CNN model	6
Fake speech detection using VGGish with attention block	6
Single-microphone speaker separation and voice activity detection in noisy and reverberant environments	6
Data-driven room acoustic modeling via differentiable feedback delay networks with learnable delay lines	6