Speech Communication

Papers
(The H4-Index of Speech Communication is 17. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-04-01 to 2024-04-01.)
ArticleCitations
Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN89
Learning deep multimodal affective features for spontaneous speech emotion recognition51
Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features49
Masked multi-head self-attention for causal speech enhancement45
CN-Celeb: Multi-genre speaker recognition41
Emotional voice conversion: Theory, databases and ESD36
Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion33
The Hearing-Aid Speech Perception Index (HASPI) Version 230
Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM24
A review of multi-objective deep learning speech denoising methods21
Fusion of deep learning features with mixture of brain emotional learning for audio-visual emotion recognition21
Parallel Representation Learning for the Classification of Pathological Speech: Studies on Parkinson’s Disease and Cleft Lip and Palate20
Unsupervised Automatic Speech Recognition: A review19
CyTex: Transforming speech to textured images for speech emotion recognition19
Multi-modal speech emotion recognition using self-attention mechanism and multi-scale fusion framework19
An Iterative Graph Spectral Subtraction Method for Speech Enhancement18
A time–frequency smoothing neural network for speech enhancement18
Improving generative adversarial networks for speech enhancement through regularization of latent representations17
Speech enhancement using a DNN-augmented colored-noise Kalman filter17
Automatic accent identification as an analytical tool for accent robust automatic speech recognition17
Learning affective representations based on magnitude and dynamic relative phase information for speech emotion recognition17
0.036149024963379