Speech Communication

Papers
(The TQCC of Speech Communication is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
Progress of machine learning based automatic phoneme recognition and its prospect81
A comprehensive study on supervised single-channel noisy speech separation with multi-task learning68
Phase unwrapping based packet loss concealment using deep neural networks61
Psychoacoustic features explain creakiness classifications made by naive and non-naive listeners46
Facemask occlusion's impact on L2 listening comprehension45
Subband fusion of complex spectrogram for fake speech detection44
Editorial Board44
Fixed frequency range empirical wavelet transform based acoustic and entropy features for speech emotion recognition40
Editorial Board34
Editorial Board27
Data augmentation for speech separation27
Articulation rates’ inter-correlations and discriminating powers in an English speech corpus26
Editorial Board23
Read speech voice quality and disfluency in individuals with recent suicidal ideation or suicide attempt23
A corpus of audio-visual recordings of linguistically balanced, Danish sentences for speech-in-noise experiments22
A novel distortion-tolerant speech encryption scheme for secure voice communication20
A robust temporal map of speech monitoring from planning to articulation20
An introduction to pluricentric languages in speech science and technology20
NHSS: A speech and singing parallel database20
Perceptual asymmetry between pitch peaks and valleys19
Editorial Board17
Vocal emotion perception in Mandarin-speaking older adults with hearing loss17
Assessing child communication engagement and statistical speech patterns for American English via speech recognition in naturalistic active learning spaces17
Editorial Board17
The Relationship Between Turn-taking, Vocal Pitch Synchrony, and Rapport in Creative Problem-Solving Communication15
Neural speech-rate conversion with multispeaker WaveNet vocoder14
The prosody of theme, rheme and focus in Egyptian Arabic: A quantitative investigation of tunes, configurations and speaker variability14
Speech intelligibility deterioration for normal hearing and hearing impaired patients with different types of tinnitus14
Editorial Board14
Investigating a neural all pass warp in modern TTS applications14
Efficient acoustic feature transformation in mismatched environments using a Guided-GAN13
An automated integrated speech and face imageanalysis system for the identification of human emotions13
Two-stage UNet with channel and temporal-frequency attention for multi-channel speech enhancement13
Editorial Board13
Automatic Speech Recognition and Pronunciation Error Detection of Dutch Non-native Speech: cumulating speech resources in a pluricentric language13
The influence of task engagement on phonetic convergence13
Unsupervised Automatic Speech Recognition: A review12
HC-APNet: Harmonic Compensation Auditory Perception Network for low-complexity speech enhancement12
Learning and controlling the source-filter representation of speech with a variational autoencoder11
Deletion and insertion tampering detection for speech authentication based on fluctuating super vector of electrical network frequency11
Investigating prosodic entrainment from global conversations to local turns and tones in Mandarin conversations11
Frequent-words analysis for forensic speaker comparison11
Editorial Board10
Evaluating the effects of continuous pitch and speech tempo modifications on perceptual speaker verification performance by familiar and unfamiliar listeners10
The interplay of prosodic cues in the L2: How intonation, rhythm, and speech rate in speech by Spanish learners of Dutch contribute to L1 Dutch perceptions of accentedness and comprehensibility10
Blind Speech Separation and Dereverberation using neural beamforming10
Real-time intelligibility affects the realization of French word-final schwa10
A study of correlation between physiological process of articulation and emotions on Mandarin Chinese9
Multilingual speech recognition for GlobalPhone languages9
Vocal characteristics of accuracy in eyewitness testimony9
Effects of urgent speech and congruent/incongruent text on speech intelligibility for older adults in the presence of noise and reverberation9
A formant modification method for improved ASR of children’s speech9
Differences between listeners with early and late immersion age in spatial release from masking in various acoustic environments8
Editorial Board8
Effects of voice onset time and place of articulation on perception of dichotic Turkish syllables8
The effect of fluency strategy training on interpreter trainees’ speech fluency: Does content familiarity matter?8
A new universal camouflage attack algorithm for intelligent speech system8
Prosody in narratives: An exploratory study with children with sex chromosomes trisomies8
Sequential perception of tone and focus in parallel–A computational simulation8
Multi-modal co-learning for silent speech recognition based on ultrasound tongue images8
The Lombard intelligibility benefit of native and non-native speech for native and non-native listeners8
Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations8
Speechformer-CTC: Sequential modeling of depression detection with speech temporal classification8
Prosodic alignment toward emotionally expressive speech: Comparing human and Alexa model talkers8
Oral configurations during vowel nasalization in English7
GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition7
Tone-syllable synchrony in Mandarin: New evidence and implications7
Editorial Board7
Recognition of vocoded speech in English by Mandarin-speaking English-learners7
Coarse-to-fine speech separation method in the time-frequency domain7
Disordered speech recognition considering low resources and abnormal articulation7
Bangladeshi Bangla speech corpus for automatic speech recognition research7
Yanbian Korean speakers tend to merge /e/ and /ɛ/ when exposed to Seoul Korean7
Progressive channel fusion for more efficient TDNN on speaker verification7
Perceptual effects of interpolated Austrian and German standard varieties7
Prosody and fluency of Finland Swedish as a second language: Investigating global parameters for automated speaking assessment7
Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions6
Nasal coarticulation in Lombard speech6
Pathological voice classification using MEEL features and SVM-TabNet model6
Arabic Automatic Speech Recognition: Challenges and Progress6
Cross-modal information fusion for voice spoofing detection6
The Second-Language Productivity of Two Mandarin Tone Sandhi Patterns6
Combined approach to dysarthric speaker verification using data augmentation and feature fusion6
Deep ad-hoc beamforming based on speaker extraction for target-dependent speech separation6
Enhancing bone-conducted speech with spectrum similarity metric in adversarial learning6
Efficient time-domain speech separation using short encoded sequence network6
Controllable speech synthesis by learning discrete phoneme-level prosodic representations6
One-shot emotional voice conversion based on feature separation6
Speech pause distribution as an early marker for Alzheimer’s disease6
Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation6
Role of language familiarity in understanding speech in noise under various acoustic environments6
Speakers’ vocal expression of sexual orientation depends on experimenter gender6
Editorial Board6
Differential constant-beamwidth beamforming with cube arrays6
Modulation spectral features for speech emotion recognition using deep neural networks6
1.0992460250854