Speech Communication

Papers
(The TQCC of Speech Communication is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-06-01 to 2026-06-01.)
ArticleCitations
A comprehensive study on supervised single-channel noisy speech separation with multi-task learning69
Psychoacoustic features explain creakiness classifications made by naive and non-naive listeners63
Editorial Board58
Subband fusion of complex spectrogram for fake speech detection58
Automatic classification of vocal intensity categories from amplitude-normalized speech signals by comparing acoustic features and classifier models52
Editorial Board48
Data augmentation for speech separation38
A corpus of audio-visual recordings of linguistically balanced, Danish sentences for speech-in-noise experiments36
An introduction to pluricentric languages in speech science and technology32
Self-Supervised Learning for Speaker Recognition: A study and review32
Fixed frequency range empirical wavelet transform based acoustic and entropy features for speech emotion recognition31
A robust temporal map of speech monitoring from planning to articulation30
A novel distortion-tolerant speech encryption scheme for secure voice communication29
Vocal emotion perception in Mandarin-speaking older adults with hearing loss28
Editorial Board27
Editorial Board25
The prosody of theme, rheme and focus in Egyptian Arabic: A quantitative investigation of tunes, configurations and speaker variability23
Topological data analysis of human vowels: Persistent homologies across representation spaces22
Blood pressure monitoring from naturally recorded speech sounds: advancements and future prospects22
Editorial Board21
HC-APNet: Harmonic Compensation Auditory Perception Network for low-complexity speech enhancement20
Editorial Board18
Frequent-words analysis for forensic speaker comparison18
Investigating prosodic entrainment from global conversations to local turns and tones in Mandarin conversations18
Two-stage UNet with channel and temporal-frequency attention for multi-channel speech enhancement17
Efficient acoustic feature transformation in mismatched environments using a Guided-GAN17
Expectation of speech style improves audio-visual perception of English vowels17
"I said simPle, not symBol!"Is clear speech tailored to the listener's feedback17
Automatic Speech Recognition and Pronunciation Error Detection of Dutch Non-native Speech: cumulating speech resources in a pluricentric language16
Evaluating the effects of continuous pitch and speech tempo modifications on perceptual speaker verification performance by familiar and unfamiliar listeners15
Deletion and insertion tampering detection for speech authentication based on fluctuating super vector of electrical network frequency15
Paradigm fusion learning from overt and silent chinese speech based on pseudo-siamese multiscale capsule neural network14
Editorial Board14
Real-time intelligibility affects the realization of French word-final schwa14
A study of correlation between physiological process of articulation and emotions on Mandarin Chinese14
Learning and controlling the source-filter representation of speech with a variational autoencoder13
Hand gesture realisation of contrastive focus in real-time whisper-to-speech synthesis: Investigating the transfer from implicit to explicit control of intonation13
Towards unsupervised speech recognition without pronunciation models13
Influence of speech-in-noise perception, gender, and age on lipreading ability for monosyllabic words13
Vocal characteristics of accuracy in eyewitness testimony13
Dynamic graph learning with gated convolutions for single-channel speech separation12
Effects of voice onset time and place of articulation on perception of dichotic Turkish syllables12
Sequential perception of tone and focus in parallel–A computational simulation12
A new universal camouflage attack algorithm for intelligent speech system12
The effect of fluency strategy training on interpreter trainees’ speech fluency: Does content familiarity matter?12
Prosody in narratives: An exploratory study with children with sex chromosomes trisomies12
Yanbian Korean speakers tend to merge /e/ and /ɛ/ when exposed to Seoul Korean11
Coarse-to-fine speech separation method in the time-frequency domain11
Disordered speech recognition considering low resources and abnormal articulation11
Speechformer-CTC: Sequential modeling of depression detection with speech temporal classification11
Adaptive weighting in a transformer framework for multimodal emotion recognition11
Editorial Board11
Multi-modal co-learning for silent speech recognition based on ultrasound tongue images11
Exploiting Locality Sensitive Hashing - Clustering and gloss feature for sign language production11
Progressive channel fusion for more efficient TDNN on speaker verification10
Modulation spectral features for speech emotion recognition using deep neural networks10
FinnAffect: An affective speech corpus for spontaneous Finnish10
Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions10
GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition10
Nasal coarticulation in Lombard speech10
Perceptual effects of interpolated Austrian and German standard varieties10
Tone-syllable synchrony in Mandarin: New evidence and implications10
Prosody and fluency of Finland Swedish as a second language: Investigating global parameters for automated speaking assessment10
Speakers’ vocal expression of sexual orientation depends on experimenter gender9
Combined approach to dysarthric speaker verification using data augmentation and feature fusion9
A cross-modal attention model with contextual enhancements for speech emotion recognition9
Pathological voice classification using MEEL features and SVM-TabNet model9
Editorial Board9
Role of language familiarity in understanding speech in noise under various acoustic environments9
Efficient time-domain speech separation using short encoded sequence network9
Arabic Automatic Speech Recognition: Challenges and Progress9
Enhancing bone-conducted speech with spectrum similarity metric in adversarial learning9
Automatic speech recognition technology to evaluate an audiometric word recognition test: A preliminary investigation9
One-shot emotional voice conversion based on feature separation8
Accurate synthesis of dysarthric Speech for ASR data augmentation8
Editorial Board8
Differential constant-beamwidth beamforming with cube arrays8
Cross-modal information fusion for voice spoofing detection8
MC-Mamba: Cross-modal target speaker extraction model based on multiple consistency8
Editorial Board8
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech8
Learning transfer from singing to speech: Insights from vowel analyses in aging amateur singers and non-singers7
CSLNSpeech: Solving the extended speech separation problem with the help of Chinese sign language7
Prosodic characteristics of deceptive picture descriptions in Finnish: Acoustics, beliefs, self-evaluations, and deception theories7
Assessing Cancer-Related Cognitive Impairment for breast cancer survivors with speech analysis7
Controllable speech synthesis by learning discrete phoneme-level prosodic representations7
Advancing automatic speech recognition using feature fusion with self-supervised learning features: A case study on Fearless Steps Apollo corpus7
Editorial Board7
Exploring LoRA variants to adapt whisper models for robust recognition of children’s speech7
Categorization of patients affected with neurogenerative dysarthria among Hindi-speaking population and analyzing factors causing reduced speech intelligibility at the human-machine interface7
Addressing the semi-open set dialect recognition problem under resource-efficient considerations7
The perception of intonational peaks and valleys: The effects of plateaux, declination and experimental task7
Effect of individual characteristics on impressions of one’s own recorded voice6
Domain adaptation using non-parallel target domain corpus for self-supervised learning-based automatic speech recognition6
MaTSE: A hybrid Mamba-Transformer model for monaural Speech Enhancement6
Editorial Board6
An adaptive autoregressive pre-whitener for speech and acoustic signals based on parametric NMF6
The role of visual cues indicating onset times of target speech syllables in release from informational or energetic masking6
Chinese speech intelligibility and speech intelligibility index for the elderly6
Robustness of emotion recognition in dialogue systems: A study on third-party API integrations and black-box attacks6
Editorial Board6
Recursive Feature Diversity Network for audio super-resolution6
The Role of Auditory and Visual Cues in the Perception of Mandarin Emotional Speech in Male Drug Addicts6
Robust prosody modeling for synthetic speech detection6
0.060215950012207