OOIR: Observatory of International Research

Papers

(The TQCC of Speech Communication is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)

Article	Citations
A comprehensive study on supervised single-channel noisy speech separation with multi-task learning	90
Phase unwrapping based packet loss concealment using deep neural networks	79
Psychoacoustic features explain creakiness classifications made by naive and non-naive listeners	73
Editorial Board	54
Progress of machine learning based automatic phoneme recognition and its prospect	50
Facemask occlusion's impact on L2 listening comprehension	48
Subband fusion of complex spectrogram for fake speech detection	45
Editorial Board	35
Editorial Board	35
Articulation rates’ inter-correlations and discriminating powers in an English speech corpus	31
Data augmentation for speech separation	30
An introduction to pluricentric languages in speech science and technology	29
Editorial Board	28
A corpus of audio-visual recordings of linguistically balanced, Danish sentences for speech-in-noise experiments	27
A novel distortion-tolerant speech encryption scheme for secure voice communication	26
Perceptual asymmetry between pitch peaks and valleys	25
NHSS: A speech and singing parallel database	22
A robust temporal map of speech monitoring from planning to articulation	22
Read speech voice quality and disfluency in individuals with recent suicidal ideation or suicide attempt	20
Fixed frequency range empirical wavelet transform based acoustic and entropy features for speech emotion recognition	19
Editorial Board	19
Assessing child communication engagement and statistical speech patterns for American English via speech recognition in naturalistic active learning spaces	19
Editorial Board	18
Editorial Board	18
Vocal emotion perception in Mandarin-speaking older adults with hearing loss	18

The prosody of theme, rheme and focus in Egyptian Arabic: A quantitative investigation of tunes, configurations and speaker variability	18
Speech intelligibility deterioration for normal hearing and hearing impaired patients with different types of tinnitus	17
Blood pressure monitoring from naturally recorded speech sounds: advancements and future prospects	17
Investigating a neural all pass warp in modern TTS applications	17
Efficient acoustic feature transformation in mismatched environments using a Guided-GAN	16
Editorial Board	16
Frequent-words analysis for forensic speaker comparison	15
Expectation of speech style improves audio-visual perception of English vowels	15
"I said simPle, not symBol!"Is clear speech tailored to the listener's feedback	15
HC-APNet: Harmonic Compensation Auditory Perception Network for low-complexity speech enhancement	15
Deletion and insertion tampering detection for speech authentication based on fluctuating super vector of electrical network frequency	14
Investigating prosodic entrainment from global conversations to local turns and tones in Mandarin conversations	14
Two-stage UNet with channel and temporal-frequency attention for multi-channel speech enhancement	14
The influence of task engagement on phonetic convergence	14
Unsupervised Automatic Speech Recognition: A review	13
Neural speech-rate conversion with multispeaker WaveNet vocoder	13
Automatic Speech Recognition and Pronunciation Error Detection of Dutch Non-native Speech: cumulating speech resources in a pluricentric language	13
Multilingual speech recognition for GlobalPhone languages	11
Real-time intelligibility affects the realization of French word-final schwa	11
Effects of urgent speech and congruent/incongruent text on speech intelligibility for older adults in the presence of noise and reverberation	11
Evaluating the effects of continuous pitch and speech tempo modifications on perceptual speaker verification performance by familiar and unfamiliar listeners	11
A study of correlation between physiological process of articulation and emotions on Mandarin Chinese	11
Editorial Board	11
A formant modification method for improved ASR of children’s speech	10
The interplay of prosodic cues in the L2: How intonation, rhythm, and speech rate in speech by Spanish learners of Dutch contribute to L1 Dutch perceptions of accentedness and comprehensibility	9
Effects of voice onset time and place of articulation on perception of dichotic Turkish syllables	9
Learning and controlling the source-filter representation of speech with a variational autoencoder	9
The Lombard intelligibility benefit of native and non-native speech for native and non-native listeners	9
Vocal characteristics of accuracy in eyewitness testimony	9
Blind Speech Separation and Dereverberation using neural beamforming	9
Sequential perception of tone and focus in parallel–A computational simulation	9
Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations	8
Speechformer-CTC: Sequential modeling of depression detection with speech temporal classification	8
Yanbian Korean speakers tend to merge /e/ and /ɛ/ when exposed to Seoul Korean	8
Coarse-to-fine speech separation method in the time-frequency domain	8
GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition	8
Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions	8
Recognition of vocoded speech in English by Mandarin-speaking English-learners	8
Prosody in narratives: An exploratory study with children with sex chromosomes trisomies	8
A new universal camouflage attack algorithm for intelligent speech system	8
Multi-modal co-learning for silent speech recognition based on ultrasound tongue images	8
Disordered speech recognition considering low resources and abnormal articulation	8
Progressive channel fusion for more efficient TDNN on speaker verification	8
Bangladeshi Bangla speech corpus for automatic speech recognition research	8
Prosodic alignment toward emotionally expressive speech: Comparing human and Alexa model talkers	8
Differences between listeners with early and late immersion age in spatial release from masking in various acoustic environments	8
The effect of fluency strategy training on interpreter trainees’ speech fluency: Does content familiarity matter?	8
Exploiting Locality Sensitive Hashing - Clustering and gloss feature for sign language production	8
Editorial Board	8
Perceptual effects of interpolated Austrian and German standard varieties	8

The Second-Language Productivity of Two Mandarin Tone Sandhi Patterns	7
Enhancing bone-conducted speech with spectrum similarity metric in adversarial learning	7
Arabic Automatic Speech Recognition: Challenges and Progress	7
Editorial Board	7
Speech pause distribution as an early marker for Alzheimer’s disease	7
One-shot emotional voice conversion based on feature separation	7
Efficient time-domain speech separation using short encoded sequence network	7
Combined approach to dysarthric speaker verification using data augmentation and feature fusion	7
Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation	7
Comparing the nativeness vs. intelligibility approach in prosody instruction for developing speaking skills by interpreter trainees: An experimental study	7
Tone-syllable synchrony in Mandarin: New evidence and implications	7
Modulation spectral features for speech emotion recognition using deep neural networks	7
Role of language familiarity in understanding speech in noise under various acoustic environments	7
Nasal coarticulation in Lombard speech	7
Prosody and fluency of Finland Swedish as a second language: Investigating global parameters for automated speaking assessment	7
Cross-modal information fusion for voice spoofing detection	7
Pathological voice classification using MEEL features and SVM-TabNet model	7
Speakers’ vocal expression of sexual orientation depends on experimenter gender	7
Deep ad-hoc beamforming based on speaker extraction for target-dependent speech separation	7
Differential constant-beamwidth beamforming with cube arrays	7
Controllable speech synthesis by learning discrete phoneme-level prosodic representations	6
Learning transfer from singing to speech: Insights from vowel analyses in aging amateur singers and non-singers	6
Editorial Board	6
Fundamental frequency feature warping for frequency normalization and data augmentation in child automatic speech recognition	6
Accurate synthesis of dysarthric Speech for ASR data augmentation	6
Addressing the semi-open set dialect recognition problem under resource-efficient considerations	6
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech	6
CSLNSpeech: Solving the extended speech separation problem with the help of Chinese sign language	6
Editorial Board	6