Computer Speech and Language

Papers
(The TQCC of Computer Speech and Language is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-03-01 to 2024-03-01.)
ArticleCitations
Voxceleb: Large-scale speaker verification in the wild238
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech127
A review of speaker diarization: Recent advances with deep learning87
Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classification79
Turn-taking in Conversational Systems and Human-Robot Interaction: A Review64
State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations58
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks57
Deep reinforcement and transfer learning for abstractive text summarization: A review48
Transfer learning from adult to children for speech recognition: Evaluation, analysis and recommendations45
Hate speech detection on Twitter using transfer learning41
Combining context-relevant features with multi-stage attention network for short text classification39
Human evaluation of automatically generated text: Current trends and best practice guidelines38
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition38
Generalized end-to-end detection of spoofing attacks to automatic speaker recognizers37
Spoken language interaction with robots: Recommendations for future research36
Adversarial attack and defense strategies for deep speaker recognition systems35
Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction34
Multilingual stance detection in social media political debates34
Linguistic features and automatic classifiers for identifying mild cognitive impairment and dementia33
Enhancing Arabic aspect-based sentiment analysis using deep learning models33
Part-of-speech tagging for Arabic tweets using CRF and Bi-LSTM30
Generative adversarial networks for speech processing: A review29
MuST-C: A multilingual corpus for end-to-end speech translation28
Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods27
Advances in subword-based HMM-DNN speech recognition across languages26
Hate speech and offensive language detection in Dravidian languages using deep ensemble framework26
An automatic Alzheimer’s disease classifier based on spontaneous spoken English25
tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification25
Trajectory-based recognition of dynamic Persian sign language using hidden Markov model25
The VoicePrivacy 2020 Challenge: Results and findings24
BERT syntactic transfer: A computational experiment on Italian, French and English languages24
Two decades of speaker recognition evaluation at the national institute of standards and technology23
Automatic assessment of intelligibility in speakers with dysarthria from coded telephone speech using glottal features22
Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech21
A survey on automatic speech recognition systems for Portuguese language and its variations18
Offensive language detection in Tamil YouTube comments by adapters and cross-domain knowledge transfer17
Arabic speech recognition by end-to-end, modular systems and human17
TOP-Rank: A TopicalPostionRank for Extraction and Classification of Keyphrases in Text17
Detection of replay spoof speech using teager energy feature cues17
Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge16
Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification16
Transfer fine-tuning of BERT with phrasal paraphrases16
Voice spoofing detection corpus for single and multi-order audio replays16
Verbal fluency in normal aging and cognitive decline: Results of a longitudinal study16
A Korean named entity recognition method using Bi-LSTM-CRF and masked self-attention16
BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling16
Comprehensive analysis of aspect term extraction methods using various text embeddings15
Analysis of gender and identity issues in depression detection on de-identified speech15
Representation transfer learning from deep end-to-end speech recognition networks for the classification of health states from speech14
Cluster-based beam search for pointer-generator chatbot grounded by knowledge14
Towards the first Maithili part of speech tagger: Resource creation and system development14
Named entity recognition using neural language model and CRF for Hindi language14
Improving the potential of Enhanced Teager Energy Cepstral Coefficients (ETECC) for replay attack detection14
A question answering system in hadith using linguistic knowledge14
Analysis and classification of speech sounds of children with autism spectrum disorder using acoustic features14
Sequence labeling to detect stuttering events in read speech14
Vocal tract shaping of emotional speech14
The automatic detection of heart failure using speech signals13
Recurrent neural network language generation for spoken dialogue systems13
Overview of the seventh Dialog System Technology Challenge: DSTC713
NEC-TT System for Mixed-Bandwidth and Multi-Domain Speaker Recognition13
Assessing Parkinson's disease severity using speech analysis in non-native speakers13
Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network13
A Bayesian end-to-end model with estimated uncertainties for simple question answering over knowledge bases13
X-vector anonymization using autoencoders and adversarial training for preserving speech privacy12
On the effect of dropping layers of pre-trained transformer models12
Low resource end-to-end spoken language understanding with capsule networks11
Deep generative variational autoencoding for replay spoof detection in automatic speaker verification11
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection11
Evaluating voice-assistant commands for dementia detection11
Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning10
Hybrid-task learning for robust automatic speech recognition10
Multilingual and unsupervised subword modeling for zero-resource languages10
A speaker verification backend with robust performance across conditions9
Siamese networks for large-scale author identification9
Low-resource text classification using domain-adversarial learning9
A novel word sense disambiguation approach using WordNet knowledge graph9
Excitation modelling using epoch features for statistical parametric speech synthesis9
Discriminating speech traits of Alzheimer's disease assessed through a corpus of reading task for Spanish language9
A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks8
Leveraging Linguistic Context in Dyadic Interactions to Improve Automatic Speech Recognition for Children8
Natural language processing for under-resourced languages: Developing a Welsh natural language toolkit8
Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores8
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis8
Exploring neural models for predicting dementia from language8
Dialect Identification using Chroma-Spectral Shape Features with Ensemble Technique8
Towards a unified assessment framework of speech pseudonymisation8
An online multi-source summarization algorithm for text readability in topic-based search8
QBSUM: A large-scale query-based document summarization dataset from real-world applications8
Joint emotion label space modeling for affect lexica8
Perceptions and reactions to conversational privacy initiated by a conversational user interface8
13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE8
Towards a speech therapy support system based on phonological processes early detection8
0.021818876266479