Transactions of the Association for Computational Linguistics

Papers
(The median citation count of Transactions of the Association for Computational Linguistics is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
SpanBERT: Improving Pre-training by Representing and Predicting Spans641
A Primer in BERTology: What We Know About How BERT Works441
Multilingual Denoising Pre-training for Neural Machine Translation348
Topic Modeling in Embedding Spaces326
How Can We Know What Language Models Know?313
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation212
Efficient Content-Based Sparse Attention with Routing Transformers169
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks131
What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models126
SummEval: Re-evaluating Summarization Evaluation112
Sparse, Dense, and Attentional Representations for Text Retrieval108
A Survey on Automated Fact-Checking97
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages88
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation81
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT61
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding61
ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models59
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages51
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations50
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond49
BLiMP: The Benchmark of Linguistic Minimal Pairs for English47
oLMpics-On What Language Model Pre-training Captures46
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP42
Lost in the Middle: How Language Models Use Long Contexts42
Measuring and Improving Consistency in Pretrained Language Models42
Machine Learning–Driven Language Assessment41
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation41
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs40
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers39
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering38
SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization37
The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation36
Gender Bias in Machine Translation35
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets32
Theoretical Limitations of Self-Attention in Neural Sequence Models32
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset32
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension32
Benchmarking Large Language Models for News Summarization32
Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals32
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP31
In-Context Retrieval-Augmented Language Models31
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering30
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them30
Did Aristotle Use a Laptop?A Question Answering Benchmark with Implicit Reasoning Strategies29
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?29
Soloist: BuildingTask Bots at Scale with Transfer Learning and Machine Teaching28
Break It Down: A Question Understanding Benchmark28
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering28
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation27
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models27
Relevance-guided Supervision for OpenQA with ColBERT26
Extractive Opinion Summarization in Quantized Transformer Spaces26
Syntax-Guided Controlled Generation of Paraphrases25
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs24
AMR-To-Text Generation with Graph Transformer24
Data-to-text Generation with Macro Planning23
Time-Aware Language Models as Temporal Knowledge Bases23
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension23
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics22
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision22
MasakhaNER: Named Entity Recognition for African Languages22
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains21
A Survey on Cross-Lingual Summarization21
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?21
Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks20
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching19
Explanation-Based Human Debugging of NLP Models: A Survey19
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval19
Best-First Beam Search19
Planning with Learned Entity Prompts for Abstractive Summarization19
Efficient Methods for Natural Language Processing: A Survey19
Multilingual Autoregressive Entity Linking17
Data Weighted Training Strategies for Grammatical Error Correction17
FeTaQA: Free-form Table Question Answering16
Aligning Faithful Interpretations with their Social Attribution16
Acoustic-Prosodic and Lexical Cues to Deception and Trust: Deciphering How People Detect Lies16
Unsupervised Quality Estimation for Neural Machine Translation16
EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints15
Visual Spatial Reasoning15
Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition15
Target-Guided Structured Attention Network for Target-Dependent Sentiment Analysis15
Transformers for Tabular Data Representation: A Survey of Models and Applications15
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups14
Hallucinations in Large Multilingual Translation Models14
Sentence Similarity Based on Contexts14
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models14
Pre-train, Prompt, and Recommendation: A Comprehensive Survey of Language Modeling Paradigm Adaptations in Recommender Systems13
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings13
ABNIRML: Analyzing the Behavior of Neural IR Models13
Generative Spoken Dialogue Language Modeling13
QED: A Framework and Dataset for Explanations in Question Answering12
Sketch-Driven Regular Expression Generation from Natural Language and Examples12
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing12
Task-Oriented Dialogue as Dataflow Synthesis12
Locally Typical Sampling12
Reducing Conversational Agents’ Overconfidence Through Linguistic Calibration12
Adaptive Semiparametric Language Models11
How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN11
Infusing Finetuning with Semantic Dependencies11
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary11
Better Document-Level Machine Translation with Bayes’ Rule11
Let’s PlayMono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses10
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?10
Differentiable Subset Pruning of Transformer Heads10
Efficient Long-Text Understanding with Short-Text Models10
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue10
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization10
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark9
Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes9
Phonotactic Complexity and Its Trade-offs9
Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference9
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining9
Testing the Predictions of Surprisal Theory in 11 Languages9
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking9
Decontextualization: Making Sentences Stand-Alone9
Compositional Evaluation on Japanese Textual Entailment and Similarity9
Generate, Annotate, and Learn: NLP with Synthetic Text9
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation8
Hate Speech Classifiers Learn Normative Social Stereotypes8
♫ MuSiQue: Multihop Questions via Single-hop Question Composition8
Neural Modeling for Named Entities and Morphology (NEMO2)8
True Few-Shot Learning with Prompts—A Real-World Perspective8
A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods8
Revisiting Multi-Domain Machine Translation8
Augmenting Transformers with KNN-Based Composite Memory for Dialog8
Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection8
It’s not Rocket Science: Interpreting Figurative Language in Narratives7
Pretraining the Noisy Channel Model for Task-Oriented Dialogue7
Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings7
Morphology Matters: A Multilingual Language Modeling Analysis7
Multi-task Active Learning for Pre-trained Transformer-based Models7
Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies7
Modeling Content and Context with Deep Relational Learning7
Decoding Brain Activity Associated with Literal and Metaphoric Sentence Comprehension Using Distributional Semantic Models7
Characterizing English Variation across Social Media Communities with BERT7
ParsiNLU: A Suite of Language Understanding Challenges for Persian7
Large Language Models Enable Few-Shot Clustering7
Reducing Confusion in Active Learning for Part-Of-Speech Tagging7
Syntactic Structure Distillation Pretraining for Bidirectional Encoders7
Idiomatic Expression Identification using Semantic Compatibility7
Dialogue State Tracking with Incremental Reasoning7
Evaluating Document Coherence Modeling6
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval6
Controllable Summarization with Constrained Markov Decision Process6
ProoFVer: Natural Logic Theorem Proving for Fact Verification6
Document Summarization with Latent Queries6
A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings6
Coreference Resolution through a seq2seq Transition-Based System6
Abstractive Meeting Summarization: A Survey6
Relational Memory-Augmented Language Models6
Heterogeneous Supervised Topic Models6
End-to-end Argument Mining with Cross-corpora Multi-task Learning6
Self-supervised Regularization for Text Classification6
Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering6
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations6
Deciphering Undersegmented Ancient Scripts Using Phonetic Prior6
Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?6
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering6
Neural OCR Post-Hoc Correction of Historical Corpora6
AMR Similarity Metrics from Principles6
What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition5
Investigating Reasons for Disagreement in Natural Language Inference5
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon5
Temporal Effects on Pre-trained Models for Language Processing Tasks5
Czech Grammar Error Correction with a Large and Diverse Corpus5
Data-to-text Generation with Variational Sequential Planning5
Saturated Transformers are Constant-Depth Threshold Circuits5
A Computational Framework for Slang Generation5
On the Role of Negative Precedent in Legal Outcome Prediction5
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation5
Erratum: “BLiMP: The Benchmark of Linguistic Minimal Pairs for English”5
Fact Checking with Insufficient Evidence5
Neuron-level Interpretation of Deep NLP Models: A Survey5
Supervised Gradual Machine Learning for Aspect-Term Sentiment Analysis4
Conversation Graph: Data Augmentation, Training, and Evaluation for Non-Deterministic Dialogue Management4
Uncertainty Estimation and Reduction of Pre-trained Models for Text Regression4
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity4
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation4
Word Acquisition in Neural Language Models4
OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue4
Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems4
Learning English with Peppa Pig4
High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metrics4
On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method4
How Furiously Can Colorless Green Ideas Sleep? Sentence Acceptability in Context4
Consistent Transcription and Translation of Speech4
Meta-Learning a Cross-lingual Manifold for Semantic Parsing4
On the Effect of Anticipation on Reading Times4
Weisfeiler-Leman in the Bamboo: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity4
Rank-Aware Negative Training for Semi-Supervised Text Classification4
What Do Self-Supervised Speech Models Know About Words?3
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure3
Lexically Aware Semi-Supervised Learning for OCR Post-Correction3
What Does My QA Model Know? Devising Controlled Probes Using Expert Knowledge3
Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition3
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale3
Memory-Based Semantic Parsing3
Revisiting Negation in Neural Machine Translation3
Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement3
Model Compression for Domain Adaptation through Causal Effect Estimation3
MENLI: Robust Evaluation Metrics from Natural Language Inference3
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication3
Strong Equivalence of TAG and CCG3
The Return of Lexical Dependencies: Neural Lexicalized PCFGs3
Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages3
An End-to-End Contrastive Self-Supervised Learning Framework for Language Understanding3
Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss3
On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs3
On the Difficulty of Translating Free-Order Case-Marking Languages3
A Multi-Level Optimization Framework for End-to-End Text Augmentation3
Neural Event Semantics for Grounded Language Understanding3
Reasoning over Public and Private Data in Retrieval-Based Systems3
Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation3
Unsupervised Abstractive Opinion Summarization by Generating Sentences with Tree-Structured Topic Guidance3
Unit Testing for Concepts in Neural Networks3
Unsupervised Discourse Constituency Parsing Using Viterbi EM3
Questions Are All You Need to Train a Dense Passage Retriever3
Learning Lexical Subspaces in a Distributional Vector Space3
Compositional Generalization in Multilingual Semantic Parsing over Wikidata3
Robust Dialogue State Tracking with Weak Supervision and Sparse Data3
On Decoding Strategies for Neural Text Generators3
0.029852151870728