Transactions of the Association for Computational Linguistics

Papers
(The TQCC of Transactions of the Association for Computational Linguistics is 10. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-07-01 to 2024-07-01.)
ArticleCitations
SpanBERT: Improving Pre-training by Representing and Predicting Spans595
A Primer in BERTology: What We Know About How BERT Works394
Multilingual Denoising Pre-training for Neural Machine Translation304
Topic Modeling in Embedding Spaces283
How Can We Know What Language Models Know?260
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation188
Efficient Content-Based Sparse Attention with Routing Transformers151
What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models123
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks119
Sparse, Dense, and Attentional Representations for Text Retrieval96
SummEval: Re-evaluating Summarization Evaluation86
A Survey on Automated Fact-Checking80
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation78
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages78
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT52
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding52
ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models51
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages46
oLMpics-On What Language Model Pre-training Captures44
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond42
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations41
BLiMP: The Benchmark of Linguistic Minimal Pairs for English39
Machine Learning–Driven Language Assessment37
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs37
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers37
Measuring and Improving Consistency in Pretrained Language Models35
Gender Bias in Machine Translation33
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation33
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension31
Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals31
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering31
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP30
The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation30
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset29
Theoretical Limitations of Self-Attention in Neural Sequence Models28
SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization27
Break It Down: A Question Understanding Benchmark27
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets27
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering27
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?27
Soloist: BuildingTask Bots at Scale with Transfer Learning and Machine Teaching26
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them26
Extractive Opinion Summarization in Quantized Transformer Spaces25
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models25
Syntax-Guided Controlled Generation of Paraphrases24
Relevance-guided Supervision for OpenQA with ColBERT24
AMR-To-Text Generation with Graph Transformer23
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs22
Data-to-text Generation with Macro Planning22
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP22
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering20
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation19
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics19
Did Aristotle Use a Laptop?A Question Answering Benchmark with Implicit Reasoning Strategies19
MasakhaNER: Named Entity Recognition for African Languages19
Lost in the Middle: How Language Models Use Long Contexts18
Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks18
Time-Aware Language Models as Temporal Knowledge Bases18
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension18
Data Weighted Training Strategies for Grammatical Error Correction17
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?17
Best-First Beam Search17
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval16
A Survey on Cross-Lingual Summarization16
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains16
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching16
Planning with Learned Entity Prompts for Abstractive Summarization15
Acoustic-Prosodic and Lexical Cues to Deception and Trust: Deciphering How People Detect Lies15
Explanation-Based Human Debugging of NLP Models: A Survey15
Multilingual Autoregressive Entity Linking15
Unsupervised Quality Estimation for Neural Machine Translation15
Target-Guided Structured Attention Network for Target-Dependent Sentiment Analysis14
EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints14
Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition14
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models14
Efficient Methods for Natural Language Processing: A Survey14
Benchmarking Large Language Models for News Summarization14
Aligning Faithful Interpretations with their Social Attribution14
In-Context Retrieval-Augmented Language Models13
FeTaQA: Free-form Table Question Answering13
Transformers for Tabular Data Representation: A Survey of Models and Applications12
QED: A Framework and Dataset for Explanations in Question Answering12
Sentence Similarity Based on Contexts12
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings12
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups12
ABNIRML: Analyzing the Behavior of Neural IR Models11
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing11
Adaptive Semiparametric Language Models10
Differentiable Subset Pruning of Transformer Heads10
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary10
Infusing Finetuning with Semantic Dependencies10
Sketch-Driven Regular Expression Generation from Natural Language and Examples10
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?10
Task-Oriented Dialogue as Dataflow Synthesis10
Better Document-Level Machine Translation with Bayes’ Rule10
1.4828598499298