OOIR: Observatory of International Research

Papers

(The TQCC of Transactions of the Association for Computational Linguistics is 10. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-05-01 to 2024-05-01.)

Article	Citations
SpanBERT: Improving Pre-training by Representing and Predicting Spans	557
A Primer in BERTology: What We Know About How BERT Works	370
Multilingual Denoising Pre-training for Neural Machine Translation	278
Topic Modeling in Embedding Spaces	258
How Can We Know What Language Models Know?	229
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation	166
Efficient Content-Based Sparse Attention with Routing Transformers	141
What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models	119
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks	116
Sparse, Dense, and Attentional Representations for Text Retrieval	91
SummEval: Re-evaluating Summarization Evaluation	79
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages	72
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation	70
A Survey on Automated Fact-Checking	64
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding	49
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT	48
ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models	47
oLMpics-On What Language Model Pre-training Captures	41
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages	37
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs	35
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond	35
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers	34
BLiMP: The Benchmark of Linguistic Minimal Pairs for English	32
Machine Learning–Driven Language Assessment	31
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations	31

Gender Bias in Machine Translation	30
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension	30
Measuring and Improving Consistency in Pretrained Language Models	29
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation	28
Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals	28
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP	27
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset	27
Theoretical Limitations of Self-Attention in Neural Sequence Models	26
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering	26
Soloist: BuildingTask Bots at Scale with Transfer Learning and Machine Teaching	25
Break It Down: A Question Understanding Benchmark	25
Extractive Opinion Summarization in Quantized Transformer Spaces	24
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering	24
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models	23
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them	23
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?	23
AMR-To-Text Generation with Graph Transformer	23
The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation	23
Syntax-Guided Controlled Generation of Paraphrases	22
Relevance-guided Supervision for OpenQA with ColBERT	22
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets	22
Data-to-text Generation with Macro Planning	21
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs	20
SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization	20
MasakhaNER: Named Entity Recognition for African Languages	19
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation	18
Best-First Beam Search	17
Data Weighted Training Strategies for Grammatical Error Correction	17
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval	16
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension	16
Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks	16
Did Aristotle Use a Laptop?A Question Answering Benchmark with Implicit Reasoning Strategies	16
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP	15
Unsupervised Quality Estimation for Neural Machine Translation	15
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?	15
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics	15
Planning with Learned Entity Prompts for Abstractive Summarization	15
Explanation-Based Human Debugging of NLP Models: A Survey	15
Target-Guided Structured Attention Network for Target-Dependent Sentiment Analysis	14
Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition	14
Acoustic-Prosodic and Lexical Cues to Deception and Trust: Deciphering How People Detect Lies	14
Time-Aware Language Models as Temporal Knowledge Bases	13
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains	13
EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints	13
Multilingual Autoregressive Entity Linking	13
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models	12
A Survey on Cross-Lingual Summarization	12
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering	12
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings	12
FeTaQA: Free-form Table Question Answering	11

TopiOCQA: Open-domain Conversational Question Answering with Topic Switching	11
ABNIRML: Analyzing the Behavior of Neural IR Models	11
Sentence Similarity Based on Contexts	10
QED: A Framework and Dataset for Explanations in Question Answering	10
Sketch-Driven Regular Expression Generation from Natural Language and Examples	10
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary	10
Benchmarking Large Language Models for News Summarization	10
Infusing Finetuning with Semantic Dependencies	10
Aligning Faithful Interpretations with their Social Attribution	10
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing	10
Task-Oriented Dialogue as Dataflow Synthesis	10
Differentiable Subset Pruning of Transformer Heads	10
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?	10
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups	10