OOIR: Observatory of International Research

Papers

(The H4-Index of Language Resources and Evaluation is 14. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)

Article	Citations
Strategies for managing time and costs in speech corpus creation: insights from the Slovenian ARTUR corpus	61
Spelling errors made by people with dyslexia	38
Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks	37
Commonsense based text mining on urban policy	28
From LIMA to DeepLIMA: following a new path of interoperability	26
Speech acts in the Dutch COVID-19 Press Conferences	23
A survey on geocoding: algorithms and datasets for toponym resolution	23
Hope speech detection in Spanish	19
Investigating the role of swear words in abusive language detection tasks	18
AC-IQuAD: Automatically Constructed Indonesian Question Answering Dataset by Leveraging Wikidata	17
The narratives of war (NoW) corpus of written testimonies of the Russia-Ukraine war	16
The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States	15
Brazilian Portuguese corpora for teaching and translation: the CoMET project	14
Prompting encoder models for zero-shot classification: a cross-domain study in Italian	14