Scientific Data

Papers
(The H4-Index of Scientific Data is 57. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
County-level CO2 emissions and sequestration in China during 1997–2017523
MIMIC-IV, a freely accessible electronic health record dataset447
Dynamic World, Near real-time global 10 m land use land cover mapping336
Version 3 of the Global Aridity Index and Potential Evapotranspiration Database256
The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity241
A patient-centric dataset of images and metadata for identifying melanomas using clinical context202
MedMNIST v2 - A large-scale lightweight benchmark for 2D and 3D biomedical image classification193
Highly accurate long-read HiFi sequencing data for five complex genomes185
The 10-m crop type maps in Northeast China during 2017–2019177
Systematic phenotyping and characterization of the 5xFAD mouse model of Alzheimer’s disease174
NASA Global Daily Downscaled Projections, CMIP6172
The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms168
The human O-GlcNAcome database and meta-analysis161
Multiscale dynamic human mobility flow dataset in the U.S. during the COVID-19 epidemic154
A global record of annual terrestrial Human Footprint dataset from 2000 to 2018150
Global 1 km × 1 km gridded revised real gross domestic product and electricity consumption during 1992–2019 based on calibrated nighttime light data150
Data sharing practices and data availability upon request differ across scientific disciplines143
Operationalizing the CARE and FAIR Principles for Indigenous data futures140
Carbon Monitor, a near-real-time daily dataset of global CO2 emission from fossil fuel and cement production137
Introducing the FAIR Principles for research software136
National contributions to climate change due to historical emissions of carbon dioxide, methane, and nitrous oxide since 1850128
COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning122
VinDr-CXR: An open dataset of chest X-rays with radiologist’s annotations120
Kvasir-Capsule, a video capsule endoscopy dataset113
Building a knowledge graph to enable precision medicine105
Gridded daily weather data for North America with comprehensive uncertainty quantification102
Bias-corrected CMIP6 global dataset for dynamical downscaling of the historical and future climate (1979–2100)99
Chinese provincial multi-regional input-output database for 2012, 2015, and 201790
AusTraits, a curated plant trait database for the Australian flora90
PERSIANN-CCS-CDR, a 3-hourly 0.04° global precipitation climate data record for heavy precipitation studies88
GEOM, energy-annotated molecular conformations for property prediction and molecular generation83
The Amsterdam Open MRI Collection, a set of multimodal MRI datasets for individual difference analyses82
Hourly potential evapotranspiration at 0.1° resolution for the global land surface from 1981-present82
A whole-body FDG-PET/CT Dataset with manually annotated Tumor Lesions79
An automatic multi-tissue human fetal brain segmentation benchmark using the Fetal Tissue Annotation Dataset76
COVIDiSTRESS Global Survey dataset on psychological and behavioural consequences of the COVID-19 outbreak75
Refractiveindex.info database of optical constants73
CT-ORG, a new dataset for multiple organ segmentation in computed tomography73
DISPERSE, a trait database to assess the dispersal potential of European aquatic macroinvertebrates72
Global monthly gridded atmospheric carbon dioxide concentrations under the historical and future scenarios72
Global trends and forecasts of breast cancer incidence and deaths70
GlobSnow v3.0 Northern Hemisphere snow water equivalent dataset70
Expanded dataset of mechanical properties and observed phases of multi-principal element alloys68
Global daily 1 km land surface precipitation based on cloud cover-informed downscaling67
FIVES: A Fundus Image Dataset for Artificial Intelligence based Vessel Segmentation66
Global soil moisture data derived through machine learning trained with in-situ measurements65
A SARS-CoV-2 cytopathicity dataset generated by high-content screening of a large drug repurposing collection64
QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules64
High-resolution (1 km) Köppen-Geiger maps for 1901–2099 based on constrained CMIP6 projections64
ClimateEU, scale-free climate normals, historical time series, and future projections for Europe62
Downscaling GRACE total water storage change using partial least squares regression62
Probabilistic atlas for the language network based on precision fMRI data from >800 individuals62
Author Correction: The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data61
Local sea level trends, accelerations and uncertainties over 1993–201961
Gridded fossil CO2 emissions and related O2 combustion consistent with national inventories 1959–201860
A multi-site, multi-disorder resting-state magnetic resonance image database59
LCVP, The Leipzig catalogue of vascular plants, a new taxonomic reference list for all known vascular plants58
Vectorized rooftop area data for 90 cities in China57
A long term global daily soil moisture dataset derived from AMSR-E and AMSR2 (2002–2019)57
An update on global mining land use57
Creation and validation of a chest X-ray dataset with eye-tracking and report dictation for AI development57
0.073023080825806