Scientific Data

Papers
(The H4-Index of Scientific Data is 78. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-01-01 to 2026-01-01.)
ArticleCitations
Author Correction: The Plegma dataset: Domestic appliance-level and aggregate electricity demand with metadata from Greece1794
Author Correction: Mobility networks in Greater Mexico City711
A database of seed plants on taxonomy, geography and ecology in the Qinling-Daba Mountains and adjacent areas706
Tsunami Runup Survey Data From The Taan Fjord Landslide Event623
Multi-proteomics and interactome dataset of tick-borne encephalitis virus infected host cells451
Linking Research Data with Physically Preserved Research Materials in Chemistry447
Chromosome-level genome assembly of the Rhizoctonia solani436
GARD-LENS: A downscaled large ensemble dataset for understanding future climate and its uncertainties426
Shotgun metagenomes from productive lakes in an urban region of Sweden375
Author Correction: GERDA: The German Election Database373
The interplay between brain and behavior during development: A multisite effort to generate and share simulated datasets313
Bioclimatic atlas of the terrestrial Arctic308
Occurrence of human infection with Salmonella Typhi in sub-Saharan Africa257
CreelCat, a Catalog of United States Inland Creel and Angler Survey Data251
In toto light sheet fluorescence microscopy live imaging datasets of Ceratitis capitata embryonic development250
A dataset of scientific dates from archaeological sites in eastern Africa spanning 5000 BCE to 1800 CE238
What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024)201
Mediterranean marine sediment cores database: unlocking paleoclimatic signals for the last 20,000 years183
Dataset on the effects of psychological care on depression and suicide ideation in underrepresented children178
Near-complete reference genome assembly of Hoya carnosa178
A Field-Level Asset Mapping Dataset for England’s Agricultural Sector165
A Simulated Comprehensive Photon Flux Shielding Spectra Dataset for Advanced Radiation Safety Assessment157
Empowering open data sharing for social good: a privacy-aware approach156
Enrichment of lung cancer computed tomography collections with AI-derived annotations151
Chromosome-level assemblies of cultivated water chestnut Trapa bicornis and its wild relative Trapa incisa151
The first high-quality chromosome-level genome of Parupeneus biaculeatus using HiFi and Hi-C data149
A chromosome-scale assembly of Ormosia boluoensis (Fabaceae)133
Author Correction: Database covering the prayer movements which were not available previously128
A thermosurvey dataset: Older adults’ experiences and adaptation to urban heat and climate change128
A curated dataset of great ape genome diversity126
Generating FAIR research data in experimental tribology126
Global Ocean Particulate Organic Phosphorus, Carbon, Oxygen for Respiration, and Nitrogen (GO-POPCORN)125
An open dataset for oracle bone character recognition and decipherment124
Molecular landscape of respiratory infection: A large-scale, multi-centre blood transcriptome dataset123
Chromosome-level genome assembly of rock carp (Procypris rabaudi)122
PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery119
A Frontal Ablation Dataset for 49 Tidewater Glaciers in Greenland119
ML-extendable framework for multiphysics-multiscale simulation workflow and data management using Kadi4Mat117
Students’ performance dataset for using machine learning technique in physics education research117
Multi-Domain Indoor Dataset for Visual Place Recognition and Anomaly Detection by Mobile Robots116
An open-access database of nature-based carbon offset project boundaries114
Statistical performance indicators and index—a new tool to measure country statistical capacity111
Author Correction: Whales from space dataset, an annotated satellite image dataset of whales for training machine learning models111
SDUST2023GRA_MSS: the new global marine gravity anomaly model determined from mean sea surface model107
A longitudinal cross-country dataset on agricultural productivity and welfare in Sub-Saharan Africa106
Canopy height model and NAIP imagery pairs across CONUS106
A global dataset of fossil fungi records from the Cenozoic105
The Latin American Legislators Dataset105
A database of steric and electronic properties of heteroaryl substituents104
One-year high-frequency environmental and behavioral data from ALAN experience in a French coastal area101
The Carbon Catalogue, carbon footprints of 866 commercial products from 8 industry sectors and 5 continents101
Spatio-temporal dataset (2009–2012) of Culicoides spp., vectors of livestock viruses, in France101
A haplotype-resolved chromosomal-level genome assembly of Oxalis articulata100
Pennsieve: A Collaborative Platform for Translational Neuroscience and Beyond100
F-DATA: A Fugaku Workload Dataset for Job-centric Predictive Modelling in HPC Systems97
Home monitoring with connected mobile devices for asthma attack prediction with machine learning94
Slovak database of speech affected by neurodegenerative diseases93
Multimodal Data for the Detection of Freezing of Gait in Parkinson’s Disease93
A focus groups study on data sharing and research data management92
A semantic approach to mapping the Provenance Ontology to Basic Formal Ontology89
The Superfund Research Program Analytics Portal: linking environmental chemical exposure to biological phenotypes89
An 8-model ensemble of CMIP6-derived ocean surface wave climate87
A dataset of the daily edge of each polynya in the Antarctic86
Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity85
Optimizing drug combination and mechanism analysis based on risk pathway crosstalk in pan cancer84
A century-long eddy-resolving simulation of global oceanic large- and mesoscale state82
A VibV Dataset Integrating Vibration and Vision for Enhanced Safety in Self-Driving Tasks82
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction82
Machine learning-ready remote sensing data for Maya archaeology81
Reinterpretation of prostate cancer pathology by Appl1, Sortilin and Syndecan-1 biomarkers81
A daily high-resolution (1 km) human thermal index collection over the North China Plain from 2003 to 202081
An Enhanced Phenology Dataset for Global Drylands from 2001 to 201981
T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus81
Quantum computing dataset of maximum independent set problem on king lattice of over hundred Rydberg atoms81
NeuMa - the absolute Neuromarketing dataset en route to an holistic understanding of consumer behaviour81
M3OT: A Multi-Drone Multi-Modality dataset for Multi-Object Tracking80
Reconstructing high-quality ground-level ozone records from 1980 to 2012 in central and eastern China79
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations78
MarNemaFunDiv: a first comprehensive dataset of functional traits for marine nematodes78
FIGARO-E3: a high-resolution extended multi-regional input-output database consistent with official statistics78
0.11490702629089