Scientific Data

Papers
(The TQCC of Scientific Data is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
Chromosomal level genome assemblies of two Malus crabapple cultivars Flame and Royalty597
A biological ocean data reformatting effort381
A chromosome-level genome assembly of East Asia endemic minnow Zacco platypus279
HYADES - A Global Archive of Annual Maxima Daily Precipitation257
Physiological data for affective computing in HRI with anthropomorphic service robots: the AFFECT-HRI data set214
DNA barcode reference library for the West Sahara-Sahel reptiles198
Lake volume variation in the endorheic basin of the Tibetan Plateau from 1989 to 2019190
An ocean front dataset for the Mediterranean sea and southwest Indian ocean187
In toto light sheet fluorescence microscopy live imaging datasets of Ceratitis capitata embryonic development182
The bii4africa dataset of faunal and floral population intactness estimates across Africa’s major land uses174
Integrated microbiome-metabolome-genome axis data of Laiwu and Lulai pigs168
Single-cell integrative analysis reveals consensus cancer cell states and clinical relevance in breast cancer165
SDUST2023GRA_MSS: the new global marine gravity anomaly model determined from mean sea surface model163
The global Multidimensional Poverty Index: Harmonised level estimates and their changes over time160
Recovery of nearly 3,000 archaeal genomes from 152 terrestrial geothermal spring metagenomes157
A National Synthetic Populations Dataset for the United States125
Linking Research Data with Physically Preserved Research Materials in Chemistry122
Propithecus verreauxi demography spanning 40 years at Bezà Mahafaly Special Reserve, southwest Madagascar120
Author Correction: Microbial Metagenomes Across a Complete Phytoplankton Bloom Cycle: High-Resolution Sampling Every 4 Hours Over 22 Days117
Australian automotive workers and community leaders interview dataset following 2017 assembly plant closures113
A time-varying index for agricultural suitability across Europe from 1500–2000113
A high-quality chromosome-scale genome assembly of the Cherokee rose (Rosa laevigata)106
Chromosome-level genome assembly of tetraploid Chinese cherry (Prunus pseudocerasus)105
PCMMD: A Novel Dataset of Plasma Cells to Support the Diagnosis of Multiple Myeloma104
Dataset for developing deep learning models to assess crack width and self-healing progress in concrete101
A travelable area boundary dataset for visual navigation of field robots95
A catalogue of land-based adaptation and mitigation solutions to tackle climate change92
Mapping of 10-km daily diffuse solar radiation across China from reanalysis data and a Machine-Learning method91
An fMRI dataset for whole-body somatotopic mapping in humans91
Thirty years of volcano geodesy from space at Campi Flegrei caldera (Italy)89
Multi sequence average templates for aging and neurodegenerative disease populations83
Robotic monitoring of Alpine screes: a dataset from the EU Natura2000 habitat 8110 in the Italian Alps82
A Multimodal Dataset for Mixed Emotion Recognition77
Dataset on the effects of psychological care on depression and suicide ideation in underrepresented children76
NeuroLINCS Proteomics: Defining human-derived iPSC proteomes and protein signatures of pluripotency73
Automatic question answering for multiple stakeholders, the epidemic question answering dataset70
Preoperative CT and survival data for patients undergoing resection of colorectal liver metastases68
A proteomic and RNA-seq transcriptomic dataset of capsaicin-aggravated mouse chronic colitis model67
Inflation of test accuracy due to data leakage in deep learning-based classification of OCT images66
EEG Dataset for the Recognition of Different Emotions Induced in Voice-User Interaction66
Occurrence of human infection with Salmonella Typhi in sub-Saharan Africa66
Global Ocean Particulate Organic Phosphorus, Carbon, Oxygen for Respiration, and Nitrogen (GO-POPCORN)65
A comprehensive dataset of annotated brain metastasis MR images with clinical and radiomic data64
M4Raw: A multi-contrast, multi-repetition, multi-channel MRI k-space dataset for low-field MRI research64
An automatic multi-tissue human fetal brain segmentation benchmark using the Fetal Tissue Annotation Dataset63
The RETA Benchmark for Retinal Vascular Tree Analysis62
ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset61
A Prolonged Artificial Nighttime-light Dataset of China (1984-2020)60
Optical emissivity dataset of multi-material heterogeneous designs generated with automated figure extraction60
Shear modulus reduction and damping ratios curves joined with engineering geological units in Italy60
CheXmask: a large-scale dataset of anatomical segmentation masks for multi-center chest x-ray images60
Three-Dimensional Motion Capture Data of a Movement Screen from 183 Athletes60
Chromosome-level genome assembly and annotation of the prickly nightshade Solanum rostratum Dunal59
Time series of useful energy consumption patterns for energy system modeling59
An open dataset for oracle bone character recognition and decipherment59
A western United States snow reanalysis dataset over the Landsat era from water years 1985 to 202159
Chromosome-level genome assembly of Oriental chestnut gall wasp (Dryocosmus kuriphilus)58
R code and downstream analysis objects for the scRNA-seq atlas of normal and tumorigenic human breast tissue57
A dataset of alternately located segments in protein crystal structures57
Coassembly and binning of a twenty-year metagenomic time-series from Lake Mendota57
Harmonized nitrogen and phosphorus concentrations in the Mississippi/Atchafalaya River Basin from 1980 to 201856
A Multi-Stain Breast Cancer Histological Whole-Slide-Image Data Set from Routine Diagnostics56
Single-cell RNA-sequencing of virus-specific cellular immune responses in chronic hepatitis B patients55
A database of chemical absorption in human skin with mechanistic modeling applications55
Automated BigSMILES conversion workflow and dataset for homopolymeric macromolecules55
High-resolution freshwater dissolved calcium and pH data layers for Canada and the United States55
A dataset for assessing phytolith data for implementation of the FAIR data principles54
Total irrigation by crop in the Continental United States from 2008 to 202054
Fluorescent Neuronal Cells v2: multi-task, multi-format annotations for deep learning in microscopy54
Towards Gender Harmony Dataset: Gender Beliefs and Gender Stereotypes in 62 Countries54
High-quality faba bean reference transcripts generated using PacBio and Illumina RNA-seq data54
A chromosome-level genome assembly of the spider mite Tetranychus piercei McGregor54
A global synthesis of high-resolution stable isotope data from benthic foraminifera of the last deglaciation53
Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data53
Soul: An OCTA dataset based on Human Machine Collaborative Annotation Framework51
The Avian Diet Database as a source of quantitative information on bird diets51
Genome-wide identification of accessible chromatin regions by ATAC-seq upon induction of the transcription factor bZIP11 in Arabidopsis51
Comprehensive energy demand and usage data for building automation50
A georeferenced rRNA amplicon database of aquatic microbiomes from South America50
A Python library to check the level of anonymity of a dataset50
Chromosome-level Genome Assembly of Theretra japonica (Lepidoptera: Sphingidae)49
Chromosome-level genome assembly of ridgetail white shrimp Exopalaemon carinicauda48
A high-quality genome assembly and annotation of Thielaviopsis punctulata DSM10279848
Chromosome-level genome assembly of the flower thrips Frankliniella intonsa48
A haplotype-resolved genome assembly of Malus domestica ‘Red Fuji’47
Type B Aortic Dissection CTA Collection with True and False Lumen Expert Annotations for the Development of AI-based Algorithms47
Chromosome-level genome assembly of bean flower thrips Megalurothrips usitatus (Thysanoptera: Thripidae)47
Effects of Candidatus Liberibacter asiaticus infection on metagenome of Diaphorina citri gut endosymbiont46
A unified dataset for the city-scale traffic assignment model in 20 U.S. cities46
Telomere-to-telomere genome assembly of sorghum46
Fluorescence microscopy and correlative brightfield videos of mitochondria and vesicles in H9c2 cardiomyoblasts46
De novo transcriptomes of cave and surface isopod crustaceans: insights from 11 species across three suborders46
Aci-bench: a Novel Ambient Clinical Intelligence Dataset for Benchmarking Automatic Visit Note Generation46
Generating FAIR research data in experimental tribology46
Dataset of solution-based inorganic materials synthesis procedures extracted from the scientific literature45
Shotgun metagenomes from productive lakes in an urban region of Sweden45
A literature-derived dataset on risk factors for dry eye disease45
A nineteenth-century urban Ottoman population micro dataset: Data extraction and relational database curation from the 1840s pre-census Bursa population registers45
Capturing the COVID-19 Crisis through Public Health and Social Measures Data Science44
A minimum data standard for vector competence experiments44
Aerodynamic characterisation of porous fairings: pressure drop and Laser Doppler Velocimetry measurements43
Optimizing drug combination and mechanism analysis based on risk pathway crosstalk in pan cancer43
Dataset of human-single neuron activity during a Sternberg working memory task42
A large-scale dataset for end-to-end table recognition in the wild42
An interprovincial input–output database distinguishing firm ownership in China from 1997 to 201742
Improving data archiving practices in ancient genomics41
Dataset of the rumen microbiota and epithelial transcriptomics and proteomics in goat affected by solid diets41
Small non-coding RNA transcriptomic profiling in adult and fetal human brain41
Long-term gridded land evapotranspiration reconstruction using Deep Forest with high generalizability41
Analysis of AlphaMissense data in different protein groups and structural context40
Multi-omics and single cell characterization of cancer immunosenescence landscape40
Motor evoked potentials for multiple sclerosis, a multiyear follow-up dataset40
A bankfull geometry dataset for major exorheic rivers on the Qinghai-Tibet Plateau40
LoDoPaB-CT, a benchmark dataset for low-dose computed tomography reconstruction40
Normative volumes and relaxation times at 3T during brain development40
High-content siRNA 3D co-cultures to identify myoepithelial cell-derived breast cancer suppressor proteins40
An improved dataset of force fields, electronic and physicochemical descriptors of metabolic substrates40
OSeMOSYS Global, an open-source, open data global electricity system model generator39
Multimodal Data for the Detection of Freezing of Gait in Parkinson’s Disease39
Phytoplankton optical fingerprint libraries for development of phytoplankton ocean color satellite products39
Chromosome-level genome assembly of chub mackerel (Scomber japonicus) from the Indo-Pacific Ocean37
A Time-of-Flight and Radar Dataset of a neonatal Thorax Simulator with synchronized Reference Sensor Signals for respiratory Rate Detection37
De novo transcriptomes of six calanoid copepods (Crustacea): a resource for the discovery of novel genes37
FIKElectricity: A Electricity Consumption Dataset from Three Restaurant Kitchens in Portugal37
A transcriptome dataset for gonadectomy-induced changes in rat spinal cord37
Charge balance calculations for mixed salt systems applied to a large dataset from the built environment37
The R package for DICOM to brain imaging data structure conversion36
Cancer-Alterome: a literature-mined resource for regulatory events caused by genetic alterations in cancer36
A dataset of micro biodiversity in benthic sediment at a global scale36
Annual Impervious Surface Data from 2001–2020 for West African Countries: Ghana, Togo, Benin and Nigeria36
CaliPopGen: A genetic and life history database for the fauna and flora of California36
Bridging the gap in fishing effort mapping: a spatially-explicit fisheries dataset for Campanian MPAs, Italy36
Extended-wavelength diffuse reflectance spectroscopy dataset of animal tissues for bone-related biomedical applications36
High-resolution climate projection dataset based on CMIP6 for Peru and Ecuador: BASD-CMIP6-PE36
Bioclimatic atlas of the terrestrial Arctic36
De novo chromosome-level genome assembly of Chinese motherwort (Leonurus japonicus)36
Enrichment of lung cancer computed tomography collections with AI-derived annotations35
High-resolution estimates of social distancing feasibility, mapped for urban areas in sub-Saharan Africa35
Datasets of in vitro clonogenic assays showing low dose hyper-radiosensitivity and induced radioresistance35
Attributes of the food and physical activity built environments from the Southern Cone of Latin America35
National-scale 1-km maps of hospital travel time and hospital accessibility in China34
City-scale Vehicle Trajectory Data from Traffic Camera Videos34
A Synthetic Dataset for Semantic Segmentation of Waterbodies in Out-of-Distribution Situations34
AI-Generated Annotations Dataset for Diverse Cancer Radiology Collections in NCI Image Data Commons34
Individual attendance data for over 30 years of international climate change talks34
Three genome assemblies of Opsariichthys bidens from Yangzte River, Pearl River and Qiantang River basins34
Telomere-to-telomere Genome Assembly of two representative Asian and European pear cultivars34
A global genome dataset for Salmonella Gallinarum recovered between 1920 and 202434
A high-quality genome of the early diverging tychoplanktonic diatom Paralia guyana33
HVSMR-2.0: A 3D cardiovascular MR dataset for whole-heart segmentation in congenital heart disease33
QUaternary fault strain INdicators database - QUIN 1.0 - first release from the Apennines of central Italy33
Monitoring the West Nile virus outbreaks in Italy using open access data33
Single-cell RNA-seq of primary bone marrow neutrophils from female and male adult mice33
The short-term mortality fluctuation data series, monitoring mortality shocks across time and space33
Ancient Yi Script Handwriting Sample Repository33
A chromosome-level genome assembly of an avivorous bat species (Nyctalus aviator)33
The role of plant functional groups mediating climate impacts on carbon and biodiversity of alpine grasslands33
Chromosome-level genome assembly of Korean holoparasitic plants, Orobanche coerulescens33
A large-scale fMRI dataset for the visual processing of naturalistic scenes32
Metagenome sequencing and 103 microbial genomes from ballast water and sediments32
High-resolution bathymetries and shorelines for the Great Lakes of the White Nile basin32
3D surgical instrument collection for computer vision and extended reality32
Ten lessons for data sharing with a data commons32
An integrated metagenomic, metabolomic and transcriptomic survey of Populus across genotypes and environments32
A near-complete genome assembly of Monochamus alternatus a major vector beetle of pinewood nematode32
A human lower-limb biomechanics and wearable sensors dataset during cyclic and non-cyclic activities32
Collegiate athlete brain data for white matter mapping and network neuroscience32
An EEG database for the cognitive assessment of motor imagery during walking with a lower-limb exoskeleton31
East Australian Current velocity, temperature and salinity data products31
Daily 1 km terrain resolving maps of surface fine particulate matter for the western United States 2003–202131
Anatomical structures, cell types, and biomarkers of the healthy human blood vasculature31
The PhanSST global database of Phanerozoic sea surface temperature proxy data31
Biological traits of marine benthic invertebrates in Northwest Europe31
Genome assembly and population genomic data of a pulmonate snail Ellobium chinense31
A publicly available newborn ear shape dataset for medical diagnosis of auricular deformities30
The first chromosome-level genome of the stag beetle Dorcus hopei Saunders, 1854 (Coleoptera: Lucanidae)30
Computed tomography reconstructions of burrow networks for the Opheliid polychaete, Armandia cirrhosa30
Making Mathematical Research Data FAIR: Pathways to Improved Data Sharing30
High-resolution transcriptome datasets during embryogenesis of plant-parasitic nematodes30
Chromosome-level genome assembly of Cnidium monnieri, a highly demanded traditional Chinese medicine30
An image-based screen for secreted proteins involved in breast cancer G0 cell cycle arrest29
Chromosome-level assemblies of cultivated water chestnut Trapa bicornis and its wild relative Trapa incisa29
De novo transcriptome reconstruction in aquacultured early life stages of the cephalopod Octopus vulgaris29
Telomere-to-telomere genome assembly of Eleocharis dulcis and expression profiles during corm development29
An update of skin permeability data based on a systematic review of recent research29
Spanish electoral archive. SEA database29
EStreams: An integrated dataset and catalogue of streamflow, hydro-climatic and landscape variables for Europe29
A comprehensive genomic and transcriptomic dataset of triple-negative breast cancers29
The Superfund Research Program Analytics Portal: linking environmental chemical exposure to biological phenotypes29
Hydrological model-based streamflow reconstruction for Indian sub-continental river basins, 1951–202128
A dataset for measuring the impact of research data and their curation28
Profile observations of the Arctic atmospheric boundary layer with the BELUGA tethered balloon during MOSAiC28
A Non-Laboratory Gait Dataset of Full Body Kinematics and Egocentric Vision28
Chromosome-level genome assembly of the threatened resource plant Cinnamomum chago28
Head model dataset for mixed reality navigation in neurosurgical interventions for intracranial lesions28
Autonomic aging – A dataset to quantify changes of cardiovascular autonomic function during healthy aging27
Pre- and post-surgery brain tumor multimodal magnetic resonance imaging data optimized for large scale computational modelling27
Harmonized geospatial data to support infrastructure siting feasibility planning for energy system transitions27
A chromosomal-level genome assembly of Serrognathus titanus Boisduval, 1835 (Coleoptera: Lucanidae)27
The chromosome-level genomes of the herbal magnoliids Warburgia ugandensis and Saururus chinensis27
Haplotype-resolved chromosomal-level assembly of wasabi (Eutrema japonicum) genome27
High frequency Lunar Penetrating Radar quality control, editing and processing of Chang’E-4 lunar mission27
Ultra-deep sequencing data from a liquid biopsy proficiency study demonstrating analytic validity27
The ImSURE phantoms: a digital dataset for radiomic software benchmarking and investigation27
HiFi chromosome-scale diploid assemblies of the grape rootstocks 110R, Kober 5BB, and 101–14 Mgt27
Exploring the electrophysiology of Parkinson’s disease with magnetoencephalography and deep brain recordings27
Near-real-time MODIS-derived vegetation index data products and online services for CONUS based on NASA LANCE26
A comprehensive multimodal dataset for contactless lip reading and acoustic analysis26
Mass cytometry analysis of blood from peanut-sensitized tolerant and clinically allergic infants26
A comprehensive genomic catalog from global cold seeps26
A century-long eddy-resolving simulation of global oceanic large- and mesoscale state26
A database of seed plants on taxonomy, geography and ecology in the Qinling-Daba Mountains and adjacent areas26
Passive microwave Arctic sea ice melt onset dates from the advanced horizontal range algorithm 1979–202226
Mass cytometric and transcriptomic profiling of epithelial-mesenchymal transitions in human mammary cell lines26
Database of SARS-CoV-2 and coronaviruses kinetics relevant for assessing persistence in food processing plants26
COVID-BEHAVE dataset: measuring human behaviour during the COVID-19 pandemic26
A subnational reproductive, maternal, newborn, child, and adolescent health and development atlas of India26
Reservoir inventory for China in 2016 and 202125
Haplotype-resolved genome assembly of Coriaria nepalensis a non-legume nitrogen-fixing shrub25
Chromosome-level genome assembly of the critically endangered Baer’s pochard (Aythya baeri)25
Realization times of energetic modernization measures for buildings based on interviews with craftworkers25
CreelCat, a Catalog of United States Inland Creel and Angler Survey Data25
Positive and Negative Affect Schedule in early COVID-19 pandemic25
EPIC: Annotated epileptic EEG independent components for artifact reduction25
Directional wave buoy data measured near Campbell Island, New Zealand25
A daily high-resolution (1 km) human thermal index collection over the North China Plain from 2003 to 202025
A labeled data set of underwater images of fish and crab species from five mesohabitats in Puget Sound WA USA25
Three-dimensional topology dataset of folded radar stratigraphy in northern Greenland25
A multi-modal panel dataset to understand the psychological impact of the pandemic25
SPRC19: A Database of State Policy Responses to COVID-19 in the United States25
U.S. national water and energy land dataset for integrated multisector dynamics research25
A dataset on energy efficiency grade of white goods in mainland China at regional and household levels24
Scientific echosounder data provide a predator’s view of Antarctic krill (Euphausia superba)24
Transcriptom and miRNA data of PUFA-enriched stimulated murine macrophage and human endothelial cell lines24
Open access dataset integrating EEG and fNIRS during Stroop tasks24
Stable isotope variations of dew under three different climates24
The Three Terms Task - an open benchmark to compare human and artificial semantic representations24
A dataset of human capital-weighted population estimates for 185 countries from 1970 to 210024
A nearly complete database on the records and ecology of the rarest boreal tiger moth from 1840s to 202024
Fisheries dataset on moulting patterns and shell quality of American lobsters H. americanus in Atlantic Canada24
The Simrad EK60 echosounder dataset from the Malaspina circumnavigation24
Flora diversity survey and establishment of a plant DNA barcode database of Lomas ecosystems in Peru24
Global WaterPack - The development of global surface water over the past 20 years at daily temporal resolution23
One high quality genome and two transcriptome datasets for new species of Mantamonas, a deep-branching eukaryote clade23
Vectorized dataset of silted land formed by check dams on the Chinese Loess Plateau23
SOIL-WATERGRIDS, mapping dynamic changes in soil moisture and depth of water table from 1970 to 201423
Home monitoring with connected mobile devices for asthma attack prediction with machine learning23
Statistical performance indicators and index—a new tool to measure country statistical capacity23
A consistent and corrected nighttime light dataset (CCNL 1992–2013) from DMSP-OLS data23
Chromosome-level genome assembly of the northern Pacific seastar Asterias amurensis23
0.079144954681396