Bioinformatics

Papers
(The TQCC of Bioinformatics is 15. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-11-01 to 2025-11-01.)
ArticleCitations
The 2025 ISCB Accomplishments by a Senior Scientist Award—Dr Amos Bairoch1692
DivPro: diverse protein sequence design with direct structure recovery guidance1228
RVINN: a flexible modeling for inferring dynamic transcriptional and post-transcriptional regulation using physics-informed neural networks808
LPTD: a novel linear programming-based topology determination method for cryo-EM maps735
Integrated Genome Browser App Store706
PANPROVA: pangenomic prokaryotic evolution of full assemblies395
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis329
ATLIGATOR: editing protein interactions with an atlas-based approach298
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach247
CompareM2 is a genomes-to-report pipeline for comparing microbial genomes234
MRDagent: iterative and adaptive parameter optimization for stable ctDNA-based MRD detection in heterogeneous samples218
getDNB: identifying dynamic network biomarkers of hepatocellular carcinoma from time-varying gene regulations utilizing graph embedding techniques for anomaly detection199
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities157
Increasing confidence in proteomic spectral deconvolution through mass defect128
monaLisa: an R/Bioconductor package for identifying regulatory motifs127
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models123
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow120
Statistical framework to determine indel-length distribution116
Correction to: GTExVisualizer: a web platform for supporting ageing studies111
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network106
ProteinLIPs: a web server for identifying highly polar and poorly packed interfaces in proteins100
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules99
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures94
Accurate assembly of multiple RNA-seq samples with Aletsch94
deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets91
Memory-efficient, accelerated protein interaction inference with blocked, multi-GPU D-SCRIPT86
Scalable inference and identifiability of kinetic parameters for transcriptional bursting from single cell data86
Completing gene trees without species trees in sub-quadratic time86
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations85
SimPlot++: a Python application for representing sequence similarity and detecting recombination84
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores82
The ENDS of assumptions: an online tool for the epistemic non-parametric drug–response scoring81
Fragmentstein—facilitating data reuse for cell-free DNA fragment analysis76
Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors75
Inference of 3D genome architecture by modeling overdispersion of Hi-C data74
Cross-species prediction of essential genes in insects74
trfermikit: a tool to discover VNTR-associated deletions74
Exploring automatic inconsistency detection for literature-based gene ontology annotation73
Response to the letter to the editor: On the feasibility of dynamical analysis of network models of biochemical regulation73
Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci70
Aclust2.0: a revamped unsupervised R tool for Infinium methylation beadchips data analyses70
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements66
MetBP: a software tool for detection of interaction between metal ion–RNA base pairs66
From high-throughput evaluation to wet-lab studies: advancing mutation effect prediction with a retrieval-enhanced model65
Harnessing deep learning for proteome-scale detection of amyloid signaling motifs65
hapCon: estimating contamination of ancient genomes by copying from reference haplotypes64
Floria: fast and accurate strain haplotyping in metagenomes64
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction64
CANTATA—prediction of missing links in Boolean networks using genetic programming64
Erratum to: GADGETS: a genetic algorithm for detecting epistasis using nuclear families63
DeepPerVar: a multi-modal deep learning framework for functional interpretation of genetic variants in personal genome63
RNAsolo: a repository of cleaned PDB-derived RNA 3D structures62
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning61
ADViSELipidomics: a workflow for analyzing lipidomics data59
Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition59
bollito: a flexible pipeline for comprehensive single-cell RNA-seq analyses59
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences58
ProSynAR: a reference aware read merger58
PyLiger: scalable single-cell multi-omic data integration in Python58
MICER: a pre-trained encoder–decoder architecture for molecular image captioning58
Decomposing mosaic tandem repeats accurately from long reads57
The phers R package: using phenotype risk scores based on electronic health records to study Mendelian disease and rare genetic variants57
RNAglib: a python package for RNA 2.5 D graphs57
Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation56
The FASTQ+ format and PISA55
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification55
Evidential meta-model for molecular property prediction54
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks54
Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition53
WMDS.net: a network control framework for identifying key players in transcriptome programs53
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing53
hipFG: high-throughput harmonization and integration pipeline for functional genomics data52
COVID-19 Spread Mapper: a multi-resolution, unified framework and open-source tool52
HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data51
minoTour, real-time monitoring and analysis for nanopore sequencers50
Hierarchical reinforcement learning for automatic disease diagnosis50
Adaptive digital tissue deconvolution50
CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction49
Single-cell RNA sequencing data analysis based on non-uniformε−neighborhood network49
Powerful molecule generation with simple ConvNet48
vaRHC: an R package for semi-automation of variant classification in hereditary cancer genes according to ACMG/AMP and gene-specific ClinGen guidelines48
XSI—a genotype compression tool for compressive genomics in large biobanks48
BATL: Bayesian annotations for targeted lipidomics48
From viral evolution to spatial contagion: a biologically modulated Hawkes model48
ViReMaShiny: an interactive application for analysis of viral recombination data47
Prediction and curation of missing biomedical identifier mappings with Biomappings46
LinkExplorer: predicting, explaining and exploring links in large biomedical knowledge graphs46
Transfer learning for drug–target interaction prediction45
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning45
Estimating sparse regression models in multi-task learning and transfer learning through adaptive penalisation45
Delineating inter- and intra-antibody repertoire evolution with AntibodyForests45
Prediction of gene co-expression from chromatin contacts with graph attention network45
EDTox: an R Shiny application to predict the endocrine disruption potential of compounds44
High-sensitivity pattern discovery in large, paired multiomic datasets44
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions44
ECCB2022: the 21st European Conference on Computational Biology44
RawHummus: an R Shiny app for automated raw data quality control in metabolomics43
Correction of image distortion in large-field ssEM stitching by an unsupervised intermediate-space solving network43
The minimizer Jaccard estimator is biased and inconsistent43
Microbench: automated metadata management for systems biology benchmarking and reproducibility in Python43
SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality42
A physics-informed neural SDE network for learning cellular dynamics from time-series scRNA-seq data42
Omnibus and robust deconvolution scheme for bulk RNA sequencing data integrating multiple single-cell reference sets and prior biological knowledge42
Geometry-complete perceptron networks for 3D molecular graphs42
Comprehensive comparison of two types of algorithm for circRNA detection from short-read RNA-Seq41
Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures41
Spectral clustering of single-cell multi-omics data on multilayer graphs41
Single-cell mutation calling and phylogenetic tree reconstruction with loss and recurrence41
scGrapHiC: deep learning-based graph deconvolution for Hi-C using single cell gene expression40
MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides40
Trustworthy causal biomarker discovery: a multiomics brain imaging genetics-based approach40
Functional characterization of co-phosphorylation networks39
AdenPredictor: accurate prediction of the adenylation domain specificity of nonribosomal peptide biosynthetic gene clusters in microbial genomes39
SpecieScan: semi-automated taxonomic identification of bone collagen peptides from MALDI-ToF-MS38
OMEN: network-based driver gene identification using mutual exclusivity38
Cell type matching across species using protein embeddings and transfer learning38
Modified RNAs and predictions with the ViennaRNA Package38
Tightly integrated multiomics-based deep tensor survival model for time-to-event prediction37
RNA threading with secondary structure and sequence profile37
LipidOne: user-friendly lipidomic data analysis tool for a deeper interpretation in a systems biology scenario37
CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties36
mHapTk: a comprehensive toolkit for the analysis of DNA methylation haplotypes34
Generating synthetic genotypes using diffusion models34
Conformal inference for reliable single cell RNA-seq annotation34
Accessible, uniform protein property prediction with a scikit-learn based toolset AIDE34
PractiCPP: a deep learning approach tailored for extremely imbalanced datasets in cell-penetrating peptide prediction33
BAV-LLPS: a database of bacterial, archaea, and virus liquid–liquid phase separation proteins33
LOCAN: a python library for analyzing single-molecule localization microscopy data33
PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers33
Quantifying and correcting slide-to-slide variation in multiplexed immunofluorescence images33
Conumee 2.0: enhanced copy-number variation analysis from DNA methylation arrays for humans and mice33
tcplfit2: an R-language general purpose concentration–response modeling package32
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics32
Graph-theoretical prediction of biological modules in quaternary structures of large protein complexes32
Mining literature and pathway data to explore the relations of ketamine with neurotransmitters and gut microbiota using a knowledge-graph32
PST-PRNA: prediction of RNA-binding sites using protein surface topography and deep learning32
PltDB: a blood platelets-based gene expression database for disease investigation32
Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis32
Efficient gradient boosting for prognostic biomarker discovery31
The 2024 ISCB Overton Prize Award—Dr Martin Steinegger31
Scbean: a python library for single-cell multi-omics data analysis31
2023 ISCB Overton Prize: Jingyi Jessica Li31
Computational modeling of mRNA degradation dynamics using deep neural networks31
Galaxy Helm chart: a standardized method for deploying production Galaxy servers31
SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing31
spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data30
iSFun: an R package for integrative dimension reduction analysis30
Multistage attention-based extraction and fusion of protein sequence and structural features for protein function prediction30
HAMPLE: deciphering TF-DNA binding mechanism in different cellular environments by characterizing higher-order nucleotide dependency30
PeakBot: machine-learning-based chromatographic peak picking30
Forseti: a mechanistic and predictive model of the splicing status of scRNA-seq reads30
Improving dictionary-based named entity recognition with deep learning30
An automated multi-modal graph-based pipeline for mouse genetic discovery30
MIAMI: mutual information-based analysis of multiplex imaging data30
Deep graph representations embed network information for robust disease marker identification29
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning29
MIO: microRNA target analysis system for immuno-oncology29
STAAR workflow: a cloud-based workflow for scalable and reproducible rare variant analysis29
Semi-supervised data-integrated feature importance enhances performance and interpretability of biological classification tasks29
StructuralDPPIV: a novel deep learning model based on atom structure for predicting dipeptidyl peptidase-IV inhibitory peptides29
Phenotype prediction from single-cell RNA-seq data using attention-based neural networks29
scSGL: kernelized signed graph learning for single-cell gene regulatory network inference28
MSNet-4mC: learning effective multi-scale representations for identifying DNA N4-methylcytosine sites28
scHiCPTR: unsupervised pseudotime inference through dual graph refinement for single-cell Hi-C data28
Prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients28
Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences28
GEnView: a gene-centric, phylogeny-based comparative genomics pipeline for bacterial genomes and plasmids28
An approachable, flexible and practical machine learning workshop for biologists28
AbDiver: a tool to explore the natural antibody landscape to aid therapeutic design28
SBGNview: towards data analysis, integration and visualization on all pathways27
scanMiR: a biochemically based toolkit for versatile and efficient microRNA target prediction27
Nezzle: an interactive and programmable visualization of biological networks in Python27
statgenMPP: an R package implementing an IBD-based mixed model approach for QTL mapping in a wide range of multi-parent populations27
KCOSS: an ultra-fast k-mer counter for assembled genome analysis27
ELIXIR biovalidator for semantic validation of life science metadata27
dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning27
Prediction of HIV sensitivity to monoclonal antibodies using aminoacid sequences and deep learning27
SPRISS: approximating frequentk-mers by sampling reads, and applications27
2022 ISCB Accomplishments by a Senior Scientist Award: Ron Shamir27
CellAnn: a comprehensive, super-fast, and user-friendly single-cell annotation web server26
CATH-ddG: towards robust mutation effect prediction on protein–protein interactions out of CATH homologous superfamily26
Foreign RNA spike-ins enable accurate allele-specific expression analysis at scale26
The 2025 ISCB Overton Prize Award—Dr James Zou26
MixingDTA: improved drug–target affinity prediction by extending mixup with guilt-by-association26
Powerful and interpretable control of false discoveries in two-group differential expression studies26
LoRA-DR-suite: adapted embeddings predict intrinsic and soft disorder from protein sequences26
GADGETS: a genetic algorithm for detecting epistasis using nuclear families26
Balancing complexity and clarity—towards clinician-ready antibiotic resistance prediction models25
HyperGraphs.jl: representing higher-order relationships in Julia25
A novel pipeline for computerized mouse spermatogenesis staging25
AHoJ: rapid, tailored search and retrieval of apo and holo protein structures for user-defined ligands25
ORT: a workflow linking genome-scale metabolic models with reactive transport codes25
ARTEMIS integrates autoencoders and Schrödinger Bridges to predict continuous dynamics of gene expression, cell population, and perturbation from time-series single-cell data25
Optimal phylogenetic reconstruction of insertion and deletion events25
Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data25
Polyphest: fast polyploid phylogeny estimation24
A unified mediation analysis framework for integrative cancer proteogenomics with clinical outcomes24
SEPA: signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data24
DeepProtein: deep learning library and benchmark for protein sequence learning24
IMPACT: interpretable microbial phenotype analysis via microbial characteristic traits24
Driver gene detection through Bayesian network integration of mutation and expression profiles24
Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data24
PDMDA: predicting deep-level miRNA–disease associations with graph neural networks and sequence features24
ReadItAndKeep: rapid decontamination of SARS-CoV-2 sequencing reads24
Determining epitope specificity of T-cell receptors with transformers24
RAxML Grove: an empirical phylogenetic tree database24
Comparing transmembrane protein structures with ATOLL23
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases23
InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias23
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data23
SPEAR: Systematic ProtEin AnnotatoR23
ATHENA: analysis of tumor heterogeneity from spatial omics measurements23
TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity23
CODEX: COunterfactual Deep learning for the in silico EXploration of cancer cell line perturbations23
Explainable multimodal machine learning model for classifying pregnancy drug safety22
DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure22
3D GAN image synthesis and dataset quality assessment for bacterial biofilm22
Thermometer: a webserver to predict protein thermal stability22
Bayesian inference of fitness landscapes via tree-structured branching processes22
ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment22
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein–protein interaction databases22
Expanding the coverage of spatial proteomics: a machine learning approach22
Overcoming biases in causal inference of molecular interactions22
CIBRA identifies genomic alterations with a system-wide impact on tumor biology22
Somatic mutation effects diffused over microRNA dysregulation22
Biological Random Walks: multi-omics integration for disease gene prioritization22
3D Optical Coherence Tomography image processing in BISCAP: characterization of biofilm structure and properties22
REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data22
BSDE: barycenter single-cell differential expression for case–control studies21
Struct-f4: a Rcpp package for ancestry profile and population structure inference from f4-statistics21
Pycallingcards: an integrated environment for visualizing, analyzing, and interpreting Calling Cards data21
NFTest: automated testing of Nextflow pipelines21
BACPI: a bi-directional attention neural network for compound–protein interaction and binding affinity prediction21
TEspeX: consensus-specific quantification of transposable element expression preventing biases from exonized fragments21
HOMELETTE: a unified interface to homology modelling software21
ODGI: understanding pangenome graphs21
SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition21
Phylogenetic diversity statistics for all clades in a phylogeny21
Looking at the BiG picture: incorporating bipartite graphs in drug response prediction21
PlasmoFAB: a benchmark to foster machine learning for Plasmodium falciparum protein antigen candidate prediction20
RiboGraph: an interactive visualization system for ribosome profiling data at read length resolution20
konnect2prot: a web application to explore the protein properties in a functional protein–protein interaction network20
M-Ionic: prediction of metal-ion-binding sites from sequence using residue embeddings20
End-to-end learning of evolutionary models to find coding regions in genome alignments20
scGAC: a graph attentional architecture for clustering single-cell RNA-seq data20
HiCARN: resolution enhancement of Hi-C data using cascading residual networks20
MolCL-SP: a multimodal contrastive learning framework with non-overlapping substructure perturbations for molecular property prediction20
GASTON-Mix: a unified model of spatial gradients and domains using spatial mixture-of-experts20
massNet: integrated processing and classification of spatially resolved mass spectrometry data using deep learning for rapid tumor delineation20
A penalized linear mixed model with generalized method of moments estimators for complex phenotype prediction20
0.69206500053406