Bioinformatics

Papers
(The TQCC of Bioinformatics is 12. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-01-01 to 2025-01-01.)
ArticleCitations
EPS: automated feature selection in case–control studies using extreme pseudo-sampling961
PAX2GRAPHML: a python library for large-scale regulation network analysis using BioPAX771
On the stability of log-rank test under labeling errors570
PaIntDB: network-based omics integration and visualization using protein–protein interactions in Pseudomonas aeruginosa552
ACES: Analysis of Conservation with an Extensive list of Species502
ViTAL: Vision TrAnsformer based Low coverage SARS-CoV-2 lineage assignment460
Selection among site-dependent structurally constrained substitution models of protein evolution by approximate Bayesian computation451
Utilizing image and caption information for biomedical document classification376
PrISM: precision for integrative structural models325
ASES: visualizing evolutionary conservation of alternative splicing in proteins258
xGAP: a python based efficient, modular, extensible and fault tolerant genomic analysis pipeline for variant discovery247
An OMICs-based meta-analysis to support infection state stratification217
Corrigendum to: HRIBO: high-throughput analysis of bacterial ribosome profiling data190
Gene Tracer: a smart, interactive, voice-controlled Alexa skill For gene information retrieval and browsing, mutation annotation and network visualization187
2023 ISCB innovator award: Dana Pe’er169
wenda_gpu: fast domain adaptation for genomic data168
ATLIGATOR: editing protein interactions with an atlas-based approach153
A weighted distance-based approach for deriving consensus tumor evolutionary trees149
XRRpred: accurate predictor of crystal structure quality from protein sequence140
webSCST: an interactive web application for single-cell RNA-sequencing data and spatial transcriptomic data integration130
Predicting spatially resolved gene expression via tissue morphology using adaptive spatial GNNs122
CIBRA identifies genomic alterations with a system-wide impact on tumor biology111
learnMSA2: deep protein multiple alignments with large language and hidden Markov models107
PsiNorm: a scalable normalization for single-cell RNA-seq data105
Metagenomic functional profiling: to sketch or not to sketch?104
Polypharmacy side-effect prediction with enhanced interpretability based on graph feature attention network104
StructuralVariantAnnotation: a R/Bioconductor foundation for a caller-agnostic structural variant software ecosystem104
Identification of cell-type-specific spatially variable genes accounting for excess zeros104
fastISM: performantin silicosaturation mutagenesis for convolutional neural networks98
monaLisa: an R/Bioconductor package for identifying regulatory motifs97
Target–Decoy MineR for determining the biological relevance of variables in noisy datasets96
GNN-based embedding for clustering scRNA-seq data95
Telogator: a method for reporting chromosome-specific telomere lengths from long reads94
SOMDE: a scalable method for identifying spatially variable genes with self-organizing map91
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments87
Cooperative sequence clustering and decoding for DNA storage system with fountain codes84
Clustering single-cell RNA-seq data by rank constrained similarity learning84
scWMC: weighted matrix completion-based imputation of scRNA-seq data via prior subspace information82
Nebulosa recovers single-cell gene expression signals by kernel density estimation80
On the stability of log-rank test under labeling errors78
Erratum to: EpiGraphDB: a database and data mining platform for health data science77
An incrementally updatable and scalable system for large-scale sequence search using the Bentley–Saxe transformation76
scKINETICS: inference of regulatory velocity with single-cell transcriptomics data75
DIMPL: a bioinformatics pipeline for the discovery of structured noncoding RNA motifs in bacteria75
ISMB 2022 proceedings74
SharePro: an accurate and efficient genetic colocalization method accounting for multiple causal signals73
miRe2e: a full end-to-end deep model based on transformers for prediction of pre-miRNAs72
Metaball skinning of synthetic astroglial morphologies into realistic mesh models for in silico simulations and visual analytics71
SPOT: a web-tool enabling swift profiling of transcriptomes70
Graph2MDA: a multi-modal variational graph embedding model for predicting microbe–drug associations69
IsoMiRmap: fast, deterministic and exhaustive mining of isomiRs from short RNA-seq datasets69
Tree2GD: a phylogenomic method to detect large-scale gene duplication events68
Effective knowledge graph embeddings based on multidirectional semantics relations for polypharmacy side effects prediction68
On the feasibility of dynamical analysis of network models of biochemical regulation66
BSDE: barycenter single-cell differential expression for case–control studies62
CAFE: a software suite for analysis of paired-sample transposon insertion sequencing data62
A cross-level information transmission network for hierarchical omics data integration and phenotype prediction from a new genotype62
OpenPhi: an interface to access Philips iSyntax whole slide images for computational pathology61
An integrative pipeline for circular RNA quantitative trait locus discovery with application in human T cells61
GuidePro: a multi-source ensemble predictor for prioritizing sgRNAs in CRISPR/Cas9 protein knockouts61
PathwAX II: network-based pathway analysis with interactive visualization of network crosstalk60
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations60
Characterizing domain-specific open educational resources by linking ISCB Communities of Special Interest to Wikipedia59
The 2024 Outstanding Contributions to ISCB Award—Dr Scott Markel59
MMGraph: a multiple motif predictor based on graph neural network and coexisting probability for ATAC-seq data59
Benchmarking table recognition performance on biomedical literature on neurological disorders58
REET: robustness evaluation and enhancement toolbox for computational pathology58
OMAmer: tree-driven and alignment-free protein assignment to subfamilies outperforms closest sequence approaches57
preciseTAD: a transfer learning framework for 3D domain boundary prediction at base-pair resolution56
Expanding the coverage of spatial proteomics: a machine learning approach55
ShinyArchR.UiO: user-friendly,integrative and open-source tool for visualization of single-cell ATAC-seq data using ArchR55
ConsensuSV—from the whole-genome sequencing data to the complete variant list54
GAMIBHEAR: whole-genome haplotype reconstruction from Genome Architecture Mapping data54
LPTD: a novel linear programming-based topology determination method for cryo-EM maps53
Comparing transmembrane protein structures with ATOLL51
PANPROVA: pangenomic prokaryotic evolution of full assemblies50
TMQuery: a database of precomputed template modeling scores for assessment of protein structural similarity50
Integrated Genome Browser App Store50
BioThings SDK: a toolkit for building high-performance data APIs in biomedical research49
Comparison of structural variants detected by optical mapping with long-read next-generation sequencing49
GEInter: an R package for robust gene–environment interaction analysis48
Multimodal medical image fusion using adaptive co-occurrence filter-based decomposition optimization model48
ASURAT: functional annotation-driven unsupervised clustering of single-cell transcriptomes48
Bios2cor: an R package integrating dynamic and evolutionary correlations to identify functionally important residues in proteins48
NaViA: a program for the visual analysis of complex mass spectra47
Exploring parallel MPI fault tolerance mechanisms for phylogenetic inference with RAxML-NG46
AlphaMap: an open-source Python package for the visual annotation of proteomics data with sequence-specific knowledge46
ExplorATE: a new pipeline to explore active transposable elements from RNA-seq data45
CHIT: an allele-specific method for testing the association between molecular quantitative traits and phenotype–genotype interaction45
scAMACE: model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation45
EDGE COVID-19: a web platform to generate submission-ready genomes from SARS-CoV-2 sequencing efforts44
3DPolyS-LE: an accessible simulation framework to model the interplay between chromatin and loop extrusion44
A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data44
Deciphering associations between gut microbiota and clinical factors using microbial modules44
3D GAN image synthesis and dataset quality assessment for bacterial biofilm44
ASHLEYS: automated quality control for single-cell Strand-seq data44
Isoform function prediction by Gene Ontology embedding43
SCIntRuler: guiding the integration of multiple single-cell RNA-seq datasets with a novel statistical metric43
Embeddings of genomic region sets capture rich biological associations in lower dimensions43
CondiS web app: imputation of censored lifetimes for machine learning-based survival analysis42
Somatic mutation effects diffused over microRNA dysregulation42
DiffChIPL: a differential peak analysis method for high-throughput sequencing data with biological replicates based on limma42
MAFFIN: metabolomics sample normalization using maximal density fold change with high-quality metabolic features and corrected signal intensities42
GdClean: removal of Gadolinium contamination in mass cytometry data41
SPEAR: Systematic ProtEin AnnotatoR41
FASTRAL: improving scalability of phylogenomic analysis41
SNIKT: sequence-independent adapter identification and removal in long-read shotgun sequencing data41
Correction to: GSpace: an exact coalescence simulator of recombining genomes under isolation by distance41
Median and small parsimony problems on RNA trees40
pKPDB: a protein data bank extension database of pKa and pI theoretical values40
Large-scale inference of population structure in presence of missingness using PCA40
Top-Down Crawl: a method for the ultra-rapid and motif-free alignment of sequences with associated binding metrics39
Trap spaces of multi-valued networks: definition, computation, and applications39
Explainable multimodal machine learning model for classifying pregnancy drug safety39
I2b2-etl: Python application for importing electronic health data into the informatics for integrating biology and the bedside platform38
RCandy: an R package for visualizing homologous recombinations in bacterial genomes38
methyLImp2: faster missing value estimation for DNA methylation data38
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases38
GORetriever: reranking protein-description-based GO candidates by literature-driven deep information retrieval for protein function annotation38
Optimization of drug–target affinity prediction methods through feature processing schemes38
VeTra: a tool for trajectory inference based on RNA velocity37
COVID-19 Knowledge Graph from semantic integration of biomedical literature and databases37
Toward comprehensive functional analysis of gene lists weighted by gene essentiality scores37
CTISL: a dynamic stacking multi-class classification approach for identifying cell types from single-cell RNA-seq data37
NeoFox: annotating neoantigen candidates with neoantigen features37
Idéfix: identifying accidental sample mix-ups in biobanks using polygenic scores36
Integrative survival analysis of breast cancer with gene expression and DNA methylation data36
Accurate large-scale phylogeny-aware alignment using BAli-Phy36
Predicting anti-cancer drug response by finding optimal subset of drugs36
IIFDTI: predicting drug–target interactions through interactive and independent features based on attention mechanism36
Statistical framework to determine indel-length distribution36
TEspeX: consensus-specific quantification of transposable element expression preventing biases from exonized fragments36
Multi-instance learning of graph neural networks for aqueous pKa prediction36
Disease gene prediction with privileged information and heteroscedastic dropout35
LaGAT: link-aware graph attention network for drug–drug interaction prediction35
OMAMO: orthology-based alternative model organism selection35
SPDE: a multi-functional software for sequence processing and data extraction35
AWOT and CWOT for genotype and genotype-by-treatment interaction joint analysis in pharmacogenetics GWAS35
HOMELETTE: a unified interface to homology modelling software35
IntelliPy: a GUI for analyzing IntelliCage data35
ExEmPLAR (Extracting, Exploring, and Embedding Pathways Leading to Actionable Research): a user-friendly interface for knowledge graph mining34
Computing optimal factories in metabolic networks with negative regulation34
StAmP-DB: a platform for structures of polymorphic amyloid fibril cores34
AutoDC: an automatic machine learning framework for disease classification33
Fec: a fast error correction method based on two-rounds overlapping and caching33
Mocafe: a comprehensive Python library for simulating cancer development with Phase Field Models33
AutoCAT: automated cancer-associated TCRs discovery from TCR-seq data33
The DOMINO web-server for active module identification analysis32
ORFLine: a bioinformatic pipeline to prioritize small open reading frames identifies candidate secreted small proteins from lymphocytes32
3Dscript.server: true server-side 3D animation of microscopy images using a natural language-based syntax32
RAPPPID: towards generalizable protein interaction prediction with AWD-LSTM twin networks32
IRIS-FGM: an integrative single-cell RNA-Seq interpretation system for functional gene module analysis31
A heuristic algorithm solving the mutual-exclusivity-sorting problem31
Prior knowledge facilitates low homologous protein secondary structure prediction with DSM distillation31
ExperimentSubset: an R package to manage subsets of Bioconductor Experiment objects31
TPWshiny: an interactive R/Shiny app to explore cell line transcriptional responses to anti-cancer drugs31
AMC: accurate mutation clustering from single-cell DNA sequencing data31
Pycallingcards: an integrated environment for visualizing, analyzing, and interpreting Calling Cards data31
MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences30
AutoCCS: automated collision cross-section calculation software for ion mobility spectrometry–mass spectrometry30
medna-metadata: an open-source data management system for tracking environmental DNA samples and metadata30
Probabilistic thermodynamic analysis of metabolic networks30
RNAloops: a database of RNA multiloops30
Phylogenetic diversity statistics for all clades in a phylogeny30
Improving and evaluating deep learning models of cellular organization30
BART3D: inferring transcriptional regulators associated with differential chromatin interactions from Hi-C data29
massDatabase: utilities for the operation of the public compound and pathway database29
SigTools: exploratory visualization for genomic signals29
RoDiCE: robust differential protein co-expression analysis for cancer complexome29
MitoVisualize: a resource for analysis of variants in human mitochondrial RNAs and DNA29
dsRBPBind: modeling the effect of RNA secondary structure on double-stranded RNA–protein binding29
MuWU: Mutant-seq library analysis and annotation29
Phytest: quality control for phylogenetic analyses29
HPOFiller: identifying missing protein–phenotype associations by graph convolutional network29
Correction to: GTExVisualizer: a web platform for supporting ageing studies29
BioCCP.jl: collecting coupons in combinatorial biotechnology29
Scoring protein sequence alignments using deep learning29
seqgra: principled selection of neural network architectures for genomics prediction tasks28
Rapid T-cell receptor interaction grouping with ting28
Detecting spatially co-expressed gene clusters with functional coherence by graph-regularized convolutional neural network28
DelaySSAToolkit.jl: stochastic simulation of reaction systems with time delays in Julia28
On the relation between input and output distributions of scRNA-seq experiments28
Biomedical evidence engineering for data-driven discovery28
Practical selection of representative sets of RNA-seq samples using a hierarchical approach28
CAMML with the Integration of Marker Proteins (ChIMP)28
CACONET: a novel classification framework for microbial correlation networks28
Overcoming selection bias in synthetic lethality prediction27
Thermometer: a webserver to predict protein thermal stability27
AMICI: high-performance sensitivity analysis for large ordinary differential equation models27
SWIMmeR: an R-based software to unveiling crucial nodes in complex biological networks27
The topology of data: opportunities for cancer research27
3D Optical Coherence Tomography image processing in BISCAP: characterization of biofilm structure and properties27
Compact and evenly distributed k-mer binning for genomic sequences27
Automated exploitation of deep learning for cancer patient stratification across multiple types27
KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers27
SpaceX: gene co-expression network estimation for spatial transcriptomics27
Toward the assessment of predicted inter-residue distance27
ELIXIR: providing a sustainable infrastructure for life science data at European scale27
KIMGENS: a novel method to estimate kinship in organisms with mixed haploid diploid genetic systems robust to population structure27
Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing26
Survival analysis on rare events using group-regularized multi-response Cox regression26
GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images26
GLIDER: function prediction from GLIDE-based neighborhoods26
tSFM 1.0: tRNA Structure–Function Mapper26
Ribo-ODDR: oligo design pipeline for experiment-specific rRNA depletion in Ribo-seq26
MS1Connect: a mass spectrometry run similarity measure26
TRANSDIRE: data-driven direct reprogramming by a pioneer factor-guided trans-omics approach26
PTMint database of experimentally verified PTM regulation on protein–protein interaction26
Phylogenomic branch length estimation using quartets26
ODGI: understanding pangenome graphs25
BFF and cellhashR: analysis tools for accurate demultiplexing of cell hashing data25
clinker & clustermap.js: automatic generation of gene cluster comparison figures25
InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses25
Systematic replication enables normalization of high-throughput imaging assays25
POIBM: batch correction of heterogeneous RNA-seq datasets through latent sample matching25
A method for subtype analysis with somatic mutations25
Tandem repeat interval pattern identifies animal taxa25
MetaNorm: incorporating meta-analytic priors into normalization of NanoString nCounter data24
Haplotype-based membership inference from summary genomic data24
Bioframe: operations on genomic intervals in Pandas dataframes24
Overcoming biases in causal inference of molecular interactions24
Statistical approaches for differential expression analysis in metatranscriptomics24
Icolos: a workflow manager for structure-based post-processing of de novo generated small molecules24
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein–protein interaction databases24
PyJAMAS: open-source, multimodal segmentation and analysis of microscopy images24
MultiBaC: an R package to remove batch effects in multi-omic experiments23
An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics23
Chromosomal imbalances detected via RNA-sequencing in 28 cancers23
Scalable de novo classification of antibiotic resistance of Mycobacterium tuberculosis23
DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure23
A count-based model for delineating cell–cell interactions in spatial transcriptomics data23
A variant selection framework for genome graphs23
GBZ file format for pangenome graphs23
Predicting protein functions using positive-unlabeled ranking with ontology-based priors23
Conway–Bromage–Lyndon (CBL): an exact, dynamic representation of k-mer sets23
AttentionPert: accurately modeling multiplexed genetic perturbations with multi-scale effects23
TimiRGeN: R/Bioconductor package for time series microRNA–mRNA integration and analysis23
Efficient permutation-based genome-wide association studies for normal and skewed phenotypic distributions23
Accurate assembly of multiple RNA-seq samples with Aletsch23
Learning locality-sensitive bucketing functions23
Biomarker identification by interpretable maximum mean discrepancy23
Increasing confidence in proteomic spectral deconvolution through mass defect22
Effective drug–target interaction prediction with mutual interaction neural network22
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures22
SD2: spatially resolved transcriptomics deconvolution through integration of dropout and spatial information22
ASpli: integrative analysis of splicing landscapes through RNA-Seq assays22
TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity22
Completing gene trees without species trees in sub-quadratic time22
Deep Subspace Mutual Learning for cancer subtypes prediction22
CoGO: a contrastive learning framework to predict disease similarity based on gene network and ontology structure22
0.058556079864502