Bioinformatics

Papers
(The median citation count of Bioinformatics is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
YaHS: yet another Hi-C scaffolding tool923
clinker & clustermap.js: automatic generation of gene cluster comparison figures740
GTDB-Tk v2: memory friendly classification with the genome taxonomy database552
New strategies to improve minimap2 alignment accuracy538
Analysing high-throughput sequencing data in Python with HTSeq 2.0484
Liftoff: accurate mapping of gene annotations451
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome434
ProteinBERT: a universal deep-learning model of protein sequence and function352
LDpred2: better, faster, stronger350
CAFE 5 models variation in evolutionary rates among gene families324
STREME: accurate and versatile sequence motif discovery313
CoV-Spectrum: analysis of globally shared SARS-CoV-2 data to identify and characterize new variants236
Gfastats: conversion, evaluation and manipulation of genome sequences using assembly graphs231
fastsimcoal2: demographic inference under complex evolutionary scenarios217
DeepPurpose: a deep learning library for drug–target interaction prediction198
LocusZoom.js: interactive and embeddable visualization of genetic association study results190
microbiomeMarker: an R/Bioconductor package for microbiome marker identification and visualization176
glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data168
Nebulosa recovers single-cell gene expression signals by kernel density estimation168
ggtranscript: an R package for the visualization and interpretation of transcript isoforms usingggplot2160
Fast and sensitive taxonomic assignment to metagenomic contigs153
UCSC Cell Browser: visualize your single-cell data149
Colour deconvolution: stain unmixing in histological imaging135
propeller:testing for differences in cell type proportions in single cell data130
DeepCDR: a hybrid graph convolutional network for predicting cancer drug response129
ShinyCell: simple and sharable visualization of single-cell gene expression data122
plotsr: visualizing structural similarities and rearrangements between multiple genomes122
The VEGA suite of programs: an versatile platform for cheminformatics and drug design projects122
MGIDI: toward an effective multivariate selection in biological experiments120
FlaGs and webFlaGs: discovering novel biology through the analysis of gene neighbourhood conservation116
PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data111
dittoSeq: universal user-friendly single-cell and bulk RNA sequencing visualization toolkit109
Make Interactive Complex Heatmaps in R107
ProDy 2.0: increased scale and scope after 10 years of protein dynamics modelling with Python105
BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides104
MUFFIN: multi-scale feature fusion for drug–drug interaction prediction104
TITAN: T-cell receptor specificity prediction with bimodal attention networks104
Accurate, scalable cohort variant calls using DeepVariant and GLnexus102
SoluProt: prediction of soluble protein expression inEscherichia coli99
POKY: a software suite for multidimensional NMR and 3D structure calculation of biomolecules97
Structure-aware protein–protein interaction site prediction using deep graph convolutional network96
Impact of protein conformational diversity on AlphaFold predictions95
ABlooper: fast accurate antibody CDR loop structure prediction with accuracy estimation94
BWA-MEME: BWA-MEM emulated with a machine learning approach92
Cellsnp-lite: an efficient tool for genotyping single cells91
HyperAttentionDTI: improving drug–protein interaction prediction by sequence-based deep learning with attention mechanism87
SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization84
Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning84
PyMod 3: a complete suite for structural bioinformatics in PyMOL83
UCSCXenaShiny: an R/CRAN package for interactive analysis of UCSC Xena data82
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning80
CellProfiler Analyst 3.0: accessible data exploration and machine learning for image analysis78
Protein interaction interface region prediction by geometric deep learning76
Conditional out-of-distribution generation for unpaired data using transfer VAE76
Bacteriophage classification for assembled contigs using graph convolutional network75
Subtype-GAN: a deep learning approach for integrative cancer subtyping of multi-omics data75
StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps74
Deuteros 2.0: peptide-level significance testing of data from hydrogen deuterium exchange mass spectrometry73
MOVICS: an R package for multi-omics integration and visualization in cancer subtyping73
DeepSurf: a surface-based deep learning approach for the prediction of ligand binding sites on proteins72
RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts71
TALE: Transformer-based protein function Annotation with joint sequence–Label Embedding69
V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data68
MDeePred: novel multi-channel protein featurization for deep learning-based binding affinity prediction in drug discovery68
Humanization of antibodies using a machine learning approach on large-scale repertoire data68
Mutalyzer 2: next generation HGVS nomenclature checker66
DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction66
iEnhancer-XG: interpretable sequence-based enhancers and their strength predictor65
BP4RNAseq: a babysitter package for retrospective and newly generated RNA-seq data analyses using both alignment-based and alignment-free quantification method64
lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning62
VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families62
SVIM-asm: structural variant detection from haploid and diploid genome assemblies62
ImmuCellAI-mouse: a tool for comprehensive prediction of mouse immune cell abundance and immune microenvironment depiction62
MBG: Minimizer-based sparse de Bruijn Graph construction61
SpatialExperiment: infrastructure for spatially-resolved transcriptomics data in R using Bioconductor61
MIB2: metal ion-binding site prediction and modeling server61
ODGI: understanding pangenome graphs61
Extended connectivity interaction features: improving binding affinity prediction through chemical description61
stPlus: a reference-based method for the accurate enhancement of spatial transcriptomics60
Evaluating single-cell cluster stability using the Jaccard similarity index60
COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology59
HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition59
Current structure predictors are not learning the physics of protein folding59
Improved RNA secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning58
Interfacing Seurat with the R tidy universe58
MVGCN: data integration through multi-view graph convolutional network for predicting links in biomedical bipartite networks57
HierCC: a multi-level clustering scheme for population assignments based on core genome MLST57
Plotgardener: cultivating precise multi-panel figures in R57
eMPRess: a systematic cophylogeny reconciliation tool57
amPEPpy 1.0: a portable and accurate antimicrobial peptide prediction tool57
Tiara: deep learning-based classification system for eukaryotic sequences56
DamageProfiler: fast damage pattern calculation for ancient DNA55
Manifold alignment for heterogeneous single-cell multi-omics data integration using Pamona55
Cellinker: a platform of ligand–receptor interactions for intercellular communication analysis54
GMNN2CD: identification of circRNA–disease associations based on variational inference and graph Markov neural networks54
BridgeDPI: a novel Graph Neural Network for predicting drug–protein interactions53
MobiDB-lite 3.0: fast consensus annotation of intrinsic disorder flavors in proteins52
Identification of sub-Golgi protein localization by use of deep representation learning features51
BERN2: an advanced neural biomedical named entity recognition and normalization tool51
Swarm v3: towards tera-scale amplicon clustering50
MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics50
DLAB: deep learning methods for structure-based virtual screening of antibodies50
PROSS 2: a new server for the design of stable and highly expressed protein variants49
RCSB Protein Data Bank: improved annotation, search and visualization of membrane protein structures archived in the PDB49
GPDBN: deep bilinear network integrating both genomic data and pathological images for breast cancer prognosis prediction49
Real-time mapping of nanopore raw signals48
BACPI: a bi-directional attention neural network for compound–protein interaction and binding affinity prediction48
Generating property-matched decoy molecules using deep learning48
Predicting protein–peptide binding residues via interpretable deep learning48
Stitching and registering highly multiplexed whole-slide images of tissues and tumors using ASHLAR47
Ensembling graph attention networks for human microbe–drug association prediction47
mlr3proba: an R package for machine learning in survival analysis47
Supervised graph co-contrastive learning for drug–target interaction prediction46
MAGUS: Multiple sequence Alignment using Graph clUStering46
FL-QSAR: a federated learning-based QSAR prototype for collaborative drug discovery46
SCIM: universal single-cell matching with unpaired feature sets45
SpacePHARER: sensitive identification of phages from CRISPR spacers in prokaryotic hosts45
SimPlot++: a Python application for representing sequence similarity and detecting recombination45
MultiDTI: drug–target interaction prediction based on multi-modal representation learning to bridge the gap between new chemical entities and known heterogeneous network45
coronaSPAdes: from biosynthetic gene clusters to RNA viral assemblies44
monaLisa: an R/Bioconductor package for identifying regulatory motifs44
Topsy-Turvy: integrating a global view into sequence-based PPI prediction44
sepal: identifying transcript profiles with spatial patterns by diffusion-based modeling44
Advanced graph and sequence neural networks for molecular property prediction and drug discovery44
AMICI: high-performance sensitivity analysis for large ordinary differential equation models44
SOMDE: a scalable method for identifying spatially variable genes with self-organizing map43
Socket2: a program for locating, visualizing and analyzing coiled-coil interfaces in protein structures43
orfipy: a fast and flexible tool for extracting ORFs43
Deep cross-omics cycle attention model for joint analysis of single-cell multi-omics data42
ASpli: integrative analysis of splicing landscapes through RNA-Seq assays42
Modeling multi-scale data via a network of networks42
BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models42
Deep graph learning of inter-protein contacts41
Deep learning models for RNA secondary structure prediction (probably) do not generalize across families41
Federated Random Forests can improve local performance of predictive models for various healthcare applications41
scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets41
BERTMHC: improved MHC–peptide class II interaction prediction with transformer and multiple instance learning40
Cross-dependent graph neural networks for molecular property prediction40
Bayesian modeling of spatial molecular profiling data via Gaussian process40
Learning embedding features based on multisense-scaled attention architecture to improve the predictive performance of anticancer peptides39
PhosIDN: an integrated deep neural network for improving protein phosphorylation site prediction by combining sequence and protein–protein interaction information39
Multi-omics data integration by generative adversarial network39
cytomapper: an R/Bioconductor package for visualization of highly multiplexed imaging data39
Increasing the accuracy of single sequence prediction methods using a deep semi-supervised learning framework39
Adversarial deconfounding autoencoder for learning robust gene expression embeddings39
ELIXIR: providing a sustainable infrastructure for life science data at European scale38
CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types38
Pre-training graph neural networks for link prediction in biomedical networks38
GNN-based embedding for clustering scRNA-seq data38
FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction38
Effective drug–target interaction prediction with mutual interaction neural network38
Node similarity-based graph convolution for link prediction in biological networks38
E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics37
scGAC: a graph attentional architecture for clustering single-cell RNA-seq data37
A mixture model for determining SARS-Cov-2 variant composition in pooled samples37
TPpred-ATMV: therapeutic peptide prediction by adaptive multi-view tensor learning model37
DeepViral: prediction of novel virus–host interactions from protein sequences and infectious disease phenotypes36
Clustering spatial transcriptomics data36
DeepGOZero: improving protein function prediction from sequence and zero-shot learning based on ontology axioms36
DRUMMER—rapid detection of RNA modifications through comparative nanopore sequencing36
KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers36
CUT&RUNTools 2.0: a pipeline for single-cell and bulk-level CUT&RUN and CUT&Tag data analysis36
Statistical approaches for differential expression analysis in metatranscriptomics36
MOMA: a multi-task attention learning algorithm for multi-omics data interpretation and classification35
DeepTrio: a ternary prediction system for protein–protein interaction using mask multiple parallel convolutional neural networks35
Graph2MDA: a multi-modal variational graph embedding model for predicting microbe–drug associations35
Sparse and skew hashing of K-mers35
Transfer learning via multi-scale convolutional neural layers for human–virus protein–protein interaction prediction35
Methylartist: tools for visualizing modified bases from nanopore sequence data35
SPOT-Contact-LM: improving single-sequence-based prediction of protein contact map using a transformer language model35
High-performance single-cell gene regulatory network inference at scale: the Inferelator 3.034
ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity34
MIMOSA2: a metabolic network-based tool for inferring mechanism-supported relationships in microbiome-metabolome data34
Identifying signaling genes in spatial single-cell expression data34
Interpretable-ADMET: a web service for ADMET prediction and optimization based on deep neural representation33
GeneGPT: augmenting large language models with domain tools for improved access to biomedical information33
ReadBouncer: precise and scalable adaptive sampling for nanopore sequencing33
EpiGraphDB: a database and data mining platform for health data science33
GNN-SubNet: disease subnetwork detection with explainable graph neural networks33
FBA reveals guanylate kinase as a potential target for antiviral therapies against SARS-CoV-233
DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning33
PecanPy: a fast, efficient and parallelized Python implementation of node2vec32
CrossTalkeR: analysis and visualization of ligand–receptorne tworks32
AOP-helpFinder webserver: a tool for comprehensive analysis of the literature to support adverse outcome pathways development32
Cooperative sequence clustering and decoding for DNA storage system with fountain codes32
panRGP: a pangenome-based method to predict genomic islands and explore their diversity31
A deep dilated convolutional residual network for predicting interchain contacts of protein homodimers31
HFBSurv: hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction31
Drug repurposing against breast cancer by integrating drug-exposure expression profiles and drug–drug links based on graph neural network31
LLPSDB v2.0: an updated database of proteins undergoing liquid–liquid phase separation in vitro31
Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments31
Integrating genome-scale metabolic modelling and transfer learning for human gene regulatory network reconstruction31
Quantification of aneuploidy in targeted sequencing data using ASCETS31
Autoencoder-based drug–target interaction prediction by preserving the consistency of chemical properties and functions of drugs30
NerLTR-DTA: drug–target binding affinity prediction based on neighbor relationship and learning to rank30
TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms30
Annotating high-impact 5′untranslated region variants with the UTRannotator30
CSM-AB: graph-based antibody–antigen binding affinity prediction and docking scoring function30
Accurate spliced alignment of long RNA sequencing reads30
Scaling multi-instance support vector machine to breast cancer detection on the BreaKHis dataset30
BERT-GT: cross-sentence n-ary relation extraction with BERT and Graph Transformer30
i6mA-Caps: a CapsuleNet-based framework for identifying DNA N6-methyladenine sites29
iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization29
DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction29
LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants29
SMILE: mutual information learning for integration of single-cell omics data29
Recognition of small molecule–RNA binding sites using RNA sequence and structure29
CBNplot: Bayesian network plots for enrichment analysis29
DeepAc4C: a convolutional neural network model with hybrid features composed of physicochemical patterns and distributed representation information for identification of N4-acetylcytidine in mRNA29
Accurate deep learning off-target prediction with novel sgRNA-DNA sequence encoding in CRISPR-Cas9 gene editing29
DeepUMQA: ultrafast shape recognition-based protein model quality assessment using deep learning29
DFBP: a comprehensive database of food-derived bioactive peptides for peptidomics research29
Graph attention network for link prediction of gene regulations from single-cell RNA-sequencing data29
EPSOL: sequence-based protein solubility prediction using multidimensional embedding28
PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology28
FrustratometeR: an R-package to compute local frustration in protein structures, point mutants and MD simulations28
phyLoSTM: a novel deep learning model on disease prediction from longitudinal microbiome data28
ksrates: positioning whole-genome duplications relative to speciation events in KS distributions28
RNAcmap: a fully automatic pipeline for predicting contact maps of RNAs by evolutionary coupling analysis28
NetSolP: predicting protein solubility in Escherichia coli using language models28
PICS2: next-generation fine mapping via probabilistic identification of causal SNPs28
Perceiver CPI: a nested cross-attention network for compound–protein interaction prediction27
Multi-instance learning of graph neural networks for aqueous pKa prediction27
ASTRAL-Pro 2: ultrafast species tree reconstruction from multi-copy gene family trees27
SubtypeDrug: a software package for prioritization of candidate cancer subtype-specific drugs27
Graph contextualized attention network for predicting synthetic lethality in human cancers27
Virtifier: a deep learning-based identifier for viral sequences from metagenomes27
Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data27
Neuron tracing from light microscopy images: automation, deep learning and bench testing27
APSCALE: advanced pipeline for simple yet comprehensive analyses of DNA metabarcoding data27
A span-graph neural model for overlapping entity relation extraction in biomedical texts27
Identifying interactions in omics data for clinical biomarker discovery using symbolic regression27
Improving circRNA–disease association prediction by sequence and ontology representations with convolutional and recurrent neural networks27
Multi-way relation-enhanced hypergraph representation learning for anti-cancer drug synergy prediction27
Normalization of single-cell RNA-seq counts by log(x + 1) or log(1 + x)27
Viral Host Range database, an online tool for recording, analyzing and disseminating virus–host interactions27
MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction27
The COVID-19 Ontology27
Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis27
No one tool to rule them all: prokaryotic gene prediction tool annotations are highly dependent on the organism of study26
Newt: a comprehensive web-based tool for viewing, constructing and analyzing biological maps26
PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods26
SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled26
Deep feature extraction of single-cell transcriptomes by generative adversarial network26
Random forest of perfect trees: concept, performance, applications and perspectives26
Genozip: a universal extensible genomic data compressor26
BioDynaMo: a modular platform for high-performance agent-based simulation26
High-sensitivity pattern discovery in large, paired multiomic datasets26
PathCNN: interpretable convolutional neural networks for survival prediction and pathway analysis applied to glioblastoma26
The iPPI-DB initiative: a community-centered database of protein–protein interaction modulators26
2.1675457954407