Statistical Analysis and Data Mining

Papers
(The TQCC of Statistical Analysis and Data Mining is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
415
Predictive models with end user preference26
Evaluating causal‐based feature selection for fuel property prediction models15
A practical extension of the recursive multi‐fidelity model for the emulation of hole closure experiments13
Semi‐Parametric Least‐Area Linear‐Circular Regression Through Möbius Transformation12
Modeling and inference for mixtures of simple symmetric exponential families of ‐dimensional distributions for vectors with binary coordinates12
Data‐drivensparse partial least squares12
Sample selection bias in evaluation of prediction performance of causal models12
Survival trees based on heterogeneity in time‐to‐event and censoring distributions using parameter instability test11
Randomized multiarm bandits: An improved adaptive data collection method11
9
CLADAG 2021 special issue: Selected papers on classification and data analysis9
Some Bayesian biclustering methods: Modeling and inference8
Kernel learning with nonconvex ramp loss8
Model Averaging for Regression Kink Models7
Data Twinning7
Issue Information7
BayesMultiomics: An R Package for Bayesian Shrinkage Models for Integration and Analysis of Multi‐Platform High‐Dimensional Genomics Data7
Optimal ratio for data splitting6
Bayesian shrinkage models for integration and analysis of multiplatform high‐dimensional genomics data5
5
Weighted AutoEncoding recommender system5
An efficientk‐modes algorithm for clustering categorical datasets5
Negative binomial graphical model with excess zeros5
Tracking clusters and anomalies in evolving data streams4
Bayesian inference for nonprobability samples with nonignorable missingness3
An ImprovedD2GAN‐based oversampling algorithm for imbalanced data classification3
Issue Information3
Robust deep neural network surrogate models with uncertainty quantification via adversarial training3
A tree‐based gene–environment interaction analysis with rare features3
Multivariate contaminated normal mixture regression modeling of longitudinal data based on jointmean‐covariancemodel3
A finely tuned deep transfer learning algorithm to compare outsole images3
Multi‐node Expectation–Maximization algorithm for finite mixture models3
On difference‐based gradient estimation in nonparametric regression3
Model‐Based Recursive Partitioning for Discrete Event Times3
3
Bayesian modeling of location, scale, and shape parameters in skew‐normal regression models3
Input‐response space‐filling designs incorporating response uncertainty3
Integrative learning of structuredhigh‐dimensionaldata from multiple datasets3
Estimating basis functions in massive fields under the spatial mixed effects model3
Comparison of merging strategies for building machine learning models on multiple independent gene expression data sets3
Sparse Bayesian variable selection in high‐dimensional logistic regression models with correlated priors2
A new formulation of sparse multiple kernel k$$ k $$‐means clustering and its applications2
Local influence analysis for the sliced average third‐moment estimation2
Issue Information2
Development and validation of models for two‐week mortality of inpatients with COVID‐19 infection: A large prospective cohort study2
Cost‐sensitive classification with time constraint on incomplete data2
Issue Information2
Sketched Stochastic Dictionary Learning for large‐scale data and application to high‐throughput mass spectrometry2
The analysis of association rules: Latent class analysis2
The fairness‐accuracy Pareto front2
Interaction Tests With Covariate‐Adaptive Randomization2
Bayesian Hybrid Model Search and Averaging for Sparse Gaussian Process Regression2
2
Adversarially robust subspace learning in the spiked covariance model2
eRPCA: Robust Principal Component Analysis for Exponential Family Distributions2
Issue Information2
Nonparametric clustering of RNA‐sequencing data2
Driving mode analysis—How uncertain functional inputs propagate to an output2
Biclustering high‐frequency financial time series based on information theory2
A Novel Approach for APT Detection Based on Ensemble Learning Model2
Semiparametric estimation of average treatment effects in observational studies1
Density estimation via measure transport: Outlook for applications in the biological sciences1
Gaussian process selections in semiparametric multi‐kernel machine regression for multi‐pathway analysis1
Bayesian Posterior Interval Calibration to Improve the Interpretability of Observational Studies1
Multi‐scale affinities with missing data: Estimation and applications1
Estimation of disease progression for ischemic heart disease using latent Markov with covariates1
1
Bayesian batch optimization for molybdenum versus tungsten inertial confinement fusion double shell target design1
Modeling matrix variate time series via hidden Markov models with skewed emissions1
Issue Information1
1
A Homogeneity Test for Ordinal Receiver Operating Characteristic Regression With Application to Facial Recognition Accuracy Assessment1
Simplicial depth and its median: Selected properties and limitations1
Weighted pivot coordinates for partial least squares‐based marker discovery in high‐throughput compositional data1
A deep learning factor analysis model based on importance‐weighted variational inference and normalizing flow priors: Evaluation within a set of multidimensional performance assessments in youth elite1
1
Confidence bounds for threshold similarity graph in random variable network1
Regrouped design in privacy analysis for multinomial microdata1
A study of the impact of COVID‐19 on the Chinese stock market based on a new textual multiple ARMA model1
Efficient importance sampling imputation algorithms for quantile and composite quantile regression1
Coupled support tensor machine classification for multimodal neuroimaging data1
A new parametric approach to gender gap with application to EUSILC data in Poland and Italy1
Convolutional Sparse Coding for Time Series Via a ℓ0 Penalty: An Efficient Algorithm With Statistical Guarantees1
A Conversational Assistant for Democratization of Data Visualization: A Comparative Study of Two Approaches of Interaction1
Data‐driven stochastic model for quantifying the interplay between amyloid‐beta and calcium levels in Alzheimer's disease1
Semiparametric detection of changepoints in location, scale, and copula1
High‐dimensional classification based on nonparametric maximum likelihood estimation under unknown and inhomogeneous variances1
Weighted validation of heteroscedastic regression models for better selection1
Corrigendum1
Cluster analysis via random partition distributions1
Quantifying Epistemic Uncertainty in Binary Classification via Accuracy Gain1
A family of mixture models for biclustering1
1
An automated alignment algorithm for identification of the source of footwear impressions with common class characteristics1
1
0.83899521827698