Statistical Analysis and Data Mining

(The median citation count of Statistical Analysis and Data Mining is 0. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
Issue Information340
Semi‐Parametric Least‐Area Linear‐Circular Regression Through Möbius Transformation22
Sequence Outlier Detection and Application of Gated Recurrent Unit Autoencoder Gaussian Mixture Model Based on Various Loss Optimization18
A practical extension of the recursive multi‐fidelity model for the emulation of hole closure experiments18
Issue Information15
Predictive models with end user preference13
Issue Information11
Issue Information9
Confidence bounds for threshold similarity graph in random variable network9
Modeling and inference for mixtures of simple symmetric exponential families of ‐dimensional distributions for vectors with binary coordinates9
Factor analysis for high‐dimensional time series: Consistent estimation and efficient computation8
Regrouped design in privacy analysis for multinomial microdata8
Analyzing relevance vector machines using a single penalty approach8
Adaptive boosting for ordinal target variables using neural networks7
Estimation of disease progression for ischemic heart disease using latent Markov with covariates7
eRPCA: Robust Principal Component Analysis for Exponential Family Distributions6
Marginal clustered multistate models for longitudinal progressive processes with informative cluster size6
Randomized multiarm bandits: An improved adaptive data collection method6
Bilateral‐WeightedOnline Adaptive Isolation Forest for anomaly detection in streaming data5
Online learning for streaming data classification in nonstationary environments5
Nonparametric clustering of RNA‐sequencing data5
Evaluating causal‐based feature selection for fuel property prediction models5
Survival trees based on heterogeneity in time‐to‐event and censoring distributions using parameter instability test4
Application of the Cox proportional hazards model and competing risks models to critical illness insurance data4
Buckley–Jamesestimation of generalized additive accelerated lifetime model with ultrahigh‐dimensional data4
Data‐drivensparse partial least squares3
Imputed quantile vector autoregressive model for multivariate spatial–temporal data3
Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams3
Cluster analysis via random partition distributions3
Sketched Stochastic Dictionary Learning for large‐scale data and application to high‐throughput mass spectrometry3
The fairness‐accuracy Pareto front3
Subsampling under distributional constraints3
Issue Information3
High‐dimensional classification based on nonparametric maximum likelihood estimation under unknown and inhomogeneous variances3
Weighted validation of heteroscedastic regression models for better selection3
Specifying composites in structural equation modeling: A refinement of the Henseler–Ogasawara specification3
Issue Information3
A new logarithmic multiplicative distortion for correlation analysis3
Machine learning and neural network based model predictions of soybean export shares from US Gulf to China3
Sample selection bias in evaluation of prediction performance of causal models3
Subsampling from features in large regression to find “winning features”2
Conformal Multi‐Target Hyperrectangles2
Gaussian process selections in semiparametric multi‐kernel machine regression for multi‐pathway analysis2
Sparse Bayesian variable selection in high‐dimensional logistic regression models with correlated priors2
Smart data augmentation: One equation is all you need2
The generalized hyperbolic family and automatic model selection through the multiple‐choiceLASSO2
Two‐sample testing for random graphs2
Assessment of the real‐time pattern recognition capability of machine learning algorithms2
Modal linear regression models with multiplicative distortion measurement errors2
Individualized image region detection with total variation2
Local influence analysis for the sliced average third‐moment estimation2
Association rules and decision rules2
A new formulation of sparse multiple kernel k$$ k $$‐means clustering and its applications2
The analysis of association rules: Latent class analysis2
Characterizing climate pathways using feature importance on echo state networks2
Bayesian modeling of location, scale, and shape parameters in skew‐normal regression models2
An automated alignment algorithm for identification of the source of footwear impressions with common class characteristics1
Development and validation of models for two‐week mortality of inpatients with COVID‐19 infection: A large prospective cohort study1
Optimal ratio for data splitting1
Portability analysis of data mining models for fog events forecasting1
Error‐controlled feature selection for ultrahigh‐dimensional and highly correlated feature space using deep learning1
Issue Information1
Issue Information1
A Novel Approach for APT Detection Based on Ensemble Learning Model1
CLADAG 2021 special issue: Selected papers on classification and data analysis1
Multi‐scale affinities with missing data: Estimation and applications1
Issue Information1
BayesMultiomics: An R Package for Bayesian Shrinkage Models for Integration and Analysis of Multi‐Platform High‐Dimensional Genomics Data1
Unsupervised random forests1
Coupled support tensor machine classification for multimodal neuroimaging data1
Efficient importance sampling imputation algorithms for quantile and composite quantile regression1
Convolutional Sparse Coding for Time Series Via a ℓ0 Penalty: An Efficient Algorithm With Statistical Guarantees1
Interaction Tests With Covariate‐Adaptive Randomization1
Driving mode analysis—How uncertain functional inputs propagate to an output1
Kernel learning with nonconvex ramp loss1
A new parametric approach to gender gap with application to EUSILC data in Poland and Italy1
Some Bayesian biclustering methods: Modeling and inference1
Issue Information1
Model‐based clustering of time‐dependent categorical sequences with application to the analysis of major life event patterns1
Nonparametric Expectile Regression Meets Deep Neural Networks: A Robust Nonlinear Variable Selection method1
Fourier neural networks as function approximators and differential equation solvers1
An efficientk‐modes algorithm for clustering categorical datasets0
Simplicial depth: Characterization and reconstruction0
A novel Bayesian method for variable selection and estimation in binary quantile regression0
Bag of little bootstraps for massive and distributed longitudinal data0
Adversarially robust subspace learning in the spiked covariance model0
On Algorithms and Approximations for Progressively Type‐I Censoring Schemes0
Semiparametric estimation of average treatment effects in observational studies0
Towards accelerating particle‐resolved direct numerical simulation with neural operators0
Ensemble learning for score likelihood ratios under the common source problem0
Coefficient tree regression for generalized linear models0
Boosting diversity in regression ensembles0
Frequentist model averaging for zero‐inflated Poisson regression models0
A machine learning oracle for parameter estimation0
Robustifying Marginal Linear Models for Correlated Responses Using a Constructive Multivariate Huber Distribution0
Spatially‐correlated time series clustering using location‐dependent Dirichlet process mixture model0
Hub‐aware random walk graph embedding methods for classification0
A Conversational Assistant for Democratization of Data Visualization: A Comparative Study of Two Approaches of Interaction0
Comparison of merging strategies for building machine learning models on multiple independent gene expression data sets0
Multivariate Gaussian RBF‐net for smooth function estimation and variable selection0
An ImprovedD2GAN‐based oversampling algorithm for imbalanced data classification0
Trees, forests, chickens, and eggs: when and why to prune trees in a random forest0
An approach to characterizing spatial aspects of image system blur0
Quantifying Epistemic Uncertainty in Binary Classification via Accuracy Gain0
Issue Information0
A modified least angle regression algorithm for interaction selection with heredity0
Issue Information0
Approximation error ofFourierneural networks0
Power grid frequency prediction using spatiotemporal modeling0
Integrative learning of structuredhigh‐dimensionaldata from multiple datasets0
Ensembled sparse‐input hierarchical networks for high‐dimensional datasets0
Issue Information0
Nonparametric Bayesian functional clustering with applications to racial disparities in breast cancer0
Traditional kriging versus modern Gaussian processes for large‐scale mining data0
Model selection with bootstrap validation0
Bayesian shrinkage models for integration and analysis of multiplatform high‐dimensional genomics data0
Supervised compression of big data0
An Adaptive Microbiome‐Based Truncated Test0
A linear time method for the detection of collective and point anomalies0
Factor analysis of mixed data for anomaly detection0
Doubly robust estimation for non‐probability samples with modified intertwined probabilistic factors decoupling0
Imbalanced classification: A paradigm‐based review0
Issue Information0
Compositional variable selection in quantile regression for microbiome data with false discovery rate control0
Non‐uniform active learning for Gaussian process models with applications to trajectory informed aerodynamic databases0
Simplicial depth and its median: Selected properties and limitations0
Neural‐networktransformation models for counting processes0
Handwriting identification using random forests and score‐based likelihood ratios0
On difference‐based gradient estimation in nonparametric regression0
Feature screening of ultrahigh dimensional longitudinal data based on the C‐statistic0
Modeling matrix variate time series via hidden Markov models with skewed emissions0
Weighted pivot coordinates for partial least squares‐based marker discovery in high‐throughput compositional data0
Prior effective sample size for exponential family distributions with multiple parameters0
Share density‐based clustering of income data0
A general iterative clustering algorithm0
Considerations in Bayesian agent‐based modeling for the analysis of COVID‐19 data0
Online embedding and clustering of evolving data streams0
Robust multitask learning in high dimensions under memory constraint0
Residual's influence index (RINFIN), bad leverage and unmasking in high dimensionalL2‐regression0
Visual diagnostics of an explainer model: Tools for the assessment of LIME explanations0
Revisiting Winnow: A modified online feature selection algorithm for efficient binary classification0
A deep learning factor analysis model based on importance‐weighted variational inference and normalizing flow priors: Evaluation within a set of multidimensional performance assessments in youth elite0
Categorical classifiers in multiclass classification with imbalanced datasets0
Data Twinning0
Randomized algorithms for tensor response regression0
A treeless absolutely random forest with closed‐form estimators of expected proximities0
Persistent Classification: Understanding Adversarial Attacks by Studying Decision Boundary Dynamics0
Local support vector machine based dimension reduction0
Issue Information0
Residuals and diagnostics for multinomial regression models0
Emulated order identification for models of big time series data0
A study of the impact of COVID‐19 on the Chinese stock market based on a new textual multiple ARMA model0
A deep learning approach for the comparison of handwritten documents using latent feature vectors0
An initial exploration of Bayesian model calibration for estimating the composition of rocks and soils on Mars0
Transfer learning under the Cox model with interval‐censored data0
Erratum to “Data‐driven dimension reduction in functional principal component analysis identifying the change‐point in functional data”0
Issue Information0
A finely tuned deep transfer learning algorithm to compare outsole images0
A fast and efficient Modal EM algorithm for Gaussian mixtures0
Precision aggregated local models0
Negative binomial graphical model with excess zeros0
Regression‐based Bayesian estimation and structure learning for nonparanormal graphical models0
Using Neural Networks to Identify Mixture Components in Hyperspectral Reflectance Data0
Semi‐supervised multi‐label learning with missing labels by exploiting feature‐label correlations0
A tree‐based gene–environment interaction analysis with rare features0
Neural interval‐censored survival regression with feature selection0
Lq regularization for fair artificial intelligence robust to covariate shift0
Study of a bounded interval perks distribution with quantile regression analysis0
A comparison of Gaussian processes and neural networks for computer model emulation and calibration0
Sequential metamodel‐based approaches to level‐set estimation under heteroscedasticity0
Tracking clusters and anomalies in evolving data streams0
Bayesian Posterior Interval Calibration to Improve the Interpretability of Observational Studies0
An auxiliary Part‐of‐Speech tagger for blog and microblog cyber‐slang0
Density estimation via measure transport: Outlook for applications in the biological sciences0
Estimating basis functions in massive fields under the spatial mixed effects model0
The finite mixture model for the tails of distribution: Monte Carlo experiment and empirical applications0
Penalized composite likelihood for colored graphical Gaussian models0
Weighted AutoEncoding recommender system0
Node Centrality Inference via Hypothesis Testing0
Issue Information0
Multi‐node Expectation–Maximization algorithm for finite mixture models0
A neutral zone classifier for three classes with an application to text mining0
The Classification Algorithm Based on Functional Logistic Regression Model With Spatial Effects and Its Application in Air Quality Analysis0
Bayesian batch optimization for molybdenum versus tungsten inertial confinement fusion double shell target design0
Issue Information0
Exponential calibration for correlation coefficient with additive distortion measurement errors0
Data‐driven stochastic model for quantifying the interplay between amyloid‐beta and calcium levels in Alzheimer's disease0
Adaptive batching for Gaussian process surrogates with application in noisy level set estimation0
On an Empirical Likelihood Based Solution to the Approximate Bayesian Computation Problem0
Input‐response space‐filling designs incorporating response uncertainty0
Bayesian inference for nonprobability samples with nonignorable missingness0
Noise‐Augmented ℓ0 Regularization of Tensor Regression With Tucker Decomposition0
Robust deep neural network surrogate models with uncertainty quantification via adversarial training0
Measure inducing classification and regression trees for functional data0
Parallel coordinate order forhigh‐dimensionaldata0
Application of nonparametric quantifiers for online handwritten signature verification: A statistical learning approach0
Modeling subpopulations for hierarchically structured data0
Intuitively adaptable outlier detector0
Biclustering high‐frequency financial time series based on information theory0
Hierarchy‐assisted gene expression regulatory network analysis0
Generalized mixed‐effects random forest: A flexible approach to predict university student dropout0
Issue Information0
Issue Information0
An Efficient Filtering Approach for Model Estimation in Sparse Regression0
Distributed dimension reduction with nearly oracle rate0
Expert‐in‐the‐loop design of integral nuclear data experiments0
CLADAG 2019 Special Issue: Selected Papers on Classification and Data Analysis0
Feature selection for imbalanced data with deep sparse autoencoders ensemble0
Issue Information0
A novel two‐step extrapolation‐insertion risk model based on the Expectile under the Pareto‐type distribution0
Issue Information0
Greenwood Statistic Under Distortion Measurement Errors0
Semiparametric detection of changepoints in location, scale, and copula0
Out‐of‐bag stability estimation for k‐means clustering0
A random forest approach for interval selection in functional regression0
Evaluation and interpretation of driving risks: Automobile claim frequency modeling with telematics data0
Issue Information0
Cost‐sensitive classification with time constraint on incomplete data0
Nonparametric mean and variance adaptive classification rule for high‐dimensional data with heteroscedastic variances0
Bayesian relative composite quantile regression approach of ordinal latent regression model with L1/2 regularization0
Markov chain to analyze web usability of a university website using eye tracking data0
A tutorial on generative adversarial networks with application to classification of imbalanced data0
Multivariate contaminated normal mixture regression modeling of longitudinal data based on jointmean‐covariancemodel0
A family of mixture models for biclustering0
Issue Information0