Data Mining and Knowledge Discovery

Papers
(The median citation count of Data Mining and Knowledge Discovery is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-06-01 to 2025-06-01.)
ArticleCitations
A probabilistic model for API contract specification retrieval focusing on the openAPI standard154
Joint dynamic topic model for recognition of lead-lag relationship in two text corpora121
Who can receive the pass? A computational model for quantifying availability in soccer113
Counterfactual explanations as interventions in latent space98
Knowledge graph completion based on asymmetric translation and automatic entity type representation66
Exploiting sensor data in professional road cycling: personalized data-driven approach for frequent fitness monitoring61
Representing ensembles of networks for fuzzy cluster analysis: a case study61
Discord-based counterfactual explanations for time series classification50
Thompson sampling-based recursive block elimination for dynamic assignment under limited budget in pure-exploration49
TCMI: a non-parametric mutual-dependence estimator for multivariate continuous distributions49
Correction: Marginal effects for non-linear prediction functions49
The grammar of interactive explanatory model analysis43
Traffic forecasting on new roads using spatial contrastive pre-training (SCPT)39
Hydra: competing convolutional kernels for fast and accurate time series classification38
Wisdom of the contexts: active ensemble learning for contextual anomaly detection37
VEM$$^2$$L: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completion36
Reflective-net: learning from explanations35
MMA: metadata supported multi-variate attention for onset detection and prediction35
Dynamic cyber risk estimation with competitive quantile autoregression28
Neural content-aware collaborative filtering for cold-start music recommendation24
Correction: Bake off redux: a review and experimental evaluation of recent time series classification algorithms21
SALτ: efficiently stopping TAR by improving priors estimates20
TenGAN: adversarially generating multiplex tensor graphs20
On computing exact means of time series using the move-split-merge metric20
Improving neural network’s robustness on tabular data with D-layers20
Approximation trees: statistical reproducibility in model distillation19
Explainable decomposition of nested dense subgraphs19
AA-forecast: anomaly-aware forecast for extreme events17
Generalized core maintenance of dynamic bipartite graphs17
Robust explainer recommendation for time series classification17
Interpretable representations in explainable AI: from theory to practice17
MultiRocket: multiple pooling operators and transformations for fast and effective time series classification16
Contextualization of soccer analysis with tactical periodization and machine learning16
On GNN explainability with activation rules16
What do anomaly scores actually mean? Dynamic characteristics beyond accuracy16
On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles16
Multilayer horizontal visibility graphs for multivariate time series analysis16
Exploiting second-order dissimilarity representations for hierarchical clustering and visualization15
Robust and sparse multinomial regression in high dimensions15
Explainable and interpretable machine learning and data mining15
Topic change point detection using a mixed Bayesian model15
Explanatory artificial intelligence (YAI): human-centered explanations of explainable AI and complex data15
EmbAssi: embedding assignment costs for similarity search in large graph databases14
Hyperbolic node embedding for temporal networks14
Sky-signatures: detecting and characterizing recurrent behavior in sequential data14
Efficient algorithms for fair clustering with a new notion of fairness14
Random walks with variable restarts for negative-example-informed label propagation13
A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts13
Coupled block diagonal regularization for multi-view subspace clustering13
Algorithmic fairness datasets: the story so far13
BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent12
PAC-Bayesian lifelong learning for multi-armed bandits12
Bounding the family-wise error rate in local causal discovery using Rademacher averages12
Unsupervised feature based algorithms for time series extrinsic regression12
Hypercore decomposition for non-fragile hyperedges: concepts, algorithms, observations, and applications11
Randomnet: clustering time series using untrained deep neural networks11
Mondrian forest for data stream classification under memory constraints11
NICE: an algorithm for nearest instance counterfactual explanations11
Link prediction in dynamic networks using random dot product graphs11
Unsupervised domain adaptation with non-stochastic missing data10
Missing value replacement in strings and applications10
Synwalk: community detection via random walk modelling10
K-plex cover pooling for graph neural networks10
Structural learning of simple staged trees10
Making clusterings fairer by post-processing: algorithms, complexity results and experiments10
Inferring tie strength in temporal networks10
An eager splitting strategy for online decision trees in ensembles10
When graph convolution meets double attention: online privacy disclosure detection with multi-label text classification10
Dynamic self-paced sampling ensemble for highly imbalanced and class-overlapped data classification9
Temporal state change Bayesian networks for modeling of evolving multivariate state sequences: model, structure discovery and parameter estimation9
Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach9
Locality adaptive incomplete multi-view subspace clustering9
Detach-ROCKET: sequential feature selection for time series classification with random convolutional kernels9
ClaSP: parameter-free time series segmentation9
Grouped feature importance and combined features effect plot9
Robust subgroup discovery8
PETSC: pattern-based embedding for time series classification8
A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering8
Central node identification via weighted kernel density estimation8
Intersectional fair ranking via subgroup divergence8
Modelling event sequence data by type-wise neural point process7
Knowledge graph embedding closed under composition7
i-Align: an interpretable knowledge graph alignment model7
Marginal effects for non-linear prediction functions7
MrTF: model refinery for transductive federated learning7
Structural iterative lexicographic autoencoded node representation7
Sentiment analysis in tweets: an assessment study from classical to modern word representation models7
Binary quantification and dataset shift: an experimental investigation7
Bake off redux: a review and experimental evaluation of recent time series classification algorithms7
Continuous treatment effect estimation via generative adversarial de-confounding7
One-shot relational learning for extrapolation reasoning on temporal knowledge graphs7
A tale of two roles: exploring topic-specific susceptibility and influence in cascade prediction7
An external stability audit framework to test the validity of personality prediction in AI hiring7
Benchmarking and survey of explanation methods for black box models6
ArcMatch: high-performance subgraph matching for labeled graphs by exploiting edge domains6
Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback6
Strengthening ties towards a highly-connected world6
Data-driven learning optimal K values for K-nearest neighbour matching in causal inference6
Session-based recommendation by exploiting substitutable and complementary relationships from multi-behavior data6
On regime changes in text data using hidden Markov model of contaminated vMF distribution6
Community detection in interval-weighted networks6
Z-Time: efficient and effective interpretable multivariate time series classification6
Online concept evolution detection based on active learning6
BDRI: block decomposition based on relational interaction for knowledge graph completion5
MIRACLE: Malware image recognition and classification by layered extraction5
HyEED: embedding learning of knowledge graphs with entity description in hyperbolic space5
SOKNL: A novel way of integrating K-nearest neighbours with adaptive random forest regression for data streams5
OEC: an online ensemble classifier for mining data streams with noisy labels5
Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets5
Regularization-based methods for ordinal quantification5
Large scale K-means clustering using GPUs5
Inferring range of information diffusion based on historical frequent items5
Fairness in vulnerable attribute prediction on social media5
Improving position encoding of transformers for multivariate time series classification5
Exploring potential biases towards blockbuster items in ranking-based recommendations5
GeoRF: a geospatial random forest5
Knowledge graph embedding methods for entity alignment: experimental review5
ContE: contextualized knowledge graph embedding for circular relations4
Differentially Private Distance Learning in Categorical Data4
MultiETSC: automated machine learning for early time series classification4
Robust regression via error tolerance4
Efficient outlier detection in numerical and categorical data4
Attention based adversarially regularized learning for network embedding4
Explainable contextual anomaly detection using quantile regression forests4
Mining sequences with exceptional transition behaviour of varying order using quality measures based on information-theoretic scoring functions4
INK: knowledge graph embeddings for node classification4
Counterfactual inference with latent variable and its application in mental health care4
Negative-sample-free knowledge graph embedding4
Matrix sketching for supervised classification with imbalanced classes4
Model-agnostic variable importance for predictive uncertainty: an entropy-based approach4
Conclusive local interpretation rules for random forests4
A multi-scale time series forecasting framework with temporal hierarchical information fusion and reconciliation4
Multi-neighbor social recommendation with attentional graph convolutional network4
Improving graph-based recommendation with unraveled graph learning4
Explaining deep convolutional models by measuring the influence of interpretable features in image classification4
Using differential evolution for an attribute-weighted inverted specific-class distance measure for nominal attributes4
An attention matrix for every decision: faithfulness-based arbitration among multiple attention-based interpretations of transformers in text classification4
Isolation kernel: the X factor in efficient and effective large scale online kernel learning4
Effective interpretable learning for large-scale categorical data4
ConvMOS: climate model output statistics with deep learning4
Improving the core resilience of real-world hypergraphs4
Sequential stratified regeneration: MCMC for large state spaces with an application to subgraph count estimation4
Uplift modeling with quasi-loss-functions4
MERLIN++: parameter-free discovery of time series anomalies4
Scalable classifier-agnostic channel selection for multivariate time series classification3
Early abandoning and pruning for elastic distances including dynamic time warping3
Bias characterization, assessment, and mitigation in location-based recommender systems3
LoCoMotif: discovering time-warped motifs in time series3
A spatiotemporal deep neural network for fine-grained multi-horizon wind prediction3
Design and evaluation of highly accurate smart contract code vulnerability detection framework3
Fast block-wise partitioning for extreme multi-label classification3
Correction to: Studying bias in visual features through the lens of optimal transport3
An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices3
A systematic review of deep learning for structural geological interpretation3
A combinatorial multi-armed bandit approach to correlation clustering3
Cost-sensitive ensemble learning: a unifying framework3
Tractable probabilistic models and computational complexity3
Interplay between topology and edge weights in real-world graphs: concepts, patterns, and an algorithm3
kNN matrix profile for knowledge discovery from time series3
Differentiated matching for individual and average treatment effect estimation3
Series2vec: similarity-based self-supervised representation learning for time series classification3
Structure-aware decoupled imputation network for multivariate time series3
Multiple-input neural networks for time series forecasting incorporating historical and prospective context3
CSCN: an efficient snapshot ensemble learning based sparse transformer model for long-range spatial-temporal traffic flow prediction3
Enhancing cluster analysis via topological manifold learning3
ARL: analogical reinforcement learning for knowledge graph reasoning3
Provable randomized rounding for minimum-similarity diversification3
The impact of variable ordering on Bayesian network structure learning3
Regularized impurity reduction: accurate decision trees with complexity guarantees3
A two-step anomaly detection based method for PU classification in imbalanced data sets3
Proximity forest 2.0: a new effective and scalable similarity-based classifier for time series3
Deep anomaly detection with partition contrastive learning for tabular data2
Proxy-enhanced cross-domain sequential recommendation2
A comparative evaluation of clustering-based outlier detection2
Enforcing fairness using ensemble of diverse Pareto-optimal models2
Sparse oblique decision trees: a tool to understand and manipulate neural net features2
Exploring uplift modeling with high class imbalance2
Controlling hallucinations at word level in data-to-text generation2
An anomaly aware network embedding framework for unsupervised anomalous link detection2
MODE-Bi-GRU: orthogonal independent Bi-GRU model with multiscale feature extraction2
Hybrid Bayesian network discovery with latent variables by scoring multiple interventions2
An efficient procedure for mining egocentric temporal motifs2
Chebyshev approaches for imbalanced data streams regression models2
Reciprocity in directed hypergraphs: measures, findings, and generators2
A hyperbolic approach for learning communities on graphs2
The art of centering without centering for robust principal component analysis2
The Hadamard decomposition problem2
Weighted sparse simplex representation: a unified framework for subspace clustering, constrained clustering, and active learning2
Generalized density attractor clustering for incomplete data2
Transfer how much: a fine-grained measure of the knowledge transferability of user behavior sequences in social network2
Simplification of genetic programs: a literature survey2
Data debiasing via causal diffusion model2
Fake review detection on online E-commerce platforms: a systematic literature review2
FRAPPE: fast rank approximation with explainable features for tensors2
Implicit consensus clustering from multiple graphs2
Correction: FRAPPE: fast rank approximation with explainable features for tensors2
Exploring the diverse world of SAX-based methodologies2
Enhancing racism classification: an automatic multilingual data annotation system using self-training and CNN2
VFC-SMOTE: very fast continuous synthetic minority oversampling for evolving data streams2
trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes2
Social norm bias: residual harms of fairness-aware algorithms2
Clustering using GRASP and path relinking for the max k-cut problem2
Time series clustering with random convolutional kernels2
AURORA: A Unified fRamework fOR Anomaly detection on multivariate time series2
0.087426900863647