Data Mining and Knowledge Discovery

Papers
(The median citation count of Data Mining and Knowledge Discovery is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-10-01 to 2024-10-01.)
ArticleCitations
The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances244
Counterfactual explanations and how to find them: literature review and benchmarking100
A survey of community detection methods in multilayer networks93
A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts80
MultiRocket: multiple pooling operators and transformations for fast and effective time series classification77
Time series extrinsic regression50
Deep graph similarity learning: a survey44
Fake review detection on online E-commerce platforms: a systematic literature review41
Forecast evaluation for data scientists: common pitfalls and best practices38
Time series motifs discovery under DTW allows more robust discovery of conserved structure34
Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning34
Benchmarking and survey of explanation methods for black box models34
Relational Learning Analysis of Social Politics using Knowledge Graph Embedding34
End-to-end deep representation learning for time series clustering: a comparative study33
XEM: An explainable-by-design ensemble method for multivariate time series classification32
Smoothed dilated convolutions for improved dense prediction32
Algorithmic fairness datasets: the story so far31
Word-class embeddings for multiclass text classification26
Data-driven detection of counterpressing in professional football25
Graph convolutional networks for traffic forecasting with missing values22
Improving position encoding of transformers for multivariate time series classification20
Dataset2Vec: learning dataset meta-features19
The area under the ROC curve as a measure of clustering quality19
Hydra: competing convolutional kernels for fast and accurate time series classification19
A framework for deep constrained clustering18
Multi-label learning with missing and completely unobserved labels18
Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach18
Grouped feature importance and combined features effect plot18
Efficient set-valued prediction in multi-class classification17
Boosting house price predictions using geo-spatial network embedding16
User preference and embedding learning with implicit feedback for recommender systems16
Hierarchical message-passing graph neural networks15
VFC-SMOTE: very fast continuous synthetic minority oversampling for evolving data streams15
Cost-sensitive ensemble learning: a unifying framework15
A survey of deep network techniques all classifiers can adopt15
Detecting virtual concept drift of regressors without ground truth values15
INK: knowledge graph embeddings for node classification15
Expected passes14
Sequence graph transform (SGT): a feature embedding function for sequence data mining13
Explanatory artificial intelligence (YAI): human-centered explanations of explainable AI and complex data12
Extending greedy feature selection algorithms to multiple solutions12
Simplification of genetic programs: a literature survey12
Feature extraction from unequal length heterogeneous EHR time series via dynamic time warping and tensor decomposition12
Bake off redux: a review and experimental evaluation of recent time series classification algorithms12
Knowledge graph embedding methods for entity alignment: experimental review12
Sequential recommendation with metric models based on frequent sequences12
POI recommendation with queuing time and user interest awareness11
A deep multimodal model for bug localization11
Who can receive the pass? A computational model for quantifying availability in soccer11
An efficient procedure for mining egocentric temporal motifs11
Robust subgroup discovery10
BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent10
Interpretability, personalization and reliability of a machine learning based clinical decision support system10
PETSC: pattern-based embedding for time series classification10
Chebyshev approaches for imbalanced data streams regression models10
Controlling hallucinations at word level in data-to-text generation10
A recurrent neural network architecture to model physical activity energy expenditure in older people9
Synwalk: community detection via random walk modelling9
ClaSP: parameter-free time series segmentation9
Stable and actionable explanations of black-box models through factual and counterfactual rules9
Sufficient dimension reduction for average causal effect estimation9
Predictive modeling of infant mortality9
What’s in a name? – gender classification of names with character based machine learning models9
Novel features for time series analysis: a complex networks approach9
SPEck: mining statistically-significant sequential patterns efficiently with exact sampling9
The minimum description length principle for pattern mining: a survey9
Informative pseudo-labeling for graph neural networks with few labels9
Early abandoning and pruning for elastic distances including dynamic time warping9
Detecting singleton spams in reviews via learning deep anomalous temporal aspect-sentiment patterns9
Time series clustering in linear time complexity8
Mining full, inner and tail periodic patterns with perfect, imperfect and asynchronous periodicity simultaneously8
Natural language techniques supporting decision modelers8
NICE: an algorithm for nearest instance counterfactual explanations8
Mining communities and their descriptions on attributed graphs: a survey8
SMILE: a feature-based temporal abstraction framework for event-interval sequence classification8
The grammar of interactive explanatory model analysis8
Adversarial balancing-based representation learning for causal effect inference with observational data8
On GNN explainability with activation rules8
Neural content-aware collaborative filtering for cold-start music recommendation8
Fast and robust video-based exercise classification via body pose tracking and scalable multivariate time series classifiers8
Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency7
Individualized passenger travel pattern multi-clustering based on graph regularized tensor latent dirichlet allocation7
Interpreting deep learning models with marginal attribution by conditioning on quantiles7
Sentiment analysis in tweets: an assessment study from classical to modern word representation models7
An external stability audit framework to test the validity of personality prediction in AI hiring7
An overlap sensitive neural network for class imbalanced data7
An adaptive meta-heuristic for music plagiarism detection based on text similarity and clustering7
Continuous treatment effect estimation via generative adversarial de-confounding6
Federated singular value decomposition for high-dimensional data6
Implicit consensus clustering from multiple graphs6
Fair detection of poisoning attacks in federated learning on non-i.i.d. data6
The network-untangling problem: from interactions to activity timelines6
Sparse randomized shortest paths routing with Tsallis divergence regularization6
Isolation kernel: the X factor in efficient and effective large scale online kernel learning6
Parameterizing the cost function of dynamic time warping with application to time series classification6
Conclusive local interpretation rules for random forests6
Better trees: an empirical study on hyperparameter tuning of classification decision tree induction algorithms6
CURIE: a cellular automaton for concept drift detection5
Reflective-net: learning from explanations5
Navigating the metric maze: a taxonomy of evaluation metrics for anomaly detection in time series5
K-plex cover pooling for graph neural networks5
Exploiting sensor data in professional road cycling: personalized data-driven approach for frequent fitness monitoring5
Counterfactual explanations as interventions in latent space5
FuseRec: fusing user and item homophily modeling with temporal recommender systems5
Wisdom of the contexts: active ensemble learning for contextual anomaly detection5
Fairness in vulnerable attribute prediction on social media5
Dynamic self-paced sampling ensemble for highly imbalanced and class-overlapped data classification5
Scalable classifier-agnostic channel selection for multivariate time series classification5
An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices5
DAMP: accurate time series anomaly detection on trillions of datapoints and ultra-fast arriving data streams5
Multi-view metro station clustering based on passenger flows: a functional data-edged network community detection approach5
Preventing deception with explanation methods using focused sampling5
SOKNL: A novel way of integrating K-nearest neighbours with adaptive random forest regression for data streams5
Homophily outlier detection in non-IID categorical data5
Fast, accurate and explainable time series classification through randomization5
Link prediction in dynamic networks using random dot product graphs5
Using p-values for the comparison of classifiers: pitfalls and alternatives4
Generalized core maintenance of dynamic bipartite graphs4
MERLIN++: parameter-free discovery of time series anomalies4
Large scale K-means clustering using GPUs4
An eager splitting strategy for online decision trees in ensembles4
Z-Time: efficient and effective interpretable multivariate time series classification4
Hyperbolic node embedding for temporal networks4
Handling imbalance in hierarchical classification problems using local classifiers approaches4
AURORA: A Unified fRamework fOR Anomaly detection on multivariate time series4
MultiETSC: automated machine learning for early time series classification4
Inferring range of information diffusion based on historical frequent items4
Extended missing data imputation via GANs for ranking applications4
Practical joint human-machine exploration of industrial time series using the matrix profile4
Correction: Bake off redux: a review and experimental evaluation of recent time series classification algorithms4
Mining sequences with exceptional transition behaviour of varying order using quality measures based on information-theoretic scoring functions4
Robust regression via error tolerance4
When graph convolution meets double attention: online privacy disclosure detection with multi-label text classification4
Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets4
Methods for explaining Top-N recommendations through subgroup discovery3
Joint leaf-refinement and ensemble pruning through $$L_1$$ regularization3
Shapley values for cluster importance3
Tackling ordinal regression problem for heterogeneous data: sparse and deep multi-task learning approaches3
PAC-Bayesian lifelong learning for multi-armed bandits3
A methodology for refined evaluation of neural code completion approaches3
Datasets, tasks, and training methods for large-scale hypergraph learning3
Regularized impurity reduction: accurate decision trees with complexity guarantees3
Approximation trees: statistical reproducibility in model distillation3
Fast computation of Katz index for efficient processing of link prediction queries3
CrashNet: an encoder–decoder architecture to predict crash test outcomes3
Enforcing fairness using ensemble of diverse Pareto-optimal models3
Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data3
Decision tree boosted varying coefficient models3
Robust and sparse multinomial regression in high dimensions3
Learning tractable probabilistic models for moral responsibility and blame3
Hybrid Bayesian network discovery with latent variables by scoring multiple interventions3
Counterfactual inference with latent variable and its application in mental health care3
Social explorative attention based recommendation for content distribution platforms3
A graph convolutional fusion model for community detection in multiplex networks3
Exploring potential biases towards blockbuster items in ranking-based recommendations3
Exploring uplift modeling with high class imbalance3
Mining Pareto-optimal counterfactual antecedents with a branch-and-bound model-agnostic algorithm3
A case study of improving a non-technical losses detection system through explainability3
Selego: robust variate selection for accurate time series forecasting3
A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering3
A comparative study of methods for estimating model-agnostic Shapley value explanations2
Transfer how much: a fine-grained measure of the knowledge transferability of user behavior sequences in social network2
Regression tree-based active learning2
Structural learning of simple staged trees2
One-shot relational learning for extrapolation reasoning on temporal knowledge graphs2
Pseudoinverse graph convolutional networks2
Mining explainable local and global subgraph patterns with surprising densities2
Bias characterization, assessment, and mitigation in location-based recommender systems2
Efficient binary embedding of categorical data using BinSketch2
kNN matrix profile for knowledge discovery from time series2
Category tree distance: a taxonomy-based transaction distance for web user analysis2
Differentially private tree-based redescription mining2
Explaining deep convolutional models by measuring the influence of interpretable features in image classification2
An attention matrix for every decision: faithfulness-based arbitration among multiple attention-based interpretations of transformers in text classification2
Can local explanation techniques explain linear additive models?2
Matrix sketching for supervised classification with imbalanced classes2
Reciprocity in directed hypergraphs: measures, findings, and generators2
Personalised meta-path generation for heterogeneous graph neural networks2
Topic change point detection using a mixed Bayesian model2
Hypercore decomposition for non-fragile hyperedges: concepts, algorithms, observations, and applications2
Characterizing attitudinal network graphs through frustration cloud2
Functional classwise principal component analysis: a classification framework for functional data analysis2
Random walk with restart on hypergraphs: fast computation and an application to anomaly detection2
ROhAN: Row-order agnostic null models for statistically-sound knowledge discovery2
ContE: contextualized knowledge graph embedding for circular relations2
ConvMOS: climate model output statistics with deep learning2
Dynamic cyber risk estimation with competitive quantile autoregression2
Efficient algorithms for fair clustering with a new notion of fairness2
Structure learning for relational logistic regression: an ensemble approach2
Ranking with submodular functions on a budget2
Differentiated matching for individual and average treatment effect estimation2
Multiple-input neural networks for time series forecasting incorporating historical and prospective context2
Correlations between random projections and the bivariate normal2
Predicting consumer choice from raw eye-movement data using the RETINA deep learning architecture2
On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles2
Robust explainer recommendation for time series classification2
0.042129039764404