Data Mining and Knowledge Discovery

Papers
(The median citation count of Data Mining and Knowledge Discovery is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-04-01 to 2024-04-01.)
ArticleCitations
InceptionTime: Finding AlexNet for time series classification562
ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels364
The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances195
A survey of community detection methods in multilayer networks76
Challenges in benchmarking stream learning algorithms with real-world data60
Counterfactual explanations and how to find them: literature review and benchmarking58
MultiRocket: multiple pooling operators and transformations for fast and effective time series classification49
Deep graph similarity learning: a survey37
Deep soccer analytics: learning an action-value function for evaluating soccer players36
An efficient K-means clustering algorithm for tall data36
A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts35
Time series extrinsic regression34
Matrix profile goes MAD: variable-length motif and discord discovery in data series34
TEASER: early and accurate time series classification32
Relational Learning Analysis of Social Politics using Knowledge Graph Embedding30
Fake review detection on online E-commerce platforms: a systematic literature review30
Smoothed dilated convolutions for improved dense prediction29
Time series motifs discovery under DTW allows more robust discovery of conserved structure28
ColluEagle: collusive review spammer detection using Markov random fields26
Scalable attack on graph data by injecting vicious nodes26
Comparison of novelty detection methods for multispectral images in rover-based planetary exploration missions24
Forecast evaluation for data scientists: common pitfalls and best practices24
Word-class embeddings for multiclass text classification23
End-to-end deep representation learning for time series clustering: a comparative study22
XEM: An explainable-by-design ensemble method for multivariate time series classification22
Data-driven detection of counterpressing in professional football22
Active learning for hierarchical multi-label classification19
Improving embedded knowledge graph multi-hop question answering by introducing relational chain reasoning19
ABBA: adaptive Brownian bridge-based symbolic aggregation of time series17
Gaussian bandwidth selection for manifold learning and classification16
Algorithmic fairness datasets: the story so far16
Treant: training evasion-aware decision trees16
A framework for deep constrained clustering15
Multi-label learning with missing and completely unobserved labels15
Efficient mining of the most significant patterns with permutation testing15
User preference and embedding learning with implicit feedback for recommender systems15
Benchmarking and survey of explanation methods for black box models14
struc2gauss: Structural role preserving network embedding via Gaussian embedding14
INK: knowledge graph embeddings for node classification14
MIDIA: exploring denoising autoencoders for missing data imputation14
A survey of deep network techniques all classifiers can adopt14
The area under the ROC curve as a measure of clustering quality14
Grouped feature importance and combined features effect plot13
Dataset2Vec: learning dataset meta-features13
Graph convolutional networks for traffic forecasting with missing values13
Extending greedy feature selection algorithms to multiple solutions12
Sequential recommendation with metric models based on frequent sequences12
Efficient set-valued prediction in multi-class classification12
Cost-sensitive ensemble learning: a unifying framework11
Expected passes11
Detecting virtual concept drift of regressors without ground truth values11
PETSC: pattern-based embedding for time series classification10
VFC-SMOTE: very fast continuous synthetic minority oversampling for evolving data streams10
A deep multimodal model for bug localization10
Boosting house price predictions using geo-spatial network embedding10
An ultra-fast time series distance measure to allow data mining in more complex real-world deployments10
An efficient procedure for mining egocentric temporal motifs10
Hydra: competing convolutional kernels for fast and accurate time series classification10
Large-scale network motif analysis using compression9
Bayesian mean-parameterized nonnegative binary matrix factorization9
BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent9
For real: a thorough look at numeric attributes in subgroup discovery9
Controlling hallucinations at word level in data-to-text generation9
Robust subgroup discovery9
Who can receive the pass? A computational model for quantifying availability in soccer9
Hierarchical message-passing graph neural networks8
Simplification of genetic programs: a literature survey8
Mining communities and their descriptions on attributed graphs: a survey8
Chebyshev approaches for imbalanced data streams regression models8
SPEck: mining statistically-significant sequential patterns efficiently with exact sampling8
Feature extraction from unequal length heterogeneous EHR time series via dynamic time warping and tensor decomposition8
Interpretability, personalization and reliability of a machine learning based clinical decision support system7
Natural language techniques supporting decision modelers7
Early abandoning and pruning for elastic distances including dynamic time warping7
On GNN explainability with activation rules7
Simple and effective neural-free soft-cluster embeddings for item cold-start recommendations7
Introducing time series snippets: a new primitive for summarizing long time series7
Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach7
Time series clustering in linear time complexity6
Detecting singleton spams in reviews via learning deep anomalous temporal aspect-sentiment patterns6
An overlap sensitive neural network for class imbalanced data6
POI recommendation with queuing time and user interest awareness6
Neural content-aware collaborative filtering for cold-start music recommendation6
What’s in a name? – gender classification of names with character based machine learning models6
Improving position encoding of transformers for multivariate time series classification6
Sequence graph transform (SGT): a feature embedding function for sequence data mining6
Explanatory artificial intelligence (YAI): human-centered explanations of explainable AI and complex data6
Sufficient dimension reduction for average causal effect estimation6
Mining full, inner and tail periodic patterns with perfect, imperfect and asynchronous periodicity simultaneously6
SMILE: a feature-based temporal abstraction framework for event-interval sequence classification6
TEAGS: time-aware text embedding approach to generate subgraphs6
Novel features for time series analysis: a complex networks approach6
A recurrent neural network architecture to model physical activity energy expenditure in older people6
Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency6
The minimum description length principle for pattern mining: a survey6
ClaSP: parameter-free time series segmentation6
Adversarial balancing-based representation learning for causal effect inference with observational data6
Interpreting deep learning models with marginal attribution by conditioning on quantiles5
Implicit consensus clustering from multiple graphs5
An adaptive meta-heuristic for music plagiarism detection based on text similarity and clustering5
An external stability audit framework to test the validity of personality prediction in AI hiring5
Predictive modeling of infant mortality5
CrawlSN: community-aware data acquisition with maximum willingness in online social networks5
An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices5
The network-untangling problem: from interactions to activity timelines5
Isolation kernel: the X factor in efficient and effective large scale online kernel learning5
Synwalk: community detection via random walk modelling5
Continuous treatment effect estimation via generative adversarial de-confounding4
CURIE: a cellular automaton for concept drift detection4
Robust regression via error tolerance4
Exploiting sensor data in professional road cycling: personalized data-driven approach for frequent fitness monitoring4
Link prediction in dynamic networks using random dot product graphs4
Online summarization of dynamic graphs using subjective interestingness for sequential data4
Sparse randomized shortest paths routing with Tsallis divergence regularization4
Recency-based sequential pattern mining in multiple event sequences4
An eager splitting strategy for online decision trees in ensembles4
Stable and actionable explanations of black-box models through factual and counterfactual rules4
FuseRec: fusing user and item homophily modeling with temporal recommender systems4
Sentiment analysis in tweets: an assessment study from classical to modern word representation models4
Informative pseudo-labeling for graph neural networks with few labels4
Homophily outlier detection in non-IID categorical data4
Large scale K-means clustering using GPUs3
Fair detection of poisoning attacks in federated learning on non-i.i.d. data3
Fast, accurate and explainable time series classification through randomization3
Hyperbolic node embedding for temporal networks3
Hybrid Bayesian network discovery with latent variables by scoring multiple interventions3
Individualized passenger travel pattern multi-clustering based on graph regularized tensor latent dirichlet allocation3
MultiETSC: automated machine learning for early time series classification3
Fairness in vulnerable attribute prediction on social media3
Dynamic self-paced sampling ensemble for highly imbalanced and class-overlapped data classification3
CrashNet: an encoder–decoder architecture to predict crash test outcomes3
Tackling ordinal regression problem for heterogeneous data: sparse and deep multi-task learning approaches3
Fast and robust video-based exercise classification via body pose tracking and scalable multivariate time series classifiers3
Scalable classifier-agnostic channel selection for multivariate time series classification3
Handling imbalance in hierarchical classification problems using local classifiers approaches3
Conclusive local interpretation rules for random forests3
Preventing deception with explanation methods using focused sampling3
Social explorative attention based recommendation for content distribution platforms3
Enforcing fairness using ensemble of diverse Pareto-optimal models3
SOKNL: A novel way of integrating K-nearest neighbours with adaptive random forest regression for data streams3
Exploring potential biases towards blockbuster items in ranking-based recommendations3
Extended missing data imputation via GANs for ranking applications3
Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets3
Federated singular value decomposition for high-dimensional data3
Mining sequences with exceptional transition behaviour of varying order using quality measures based on information-theoretic scoring functions3
Reflective-net: learning from explanations3
Wisdom of the contexts: active ensemble learning for contextual anomaly detection3
The grammar of interactive explanatory model analysis3
K-plex cover pooling for graph neural networks3
Pseudoinverse graph convolutional networks2
Counterfactual inference with latent variable and its application in mental health care2
Matrix sketching for supervised classification with imbalanced classes2
Knowledge graph embedding methods for entity alignment: experimental review2
Counterfactual explanations as interventions in latent space2
kNN matrix profile for knowledge discovery from time series2
Selego: robust variate selection for accurate time series forecasting2
ROhAN: Row-order agnostic null models for statistically-sound knowledge discovery2
Structure learning for relational logistic regression: an ensemble approach2
AURORA: A Unified fRamework fOR Anomaly detection on multivariate time series2
Visualizing image content to explain novel image discovery2
ConvMOS: climate model output statistics with deep learning2
Using p-values for the comparison of classifiers: pitfalls and alternatives2
Fast computation of Katz index for efficient processing of link prediction queries2
Mining explainable local and global subgraph patterns with surprising densities2
Shapley values for cluster importance2
Exploring uplift modeling with high class imbalance2
Robust and sparse multinomial regression in high dimensions2
Regularized impurity reduction: accurate decision trees with complexity guarantees2
Multi-view metro station clustering based on passenger flows: a functional data-edged network community detection approach2
Personalised meta-path generation for heterogeneous graph neural networks2
Generalized core maintenance of dynamic bipartite graphs2
DAMP: accurate time series anomaly detection on trillions of datapoints and ultra-fast arriving data streams2
Joint leaf-refinement and ensemble pruning through $$L_1$$ regularization2
A case study of improving a non-technical losses detection system through explainability2
Dynamic cyber risk estimation with competitive quantile autoregression2
Inferring range of information diffusion based on historical frequent items2
PAC-Bayesian lifelong learning for multi-armed bandits2
Variational auto-encoder based Bayesian Poisson tensor factorization for sparse and imbalanced count data2
Characterizing attitudinal network graphs through frustration cloud2
Learning tractable probabilistic models for moral responsibility and blame2
Parameterizing the cost function of dynamic time warping with application to time series classification2
Topic change point detection using a mixed Bayesian model2
A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering2
Decision tree boosted varying coefficient models2
ContE: contextualized knowledge graph embedding for circular relations2
0.039910078048706