VLDB Journal

Papers
(The median citation count of VLDB Journal is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-10-01 to 2024-10-01.)
ArticleCitations
Data collection and quality challenges in deep learning: a data-centric AI perspective113
Fast-adapting and privacy-preserving federated recommender system52
Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytic46
Fairness in rankings and recommendations: an overview45
A survey of RDF stores & SPARQL engines for querying knowledge graphs41
A model and query language for temporal graph databases37
MDDE: multitasking distributed differential evolution for privacy-preserving database fragmentation35
A survey on outlier explanations32
A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning32
A survey on deep learning approaches for text-to-SQL30
Unsupervised and scalable subsequence anomaly detection in large data series29
LineageChain: a fine-grained, secure and efficient data provenance system for blockchains29
Data dependencies for query optimization: a survey27
Location- and keyword-based querying of geo-textual data: a survey24
Towards efficient solutions of bitruss decomposition for large-scale bipartite graphs23
Cross-chain deals and adversarial commerce23
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs22
Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra22
Distributed temporal graph analytics with GRADOOP19
Privacy and efficiency guaranteed social subgraph matching17
Dragoon: a hybrid and efficient big trajectory management system for offline and online analytics16
Visually aware recommendation with aesthetic features15
A survey on semantic schema discovery15
eRiskCom: an e-commerce risky community detection platform14
Answering reachability and K-reach queries on large graphs with label constraints14
$$\hbox {CDBTune}^{+}$$: An efficient deep reinforcement learning-based automatic cloud database tuning system14
On entity alignment at scale14
PrefixFPM: a parallel framework for general-purpose mining of frequent and closed patterns14
I/O efficient k-truss community search in massive graphs13
GeoSparkViz: a cluster computing system for visualizing massive-scale geospatial data13
A dataspace-based framework for OLAP analyses in a high-variety multistore13
A survey on the evolution of stream processing systems13
In-Memory Interval Joins13
HFUL: a hybrid framework for user account linkage across location-aware social networks12
Fast subgraph query processing and subgraph matching via static and dynamic equivalences12
Efficient kNN query for moving objects on time-dependent road networks12
Efficient and effective ER with progressive blocking11
ByShard: sharding in a Byzantine environment10
Algorithms for the discovery of embedded functional dependencies10
Parallel mining of large maximal quasi-cliques10
Data distribution debugging in machine learning pipelines10
Efficient Hop-constrained s-t Simple Path Enumeration10
G-thinker: a general distributed framework for finding qualified subgraphs in a big graph with load balancing9
A cost model for random access queries in document stores9
Better database cost/performance via batched I/O on programmable SSD9
Fast data series indexing for in-memory data9
Pivot selection algorithms in metric spaces: a survey and experimental study9
Model averaging in distributed machine learning: a case study with Apache Spark8
VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition8
Leveraging range joins for the computation of overlap joins8
Unified route representation learning for multi-modal transportation recommendation with spatiotemporal pre-training8
Accelerated butterfly counting with vertex priority on bipartite graphs8
Effective entity matching with transformers8
Semantic embedding for regions of interest7
Cache-efficient sweeping-based interval joins for extended Allen relation predicates7
(p,q)-biclique counting and enumeration for large sparse bipartite graphs7
A design space for RDF data representations7
Efficient distributed discovery of bidirectional order dependencies7
ProS: data series progressive k-NN similarity search and classification with probabilistic quality guarantees7
Complex event forecasting with prediction suffix trees6
Resource-aware adaptive indexing for in situ visual exploration and analytics6
A meta-level analysis of online anomaly detectors6
PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search6
ABSTAT-HD: a scalable tool for profiling very large knowledge graphs6
Deep entity matching with adversarial active learning6
Maximum and top-k diversified biclique search at scale6
RDFFrames: knowledge graph access for machine learning tools6
ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths6
ICS-GNN$$^+$$: lightweight interactive community search via graph neural network6
Picket: guarding against corrupted data in tabular data during learning and inference6
Span-reachability querying in large temporal graphs6
Comparison and evaluation of state-of-the-art LSM merge policies5
Interactively discovering and ranking desired tuples by data exploration5
Application-driven graph partitioning5
Payment behavior prediction on shared parking lots with TR-GCN5
Cleaning timestamps with temporal constraints5
Continuous monitoring of moving skyline and top-k queries5
Accelerating multi-way joins on the GPU5
Fast fully dynamic labelling for distance queries5
Toward maintenance of hypercores in large-scale dynamic hypergraphs5
Distributed detection of sequential anomalies in univariate time series5
Local dampening: differential privacy for non-numeric queries via local sensitivity5
Anchored coreness: efficient reinforcement of social networks4
An authorization model for query execution in the cloud4
HINT: a hierarchical interval index for Allen relationships4
General graph generators: experiments, analyses, and improvements4
Multi-constraint shortest path using forest hop labeling4
ABC of order dependencies4
Learned sketch for subgraph counting: a holistic approach4
Simple and automated negative sampling for knowledge graph embedding4
Robust and scalable content-and-structure indexing4
Formal semantics and high performance in declarative machine learning using Datalog4
Anytime bottom-up rule learning for large-scale knowledge graph completion3
A Pareto optimal Bloom filter family with hash adaptivity3
In-order sliding-window aggregation in worst-case constant time3
Micro-architectural analysis of in-memory OLTP: Revisited3
Survey of window types for aggregation in stream processing systems3
A fractional memory-efficient approach for online continuous-time influence maximization3
Making graphs compact by lossless contraction3
Diversifying recommendations on sequences of sets3
Practical planning and execution of groupjoin and nested aggregates3
An analysis of one-to-one matching algorithms for entity resolution3
Survey of vector database management systems3
Deep treatment-adaptive network for causal inference3
Label-constrained shortest path query processing on road networks3
Fast, exact, and parallel-friendly outlier detection algorithms with proximity graph in metric spaces3
Internal and external memory set containment join3
Reverse spatial top-k keyword queries3
Detecting rumours with latency guarantees using massive streaming data2
P$$^2$$CG: a privacy preserving collaborative graph neural network training framework2
Opportunities for optimism in contended main-memory multicore transactions2
Coalition-based task assignment with priority-aware fairness in spatial crowdsourcing2
Accelerating directed densest subgraph queries with software and hardware approaches2
Efficient detection of multivariate correlations with different correlation measures2
Time-topology analysis on temporal graphs2
Data-induced predicates for sideways information passing in query optimizers2
Learning-based query optimization for multi-probe approximate nearest neighbor search2
SQUID: subtrajectory query in trillion-scale GPS database2
Efficient structural node similarity computation on billion-scale graphs2
Distance labeling: on parallelism, compression, and ordering2
Tabular data synthesis with generative adversarial networks: design space and optimizations2
A survey on transactional stream processing2
BugDoc2
Privacy-preserving worker allocation in crowdsourcing2
Efficient exploratory clustering analyses in large-scale exploration processes2
Have query optimizers hit the wall?2
Exploiting domain knowledge to address class imbalance and a heterogeneous feature space in multi-class classification2
Fine-grained semantic type discovery for heterogeneous sources using clustering2
0.03924298286438