VLDB Journal

(The TQCC of VLDB Journal is 6. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
Towards flexibility and robustness of LSM trees159
Incremental discovery of denial constraints52
Algorithms for the discovery of embedded functional dependencies48
Answering reachability and K-reach queries on large graphs with label constraints45
Accelerating multi-way joins on the GPU41
To share or not to share vector registers?37
Optimizing navigational graph queries35
HeteroStamp: leveraging heterogeneous social interactions for mobility prediction-enhanced cost-aware spatiotemporal crowdsensing33
An in-depth analysis of pre-trained embeddings for entity resolution32
A survey of multimodal event detection based on data fusion32
A graph pattern mining framework for large graphs on GPU30
Reconciling tuple and attribute timestamping for temporal data warehouses27
ProS: data series progressive k-NN similarity search and classification with probabilistic quality guarantees26
Efficient kNN query for moving objects on time-dependent road networks23
Special issue on responsible data management and data science23
PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search22
Effective entity matching with transformers22
Tempura: a general cost-based optimizer framework for incremental data processing (Journal Version)19
A design space for RDF data representations17
When hierarchy meets 2-hop-labeling: efficient shortest distance and path queries on road networks16
Maximum and top-k diversified biclique search at scale15
A model and query language for temporal graph databases15
Optimizing RPQs over a compact graph representation14
Anchored coreness: efficient reinforcement of social networks14
VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition14
F-IVM: analytics over relational databases under updates14
Exploiting domain knowledge to address class imbalance and a heterogeneous feature space in multi-class classification14
Efficient and effective algorithms for densest subgraph discovery and maintenance13
Application-driven graph partitioning13
Data distribution tailoring revisited: cost-efficient integration of representative data13
Discovering approximate implicit domain orders through order dependencies13
HERMES: data placement and schema optimization for enterprise knowledge bases12
Efficient detection of multivariate correlations with different correlation measures12
RDFFrames: knowledge graph access for machine learning tools12
Picket: guarding against corrupted data in tabular data during learning and inference12
Tabular data synthesis with generative adversarial networks: design space and optimizations11
DumpyOS: A data-adaptive multi-ary index for scalable data series similarity search10
Local dampening: differential privacy for non-numeric queries via local sensitivity10
Anytime bottom-up rule learning for large-scale knowledge graph completion10
Span-reachability querying in large temporal graphs10
Lero: applying learning-to-rank in query optimizer10
Privacy and efficiency guaranteed social subgraph matching9
Data distribution debugging in machine learning pipelines9
A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning9
Information Resilience: the nexus of responsible and agile approaches to information use9
Survey of vector database management systems8
Correction to: TurboLift: fast accuracy lifting for historical data recovery8
Data dependencies for query optimization: a survey8
Correction to: Survey of window types for aggregation in stream processing systems8
Correction to: Data dependencies for query optimization: a survey8
Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytic8
Practical planning and execution of groupjoin and nested aggregates7
MDDE: multitasking distributed differential evolution for privacy-preserving database fragmentation7
Generating highly customizable python code for data processing with large language models7
PARROT: pattern-based correlation exploitation in big partitioned data series7
PrefixFPM: a parallel framework for general-purpose mining of frequent and closed patterns7
Pivot selection algorithms in metric spaces: a survey and experimental study7
Distributed detection of sequential anomalies in univariate time series6
Cardinality estimation using normalizing flow6
Cache-efficient sweeping-based interval joins for extended Allen relation predicates6
General graph generators: experiments, analyses, and improvements6
WavingSketch: an unbiased and generic sketch for finding top-k items in data streams6
A fractional memory-efficient approach for online continuous-time influence maximization6
Efficient distributed discovery of bidirectional order dependencies6
Continuous monitoring of moving skyline and top-k queries6
Eris: efficiently measuring discord in multidimensional sources6
Internal and external memory set containment join6
Distance labeling: on parallelism, compression, and ordering6
A quantitative evaluation of persistent memory hash indexes6