VLDB Journal

Papers
(The median citation count of VLDB Journal is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
To share or not to share vector registers?231
Correction to: Data dependencies for query optimization: a survey80
Generating highly customizable python code for data processing with large language models66
PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search57
Optimizing navigational graph queries57
Anchored coreness: efficient reinforcement of social networks52
The full story of 1000 cores51
PrefixFPM: a parallel framework for general-purpose mining of frequent and closed patterns48
Third and Boyce–Codd normal form for property graphs43
A fractional memory-efficient approach for online continuous-time influence maximization38
Hu-Fu: efficient and secure spatial queries over data federation33
Reverse spatial top-k keyword queries25
Learned sketch for subgraph counting: a holistic approach24
Efficient and robust active learning methods for interactive database exploration24
Formal semantics and high performance in declarative machine learning using Datalog22
On efficient 3D object retrieval22
A new window Clause for SQL++21
Hyper-distance oracles in hypergraphs20
In-database query optimization on SQL with ML predicates19
Cross-chain deals and adversarial commerce19
Fast subgraph query processing and subgraph matching via static and dynamic equivalences18
SQUID: subtrajectory query in trillion-scale GPS database17
Special issue on the best papers of DaMoN 202017
An authorization model for query execution in the cloud17
An update-intensive LSM-based R-tree index16
GPU-based butterfly counting16
Approximation and inapproximability results on computing optimal repairs16
Discovering critical vertices for reinforcement of large-scale bipartite networks14
Ontological databases with faceted queries14
Efficient top-k spatial-range-constrained approximate nearest neighbor search on geo-tagged high-dimensional vectors13
ABSTAT-HD: a scalable tool for profiling very large knowledge graphs13
Special issue on big graph data management and processing13
DB-BERT: making database tuning tools “read” the manual13
Parallel mining of large maximal quasi-cliques13
HFUL: a hybrid framework for user account linkage across location-aware social networks13
P$$^2$$CG: a privacy preserving collaborative graph neural network training framework13
Multi-constraint shortest path using forest hop labeling12
Correction to: BugDoc Iterative debugging and explanation of pipeline executions12
ByShard: sharding in a Byzantine environment12
Fast data series indexing for in-memory data12
Efficient and scalable huge embedding model training via distributed cache management11
Picket: guarding against corrupted data in tabular data during learning and inference11
Efficient and effective algorithms for densest subgraph discovery and maintenance10
Incremental discovery of denial constraints10
Efficient detection of multivariate correlations with different correlation measures10
DumpyOS: A data-adaptive multi-ary index for scalable data series similarity search10
Towards flexibility and robustness of LSM trees10
Accelerating multi-way joins on the GPU9
LIST: learning to index spatio-textual data for embedding based spatial keyword queries9
A graph pattern mining framework for large graphs on GPU9
Assisted design of data science pipelines8
AutoML in heavily constrained applications8
A survey on semantic schema discovery8
Anytime bottom-up rule learning for large-scale knowledge graph completion8
Eris: efficiently measuring discord in multidimensional sources8
Accelerating directed densest subgraph queries with software and hardware approaches7
Special issue: modern hardware7
A survey on the evolution of stream processing systems7
ICS-GNN$$^+$$: lightweight interactive community search via graph neural network7
A survey on outlier explanations7
Survey of window types for aggregation in stream processing systems7
A survey on deep learning approaches for text-to-SQL7
xDBTagger: explainable natural language interface to databases using keyword mappings and schema graph6
Performant almost-latch-free data structures using epoch protection in more depth6
Scalable decoupling graph neural network with feature-oriented optimization6
A multi-facet analysis of BERT-based entity matching models6
HINT: a hierarchical interval index for Allen relationships6
How good are machine learning clouds? Benchmarking two snapshots over 5 years6
A survey of RDF stores & SPARQL engines for querying knowledge graphs6
Morphtree: a polymorphic main-memory learned index for dynamic workloads6
Have query optimizers hit the wall?6
Resource-aware adaptive indexing for in situ visual exploration and analytics6
Time-topology analysis on temporal graphs5
Highly distributed and privacy-preserving queries on personal data management systems5
HPCache: memory-efficient OLAP through proportional caching revisited5
Join optimization revisited: a novel DP algorithm for join&sort order selection5
FlexpushdownDB: rethinking computation pushdown for cloud OLAP DBMSs5
BatchHL$$^{+}$$: batch dynamic labelling for distance queries on large-scale networks5
Netherite: efficient execution of serverless workflows5
BugDoc5
VUS: effective and efficient accuracy measures for time-series anomaly detection5
A near-optimal approach to edge connectivity-based hierarchical graph decomposition5
A generic framework for efficient computation of top-k diverse results5
Identifying similar-bicliques in bipartite graphs5
Tee-based key-value stores: a survey5
Editorial: Special Issue for Selected Papers of VLDB 20215
Answering reachability and K-reach queries on large graphs with label constraints4
General graph generators: experiments, analyses, and improvements4
Span-reachability querying in large temporal graphs4
eRiskCom: an e-commerce risky community detection platform4
Table integration in data lakes unleashed: pairwise integrability judgment, integrable set discovery, and multi-tuple conflict resolution4
Efficient distributed discovery of bidirectional order dependencies4
Application-driven graph partitioning4
HERMES: data placement and schema optimization for enterprise knowledge bases4
Efficient kNN query for moving objects on time-dependent road networks4
Cardinality estimation using normalizing flow4
Leveraging user itinerary to improve personalized deep matching at Fliggy4
Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytic4
Data collection and quality challenges in deep learning: a data-centric AI perspective4
Continuous monitoring of moving skyline and top-k queries4
AutoCTS++: zero-shot joint neural architecture and hyperparameter search for correlated time series forecasting4
MinJoin++: a fast algorithm for string similarity joins under edit distance4
A powerful reducing framework for accelerating set intersections over graphs3
RNE: computing shortest paths using road network embedding3
Efficient exploratory clustering analyses in large-scale exploration processes3
A systematic evaluation of machine learning on serverless infrastructure3
Temporal graph patterns by timed automata3
Butterfly counting and bitruss decomposition on uncertain bipartite graphs3
C5: cloned concurrency control that always keeps up3
Ingress: an automated incremental graph processing system3
Adaptive algorithms for crowd-aided categorization3
Editorial for S.I.: VLDB 20203
Reliability evaluation of individual predictions: a data-centric approach3
Fast, exact, and parallel-friendly outlier detection algorithms with proximity graph in metric spaces3
Hypergraph motifs and their extensions beyond binary3
Similarity-driven and task-driven models for diversity of opinion in crowdsourcing markets3
Flexible grouping of linear segments for highly accurate lossy compression of time series data3
SWOOP: top-k similarity joins over set streams3
Complex event forecasting with prediction suffix trees2
Enhancing domain-aware multi-truth data fusion using copy-based source authority and value similarity2
Raster interval object approximations for spatial intersection joins2
Data distribution tailoring revisited: cost-efficient integration of representative data2
Detecting rumours with latency guarantees using massive streaming data2
Accelerating maximum biplex search over large bipartite graphs2
Efficient algorithms for reachability and path queries on temporal bipartite graphs2
Efficient indexing and searching of constrained core in hypergraphs2
ProS: data series progressive k-NN similarity search and classification with probabilistic quality guarantees2
In-order sliding-window aggregation in worst-case constant time2
Making graphs compact by lossless contraction2
Correction to: TurboLift: fast accuracy lifting for historical data recovery2
Reconciling tuple and attribute timestamping for temporal data warehouses2
0.027524948120117