Proceedings of the Vldb Endowment

Papers
(The TQCC of Proceedings of the Vldb Endowment is 9. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-12-01 to 2025-12-01.)
ArticleCitations
IsoBugView613
Approximating probabilistic group steiner trees in graphs398
Timestamp as a Service, Not an Oracle244
Cardinality Estimation for Having-Clauses115
DuckDB-wasm111
Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services103
QPJVis Demo: Quality-Boost Progressive Join Query Processing System94
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service89
VeriBench: Analyzing the Performance of Database Systems with Verifiability75
Neighborhood-Based Hypergraph Core Decomposition75
How to Optimize SQL Queries? A Comparison Between Split, Holistic, and Hybrid Approaches74
Unraveling the Impact of Window Semantics: Optimizing Join Order for Efficient Stream Processing73
Shifting Transaction Isolation on Graphs: From Systems to Data68
Efficient Graph Data Access for Out-of-Memory GPU Streaming Graph Processing64
Relational Data Models for Genetic VCF data61
Cloudy with a Chance of JSON59
Accelerating Subgraph Matching through Fine-Grained and Powerful Equivalences59
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization58
Unify: A System For Unstructured Data Analytics58
TSB-AutoAD: Towards Automated Solutions for Time-Series Anomaly Detection56
Efficient Discovery of Relaxed Functional Dependencies56
SkyStore: Cost-Optimized Object Storage Across Regions and Clouds55
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation54
Influential Community Search over Large Heterogeneous Information Networks52
SpaceSaving ±52
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers51
Galvatron50
Efficient Distributed Transaction Processing in Heterogeneous Networks49
Algorithm and system co-design for efficient subgraph-based graph representation learning49
Towards Designing and Learning Piecewise Space-Filling Curves49
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates48
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules48
Reliable community search in dynamic networks48
A Reproducible Tutorial on Reproducibility in Database Systems Research48
G-tran47
Motiflets47
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives46
DyHealth46
Fries45
Differentially Private Stream Processing at Scale45
LION: Fast and High-Resolution Network Kernel Density Visualization43
POEM42
HyperBlocker: Accelerating Rule-Based Blocking in Entity Resolution Using GPUs41
DoppelGanger++ in Action: A Database Replay System with Fast Dependency Graph Generation41
Hardware-Efficient Data Imputation through DBMS Extensibility41
Making CRDTs Not So Eventual40
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning40
Improving matrix-vector multiplication via lossless grammar-compressed matrices39
SAIL: A Voyage to Symbolic Approximation Solutions for Time-Series Analysis39
A Comprehensive Survey and Experimental Study of Learning-Based Community Search39
Demonstrating Waffle: A Self-Driving Grid Index38
Efficient Non-Learning Similar Subtrajectory Search37
DARKER: Efficient Transformer with Data-Driven Attention Mechanism for Time Series37
LIDER36
VeLP: Vehicle Loading Plan Learning from Human Behavior in Nationwide Logistics System35
LITS: An Optimized Learned Index for Strings35
SUFF: Accelerating Subgraph Matching with Historical Data35
SQL Engines Excel at the Execution of Imperative Programs34
LogLite: Lightweight Plug-and-Play Streaming Log Compression34
Bonspiel: Low Tail Latency Transactions in Geo-Distributed Databases34
DPXPlain33
IsoVista: Black-Box Checking Database Isolation Guarantees33
TsQuality: Measuring Time Series Data Quality in Apache IoTDB32
Trie memtables in cassandra32
Approximate Queries over Concurrent Updates31
PSFQ: A Blockchain-Based Privacy-Preserving and Verifiable Student Feedback Questionnaire Platform31
Hermes: Off-the-Shelf Real-Time Transactional Analytics31
Federated Data Distribution Shift Estimation30
Seiden: Revisiting Query Processing in Video Database Systems30
Databases Unbound: Querying All of the World's Bytes with AI30
TSB-UAD30
Plush30
Cuckoo Heavy Keeper and the Balancing Act of Maintaining Heavy Hitters in Stream Processing30
SingleStore-V: An Integrated Vector Database System in SingleStore30
HAIChart: Human and AI Paired Visualization System30
Succinct graph representations as distance oracles29
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting29
A demonstration of multi-region CockroachDB29
Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores29
VIDEX: A Disaggregated and Extensible Virtual Index for the Cloud and AI Era28
Simulating a Transactional Server for Multi-Model Systems28
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching28
GalaxyWeaver: Autonomous Table-to-Graph Conversion and Schema Optimization with Large Language Models28
Saving Money for Analytical Workloads in the Cloud27
HADES: Range-Filtered Private Aggregation on Public Data27
Oasis: An Optimal Disjoint Segmented Learned Range Filter27
SparkCAD27
A Practical Theory of Generalization in Selectivity Learning27
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Query Event Logs26
Decentralized Actor Scheduling and Reference-Based Storage in Xorbits: A Native Scalable Data Science Engine26
XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes26
Vive la Différence: Practical Diff Testing of Stateful Applications26
Optimal Sharding for Scalable Blockchains with Deconstructed SMR26
FSMDTW: A Fast Index-Free Subsequence Matching Algorithm for Dynamic Time Warping26
FastMosaic in Action: A New Mosaic Operator for Array DBMSs25
Design trade-offs for a robust dynamic hybrid hash join25
ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms25
Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams25
KGNav: A Knowledge Graph Navigational Visual Query System25
Kora: A Cloud-Native Event Streaming Platform for Kafka25
Enriching Relations with Additional Attributes for ER25
FS-Real: A Real-World Cross-Device Federated Learning Platform25
RICH: Real-Time Identification of Negative Cycles for High-Efficiency Arbitrage25
Discovering Leitmotifs in Multidimensional Time Series25
BURST: Rendering Clustering Techniques Suitable for Evolving Streams25
Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data25
DINOMO24
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing24
Sparcle: Boosting the Accuracy of Data Cleaning Systems through Spatial Awareness24
Petabyte-Scale Row-Level Operations in Data Lakehouses24
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying23
Hercules against data series similarity search23
CMixing: An Efficient Coin Mixing Platform to Enhance Anonymity in Cryptocurrency Transactions23
PerMA-bench23
Federated matrix factorization with privacy guarantee23
Expanding Reverse Nearest Neighbors23
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution23
Sphinteract: Resolving Ambiguities in NL2SQL through User Interaction23
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads22
Serving deep learning models with deduplication from relational databases22
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems22
Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data22
Navigating Data Repositories: Utilizing Line Charts to Discover Relevant Datasets22
MLP-Mixer based Masked Autoencoders are Effective, Explainable and Robust for Time Series Anomaly Detection22
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data22
Optimizing machine learning inference queries with correlative proxy models22
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines22
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs22
LavaStore: ByteDance's Purpose-Built, High-Performance, Cost-Effective Local Storage Engine for Cloud Services21
Cloud data systems21
Anomaly detection in time series21
Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and Improvement21
Window Function Expression: Let the Self-Join Enter21
Less is More: Efficient Time Series Dataset Condensation via Two-Fold Modal Matching21
Dalton21
CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space21
A Case for Graphics-Driven Query Processing20
From Scale-Up to Scale-Out: PolarDB's Journey to Achieving 2 Billion tpmC20
Selective data acquisition in the wild for model charging20
QuoteInspector: Gaining Insight about Social Media Discussions20
Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification20
RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems20
Fused Gromov-Wasserstein Alignment for Graph Edit Distance Computation and Beyond20
Datamap-Driven Tabular Coreset Selection for Classifier Training20
Demo of QueryBooster: Supporting Middleware-Based SQL Query Rewriting as a Service20
TuskFlow: An Efficient Graph Database for Long-Running Transactions20
Authenticated Aggregate Queries with Boolean Range Predicates on Blockchains20
Unleash the Power of Ellipsis: Accuracy-Enhanced Sparse Vector Technique with Exponential Noise20
Resource Management in Aurora Serverless19
GENTI: GPU-Powered Walk-Based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs19
ResLake : Towards Minimum Job Latency and Balanced Resource Utilization in Geo-Distributed Job Scheduling19
DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data19
On More Efficiently and Versatilely Querying Historical k -Cores19
Polyglot data management19
AeonG: An Efficient Built-in Temporal Support in Graph Databases19
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis19
Fast neural ranking on bipartite graph indices19
Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding19
Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads19
Accelerating Maximal Clique Enumeration via Graph Reduction19
Minimum Strongly Connected Subgraph Collection in Dynamic Graphs19
Falcon: Advancing Asynchronous BFT Consensus for Lower Latency and Enhanced Throughput19
Starry19
SCompression: Enhancing Database Knob Tuning Efficiency Through Slice-Based OLTP Workload Compression19
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release18
ELPIS: Graph-Based Similarity Search for Scalable Data Science18
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation18
Themis: A GPU-Accelerated Relational Query Execution Engine18
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL18
Nuhuo: An Effective Estimation Model for Traffic Speed Histogram Imputation on A Road Network18
Hyper-tune18
Scalable and Robust Snapshot Isolation for High-Performance Storage Engines18
Composable Data Management: An Execution Overview18
TranAD18
YeSQL18
Pyneapple-G: Scalable Spatial Grouping Queries18
L2chain18
GQL and SQL/PGQ: Theoretical Models and Expressive Power18
A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem18
Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic Graphs18
Efficient Discovery of Significant Patterns with Few-Shot Resampling18
Computing Rule-Based Explanations by Leveraging Counterfactuals18
Uldp-FL: Federated Learning with Across-Silo User-Level Differential Privacy18
CEDA: Learned Cardinality Estimation with Domain Adaptation18
PGE17
QTCS: Efficient Query-Centered Temporal Community Search17
Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics17
KEIGO: Co-Designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-Aware Storage Hierarchy17
Lingua Manga : A Generic Large Language Model Centric System for Data Curation17
The case for distributed shared-memory databases with RDMA-enabled memory disaggregation17
To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks17
Anarchy in the Database: A Survey and Evaluation of Database Management System Extensibility17
DBMS annihilator17
LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans17
Sancus17
Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing17
Tigger: A Database Proxy That Bounces with User-Bypass17
MiCS16
Differentially Private Data Generation with Missing Data16
Machine Learning for Graph Data Management and Query Processing16
Dynamic Graph Databases with Out-of-Order Updates16
Task: An Efficient Framework for Instant Error-Tolerant Spatial Keyword Queries on Road Networks16
TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph Reasoning16
Towards distributed bitruss decomposition on bipartite graphs16
GRewriter: Practical Query Rewriting with Automatic Rule Set Expansion in GaussDB16
Fast approximate denial constraint discovery16
Efficient Triangle-Connected Truss Community Search in Dynamic Graphs16
Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding16
ELEET: Efficient Learned Query Execution over Text and Tables16
Skellam mixture mechanism16
Bridging Disciplines in Data Management Research to Solve Complex Data Problems16
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models16
TGL16
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes15
Efficient k NN Search in Public Transportation Networks15
ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection15
FB + -Tree: A Memory-Optimized B + -Tree with Latch-Free Update15
Efficient Execution of User-Defined Functions in SQL Queries15
PIM-Tree15
Distributed learning of fully connected neural networks using independent subnet training15
Streaming Time Series Subsequence Anomaly Detection: A Glance and Focus Approach15
OFL-W3: A One-Shot Federated Learning System on Web 3.015
DataRinse: Semantic Transforms for Data Preparation Based on Code Mining15
Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks15
OceanBase Paetica: A Hybrid Shared-Nothing/Shared-Everything Database for Supporting Single Machine and Distributed Cluster15
Reimagining Deep Learning Systems through the Lens of Data Systems15
Tiresias15
Access Control for Information-Theoretically Secure Data14
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams14
Improving DBMS Scheduling Decisions with Accurate Performance Prediction on Concurrent Queries14
B link -hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases14
Exploiting Cloud Object Storage for High-Performance Analytics14
An Experimental Evaluation of Anomaly Detection in Time Series14
Cracking Vector Search Indexes14
Breaking It Down: An In-Depth Study of Index Advisors14
Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models14
Fair Transaction Processing for Multi-Tenant Databases14
Design and Modular Verification of Distributed Transactions in MongoDB14
FlowWalker: A Memory-Efficient and High-Performance GPU-Based Dynamic Graph Random Walk Framework14
Explaining Differentially Private Query Results with DPXPlain14
AMRAS14
SecretFlow-SCQL: A Secure Collaborative Query Platform14
ABC14
Decentralized crowdsourcing for human intelligence tasks with efficient on-chain cost14
MD-MVCC: Multi-Version Concurrency Control for Schema Changes in Azure SQL Database14
Efficient Black-Box Checking of Snapshot Isolation in Databases14
Troubles with nulls, views from the users14
Machine Learning for Subgraph Extraction: Methods, Applications and Challenges14
POEM: Pattern-Oriented Explanations of Convolutional Neural Networks14
Incremental Detection of Denial Constraint Violations14
0.08439302444458