Proceedings of the Vldb Endowment

Papers
(The TQCC of Proceedings of the Vldb Endowment is 8. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
Approximating probabilistic group steiner trees in graphs388
LES 3262
Motiflets147
Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service103
Efficient Distributed Transaction Processing in Heterogeneous Networks90
IsoBugView73
Kamino70
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules65
Fries63
PARQO: Penalty-Aware Robust Plan Selection in Query Optimization62
Cardinality Estimation for Having-Clauses60
VeriBench: Analyzing the Performance of Database Systems with Verifiability57
Timestamp as a Service, Not an Oracle56
SpaceSaving ±55
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation54
Differentially Private Stream Processing at Scale53
QPJVis Demo: Quality-Boost Progressive Join Query Processing System52
Algorithm and system co-design for efficient subgraph-based graph representation learning51
Reliable community search in dynamic networks47
DuckDB-wasm47
Breathing New Life into an Old Tree: Resolving Logging Dilemma of B + -tree on Modern Computational Storage Drives47
Accelerating recommendation system training by leveraging popular choices45
A Reproducible Tutorial on Reproducibility in Database Systems Research45
PerfGuard45
DyHealth44
Spectrum: Speedy and Strictly-Deterministic Smart Contract Transactions for Blockchain Ledgers42
G-tran42
Galvatron42
Neighborhood-Based Hypergraph Core Decomposition41
OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates41
Influential Community Search over Large Heterogeneous Information Networks41
Towards Designing and Learning Piecewise Space-Filling Curves41
SingleStore-V: An Integrated Vector Database System in SingleStore40
DoppelGanger++ in Action: A Database Replay System with Fast Dependency Graph Generation40
PSFQ: A Blockchain-Based Privacy-Preserving and Verifiable Student Feedback Questionnaire Platform39
LION: Fast and High-Resolution Network Kernel Density Visualization39
Approximate Queries over Concurrent Updates39
Demonstrating Waffle: A Self-Driving Grid Index39
Efficient Non-Learning Similar Subtrajectory Search38
POEM38
TsQuality: Measuring Time Series Data Quality in Apache IoTDB38
MultiCategory38
SQL Engines Excel at the Execution of Imperative Programs38
Plush37
Improving matrix-vector multiplication via lossless grammar-compressed matrices37
SKT37
Exploiting the Power of Equality-Generating Dependencies in Ontological Reasoning37
VeLP: Vehicle Loading Plan Learning from Human Behavior in Nationwide Logistics System36
DPXPlain36
Horizon36
Hardware-Efficient Data Imputation through DBMS Extensibility35
Making CRDTs Not So Eventual35
LITS: An Optimized Learned Index for Strings35
HyperBlocker: Accelerating Rule-Based Blocking in Entity Resolution Using GPUs35
Seiden: Revisiting Query Processing in Video Database Systems34
DARKER: Efficient Transformer with Data-Driven Attention Mechanism for Time Series34
Trie memtables in cassandra34
SUFF: Accelerating Subgraph Matching with Historical Data33
Unconstrained submodular maximization with modular costs33
Incremental partitioning for efficient spatial data analytics33
LIDER33
Databases Unbound: Querying All of the World's Bytes with AI33
TSB-UAD33
Fangorn32
IsoVista: Black-Box Checking Database Isolation Guarantees32
A demonstration of multi-region CockroachDB32
Pre-training summarization models of structured datasets for cardinality estimation32
HAIChart: Human and AI Paired Visualization System32
Succinct graph representations as distance oracles31
Scalable Reasoning on Document Stores via Instance-Aware Query Rewriting31
ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems31
Kora: A Cloud-Native Event Streaming Platform for Kafka31
Enriching Relations with Additional Attributes for ER31
Hercules against data series similarity search30
Butterfly-core community search over labeled graphs30
Federated matrix factorization with privacy guarantee30
Ember30
Enabling SQL-based training data debugging for federated learning29
FARGO: Fast Maximum Inner Product Search via Global Multi-Probing29
CoroGraph: Bridging Cache Efficiency and Work Efficiency for Graph Algorithm Execution28
MLP-Mixer based Masked Autoencoders are Effective, Explainable and Robust for Time Series Anomaly Detection28
Design trade-offs for a robust dynamic hybrid hash join28
Toward Quantity-of-Interest Preserving Lossy Compression for Scientific Data28
Optimizing machine learning inference queries with correlative proxy models27
Expanding Reverse Nearest Neighbors27
Dalton27
DINOMO26
Oasis: An Optimal Disjoint Segmented Learned Range Filter26
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs26
Serving deep learning models with deduplication from relational databases25
ACTA: Autonomy and Coordination Task Assignment in Spatial Crowdsourcing Platforms25
ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads25
GRAIN25
SparkCAD25
PerMA-bench25
Errata for "teseo and the analysis of structural dynamic graphs"24
Subgraph matching over graph federation24
Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data24
Saving Money for Analytical Workloads in the Cloud24
XDB in Action: Decentralized Cross-Database Query Processing for Black-Box DBMSes24
Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams24
OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance from Database Query Event Logs24
FS-Real: A Real-World Cross-Device Federated Learning Platform23
Less is More: Efficient Time Series Dataset Condensation via Two-Fold Modal Matching23
Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and Improvement23
From Zero to Hero: Detecting Leaked Data through Synthetic Data Injection and Model Querying23
Sparcle: Boosting the Accuracy of Data Cleaning Systems through Spatial Awareness23
Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines23
FastMosaic in Action: A New Mosaic Operator for Array DBMSs23
KGNav: A Knowledge Graph Navigational Visual Query System23
Discovering Leitmotifs in Multidimensional Time Series23
Navigating Data Repositories: Utilizing Line Charts to Discover Relevant Datasets22
Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data22
LavaStore: ByteDance's Purpose-Built, High-Performance, Cost-Effective Local Storage Engine for Cloud Services22
Petabyte-Scale Row-Level Operations in Data Lakehouses22
CMixing: An Efficient Coin Mixing Platform to Enhance Anonymity in Cryptocurrency Transactions22
SlimChain21
Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching21
Anomaly detection in time series21
CORE-Sketch: On Exact Computation of Median Absolute Deviation with Limited Space21
A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search21
Polyglot data management20
AeonG: An Efficient Built-in Temporal Support in Graph Databases20
Cloud data systems20
Nuhuo: An Effective Estimation Model for Traffic Speed Histogram Imputation on A Road Network20
A Case for Graphics-Driven Query Processing20
Demonstration of generating explanations for black-box algorithms using Lewis20
Accelerating Maximal Clique Enumeration via Graph Reduction20
DAFDiscover: Robust Mining Algorithm for Dynamic Approximate Functional Dependencies on Dirty Data19
Starry19
Towards an optimized GROUP by abstraction for large-scale machine learning19
ResLake : Towards Minimum Job Latency and Balanced Resource Utilization in Geo-Distributed Job Scheduling19
Not black-box anymore!19
AUTOGR19
GENTI: GPU-Powered Walk-Based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs19
Pyneapple-G: Scalable Spatial Grouping Queries19
Machine learning for databases19
Uldp-FL: Federated Learning with Across-Silo User-Level Differential Privacy19
Fast neural ranking on bipartite graph indices19
Window Function Expression: Let the Self-Join Enter19
QuoteInspector: Gaining Insight about Social Media Discussions19
L2chain19
Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads18
Estimating spread of contact-based contagions in a population through sub-sampling18
Unleash the Power of Ellipsis: Accuracy-Enhanced Sparse Vector Technique with Exponential Noise18
Datamap-Driven Tabular Coreset Selection for Classifier Training18
Hyper-tune18
Efficient Fault Tolerance for Recommendation Model Training via Erasure Coding18
Quantifying Point Contributions: A Lightweight Framework for Efficient and Effective Query-Driven Trajectory Simplification18
Resource Management in Aurora Serverless18
Computing Rule-Based Explanations by Leveraging Counterfactuals18
Minimum Strongly Connected Subgraph Collection in Dynamic Graphs18
Combining Small Language Models and Large Language Models for Zero-Shot NL2SQL18
Demo of QueryBooster: Supporting Middleware-Based SQL Query Rewriting as a Service17
Missing value imputation on multidimensional time series17
Lingua Manga : A Generic Large Language Model Centric System for Data Curation17
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release17
TranAD17
To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks17
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis17
Selective data acquisition in the wild for model charging17
CEDA: Learned Cardinality Estimation with Domain Adaptation17
A Hierarchical Grouping Algorithm for the Multi-Vehicle Dial-a-Ride Problem16
MiCS16
PGE16
YeSQL16
Scalable and Robust Snapshot Isolation for High-Performance Storage Engines16
Composable Data Management: An Execution Overview16
Differentially Private Data Generation with Missing Data16
Cquirrel16
DeFiHap16
Ganos Aero: A Cloud-Native System for Big Raster Data Management and Processing16
The case for distributed shared-memory databases with RDMA-enabled memory disaggregation15
Efficient Algorithms for Pseudoarboricity Computation in Large Static and Dynamic Graphs15
DataRinse: Semantic Transforms for Data Preparation Based on Code Mining15
ELEET: Efficient Learned Query Execution over Text and Tables15
ParChain15
TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph Reasoning15
Skellam mixture mechanism15
Dynamic Graph Databases with Out-of-Order Updates15
Efficient k NN Search in Public Transportation Networks15
Trident15
Themis: A GPU-Accelerated Relational Query Execution Engine15
Towards distributed bitruss decomposition on bipartite graphs15
DBMS annihilator15
Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks15
Fast approximate denial constraint discovery15
Efficient Discovery of Significant Patterns with Few-Shot Resampling15
QTCS: Efficient Query-Centered Temporal Community Search14
OceanBase Paetica: A Hybrid Shared-Nothing/Shared-Everything Database for Supporting Single Machine and Distributed Cluster14
Sancus14
Sound of databases14
Machine Learning for Subgraph Extraction: Methods, Applications and Challenges14
PIM-Tree14
PRICE: A Pretrained Model for Cross-Database Cardinality Estimation14
Efficient Triangle-Connected Truss Community Search in Dynamic Graphs14
Distributed learning of fully connected neural networks using independent subnet training14
ArcheType: A Novel Framework for Open-Source Column Type Annotation Using Large Language Models14
xFraud14
Cryptanalysis of an encrypted database in SIGMOD '1414
TGL14
Detecting layout templates in complex multiregion files14
ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection14
ELPIS: Graph-Based Similarity Search for Scalable Data Science14
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes14
LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans14
Breaking It Down: An In-Depth Study of Index Advisors14
Designing production-friendly machine learning14
AMRAS14
Decomposed bounded floats for fast compression and queries14
Tigger: A Database Proxy That Bounces with User-Bypass14
Task: An Efficient Framework for Instant Error-Tolerant Spatial Keyword Queries on Road Networks14
Reimagining Deep Learning Systems through the Lens of Data Systems13
Exploiting Cloud Object Storage for High-Performance Analytics13
GraphScope13
WebMILE13
ABC13
Mixed Covers of Keys and Functional Dependencies for Maintaining the Integrity of Data under Updates13
Efficient Black-Box Checking of Snapshot Isolation in Databases13
NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams13
A queueing-theoretic framework for vehicle dispatching in dynamic car-hailing13
Angel-PTM: A Scalable and Economical Large-Scale Pre-Training System in Tencent13
Points-of-interest relationship inference with spatial-enriched graph neural networks13
ChainDash: An Ad-Hoc Blockchain Data Analytics System13
OFL-W3: A One-Shot Federated Learning System on Web 3.013
POEM: Pattern-Oriented Explanations of Convolutional Neural Networks13
B link -hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases13
Density Personalized Group Query13
Demonstration of accelerating machine learning inference queries with correlative proxy models13
Data and AI Model Markets: Opportunities for Data and Model Sharing, Discovery, and Integration13
Decentralized crowdsourcing for human intelligence tasks with efficient on-chain cost13
Explaining Differentially Private Query Results with DPXPlain13
Tiresias13
Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V13
In the land of data streams where synopses are missing, one framework to bring them all13
SecretFlow-SCQL: A Secure Collaborative Query Platform13
Efficient Execution of User-Defined Functions in SQL Queries13
FlowWalker: A Memory-Efficient and High-Performance GPU-Based Dynamic Graph Random Walk Framework13
Optimal Matrix Sketching over Sliding Windows13
No Repetition13
POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance13
Generating Succinct Descriptions of Database Schemata for Cost-Efficient Prompting of Large Language Models13
An Experimental Evaluation of Anomaly Detection in Time Series13
Troubles with nulls, views from the users13
MT-teql13
DeepJoin: Joinable Table Discovery with Pre-Trained Language Models12
The power of summarization in graph mining and learning12
Napa12
Hu-Fu12
Hu-fu12
0.1080379486084