Proceedings of the Vldb Endowment

Papers
(The median citation count of Proceedings of the Vldb Endowment is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-04-01 to 2024-04-01.)
ArticleCitations
TranAD149
PyTorch distributed142
Deep entity matching with pre-trained language models124
TiDB113
Cloudburst113
The PGM-index107
Anomaly detection in time series96
A benchmarking study of embedding-based entity alignment for knowledge graphs88
Dash78
Privacy preserving vertical federated learning for tree-based models76
Delta lake71
NeuroCard69
Data market platforms60
Benchmarking learned indexes58
uTree52
Maximum biclique search at billion scale50
MyRocks48
A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search48
LedgerDB47
Data collection and quality challenges for deep learning46
Sato46
Are we ready for learned cardinality estimation?46
Responsible data management45
Diagnosing root causes of intermittent slow queries in cloud databases45
SlimChain44
Tsunami44
Effectively learning spatial indices44
Query performance prediction for concurrent queries using graph embedding43
Series2Graph43
Apache IoTDB43
Analyzing and mitigating data stalls in DNN training43
TURL42
Viper42
Pangolin42
Magic mirror in my hand, which is the best in the land?41
Natural language to SQL40
Decoupled dynamic spatial-temporal graph neural network for traffic forecasting40
Towards scalable dataframe systems40
SAND40
openGauss39
AGL39
Hypergraph motifs37
Cerebro36
Atomic commitment across blockchains36
Single machine graph analytics on massive datasets using Intel optane DC persistent memory35
Baran35
An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems35
Aria35
Updatable learned index with precise positions34
TSB-UAD34
Incrementalization of graph partitioning algorithms33
ATHENA++32
Efficient algorithms for budgeted influence maximization on massive social networks32
ByShard31
Understanding the idiosyncrasies of real persistent memory31
Learned cardinality estimation31
FACE30
Quantifying TPC-H choke points and their optimizations30
Fair task assignment in spatial crowdsourcing30
Rank aggregation algorithms for fair consensus29
ARDA29
tf.data29
Leaper28
Fauce28
TRACE28
HET28
SPORES27
Deep learning for blocking in entity matching27
Hierarchical core maintenance on large dynamic graphs26
Identifying insufficient data coverage in databases with multiple relations26
AnalyticDB-V26
Dual-objective fine-tuning of BERT for entity matching25
A learned query rewrite system using Monte Carlo tree search25
GeCo25
Real-time distance-based outlier detection in data streams25
Exathlon25
KClist++24
Randomized error removal for online spread estimation in data streaming24
Auctus24
RPT24
RapidMatch24
Accelerating truss decomposition on heterogeneous processors24
Multi-modal transportation recommendation with unified route representation learning24
Dealer24
ICS-GNN24
Large graph convolutional network training with GPU-oriented data communication architecture24
VHP24
Flow-loss23
Efficient size-bounded community search over large networks23
Building enclave-native storage engines for practical encrypted databases23
CGPTuner23
On-off sketch23
An analysis of concurrency control protocols for in-memory databases with CCBench23
Dremel23
DeepTRANS23
G 323
Frequency estimation under local differential privacy23
CodexDB22
Missing value imputation on multidimensional time series22
ADnEV22
Adopting worst-case optimal joins in relational database systems22
SAQE22
FREDE22
Anytime stochastic routing with hybrid learning22
Oracle AutoML21
GraphAn21
Improving reproducibility of data science pipelines through transparent provenance capture21
Jointly optimizing preprocessing and inference for DNN-based visual analytics21
Hybrid blockchain database systems21
Understanding the effect of data center resource disaggregation on production DBMSs21
Data management in microservices21
F1 lightning21
AutoCTS21
ByteGNN21
Efficiently approximating selectivity functions using low overhead regression models21
Relational data synthesis using generative adversarial networks20
Scrutinizer20
Lux20
Automated generation of materialized views in Oracle20
Scaling attributed network embedding to massive graphs20
FLAT20
FlexPushdownDB20
Hu-Fu20
Efficiently answering reachability and path queries on temporal bipartite graphs19
Real-world trajectory sharing with local differential privacy19
EMOGI19
Effective and efficient relational community detection and search in large dynamic heterogeneous information networks19
GraphScope19
Spitz19
Efficient bi-triangle counting for large bipartite networks19
Distributed hop-constrained s-t simple path enumeration at billion scale19
TGL19
Cardinality estimation in DBMS19
Can Foundation Models Wrangle Your Data?19
MDTP19
Constructing and analyzing the LSM compaction design space18
Machine learning for databases18
APEX18
Elle18
The computation of optimal subset repairs18
Persistent memory hash indexes18
Efficient oblivious database joins18
RONIN18
Fairly evaluating and scoring items in a data set18
Selective data acquisition in the wild for model charging17
DeepTrack: Monitoring and exploring spatio-temporal data17
Capturing and querying fine-grained provenance of preprocessing pipelines in data science17
Pricing influential nodes in online social networks17
Data synthesis via differentially private markov random fields17
From natural language processing to neural databases17
Butterfly-core community search over labeled graphs17
The art of balance17
Ordering heuristics for k -clique listing17
FINEdex17
Efficient join algorithms for large database tables in a multi-GPU environment17
POLARIS17
The simpler the better17
Volume under the surface17
Stingy sketch16
Topic-based community search over spatial-social networks16
Building high throughput permissioned blockchain fabrics16
VolcanoML16
Epoch-based commit and replication in distributed OLTP databases16
Capturing associations in graphs16
FSST16
Answering multi-dimensional range queries under local differential privacy16
Fine-grained lineage for safer notebook interactions16
CGM16
Stable learned bloom filters for data streams16
Watermarks in stream processing systems16
UDO16
Unsupervised time series outlier detection with diversity-driven convolutional ensembles16
Towards cost-effective and elastic cloud database deployment via memory disaggregation16
On the efficiency of K-means clustering16
Accelerating large scale real-time GNN inference using channel pruning16
Efficient and effective similar subtrajectory search with deep reinforcement learning16
NeuChain15
Fast subtrajectory similarity search in road networks under weighted edit distance constraints15
Sancus15
Revisiting the design of LSM-tree Based OLTP storage engine with persistent memory15
Efficient maximal biclique enumeration for large sparse bipartite graphs15
Nearest neighbor classifiers over incomplete information15
xFraud15
Query driven-graph neural networks for community search15
ConnectIt15
Cost-Based or Learning-Based?15
Projected federated averaging with heterogeneous differential privacy15
Automated feature engineering for algorithmic fairness14
KDV-explorer14
Massively parallel algorithms for personalized pagerank14
Teseo and the analysis of structural dynamic graphs14
Distributed subgraph counting14
Decomposed bounded floats for fast compression and queries14
Distributed deep learning on data systems14
Efficient label-constrained shortest path queries on road networks14
Finding large diverse communities on networks14
Zen14
Astrid14
DSB14
Tensors14
Approximate denial constraints14
Accelerating recommendation system training by leveraging popular choices14
Conversational BI13
Comprehensive and efficient workload compression13
(p,q)-biclique counting and enumeration for large sparse bipartite graphs13
Towards crowd-aware indoor path planning13
Obi-Wan13
Refiner13
New trends in high-D vector similarity search13
Tensor relational algebra for distributed machine learning system design13
iDEC13
Towards instance-optimized data systems13
Netherite13
FederatedScope: A Flexible Federated Learning Platform for Heterogeneity13
SChain13
Optimizing inference serving on serverless platforms13
OceanBase13
CALYPSO13
Data acquisition for improving machine learning models13
ThunderRW13
DeepTEA13
Integrating Data Lake Tables13
HydraList13
Do the best cloud configurations grow on trees?12
MorphStore12
Adaptive data augmentation for supervised learning over missing data12
Seagull12
AutoToken12
The relational data borg is learning12
SmartBench12
Guided exploration of user groups12
Hitting set enumeration with partial information for unique column combination discovery12
Bagua12
Shortest paths and centrality in uncertain networks12
Facilitating database tuning with hyper-parameter optimization12
Moneyball12
ODIN12
Butterfly counting on uncertain bipartite graphs12
Helios12
Tailoring data source distributions for fairness-aware data integration12
Monarch12
GRAIN12
NBTree12
Crystal12
Sage12
On analyzing graphs with motif-paths12
Demand-based sensor data gathering with multi-query optimization12
Optimal algorithms for ranked enumeration of answers to full conjunctive queries11
An experimental evaluation and guideline for path finding in weighted dynamic network11
Kamino11
0.030050039291382