VLDB Journal

Papers
(The median citation count of VLDB Journal is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
Data collection and quality challenges in deep learning: a data-centric AI perspective138
Fast-adapting and privacy-preserving federated recommender system52
Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytic48
Fairness in rankings and recommendations: an overview45
A survey of RDF stores & SPARQL engines for querying knowledge graphs41
A model and query language for temporal graph databases37
MDDE: multitasking distributed differential evolution for privacy-preserving database fragmentation35
A survey on outlier explanations32
A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning32
A survey on deep learning approaches for text-to-SQL30
Unsupervised and scalable subsequence anomaly detection in large data series29
LineageChain: a fine-grained, secure and efficient data provenance system for blockchains29
Data dependencies for query optimization: a survey27
Location- and keyword-based querying of geo-textual data: a survey26
Cross-chain deals and adversarial commerce23
Towards efficient solutions of bitruss decomposition for large-scale bipartite graphs23
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs22
Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra22
Distributed temporal graph analytics with GRADOOP19
Privacy and efficiency guaranteed social subgraph matching17
Dragoon: a hybrid and efficient big trajectory management system for offline and online analytics16
Visually aware recommendation with aesthetic features15
A survey on semantic schema discovery15
PrefixFPM: a parallel framework for general-purpose mining of frequent and closed patterns14
Answering reachability and K-reach queries on large graphs with label constraints14
eRiskCom: an e-commerce risky community detection platform14
On entity alignment at scale14
$$\hbox {CDBTune}^{+}$$: An efficient deep reinforcement learning-based automatic cloud database tuning system14
I/O efficient k-truss community search in massive graphs13
In-Memory Interval Joins13
A dataspace-based framework for OLAP analyses in a high-variety multistore13
A survey on the evolution of stream processing systems13
GeoSparkViz: a cluster computing system for visualizing massive-scale geospatial data13
Fast subgraph query processing and subgraph matching via static and dynamic equivalences12
Data distribution debugging in machine learning pipelines12
HFUL: a hybrid framework for user account linkage across location-aware social networks12
Efficient kNN query for moving objects on time-dependent road networks12
Efficient and effective ER with progressive blocking11
ByShard: sharding in a Byzantine environment10
Algorithms for the discovery of embedded functional dependencies10
Parallel mining of large maximal quasi-cliques10
A cost model for random access queries in document stores10
Efficient Hop-constrained s-t Simple Path Enumeration10
Better database cost/performance via batched I/O on programmable SSD9
Fast data series indexing for in-memory data9
Pivot selection algorithms in metric spaces: a survey and experimental study9
G-thinker: a general distributed framework for finding qualified subgraphs in a big graph with load balancing9
Leveraging range joins for the computation of overlap joins8
Accelerated butterfly counting with vertex priority on bipartite graphs8
Unified route representation learning for multi-modal transportation recommendation with spatiotemporal pre-training8
Effective entity matching with transformers8
Model averaging in distributed machine learning: a case study with Apache Spark8
VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition8
Semantic embedding for regions of interest7
Efficient distributed discovery of bidirectional order dependencies7
(p,q)-biclique counting and enumeration for large sparse bipartite graphs7
A design space for RDF data representations7
Cache-efficient sweeping-based interval joins for extended Allen relation predicates7
ProS: data series progressive k-NN similarity search and classification with probabilistic quality guarantees7
Complex event forecasting with prediction suffix trees6
Resource-aware adaptive indexing for in situ visual exploration and analytics6
ICS-GNN$$^+$$: lightweight interactive community search via graph neural network6
PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search6
ABSTAT-HD: a scalable tool for profiling very large knowledge graphs6
A meta-level analysis of online anomaly detectors6
Maximum and top-k diversified biclique search at scale6
RDFFrames: knowledge graph access for machine learning tools6
ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths6
Deep entity matching with adversarial active learning6
Picket: guarding against corrupted data in tabular data during learning and inference6
Span-reachability querying in large temporal graphs6
Interactively discovering and ranking desired tuples by data exploration5
Application-driven graph partitioning5
Fast fully dynamic labelling for distance queries5
HINT: a hierarchical interval index for Allen relationships5
Distributed detection of sequential anomalies in univariate time series5
Accelerating multi-way joins on the GPU5
Cleaning timestamps with temporal constraints5
Toward maintenance of hypercores in large-scale dynamic hypergraphs5
Continuous monitoring of moving skyline and top-k queries5
Local dampening: differential privacy for non-numeric queries via local sensitivity5
Payment behavior prediction on shared parking lots with TR-GCN5
Comparison and evaluation of state-of-the-art LSM merge policies5
Multi-constraint shortest path using forest hop labeling4
Learned sketch for subgraph counting: a holistic approach4
Zen+: a robust NUMA-aware OLTP engine optimized for non-volatile main memory4
Simple and automated negative sampling for knowledge graph embedding4
Robust and scalable content-and-structure indexing4
Formal semantics and high performance in declarative machine learning using Datalog4
General graph generators: experiments, analyses, and improvements4
Anchored coreness: efficient reinforcement of social networks4
An authorization model for query execution in the cloud4
ABC of order dependencies4
Anytime bottom-up rule learning for large-scale knowledge graph completion4
An analysis of one-to-one matching algorithms for entity resolution3
Making graphs compact by lossless contraction3
In-order sliding-window aggregation in worst-case constant time3
Micro-architectural analysis of in-memory OLTP: Revisited3
Internal and external memory set containment join3
A fractional memory-efficient approach for online continuous-time influence maximization3
A Pareto optimal Bloom filter family with hash adaptivity3
Fast, exact, and parallel-friendly outlier detection algorithms with proximity graph in metric spaces3
Survey of window types for aggregation in stream processing systems3
Practical planning and execution of groupjoin and nested aggregates3
Survey of vector database management systems3
Deep treatment-adaptive network for causal inference3
Label-constrained shortest path query processing on road networks3
Diversifying recommendations on sequences of sets3
Reverse spatial top-k keyword queries3
A survey on transactional stream processing2
Data-induced predicates for sideways information passing in query optimizers2
P$$^2$$CG: a privacy preserving collaborative graph neural network training framework2
Efficient exploratory clustering analyses in large-scale exploration processes2
Opportunities for optimism in contended main-memory multicore transactions2
Accelerating directed densest subgraph queries with software and hardware approaches2
Exploiting domain knowledge to address class imbalance and a heterogeneous feature space in multi-class classification2
Time-topology analysis on temporal graphs2
Fine-grained semantic type discovery for heterogeneous sources using clustering2
Privacy-preserving worker allocation in crowdsourcing2
SQUID: subtrajectory query in trillion-scale GPS database2
Efficient structural node similarity computation on billion-scale graphs2
Distance labeling: on parallelism, compression, and ordering2
Efficient detection of multivariate correlations with different correlation measures2
Detecting rumours with latency guarantees using massive streaming data2
BugDoc2
Learning-based query optimization for multi-probe approximate nearest neighbor search2
Coalition-based task assignment with priority-aware fairness in spatial crowdsourcing2
Have query optimizers hit the wall?2
Tabular data synthesis with generative adversarial networks: design space and optimizations2
0.047466993331909