OOIR: Observatory of International Research

Papers

(The median citation count of VLDB Journal is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2022-05-01 to 2026-05-01.)

Article	Citations
Threshold queries in theory and in the wild	441
An efficient and scalable graph database with built-in temporal support	128
Beyond influence: voting theory for opinion maximization	111
Generating highly customizable python code for data processing with large language models	85
Optimizing navigational graph queries	80
Efficiently Counting Four-Node Motifs in Large-Scale Temporal Graphs	35
Transactional panorama: a conceptual framework for user perception in analytical visual interfaces (extended version)	33
Missing Value Imputation in Tabular Data Lakes Unleashed: A Hybrid Approach	31
Reverse spatial top-k keyword queries	27
FOSS: A learned doctor for query optimization	26
Efficient and robust active learning methods for interactive database exploration	26
BioGITOM: Matching Biomedical Ontologies with Graph Isomorphism Transformer	25
Model reusability in Reinforcement Learning	25
Third and Boyce–Codd normal form for property graphs	20
Hyper-distance oracles in hypergraphs	20
Hu-Fu: efficient and secure spatial queries over data federation	20
Fast subgraph query processing and subgraph matching via static and dynamic equivalences	19
Efficient graph embedding at scale: optimizing CPU-GPU-SSD integration	19
In-database query optimization on SQL with ML predicates	19
On efficient 3D object retrieval	17
A new window Clause for SQL++	15
Learned sketch for subgraph counting: a holistic approach	15
PTSSP: privacy-preserving top-k spatial keyword similarity query with priority matching	15
Special issue on the best papers of DaMoN 2020	14
GPU-based butterfly counting	14

Discovering critical vertices for reinforcement of large-scale bipartite networks	14
LEON+: towards robust ML-aided query optimization	13
Efficient top-k spatial-range-constrained approximate nearest neighbor search on geo-tagged high-dimensional vectors	13
An update-intensive LSM-based R-tree index	12
DB-BERT: making database tuning tools “read” the manual	11
The Status-Quo in nested data processing for high-energy physics	11
P$$^2$$CG: a privacy preserving collaborative graph neural network training framework	11
Multi-constraint shortest path using forest hop labeling	11
DBSP: automatic incremental view maintenance for rich query languages	11
SQUID: subtrajectory query in trillion-scale GPS database	11
Correction to: BugDoc Iterative debugging and explanation of pipeline executions	10
On Querying Historical Connectivity in Large-scale Temporal Graphs	10
LIST: learning to index spatio-textual data for embedding based spatial keyword queries	10
ByShard: sharding in a Byzantine environment	10
DIST: Efficient k-Clique Listing via Induced Subgraph Trie	10
Efficient and effective algorithms for densest subgraph discovery and maintenance	9
Privacy-Utility Balanced Cooperative Online Matching in Spatial Crowdsourcing	9
Efficient detection of multivariate correlations with different correlation measures	9
DumpyOS: A data-adaptive multi-ary index for scalable data series similarity search	9
Towards flexibility and robustness of LSM trees	9
Anytime bottom-up rule learning for large-scale knowledge graph completion	8
A graph pattern mining framework for large graphs on GPU	8
Efficient and scalable huge embedding model training via distributed cache management	8
Incremental discovery of denial constraints	8
Generating adversarial SQL queries for evaluating cardinality estimators	8
Eris: efficiently measuring discord in multidimensional sources	7
Assisted design of data science pipelines	7
Tiered-Indexing: Optimizing Access Methods for Skew	7
AutoML in heavily constrained applications	7
Accelerating directed densest subgraph queries with software and hardware approaches	7
Efficient discovery of co-movement patterns from video data	7
A survey on deep learning approaches for text-to-SQL	6
Special issue: modern hardware	6
A survey on the evolution of stream processing systems	6
ICS-GNN$$^+$$: lightweight interactive community search via graph neural network	6
Performant almost-latch-free data structures using epoch protection in more depth	6
Survey of window types for aggregation in stream processing systems	6
Editorial for Special Issue: VLDB 2022	6
Efficient Algorithms for Uncertain Restricted Skyline Query Processing	5
Morphtree: a polymorphic main-memory learned index for dynamic workloads	5
xDBTagger: explainable natural language interface to databases using keyword mappings and schema graph	5
Lamba: A pretrained model for latency prediction over distributed databases	5
A multi-facet analysis of BERT-based entity matching models	5
Scalable decoupling graph neural network with feature-oriented optimization	5
HINT: a hierarchical interval index for Allen relationships	5
How good are machine learning clouds? Benchmarking two snapshots over 5 years	5
BatchHL$$^{+}$$: batch dynamic labelling for distance queries on large-scale networks	4
An Evaluation of B-tree Compression Techniques	4
Editorial: Special Issue for Selected Papers of VLDB 2021	4
VUS: effective and efficient accuracy measures for time-series anomaly detection	4

Data collection and quality challenges in deep learning: a data-centric AI perspective	4
Netherite: efficient execution of serverless workflows	4
HPCache: memory-efficient OLAP through proportional caching revisited	4
Tee-based key-value stores: a survey	4
FlexpushdownDB: rethinking computation pushdown for cloud OLAP DBMSs	4
Time-topology analysis on temporal graphs	4
Highly distributed and privacy-preserving queries on personal data management systems	4
Scalable lighting-fast temporal indexing	4
Join optimization revisited: a novel DP algorithm for join&sort order selection	4
A near-optimal approach to edge connectivity-based hierarchical graph decomposition	4
Identifying similar-bicliques in bipartite graphs	4
Efficient Task Assignment for Multi-Workerset Crowdsourcing with Time and Expense Considerations	4
MSAD: A deep dive into model selection for time series anomaly detection	4
A generic framework for efficient computation of top-k diverse results	4
Efficient algorithms for reachability and path queries on temporal bipartite graphs	3
Density decomposition on large static and dynamic graphs: algorithms and applications	3
A powerful reducing framework for accelerating set intersections over graphs	3
A systematic evaluation of machine learning on serverless infrastructure	3
MinJoin++: a fast algorithm for string similarity joins under edit distance	3
Efficient kNN query for moving objects on time-dependent road networks	3
HERMES: data placement and schema optimization for enterprise knowledge bases	3
Similarity-driven and task-driven models for diversity of opinion in crowdsourcing markets	3
Flexible grouping of linear segments for highly accurate lossy compression of time series data	3
Butterfly counting and bitruss decomposition on uncertain bipartite graphs	3
Measuring approximate functional dependencies: a comparative study	3
Accelerating maximum biplex search over large bipartite graphs	3
C5: cloned concurrency control that always keeps up	3
Leveraging user itinerary to improve personalized deep matching at Fliggy	3
Finding Locally Densest Subgraphs: Convex Programming with Edge and Triangle Density	3
Ingress: an automated incremental graph processing system	3
Detecting rumours with latency guarantees using massive streaming data	3
Enhancing domain-aware multi-truth data fusion using copy-based source authority and value similarity	3
Temporal graph patterns by timed automata	3
SWOOP: top-k similarity joins over set streams	3
Hypergraph motifs and their extensions beyond binary	3
AutoCTS++: zero-shot joint neural architecture and hyperparameter search for correlated time series forecasting	3
Table integration in data lakes unleashed: pairwise integrability judgment, integrable set discovery, and multi-tuple conflict resolution	3
Cardinality estimation using normalizing flow	3
Reliability evaluation of individual predictions: a data-centric approach	3