Journal of Big Data

Papers
(The median citation count of Journal of Big Data is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
Domain-relevance of influence: characterizing variations in online influence across multiple domains on social media3559
Detecting unregistered users through semi-supervised anomaly detection with similarity datasets417
DD-KARB: data-driven compliance to quality by rule based benchmarking269
A new dimensionality reduction technique based on the Wavelet Transform for cancer classification243
A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method182
A scheduling algorithm to maximize storm throughput in heterogeneous cluster171
Prognostic stratification based on HIF-1α signaling for evaluating hypoxia status and immune landscape in hepatocellular carcinoma166
Dissimilarity space reinforced with manifold learning and latent space modeling for improved pattern classification152
Gaussian transformation enhanced semi-supervised learning for sleep stage classification117
Poisson logit hurdle model with associated factors of perinatal mortality in Ethiopia114
Digital social innovation based on Big Data Analytics for health and well-being of society113
Missing values compensation in duplicates detection using hot deck method113
Dual channel and multi-scale adaptive morphological methods for infrared small targets107
Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study95
PCJ Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads79
Modeling and tracking Covid-19 cases using Big Data analytics on HPCC system platform67
An enhanced random forest approach using CoClust clustering: MIMIC-III and SMS spam collection application65
Estimating the carbon content of oceans using satellite sensor data64
RTiSR: a review-driven time interval-aware sequential recommendation method58
Machine learning approach for predicting production delays: a quarry company case study58
GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette54
Education on quality assurance and assessment in teaching quality of high school instructors53
Exploring the form of big data products and the supporting systems49
Evaluation is key: a survey on evaluation measures for synthetic time series49
Designing and evaluating a big data analytics approach for predicting students’ success factors49
A proposed hybrid framework to improve the accuracy of customer churn prediction in telecom industry47
Tabular and latent space synthetic data generation: a literature review45
Hyperdimensional computing: a framework for stochastic computation and symbolic AI44
Error and optimism bias regularization43
Quality assurance strategies for machine learning applications in big data analytics: an overview41
New custom rating for improving recommendation system performance40
Identification of tumor antigens and anoikis-based molecular subtypes in the hepatocellular carcinoma immune microenvironment: implications for mRNA vaccine development and precision treatment40
An analysis of COVID-19 economic measures and attitudes: evidence from social media mining39
A fuel consumption-based method for developing local-specific CO2 emission rate database using open-source big data39
Automated segmentation of choroidal neovascularization on optical coherence tomography angiography images of neovascular age-related macular degeneration patients based on deep learning38
Autoencoder-kNN meta-model based data characterization approach for an automated selection of AI algorithms37
Early prediction of MODS interventions in the intensive care unit using machine learning37
Data analysis for vague contingency data36
Classification of long-term clinical course of Parkinson’s disease using clustering algorithms on social support registry database35
Title2Vec: a contextual job title embedding for occupational named entity recognition and other applications34
Real-time spatio-temporal event detection on geotagged social media34
CTGAN-ENN: a tabular GAN-based hybrid sampling method for imbalanced and overlapped data in customer churn prediction34
Social media analysis of car parking behavior using similarity based clustering33
Remote patient monitoring and classifying using the internet of things platform combined with cloud computing33
The differences in gastric cancer epidemiological data between SEER and GBD: a joinpoint and age-period-cohort analysis32
Free trade as domestic, economic, and strategic issues: a big data analytics approach32
Network intrusion detection using data dimensions reduction techniques32
Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data31
Click-through rate prediction model integrating user interest and multi-head attention mechanism31
Defining user spectra to classify Ethereum users based on their behavior31
An empirical study on the evaluation of the RDF storage systems31
Machine learning concepts for correlated Big Data privacy30
Cartographies of warfare in the Indian subcontinent: Contextualizing archaeological and historical analysis through big data approaches29
A hybrid Hadoop-based sentiment analysis classifier for tweets associated with COVID-19 utilizing two machine learning algorithms: CNN, and fuzzy C4.529
Context-aware prediction of active and passive user engagement: Evidence from a large online social platform29
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning29
Skyline query under multidimensional incomplete data based on classification tree29
An LSTM and GRU based trading strategy adapted to the Moroccan market28
Towards a folksonomy graph-based context-aware recommender system of annotated books28
The stability of different aggregation techniques in ensemble feature selection28
Watch and learn: event-domain term extraction from social networks28
Operationalizing and automating Data Governance27
Exploring the state of the art in legal QA systems26
Detection of fickle trolls in large-scale online social networks26
Memetic multilabel feature selection using pruned refinement process26
Hemorrhage semantic segmentation in fundus images for the diagnosis of diabetic retinopathy by using a convolutional neural network26
Anomaly detection and community detection in networks26
Algorithm for generating neutrosophic data using accept-reject method26
Unsupervised hyperspectral image segmentation of films: a hierarchical clustering-based approach26
ASENN: attention-based selective embedding neural networks for road distress prediction25
A survey of graph convolutional networks (GCNs) in FPGA-based accelerators25
A review on lung disease recognition by acoustic signal analysis with deep learning networks25
Profitability trend prediction in crypto financial markets using Fibonacci technical indicator and hybrid CNN model25
Low-level turbulence risk assessment and visualization using temporal rate of change of headwind of an aircraft25
Usability enhancement model for unstructured text in big data24
Spatial heterogeneities in acute lower respiratory infections prevalence and determinants across Ethiopian administrative zones24
Data-driven multinomial random forest: a new random forest variant with strong consistency24
Automatic identification and classification of pediatric glomerulonephritis on ultrasound images based on deep learning and radiomics24
A data value metric for quantifying information content and utility24
De-occlusion and recognition of frontal face images: a comparative study of multiple imputation methods24
High-performance computing in healthcare: An automatic literature analysis perspective24
Large language models, social demography, and hegemony: comparing authorship in human and synthetic text23
Sentiment analysis of Indonesian datasets based on a hybrid deep-learning strategy23
Accuracy improvements for cold-start recommendation problem using indirect relations in social networks22
Apply machine learning techniques to detect malicious network traffic in cloud computing21
Web crawling based context aware recommender system using optimized deep recurrent neural network21
Array databases: concepts, standards, implementations21
A universal approach for multi-model schema inference21
Data reduction techniques for highly imbalanced medicare Big Data20
A novel sensitivity-based method for feature selection20
A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications20
Detecting cybersecurity attacks across different network features and learners20
The state of metaverse research: a bibliometric visual analysis based on CiteSpace19
Deep learning enhancing banking services: a hybrid transaction classification and cash flow prediction approach19
Breast cancer prediction using gated attentive multimodal deep learning19
Diabetes emergency cases identification based on a statistical predictive model18
Big data quality framework: a holistic approach to continuous quality management18
Comprehensive study of driver behavior monitoring systems using computer vision and machine learning techniques18
A literature review on one-class classification and its potential applications in big data18
Exploration of issues, challenges and latest developments in autonomous cars18
Robust visual tracking using very deep generative model18
Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems18
Text Data Augmentation for Deep Learning18
Optimizing poultry audio signal classification with deep learning and burn layer fusion18
Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging18
Big data processing using hybrid Gaussian mixture model with salp swarm algorithm17
Online listing data and their interaction with market dynamics: evidence from Singapore during COVID-1917
Correlation-based feature selection of single cell transcriptomics data from multiple sources17
Out-of-distribution- and location-aware PointNets for real-time 3D road user detection without a GPU17
Dual-weight decay mechanism and Nelder-Mead simplex boosted RIME algorithm for optimal power flow17
Correction to: Arabic text summarization using deep learning approach17
A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector17
Leveraging fine-grained mobile data for churn detection through Essence Random Forest17
Artificial intelligence for improving Nitrogen Dioxide forecasting of Abu Dhabi environment agency ground-based stations17
An integrated multistage ensemble machine learning model for fraudulent transaction detection17
Pre-trained transformer-based language models for Sundanese16
Assessing the effects of hyperparameters on knowledge graph embedding quality16
Improving efficiency for discovering business processes containing invisible tasks in non-free choice16
Classification of SSVEP-based BCIs using Genetic Algorithm16
A large-scale sentiment analysis of tweets pertaining to the 2020 US presidential election16
RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares15
MuSe: a multi-level storage scheme for big RDF data using MapReduce15
Topological variable neighborhood search15
Detection and prevention of SQLI attacks and developing compressive framework using machine learning and hybrid techniques15
Performance evaluation of deep learning techniques for DoS attacks detection in wireless sensor network15
Modelling customers credit card behaviour using bidirectional LSTM neural networks15
An adaptive hybrid african vultures-aquila optimizer with Xgb-Tree algorithm for fake news detection15
Scalable and space-efficient Robust Matroid Center algorithms14
On the development of an information system for monitoring user opinion and its role for the public14
Potential for the use of large unstructured data resources by public innovation support institutions14
Bayesian multilevel model on maternal mortality in Ethiopia14
Internal dynamics of patent reference networks using the Bray–Curtis dissimilarity measure14
Multi-level lag scheme significantly improves training efficiency in deep learning: a case study in air quality alert service over sub-tropical area14
Assessing the current landscape of AI and sustainability literature: identifying key trends, addressing gaps and challenges14
Efficient microservices offloading for cost optimization in diverse MEC cloud networks14
Can we predict multi-party elections with Google Trends data? Evidence across elections, data windows, and model classes13
15 years of Big Data: a systematic literature review13
Review of deep learning methods for remote sensing satellite images classification: experimental survey and comparative analysis13
Introducing Mplots: scaling time series recurrence plots to massive datasets13
Fast cluster-based computation of exact betweenness centrality in large graphs13
Discovering top-weighted k-truss communities in large graphs13
IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset13
AI sees beyond humans: automated diagnosis of myopia based on peripheral refraction map using interpretable deep learning13
Chromatin state distribution of residue-specific histone acetylation in early myoblast differentiation13
Social media analysis of Twitter tweets related to ASD in 2019–2020, with particular attention to COVID-19: topic modelling and sentiment analysis13
Expanded graph embedding for joint network alignment and link prediction13
Machine learning techniques to predict daily rainfall amount12
RPf-GCNs: reciprocal perspective driven fused GCNs for rumor detection on social media12
Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection12
EXABSUM: a new text summarization approach for generating extractive and abstractive summaries12
Accelerating neural network training with distributed asynchronous and selective optimization (DASO)12
Deep learning based deep-sea automatic image enhancement and animal species classification12
DEMFFA: a multi-strategy modified Fennec Fox algorithm with mixed improved differential evolutionary variation strategies12
Data analysis for sequential contingencies under uncertainty12
The adaptive community-response (ACR) method for collecting misinformation on social media12
A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters11
A service-categorized security scheme with physical unclonable functions for internet of vehicles11
Machine learning-based turbulence-risk prediction method for the safe operation of aircrafts11
Using social media for sub-event detection during disasters11
VeilGraph: incremental graph stream processing11
A guide to creating an effective big data management framework11
VEDAS: an efficient GPU alternative for store and query of large RDF data sets11
Design, development and performance analysis of cognitive assisting aid with multi sensor fused navigation for visually impaired people11
Deep-Eware: spatio-temporal social event detection using a hybrid learning model11
An unsupervised method for social network spammer detection based on user information interests11
Distributed fuzzy clustering algorithm for mixed-mode data in Apache SPARK11
Mapping and 3D modelling using quadrotor drone and GIS software10
A systematic literature review of neuroimaging coupled with machine learning approaches for diagnosis of attention deficit hyperactivity disorder10
Towards a deep learning-based outlier detection approach in the context of streaming data10
Comparing traditional news and social media with stock price movements; which comes first, the news or the price change?10
Improving lookup and query execution performance in distributed Big Data systems using Cuckoo Filter10
An efficient weighted slime mould algorithm for engineering optimization10
Time series data analysis under indeterminacy10
On data efficiency of univariate time series anomaly detection models10
A model for investment type recommender system based on the potential investors based on investors and experts feedback using ANFIS and MNN10
Characterizing patent big data upon IPC: a survey of triadic patent families and PCT applications10
Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications10
Traffic and road conditions monitoring system using extracted information from Twitter10
Development of a regional voice dataset and speaker classification based on machine learning10
Development and evaluation of a deep learning model for automatic segmentation of non-perfusion area in fundus fluorescein angiography10
eBF: an enhanced Bloom Filter for intrusion detection in IoT10
Introducing the enterprise data marketplace: a platform for democratizing company data9
Enhancing the quality of communication of cellular networks using big data applications9
Transfer learning approach based on satellite image time series for the crop classification problem9
Crude oil price forecasting using K-means clustering and LSTM model enhanced by dense-sparse-dense strategy9
Exploration of the investment patterns of potential retail banking customers using two-stage cluster analysis9
Artificial intelligence models for prediction of monthly rainfall without climatic data for meteorological stations in Ethiopia9
An empirical comparison of the performances of single structure columnar in-memory and disk-resident data storage techniques using healthcare big data9
Accurate identification of cashmere and wool fibers based on enhanced ShuffleNetV2 and transfer learning9
Transforming the generative pretrained transformer into augmented business text writer8
Data-driven prediction of soccer outcomes using enhanced machine and deep learning techniques8
Readers’ affect: predicting and understanding readers’ emotions with deep learning8
Cyberbullying detection: advanced preprocessing techniques & deep learning architecture for Roman Urdu data8
A survey on bandwidth-aware geo-distributed frameworks for big-data analytics8
Risk and UCON-based access control model for healthcare big data8
A semi-supervised short text sentiment classification method based on improved Bert model from unlabelled data8
DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network8
Big data: an optimized approach for cluster initialization8
Practical ANN prediction models for the axial capacity of square CFST columns8
Detecting web attacks using random undersampling and ensemble learners8
A review on adversarial–based deep transfer learning mechanical fault diagnosis8
The application of adaptive group LASSO imputation method with missing values in personal income compositional data8
A real-time predicting online tool for detection of people’s emotions from Arabic tweets based on big data platforms8
The use of Big Data Analytics in healthcare8
Main memory controller with multiple media technologies for big data workloads8
Quantum-inspired framework for big data analytics: evaluating the impact of movie trailers and its financial returns7
An efficient binary spider wasp optimizer for multi-dimensional knapsack instances: experimental validation and analysis7
A problem-agnostic approach to feature selection and analysis using SHAP7
Data science approach to stock prices forecasting in Indonesia during Covid-19 using Long Short-Term Memory (LSTM)7
Improve data classification performance in diagnosing diabetes using the Binary Exchange Market Algorithm7
Exploring AI-driven approaches for unstructured document analysis and future horizons7
Performance-efficient distributed transfer and transformation of big spatial histopathology datasets in the cloud7
Damped weighted erasable itemset mining with time sensitive dynamic environments7
Alzheimer’s disease diagnosis by 3D-SEConvNeXt7
Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning7
Neural network training with limited precision and asymmetric exponent7
Exploring investor-business-market interplay for business success prediction7
Uncertainty-driven generation of neutrosophic random variates from the Weibull distribution7
Hypoxia within tumor microenvironment characterizes distinct genomic patterns and aids molecular subtyping for guiding individualized immunotherapy7
The use of generative adversarial networks to alleviate class imbalance in tabular data: a survey7
Multivariate cryptocurrency prediction: comparative analysis of three recurrent neural networks approaches7
Predicting LQ45 financial sector indices using RNN-LSTM7
Fitcam: detecting and counting repetitive exercises with deep learning7
Spoofing keystroke dynamics authentication through synthetic typing pattern extracted from screen-recorded video7
Unsupervised outlier detection for time-series data of indoor air quality using LSTM autoencoder with ensemble method7
On hierarchical clustering-based approach for RDDBS design6
Contrastive self-supervised representation learning framework for metal surface defect detection6
The impact of ensemble learning on surgical tools classification during laparoscopic cholecystectomy6
Vehicle routing problems based on Harris Hawks optimization6
User profile correlation-based similarity (UPCSim) algorithm in movie recommendation system6
Enhancing argumentation component classification using contextual language model6
Cochran’s Q test for analyzing categorical data under uncertainty6
Evaluating latent content within unstructured text: an analytical methodology based on a temporal network of associated topics6
Testing coverage criteria for optimized deep belief network with search and rescue6
Efficient parallel derivation of short distinguishing sequences for nondeterministic finite state machines using MapReduce6
Predicting startup success using two bias-free machine learning: resolving data imbalance using generative adversarial networks6
Computational methods for predicting the outcome of thoracic transplantation6
An exploratory content and sentiment analysis of the guardian metaverse articles using leximancer and natural language processing6
Novel mathematical model for the classification of music and rhythmic genre using deep neural network6
An efficient annealing-assisted differential evolution for multi-parameter adaptive latent factor analysis6
Evolutionary computation-based self-supervised learning for image processing: a big data-driven approach to feature extraction and fusion for multispectral object detection6
Dissecting tumor antigens and immune subtypes for mRNA vaccine development in breast cancer6
Time series modeling of road traffic accidents in Amhara Region6
Why polls fail to predict elections5
IDC: quantitative evaluation benchmark of interpretation methods for deep text classification models5
Analyzing the worldwide perception of the Russia-Ukraine conflict through Twitter5
Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media5
New distributed-topsis approach for multi-criteria decision-making problems in a big data context5
Internet of things and ensemble learning-based mental and physical fatigue monitoring for smart construction sites5
Application of microservices patterns to big data systems5
0.12736392021179