Journal of Big Data

Papers
(The TQCC of Journal of Big Data is 15. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-12-01 to 2025-12-01.)
ArticleCitations
Prognostic stratification based on HIF-1α signaling for evaluating hypoxia status and immune landscape in hepatocellular carcinoma558
Domain-relevance of influence: characterizing variations in online influence across multiple domains on social media397
Integrating deep learning and transfer learning: optimizing white blood cells classification in medical educational institutions384
Long-term survival prediction in patients with acute brain lesions using ensemble machine learning algorithms: a cohort study with combined national health insurance service and its self-run hospital 355
An artificial intelligence platform for predicting postoperative complications in metastatic spinal surgery: development and validation study302
A universal approach for multi-model schema inference268
Context-aware prediction of active and passive user engagement: Evidence from a large online social platform237
Data provider research overview from a public management perspective: a bibliometric analysis utilizing CiteSpace233
Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data219
Identification of tumor antigens and anoikis-based molecular subtypes in the hepatocellular carcinoma immune microenvironment: implications for mRNA vaccine development and precision treatment200
A proposed hybrid framework to improve the accuracy of customer churn prediction in telecom industry199
Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study160
FONDUE—Fine-Tuned Optimization: Nurturing Data Usability & Efficiency153
Designing and evaluating a big data analytics approach for predicting students’ success factors151
Value-at-risk student prescription trees for price personalization139
A new dimensionality reduction technique based on the Wavelet Transform for cancer classification135
Comprehensive study of driver behavior monitoring systems using computer vision and machine learning techniques134
GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette132
Defining user spectra to classify Ethereum users based on their behavior128
The stability of different aggregation techniques in ensemble feature selection127
Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems125
Breast cancer prediction using gated attentive multimodal deep learning120
Big data in human behavior research: a contextual turn113
Exploring differential privacy in CNNs, LSTMs, GRUs, and RNNs for heartbeat detection from multimodal data112
The adaptive community-response (ACR) method for collecting misinformation on social media98
Pre-trained transformer-based language models for Sundanese96
Deep-Eware: spatio-temporal social event detection using a hybrid learning model90
DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network89
Traffic and road conditions monitoring system using extracted information from Twitter88
An efficient binary spider wasp optimizer for multi-dimensional knapsack instances: experimental validation and analysis86
Survey on terminology extraction from texts79
A model for investment type recommender system based on the potential investors based on investors and experts feedback using ANFIS and MNN79
An adaptive k-means clustering algorithm based on grid and domain centroid weights for digital twins in the context of digital transformation78
A unified IoT architectural model for smart hospitals: enhancing interoperability, security, and efficiency through clinical information systems (CIS)78
Social media analysis of Twitter tweets related to ASD in 2019–2020, with particular attention to COVID-19: topic modelling and sentiment analysis77
Artificial intelligence for improving Nitrogen Dioxide forecasting of Abu Dhabi environment agency ground-based stations74
Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning73
Artificial intelligence models for prediction of monthly rainfall without climatic data for meteorological stations in Ethiopia71
Machine learning techniques to predict daily rainfall amount69
Review of deep learning methods for remote sensing satellite images classification: experimental survey and comparative analysis67
Risk and UCON-based access control model for healthcare big data64
The use of Big Data Analytics in healthcare63
Distributed fuzzy clustering algorithm for mixed-mode data in Apache SPARK62
Surface defect detection on bolt surface using a real-time fine-tuned YOLOv6 model62
SMT efficiency in supervised ML methods: a throughput and interference analysis62
Developing insights from the collective voice of target users in Twitter59
Fast agglomerative clustering using approximate traveling salesman solutions55
Novel mathematical model for the classification of music and rhythmic genre using deep neural network54
Scalable approach for high-resolution land cover: a case study in the Mediterranean Basin53
Xai-driven knowledge distillation of large language models for efficient deployment on low-resource devices53
Advancing stock price prediction through the development of hybrid ensembles: a comprehensive comparative analysis of machine learning approaches52
Contrastive self-supervised representation learning framework for metal surface defect detection51
Short-term photovoltaic power production forecasting based on novel hybrid data-driven models49
Machine learning based customer churn prediction in home appliance rental business48
Predicting startup success using two bias-free machine learning: resolving data imbalance using generative adversarial networks48
Traffic flow prediction based on depthwise separable convolution fusion network48
Hajj pilgrimage abnormal crowd movement monitoring using optical flow and FCNN47
Efficient surface crack segmentation for industrial and civil applications based on an enhanced YOLOv8 model47
Advancing hospital healthcare: achieving IoT-based secure health monitoring through multilayer machine learning47
Air-pollution prediction in smart city, deep learning approach47
IoT information theft prediction using ensemble feature selection46
Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection46
Part of speech tagging: a systematic review of deep learning and machine learning approaches46
Disaggregating IMERG satellite precipitation over Czech Republic: an innovative approach using hybrid Extreme Gradient Boosting based on Fuzzy Spatial-Temporal Multivariate Clustering45
Meta-transformer: leveraging metaheuristic algorithms for agricultural commodity price forecasting44
From distributed machine to distributed deep learning: a comprehensive survey40
Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning40
Modeling the impact of BDA-AI on sustainable innovation ambidexterity and environmental performance40
Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction40
Empowering sentiment analysis in social media: a comprehensive approach to enhance the classification of abusive Tamil comments using transformer models39
Transforming OpenAPI Specification 3.0 documents into RDF-based semantic web services39
Efficient spatial data partitioning for distributed $$k$$NN joins38
Siamese Graph Convolutional Split-Attention Network with NLP based Social Sentimental Data for enhanced stock price predictions38
PoLYTC: a novel BERT-based classifier to detect political leaning of YouTube videos based on their titles38
Helformer: an attention-based deep learning model for cryptocurrency price forecasting37
A novel ST-iTransformer model for spatio-temporal ambient air pollution forecasting37
Enhancing academic performance prediction with temporal graph networks for massive open online courses37
Fuzzy deep learning architecture for cucumber plant disease detection and classification37
Metamorphosing forex: advancements in volatility forecasting using a modified fuzzy time series framework36
Churn management in hospitality36
Emotion AWARE: an artificial intelligence framework for adaptable, robust, explainable, and multi-granular emotion analysis36
Enhanced ransomware attacks detection using feature selection, sensitivity analysis, and optimized hybrid model36
Governance and sustainability of distributed continuum systems: a big data approach35
Advanced multilevel feature fusion framework for enhanced image retrieval using convolutional neural network and benchmark datasets35
Enhancing cardiac diagnostics: a deep learning ensemble approach for precise ECG image classification35
Comparative analysis of binary and one-class classification techniques for credit card fraud data34
The use of class imbalanced learning methods on ULSAM data to predict the case–control status in genome-wide association studies34
Machine learning model for malaria risk prediction based on mutation location of large-scale genetic variation data34
Self-organizing maps to evaluate optimal strategies for balancing binary class distributions: a methodological approach32
Image captioning model using attention and object features to mimic human image understanding32
Multi combination pattern labeling by using deep learning for chameleon rotary machine environment32
Multi-sample $$\zeta $$-mixup: richer, more realistic synthetic samples from a p-series interpolant31
Privacy preserved incremental record linkage31
A deep learning-based framework for large-scale plant disease detection using big data analytics in precision agriculture30
Ramifications of incorrect image segmentations; emphasizing on the potential effects on deep learning methods failure30
Deep features fusion for KCF-based moving object tracking29
Liquid biopsy-based identification of prognostic and immunotherapeutically relevant gene signatures in lower grade glioma28
Optimizing group utility in itinerary planning: a strategic and crowd-aware approach28
Determinating clusters with a higher proportion of long-term care discharges from hospitals: a nationwide Portuguese study using clustering and decision tree methods28
Generative AI in depth: A survey of recent advances, model variants, and real-world applications27
Uncertainty-aware approach for multiple imputation using conventional and machine learning models: a real-world data study27
Data augmentation for dense passage retrieval using corpus-passage frequency-based token deletion27
Adaptive multiple imputations of missing values using the class center27
Data pipeline approaches in serverless computing: a taxonomy, review, and research trends26
A deep contrastive learning-based image retrieval system for automatic detection of infectious cattle diseases26
Online variational Gaussian process for time series data26
HepScope: CNN-based single-cell discrimination of malignant hepatocytes26
Opinion mining for national security: techniques, domain applications, challenges and research opportunities26
Comprehensive review of artificial intelligence applications in renewable energy systems: current implementations and emerging trends26
Distinguishing novel coronavirus influenza A virus pneumonia with CT radiomics and clinical features26
Big Data Analytics-based life cycle sustainability assessment for sustainable manufacturing enterprises evaluation26
Research on sentiment analysis method of opinion mining based on multi-model fusion transfer learning26
Sentiment analysis classification system using hybrid BERT models26
Deep learning for component fault detection in electricity transmission lines25
A novel sub-network level ensemble deep neural network with a regularized loss function to improve prediction performance25
Identification of key drought-tolerant genes in soybean using an integrative data-driven feature engineering pipeline24
Tumor antigens and immune subtypes of glioblastoma: the fundamentals of mRNA vaccine and individualized immunotherapy development24
A systematic review on big data applications and scope for industrial processing and healthcare sectors24
Optical electrocardiogram based heart disease prediction using hybrid deep learning24
A systematic review of AI-enhanced techniques in credit card fraud detection24
Federated Freeze BERT for text classification23
Plant disease detection and classification techniques: a comparative study of the performances23
Extended version of decision making model for industrial robot selection via fractional continuous fuzzy information23
The differences in gastric cancer epidemiological data between SEER and GBD: a joinpoint and age-period-cohort analysis23
Enhancing data discovery with contextual pre-filtering23
Data analysis for vague contingency data23
An enhanced machine learning framework for accurate diagnosis of tuberculous pleural effusion23
Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approach23
Stress detection using natural language processing and machine learning over social interactions23
Utilizing AI models to identify and predict phase transition patterns of bipolar disorder patients23
Unsupervised label generation for severely imbalanced fraud data23
Spatial heterogeneities in acute lower respiratory infections prevalence and determinants across Ethiopian administrative zones22
Text-to-video generators: a comprehensive survey22
Multi strategy Horned Lizard Optimization Algorithm for complex optimization and advanced feature selection problems22
FEL-FRN: fusion ECA long-CLIP feature reconstruction network for few-shot classification22
The evolution of the European football transfer network21
The state of metaverse research: a bibliometric visual analysis based on CiteSpace21
Operationalizing and automating Data Governance21
Big data processing using hybrid Gaussian mixture model with salp swarm algorithm21
Hemorrhage semantic segmentation in fundus images for the diagnosis of diabetic retinopathy by using a convolutional neural network21
A fuel consumption-based method for developing local-specific CO2 emission rate database using open-source big data21
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning21
Unsupervised hyperspectral image segmentation of films: a hierarchical clustering-based approach21
De-occlusion and recognition of frontal face images: a comparative study of multiple imputation methods20
Evaluation is key: a survey on evaluation measures for synthetic time series20
Block-level masking and feature importance-based adversarial example generation20
A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method20
An enhanced random forest approach using CoClust clustering: MIMIC-III and SMS spam collection application19
Hyperdimensional computing: a framework for stochastic computation and symbolic AI19
Deep learning enhancing banking services: a hybrid transaction classification and cash flow prediction approach19
Main memory controller with multiple media technologies for big data workloads19
Data reduction techniques for highly imbalanced medicare Big Data19
Evaluation of predictive performance of modeling hyperuricemia using medical big data: comparison of data preprocessing methods18
Potential for the use of large unstructured data resources by public innovation support institutions18
Machine learning-based interactive dynamic resilience assessment for complex hydropower systems18
Scalable and space-efficient Robust Matroid Center algorithms18
Readers’ affect: predicting and understanding readers’ emotions with deep learning18
Combining review elements for modelling various multi-criteria collaborative recommendation models18
Multi-level lag scheme significantly improves training efficiency in deep learning: a case study in air quality alert service over sub-tropical area18
Data analysis for sequential contingencies under uncertainty18
Capturing research literature attitude towards sustainable development goals: an LLM-based topic modeling approach18
Axial compressive behavior of reinforced concrete-filled circular steel tubular columns: finite element and machine learning modelling18
A real-time predicting online tool for detection of people’s emotions from Arabic tweets based on big data platforms17
RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares17
Information preservation-based hashing for image retrieval17
Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection17
Accelerating neural network training with distributed asynchronous and selective optimization (DASO)17
Machine learning-based turbulence-risk prediction method for the safe operation of aircrafts17
Introducing Mplots: scaling time series recurrence plots to massive datasets17
Big data: an optimized approach for cluster initialization17
Bilingual hate speech detection on social media: Amharic and Afaan Oromo17
IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset16
An optimized hybrid ensemble machine learning model combining multiple classifiers for detecting advanced persistent threats in networks16
Photograph-based machine learning approach for automated detection and differentiation of aerial blight disease in soybean crops16
Towards a deep learning-based outlier detection approach in the context of streaming data16
Breast cancer diagnosis with MFF-HistoNet: a multi-modal feature fusion network integrating CNNs and quantum tensor networks16
Transfer learning approach based on satellite image time series for the crop classification problem16
An efficient weighted slime mould algorithm for engineering optimization16
Enhancing public art communication through emotional intelligence based on type-2 fractional fuzzy sets15
IDC: quantitative evaluation benchmark of interpretation methods for deep text classification models15
Blind Federated Learning without initial model15
Fitcam: detecting and counting repetitive exercises with deep learning15
Data engineering for sustainable agriculture: developments, challenges, and case studies of a novel IoRT architecture15
A unified representation and transformation of multi-model data using category theory15
Leveraging ensemble learning-based stock preselection with multiobjective investment optimization for stepwise decision-supported portfolio management15
Computational methods for predicting the outcome of thoracic transplantation15
A computational analysis of aspect-based sentiment analysis research through bibliometric mapping and topic modeling15
Exploring AI-driven approaches for unstructured document analysis and future horizons15
Application of supervised machine learning models in human emotion classification using Tsallis entropy as a feature15
Student academic performance prediction via hypergraph and TabNet15
Decision support system for handling control decisions and decision-maker related to supply chain15
Learning manifolds from non-stationary streams15
Modeling the public attitude towards organic foods: a big data and text mining approach15
The wisdom of the lexicon crowds: leveraging on decades of lexicon-based sentiment analysis for improved results15
0.082677125930786