Journal of Big Data

Papers
(The median citation count of Journal of Big Data is 4. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-05-01 to 2025-05-01.)
ArticleCitations
A proposed hybrid framework to improve the accuracy of customer churn prediction in telecom industry531
GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette353
Identification of tumor antigens and anoikis-based molecular subtypes in the hepatocellular carcinoma immune microenvironment: implications for mRNA vaccine development and precision treatment341
Domain-relevance of influence: characterizing variations in online influence across multiple domains on social media284
Context-aware prediction of active and passive user engagement: Evidence from a large online social platform267
Prognostic stratification based on HIF-1α signaling for evaluating hypoxia status and immune landscape in hepatocellular carcinoma228
Long-term survival prediction in patients with acute brain lesions using ensemble machine learning algorithms: a cohort study with combined national health insurance service and its self-run hospital 219
Value-at-risk student prescription trees for price personalization169
A new dimensionality reduction technique based on the Wavelet Transform for cancer classification167
Designing and evaluating a big data analytics approach for predicting students’ success factors161
Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data149
Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study132
Defining user spectra to classify Ethereum users based on their behavior127
Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems116
Breast cancer prediction using gated attentive multimodal deep learning107
Missing values compensation in duplicates detection using hot deck method104
Accuracy improvements for cold-start recommendation problem using indirect relations in social networks95
The stability of different aggregation techniques in ensemble feature selection93
Comprehensive study of driver behavior monitoring systems using computer vision and machine learning techniques91
A universal approach for multi-model schema inference85
A novel sensitivity-based method for feature selection81
The adaptive community-response (ACR) method for collecting misinformation on social media79
Distributed fuzzy clustering algorithm for mixed-mode data in Apache SPARK79
Deep-Eware: spatio-temporal social event detection using a hybrid learning model77
A model for investment type recommender system based on the potential investors based on investors and experts feedback using ANFIS and MNN77
Traffic and road conditions monitoring system using extracted information from Twitter74
Modelling customers credit card behaviour using bidirectional LSTM neural networks70
Big data in human behavior research: a contextual turn67
Pre-trained transformer-based language models for Sundanese65
Fast cluster-based computation of exact betweenness centrality in large graphs63
Classification of SSVEP-based BCIs using Genetic Algorithm62
An efficient binary spider wasp optimizer for multi-dimensional knapsack instances: experimental validation and analysis61
DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network58
Survey on terminology extraction from texts57
Risk and UCON-based access control model for healthcare big data56
Artificial intelligence models for prediction of monthly rainfall without climatic data for meteorological stations in Ethiopia55
Artificial intelligence for improving Nitrogen Dioxide forecasting of Abu Dhabi environment agency ground-based stations54
Review of deep learning methods for remote sensing satellite images classification: experimental survey and comparative analysis54
Machine learning techniques to predict daily rainfall amount53
The use of Big Data Analytics in healthcare53
IoT information theft prediction using ensemble feature selection52
Social media analysis of Twitter tweets related to ASD in 2019–2020, with particular attention to COVID-19: topic modelling and sentiment analysis52
Predicting startup success using two bias-free machine learning: resolving data imbalance using generative adversarial networks51
Xai-driven knowledge distillation of large language models for efficient deployment on low-resource devices49
SMT efficiency in supervised ML methods: a throughput and interference analysis49
Short-term photovoltaic power production forecasting based on novel hybrid data-driven models48
Hajj pilgrimage abnormal crowd movement monitoring using optical flow and FCNN46
Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection46
Intrusion detection systems using long short-term memory (LSTM)45
Part of speech tagging: a systematic review of deep learning and machine learning approaches45
Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction44
From distributed machine to distributed deep learning: a comprehensive survey44
Contrastive self-supervised representation learning framework for metal surface defect detection43
Advancing hospital healthcare: achieving IoT-based secure health monitoring through multilayer machine learning43
Fast agglomerative clustering using approximate traveling salesman solutions42
Traffic flow prediction based on depthwise separable convolution fusion network42
Developing insights from the collective voice of target users in Twitter42
Efficient surface crack segmentation for industrial and civil applications based on an enhanced YOLOv8 model42
Novel mathematical model for the classification of music and rhythmic genre using deep neural network41
Air-pollution prediction in smart city, deep learning approach40
Machine learning based customer churn prediction in home appliance rental business40
Scalable approach for high-resolution land cover: a case study in the Mediterranean Basin40
Modeling the impact of BDA-AI on sustainable innovation ambidexterity and environmental performance39
Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning39
Emotion AWARE: an artificial intelligence framework for adaptable, robust, explainable, and multi-granular emotion analysis38
PoLYTC: a novel BERT-based classifier to detect political leaning of YouTube videos based on their titles38
Metamorphosing forex: advancements in volatility forecasting using a modified fuzzy time series framework37
Transforming OpenAPI Specification 3.0 documents into RDF-based semantic web services36
Big data fuzzy C-means algorithm based on bee colony optimization using an Apache Hbase36
Multi combination pattern labeling by using deep learning for chameleon rotary machine environment36
Helformer: an attention-based deep learning model for cryptocurrency price forecasting35
The use of class imbalanced learning methods on ULSAM data to predict the case–control status in genome-wide association studies35
Enhancing cardiac diagnostics: a deep learning ensemble approach for precise ECG image classification35
Efficient spatial data partitioning for distributed $$k$$NN joins34
Enhancing academic performance prediction with temporal graph networks for massive open online courses34
Comparative analysis of binary and one-class classification techniques for credit card fraud data33
Image captioning model using attention and object features to mimic human image understanding33
A novel ST-iTransformer model for spatio-temporal ambient air pollution forecasting33
Machine learning model for malaria risk prediction based on mutation location of large-scale genetic variation data32
Automatic analysis of social media images to identify disaster type and infer appropriate emergency response31
Siamese Graph Convolutional Split-Attention Network with NLP based Social Sentimental Data for enhanced stock price predictions31
Ramifications of incorrect image segmentations; emphasizing on the potential effects on deep learning methods failure31
Exploring halal tourism tweets on social media31
A graph-based big data optimization approach using hidden Markov model and constraint satisfaction problem31
Governance and sustainability of distributed continuum systems: a big data approach31
Multi-sample $$\zeta $$-mixup: richer, more realistic synthetic samples from a p-series interpolant31
Research on sentiment analysis method of opinion mining based on multi-model fusion transfer learning29
Integration of image segmentation and fuzzy theory to improve the accuracy of damage detection areas in traffic accidents29
The use of knowledge extraction in predicting customer churn in B2B29
Privacy preserved incremental record linkage28
Adaptive multiple imputations of missing values using the class center28
Normalization and outlier removal in class center-based firefly algorithm for missing value imputation27
Uncertainty-aware approach for multiple imputation using conventional and machine learning models: a real-world data study27
Deep features fusion for KCF-based moving object tracking27
Big Data Analytics-based life cycle sustainability assessment for sustainable manufacturing enterprises evaluation27
Liquid biopsy-based identification of prognostic and immunotherapeutically relevant gene signatures in lower grade glioma25
Opinion mining for national security: techniques, domain applications, challenges and research opportunities25
HepScope: CNN-based single-cell discrimination of malignant hepatocytes25
Unsupervised outlier detection in multidimensional data25
Data pipeline approaches in serverless computing: a taxonomy, review, and research trends25
A distributed Content-Based Video Retrieval system for large datasets25
A deep contrastive learning-based image retrieval system for automatic detection of infectious cattle diseases25
Online variational Gaussian process for time series data24
A systematic review of AI-enhanced techniques in credit card fraud detection24
Distinguishing novel coronavirus influenza A virus pneumonia with CT radiomics and clinical features24
Novel sensitivity method for evaluating the first derivative of the feed-forward neural network outputs24
Identification of key drought-tolerant genes in soybean using an integrative data-driven feature engineering pipeline23
The forecast of COVID-19 spread risk at the county level23
Stress detection using natural language processing and machine learning over social interactions22
A novel time efficient learning-based approach for smart intrusion detection system22
Sentiment analysis classification system using hybrid BERT models22
Federated Freeze BERT for text classification22
Tumor antigens and immune subtypes of glioblastoma: the fundamentals of mRNA vaccine and individualized immunotherapy development22
Deep learning for emotion analysis in Arabic tweets22
Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approach22
Optical electrocardiogram based heart disease prediction using hybrid deep learning22
Plant disease detection and classification techniques: a comparative study of the performances22
The differences in gastric cancer epidemiological data between SEER and GBD: a joinpoint and age-period-cohort analysis21
Deep learning for component fault detection in electricity transmission lines21
Data analysis for vague contingency data21
A systematic review on big data applications and scope for industrial processing and healthcare sectors21
Operationalizing and automating Data Governance20
Unsupervised label generation for severely imbalanced fraud data20
Dissimilarity space reinforced with manifold learning and latent space modeling for improved pattern classification20
De-occlusion and recognition of frontal face images: a comparative study of multiple imputation methods20
A fuel consumption-based method for developing local-specific CO2 emission rate database using open-source big data20
Hyperdimensional computing: a framework for stochastic computation and symbolic AI20
Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning19
Spatial heterogeneities in acute lower respiratory infections prevalence and determinants across Ethiopian administrative zones19
Data reduction techniques for highly imbalanced medicare Big Data19
Unsupervised hyperspectral image segmentation of films: a hierarchical clustering-based approach19
An enhanced random forest approach using CoClust clustering: MIMIC-III and SMS spam collection application19
Real-time spatio-temporal event detection on geotagged social media19
An enhanced machine learning framework for accurate diagnosis of tuberculous pleural effusion19
The state of metaverse research: a bibliometric visual analysis based on CiteSpace18
Evaluation is key: a survey on evaluation measures for synthetic time series18
Utilizing AI models to identify and predict phase transition patterns of bipolar disorder patients18
Hemorrhage semantic segmentation in fundus images for the diagnosis of diabetic retinopathy by using a convolutional neural network17
Deep learning enhancing banking services: a hybrid transaction classification and cash flow prediction approach17
Big data processing using hybrid Gaussian mixture model with salp swarm algorithm17
FEL-FRN: fusion ECA long-CLIP feature reconstruction network for few-shot classification17
Breast cancer diagnosis with MFF-HistoNet: a multi-modal feature fusion network integrating CNNs and quantum tensor networks17
A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method17
An efficient weighted slime mould algorithm for engineering optimization17
Bilingual hate speech detection on social media: Amharic and Afaan Oromo16
Potential for the use of large unstructured data resources by public innovation support institutions16
Machine learning-based turbulence-risk prediction method for the safe operation of aircrafts16
A real-time predicting online tool for detection of people’s emotions from Arabic tweets based on big data platforms16
Scalable and space-efficient Robust Matroid Center algorithms16
Evaluation of predictive performance of modeling hyperuricemia using medical big data: comparison of data preprocessing methods16
Data analysis for sequential contingencies under uncertainty16
Accelerating neural network training with distributed asynchronous and selective optimization (DASO)16
Main memory controller with multiple media technologies for big data workloads16
Introducing Mplots: scaling time series recurrence plots to massive datasets16
Machine learning-based interactive dynamic resilience assessment for complex hydropower systems15
Multi-level lag scheme significantly improves training efficiency in deep learning: a case study in air quality alert service over sub-tropical area15
Big data: an optimized approach for cluster initialization15
Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection15
Towards a deep learning-based outlier detection approach in the context of streaming data15
Readers’ affect: predicting and understanding readers’ emotions with deep learning15
Transfer learning approach based on satellite image time series for the crop classification problem15
RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares15
Decision support system for handling control decisions and decision-maker related to supply chain14
IDC: quantitative evaluation benchmark of interpretation methods for deep text classification models14
Artifact-free fat-water separation in Dixon MRI using deep learning14
A computational analysis of aspect-based sentiment analysis research through bibliometric mapping and topic modeling14
A unified representation and transformation of multi-model data using category theory14
Learning manifolds from non-stationary streams14
Computational methods for predicting the outcome of thoracic transplantation14
Exploring AI-driven approaches for unstructured document analysis and future horizons14
IoT Big Data provenance scheme using blockchain on Hadoop ecosystem14
IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset14
Fitcam: detecting and counting repetitive exercises with deep learning14
Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods14
Time series modeling of road traffic accidents in Amhara Region14
Awareness routing algorithm in vehicular ad-hoc networks (VANETs)13
An integrated model for evaluation of big data challenges and analytical methods in recommender systems13
Advanced machine learning techniques for cardiovascular disease early detection and diagnosis13
Predicting clinical outcomes of radiotherapy for head and neck squamous cell carcinoma patients using machine learning algorithms13
Optimization-based convolutional neural model for the classification of white blood cells13
Blind Federated Learning without initial model13
Architecture for determining the cleanliness in shared vehicles using an integrated machine vision and indoor air quality-monitoring system13
Addressing big data variety using an automated approach for data characterization13
Modeling the public attitude towards organic foods: a big data and text mining approach13
Unified platform for storing, retrieving, and analysing biomechanical applications data using graph database13
Big social data as a service (BSDaaS): a service composition framework for social media analysis13
Application of deep learning technique in next generation sequence experiments12
Enhanced credit card fraud detection based on attention mechanism and LSTM deep model12
Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods12
Trends in real-time artificial intelligence methods in sports: a systematic review12
Aspect-level sentiment classification with fused local and global context12
A novel approach for detecting deep fake videos using graph neural network12
Advances in ECG and PCG-based cardiovascular disease classification: a review of deep learning and machine learning methods12
DAPS diagrams for defining Data Science projects12
Attribute annotation and bias evaluation in visual datasets for autonomous driving12
Separable convolutional neural networks for facial expressions recognition12
DLA-E: a deep learning accelerator for endoscopic images classification12
A scalable association rule learning and recommendation algorithm for large-scale microarray datasets12
Retinal photograph-based deep learning system for detection of hyperthyroidism: a multicenter, diagnostic study12
A canonical model for seasonal climate prediction using Big Data12
‘Everything is data’: towards one big data ecosystem using multiple sources of data on higher education in Indonesia12
Simulating imprecise data: sine–cosine and convolution methods with neutrosophic normal distribution11
Unsupervised feature learning-based encoder and adversarial networks11
HyGraph: a subgraph isomorphism algorithm for efficiently querying big graph databases11
An ensemble method for estimating the number of clusters in a big data set using multiple random samples11
Toward a smart health: big data analytics and IoT for real-time miscarriage prediction11
Detecting Denial of Service attacks using machine learning algorithms11
A multi-manifold learning based instance weighting and under-sampling for imbalanced data classification problems11
B-CAT: a model for detecting botnet attacks using deep attack behavior analysis on network traffic flows11
CMMamba: channel mixing Mamba for time series forecasting11
Developing a negative speech emotion recognition model for safety systems using deep learning11
Adapting transformer-based language models for heart disease detection and risk factors extraction11
Bilingual video captioning model for enhanced video retrieval11
Survey of transformers and towards ensemble learning using transformers for natural language processing11
The power of big data mining to improve the health care system in the United Arab Emirates11
Minimum threshold determination method based on dataset characteristics in association rule mining11
Where you go is who you are: a study on machine learning based semantic privacy attacks10
Using passive Wi-Fi for community crowd sensing during the COVID-19 pandemic10
Predictive analytics using Big Data for the real estate market during the COVID-19 pandemic10
Evaluation of the trends in jobs and skill-sets using data analytics: a case study10
Data generation and application using the neutrosophic Erlang distribution10
Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review10
Automatic diagnosis of keratitis using object localization combined with cost-sensitive deep attention convolutional neural network10
Identification of mRNA vaccines and conserved ferroptosis related immune landscape for individual precision treatment in bladder cancer10
A multi-dimensional hierarchical evaluation system for data quality in trustworthy AI10
Arabic aspect sentiment polarity classification using BERT10
Forex market forecasting using machine learning: Systematic Literature Review and meta-analysis9
Twitter sentiment analysis using hybrid gated attention recurrent network9
Exploring the form of big data products and the supporting systems9
ASENN: attention-based selective embedding neural networks for road distress prediction9
A novel intelligent approach for flight delay prediction9
Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest9
Exploring the state of the art in legal QA systems9
Classification of long-term clinical course of Parkinson’s disease using clustering algorithms on social support registry database9
Shielding networks: enhancing intrusion detection with hybrid feature selection and stack ensemble learning9
Automatic identification and classification of pediatric glomerulonephritis on ultrasound images based on deep learning and radiomics9
New custom rating for improving recommendation system performance9
A scheduling algorithm to maximize storm throughput in heterogeneous cluster9
A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters8
Sentiment analysis of Indonesian datasets based on a hybrid deep-learning strategy8
A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications8
Topological variable neighborhood search8
Chromatin state distribution of residue-specific histone acetylation in early myoblast differentiation8
Apply machine learning techniques to detect malicious network traffic in cloud computing8
Remote patient monitoring and classifying using the internet of things platform combined with cloud computing8
An empirical comparison of the performances of single structure columnar in-memory and disk-resident data storage techniques using healthcare big data8
A literature review on one-class classification and its potential applications in big data8
Towards a folksonomy graph-based context-aware recommender system of annotated books8
An LSTM and GRU based trading strategy adapted to the Moroccan market8
Alzheimer’s disease diagnosis by 3D-SEConvNeXt8
1.0645091533661