International Journal of High Performance Computing Applications

Papers
(The median citation count of International Journal of High Performance Computing Applications is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-12-01 to 2025-12-01.)
ArticleCitations
Visualization at exascale: Making it all work with VTK-m62
Dynamic spawning of MPI processes applied to malleability62
Enhancing scalability of a matrix-free eigensolver for studying many-body localization57
HPL-MxP benchmark: Mixed-precision algorithms, iterative refinement, and scalable data generation35
Automatizing the creation of specialized high-performance computing containers34
Large-scale ab initio simulation of light–matter interaction at the atomic scale in Fugaku32
Refining HPCToolkit for application performance analysis at exascale21
HPC I/O innovations in the exascale era16
Compressed basis GMRES on high-performance graphics processing units15
Running ahead of evolution—AI-based simulation for predicting future high-risk SARS-CoV-2 variants14
Accelerating atmospheric physics parameterizations using graphics processing units13
Modeling, evaluating, and orchestrating heterogeneous environmental leverages for large-scale data center management13
HDF5 in the exascale era: Delivering efficient and scalable parallel I/O for exascale applications12
Scalable multilevel Monte Carlo methods exploiting parallel redistribution on coarse levels11
Direct numerical simulations for hybrid rocket boundary layers: Performance modeling and scaling11
Julia versus C++ Kokkos for performance portable Cartesian CFD solvers on heterogeneous architectures10
Orchestration of materials science workflows for heterogeneous resources at large scale9
Ginkgo - A math library designed to accelerate Exascale Computing Project science applications9
General framework for re-assuring numerical reliability in parallel Krylov solvers: A case of bi-conjugate gradient stabilized methods9
Special issue introduction8
Retraction Notice8
Preparing MPICH for exascale8
Hypergraph-based locality-enhancing methods for graph operations in Big Data applications8
A tale of two codes: CUDA vs OpenACC for mass-zero constrained dynamics8
An elastic framework for ensemble-based large-scale data assimilation7
Integrating ytopt and libEnsemble to autotune OpenMC7
Massively parallel nodal discontinous Galerkin finite element method simulator for room acoustics7
GPU-based molecular dynamics of fluid flows: Reaching for turbulence7
Accelerated dynamic data reduction using spatial and temporal properties7
Performance of explicit and IMEX MRI multirate methods on complex reactive flow problems within modern parallel adaptive structured grid frameworks7
A study on the performance of distributed training of data-driven CFD simulations7
Fast truncated SVD of sparse and dense matrices on graphics processors6
PeleC: An adaptive mesh refinement solver for compressible reacting flows6
Cache blocking of distributed-memory parallel matrix power kernels5
Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications5
TransGRU-X – A fusion Seq2Seq network enhanced with multiresolution analysis and gating for forecasting of AI/ML workloads in cloud environments5
Data-driven scalable pipeline using national agent-based models for real-time pandemic response and decision support5
Preparing the TAU performance system for exascale and beyond5
Fair-sharing simulator: Toward fair scheduling in batch computing systems5
NUMA-aware parallel sparse LU factorization for SPICE-based circuit simulators on ARM multi-core processors5
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors5
Special issue: Introduction5
Clacc: OpenACC for C/C++ in Clang5
HOPPS: A performance portable spectral difference solver for high-fidelity computational fluid dynamics5
HPC-AI coupling methodology for scientific applications4
Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers4
An integrated three-dimensional aeromechanical analysis for the prediction of stresses on modern coaxial rotors4
Feynman and computation: From Los Alamos to quantum computers4
Sequence length scaling in vision transformers for scientific images on frontier4
Understanding power and energy utilization in large scale production physics simulation codes4
Asynchronous-many-task systems: Challenges and opportunities - Scaling an AMR astrophysics code on exascale machines using Kokkos and HPX4
Cache-optimized and low-overhead implementations of additive Schwarz methods for high-order FEM multigrid computations4
Accelerating cluster dynamics simulation of fission gas behavior in nuclear fuel on deep computing unit–based heterogeneous architecture supercomputer4
Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials4
Bricks: A high-performance portability layer for computations on block-structured grids4
Performance analysis of relaxation Runge–Kutta methods4
Advances in ArborX to support exascale applications3
MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures3
Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations3
Guest editors note: Special issue on clusters, clouds, and data for scientific computing3
Simulation-based machine learning for real-time assessment of side-branch hemodynamics in coronary bifurcation lesions3
Performance evaluation of mixed-precision Runge–Kutta methods for the solution of partial differential equations3
ECP libraries and tools: An overview3
UMap: An application-oriented user level memory mapping library3
IO-aware Job-Scheduling: Exploiting the Impacts of Workload Characterizations to select the Mapping Strategy3
PaRSEC: Scalability, flexibility, and hybrid architecture support for task-based applications in ECP3
Black-box statistical prediction of lossy compression ratios for scientific data3
An HPC benchmark survey and taxonomy for characterization3
Corrigendum to large-scale direct numerical simulations of turbulence using GPUs and modern Fortran3
PoCL-R: An open standard based heterogeneous offloading layer with server side scalability3
The ECP ALPINE project: In situ and post hoc visualization infrastructure and analysis capabilities for exascale3
Exploiting mesh structure to improve multigrid performance for saddle-point problems3
Scalable cosmic AI inference using cloud serverless computing3
Fault-tolerant numerical iterative algorithms at scale3
Efficient solution of batched band linear systems on GPUs3
Mixed precision LU factorization on GPU tensor cores: reducing data movement and memory footprint2
Detecting interference between applications and improving the scheduling using malleable application clones2
Task-parallel in situ temporal compression of large-scale computational fluid dynamics data2
High performance computing seismic redatuming by inversion with algebraic compression and multiple precisions2
Performance portability in a real world application: PHAST applied to Caffe2
An implicit barotropic mode solver for MPAS-ocean using a modern Fortran solver interface2
Evolution of the SLATE linear algebra library2
Deep learning foundation and pattern models: Challenges in hydrological time series2
Fixed-work versus fixed-time checkpointing on large-scale failure-prone platforms2
End-to-end GPU acceleration of low-order-refined preconditioning for high-order finite element discretizations2
Result-scalability: Following the evolution of selected social impact of HPC2
SWARM: Reimagining scientific workflow management systems in a distributed world2
High-performance conjugate gradient benchmark: A comprehensive survey2
Role-shifting threads: Increasing OpenMP malleability to address load imbalance at MPI and OpenMP2
#COVIDisAirborne: AI-enabled multiscale computational microscopy of delta SARS-CoV-2 in a respiratory aerosol2
A fine-grained parallelization of the immersed boundary method2
0.029744148254395