IEEE Computer Architecture Letters

Papers
(The TQCC of IEEE Computer Architecture Letters is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-02-01 to 2025-02-01.)
ArticleCitations
A Model for Scalable and Balanced Accelerators for Graph Processing20
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture12
Baobab Merkle Tree for Efficient Secure Memory9
High-Performance Winograd Based Accelerator Architecture for Convolutional Neural Network8
IntervalSim++: Enhanced Interval Simulation for Unbalanced Processor Designs8
FullPack: Full Vector Utilization for Sub-Byte Quantized Matrix-Vector Multiplication on General Purpose CPUs8
GPU-centric Memory Tiering for LLM Serving with NVIDIA Grace Hopper Superchip8
Address Scaling: Architectural Support for Fine-Grained Thread-Safe Metadata Management7
BTB-X: A Storage-Effective BTB Organization7
A Case Study of a DRAM-NVM Hybrid Memory Allocator for Key-Value Stores7
Redundant Array of Independent Memory Devices7
Toward Practical 128-Bit General Purpose Microarchitectures6
Intelligent SSD Firmware for Zero-Overhead Journaling6
Speculative Multi-Level Access in LSM Tree-Based KV Store6
Revisiting Browser Performance Benchmarking From an Architectural Perspective6
SmartIndex: Learning to Index Caches to Improve Performance6
Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware5
Enhancing DNN Training Efficiency Via Dynamic Asymmetric Architecture5
Structured Combinators for Efficient Graph Reduction5
Guessing Outputs of Dynamically Pruned CNNs Using Memory Access Patterns5
Approximate Multiplier Design With LFSR-Based Stochastic Sequence Generators for Edge AI4
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs4
Analysis of Data Transfer Bottlenecks in Commercial PIM Systems: A Study With UPMEM-PIM4
DeMM: A Decoupled Matrix Multiplication Engine Supporting Relaxed Structured Sparsity4
Open-Source Hardware Memory Protection Engine Integrated With NVMM Simulator4
Hungarian Qubit Assignment for Optimized Mapping of Quantum Circuits on Multi-Core Architectures4
R.I.P. Geomean Speedup Use Equal-Work (Or Equal-Time) Harmonic Mean Speedup Instead4
Smart Memory: Deep Learning Acceleration in 3D-Stacked Memories4
On Variable Strength Quantum ECC4
Hashing ATD Tags for Low-Overhead Safe Contention Monitoring4
The Importance of Generalizability in Machine Learning for Systems4
Proactive Embedding on Cold Data for Deep Learning Recommendation Model Training3
A Case for Hardware Memoization in Server CPUs3
An Area Efficient Architecture of a Novel Chaotic System for High Randomness Security in e-Health3
GATe: Streamlining Memory Access and Communication to Accelerate Graph Attention Network With Near-Memory Processing3
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks3
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications3
NoHammer: Preventing Row Hammer With Last-Level Cache Management3
Learned Performance Model for SSD3
Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture2
SPAM: Streamlined Prefetcher-Aware Multi-Threaded Cache Covert-Channel Attack2
Exploiting Intrinsic Redundancies in Dynamic Graph Neural Networks for Processing Efficiency2
BayesTuner: Leveraging Bayesian Optimization For DNN Inference Configuration Selection2
UDIR: Towards a Unified Compiler Framework for Reconfigurable Dataflow Architectures2
Hardware-Implemented Lightweight Accelerator for Large Integer Polynomial Multiplication2
Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD2
Efficient Implementation of Knuth Yao Sampler on Reconfigurable Hardware2
Towards Improved Power Management in Cloud GPUs2
HAMMER: Hardware-Friendly Approximate Computing for Self-Attention With Mean-Redistribution And Linearization2
Managing Prefetchers With Deep Reinforcement Learning2
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models2
eDKM: An Efficient and Accurate Train-Time Weight Clustering for Large Language Models2
Characterization and Analysis of Text-to-Image Diffusion Models2
Accelerating Page Migrations in Operating Systems with Intel DSA2
LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads2
Towards an Accelerator for Differential and Algebraic Equations Useful to Scientists2
Simulating Our Way to Safer Software: A Tale of Integrating Microarchitecture Simulation and Leakage Estimation Modeling2
Exploring PIM Architecture for High-Performance Graph Pattern Mining2
0.11128687858582