IEEE Computer Architecture Letters

(The TQCC of IEEE Computer Architecture Letters is 3. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-04-01 to 2024-04-01.)
DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator95
SmartSSD: FPGA Accelerated Near-Storage Data Analytics on SSD40
RAMBO: Resource Allocation for Microservices Using Bayesian Optimization29
GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers27
pPIM: A Programmable Processor-in-Memory Architecture With Precision-Scaling for Deep Learning23
The Entangling Instruction Prefetcher17
Lightweight Hardware Implementation of Binary Ring-LWE PQC Accelerator16
MultiPIM: A Detailed and Configurable Multi-Stack Processing-In-Memory Simulator14
A Cross-Stack Approach Towards Defending Against Cryptojacking13
Flexion: A Quantitative Metric for Flexibility in DNN Accelerators12
Rebasing Instruction Prefetching: An Industry Perspective11
HBM3 RAS: Enhancing Resilience at Scale10
Cryogenic PIM: Challenges & Opportunities9
TRiM: Tensor Reduction in Memory8
STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators8
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles8
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions8
A Day In the Life of a Quantum Error7
Characterizing and Understanding End-to-End Multi-Modal Neural Networks on GPUs7
MCsim: An Extensible DRAM Memory Controller Simulator6
Harnessing Pairwise-Correlating Data Prefetching With Runahead Metadata6
BTB-X: A Storage-Effective BTB Organization6
Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training6
Accelerating Concurrent Priority Scheduling Using Adaptive in-Hardware Task Distribution in Multicores6
Instruction Criticality Based Energy-Efficient Hardware Data Prefetching5
Dagger: Towards Efficient RPCs in Cloud Microservices With Near-Memory Reconfigurable NICs5
DRAM-CAM: General-Purpose Bit-Serial Exact Pattern Matching5
A Lightweight Memory Access Pattern Obfuscation Framework for NVM5
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications5
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems5
LT-PIM: An LUT-Based Processing-in-DRAM Architecture With RowHammer Self-Tracking5
DAM: Deadblock Aware Migration Techniques for STT-RAM-Based Hybrid Caches4
Dynamic Optimization of On-Chip Memories for HLS Targeting Many-Accelerator Platforms4
Characterizing and Understanding HGNNs on GPUs4
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks4
Row-Streaming Dataflow Using a Chaining Buffer and Systolic Array+ Structure4
Zero-Copying I/O Stack for Low-Latency SSDs4
Hardware Acceleration for GCNs via Bidirectional Fusion4
A First-Order Model to Assess Computer Architecture Sustainability4
Adaptive Web Browsing on Mobile Heterogeneous Multi-cores4
Deep Partitioned Training From Near-Storage Computing to DNN Accelerators4
Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD4
OpenMDS: An Open-Source Shell Generation Framework for High-Performance Design on Xilinx Multi-Die FPGAs3
Infinity Stream: Enabling Transparent and Automated In-Memory Computing3
Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive Prefetching3
Hungarian Qubit Assignment for Optimized Mapping of Quantum Circuits on Multi-Core Architectures3
WPC: Whole-Picture Workload Characterization Across Intermediate Representation, ISA, and Microarchitecture3
Characterizing and Understanding Distributed GNN Training on GPUs3
Data-Aware Compression of Neural Networks3
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs3
Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing3
Managing Prefetchers With Deep Reinforcement Learning3
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture3