IEEE Computer Architecture Letters

Papers
(The TQCC of IEEE Computer Architecture Letters is 2. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2021-08-01 to 2025-08-01.)
ArticleCitations
Speculative Multi-Level Access in LSM Tree-Based KV Store37
Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture32
The Architectural Sustainability Indicator15
A Characterization of Generative Recommendation Models: Study of Hierarchical Sequential Transduction Unit15
Characterization and Analysis of Text-to-Image Diffusion Models15
Toward Practical 128-Bit General Purpose Microarchitectures13
Old is Gold: Optimizing Single-Threaded Applications With ExGen-Malloc13
SCALES: SCALable and Area-Efficient Systolic Accelerator for Ternary Polynomial Multiplication12
Time Series Machine Learning Models for Precise SSD Access Latency Prediction11
2021 Index IEEE Computer Architecture Letters Vol. 2010
Straw: A Stress-Aware WL-Based Read Reclaim Technique for High-Density NAND Flash-Based SSDs10
SoCurity: A Design Approach for Enhancing SoC Security10
Improving Energy-Efficiency of Capsule Networks on Modern GPUs9
RouteReplies: Alleviating Long Latency in Many-Chip-Module GPUs8
A Flexible Embedding-Aware Near Memory Processing Architecture for Recommendation System8
OASIS: Outlier-Aware KV Cache Clustering for Scaling LLM Inference in CXL Memory Systems8
In-Memory Versioning (IMV)8
Security Helper Chiplets: A New Paradigm for Secure Hardware Monitoring7
A Case for In-Memory Random Scatter-Gather for Fast Graph Processing7
Exploiting Intel Advanced Matrix Extensions (AMX) for Large Language Model Inference7
Exploring the DIMM PIM Architecture for Accelerating Time Series Analysis7
Accelerating Deep Reinforcement Learning via Phase-Level Parallelism for Robotics Applications6
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture6
Mitigating Timing-Based NoC Side-Channel Attacks With LLC Remapping6
PUDTune: Multi-Level Charging for High-Precision Calibration in Processing-Using-DRAM6
LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads5
DeMM: A Decoupled Matrix Multiplication Engine Supporting Relaxed Structured Sparsity5
NoHammer: Preventing Row Hammer With Last-Level Cache Management5
Managing Prefetchers With Deep Reinforcement Learning5
pNet-gem5: Full-System Simulation With High-Performance Networking Enabled by Parallel Network Packet Processing5
SparseLeakyNets: Classification Prediction Attack Over Sparsity-Aware Embedded Neural Networks Using Timing Side-Channel Information5
High-Performance Winograd Based Accelerator Architecture for Convolutional Neural Network5
Enhancing the Reach and Reliability of Quantum Annealers by Pruning Longer Chains4
A Flexible Hybrid Interconnection Design for High-Performance and Energy-Efficient Chiplet-Based Systems4
Memory-Centric MCM-GPU Architecture4
ZoneBuffer: An Efficient Buffer Management Scheme for ZNS SSDs4
Adaptive Web Browsing on Mobile Heterogeneous Multi-cores4
SSD Offloading for LLM Mixture-of-Experts Weights Considered Harmful in Energy Efficiency4
PreGNN: Hardware Acceleration to Take Preprocessing Off the Critical Path in Graph Neural Networks4
Primate: A Framework to Automatically Generate Soft Processors for Network Applications4
Fast Performance Prediction for Efficient Distributed DNN Training3
Exploring Volatile FPGAs Potential for Accelerating Energy-Harvesting IoT Applications3
T-CAT: Dynamic Cache Allocation for Tiered Memory Systems With Memory Interleaving3
A Quantum Computer Trusted Execution Environment3
Guard Cache: Creating Noisy Side-Channels3
FPGA-Accelerated Data Preprocessing for Personalized Recommendation Systems3
Architectural Implications of GNN Aggregation Programming Abstractions3
Accelerators & Security: The Socket Approach3
SSE: Security Service Engines to Accelerate Enclave Performance in Secure Multicore Processors3
Camulator: a Lightweight and Extensible Trace-Driven Cache Simulator for Embedded Multicore SoCs3
Direct-Coding DNA With Multilevel Parallelism3
R.I.P. Geomean Speedup Use Equal-Work (Or Equal-Time) Harmonic Mean Speedup Instead2
DRAM-CAM: General-Purpose Bit-Serial Exact Pattern Matching2
Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack2
Energy-Efficient Bayesian Inference Using Bitstream Computing2
Minimal Counters, Maximum Insight: Simplifying System Performance With HPC Clusters for Optimized Monitoring2
A Case Study of a DRAM-NVM Hybrid Memory Allocator for Key-Value Stores2
EgDiff: An Enhanced Global Load Value Predictor2
PINSim: A Processing In- and Near-Sensor Simulator to Model Intelligent Vision Sensors2
A First-Order Model to Assess Computer Architecture Sustainability2
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems2
FullPack: Full Vector Utilization for Sub-Byte Quantized Matrix-Vector Multiplication on General Purpose CPUs2
Accelerating Page Migrations in Operating Systems With Intel DSA2
Analyzing and Exploiting Memory Hierarchy Parallelism With MLP Stacks2
SEMS: Scalable Embedding Memory System for Accelerating Embedding-Based DNNs2
Reducing the Silicon Area Overhead of Counter-Based Rowhammer Mitigations2
gem5-accel: A Pre-RTL Simulation Toolchain for Accelerator Architecture Validation2
Per-Row Activation Counting on Real Hardware: Demystifying Performance Overheads2
IntervalSim++: Enhanced Interval Simulation for Unbalanced Processor Designs2
0.14590501785278