IEEE Computer Architecture Letters

Papers
(The median citation count of IEEE Computer Architecture Letters is 1. The table below lists those papers that are above that threshold based on CrossRef citation counts [max. 250 papers]. The publications cover those that have been published in the past four years, i.e., from 2020-11-01 to 2024-11-01.)
ArticleCitations
RAMBO: Resource Allocation for Microservices Using Bayesian Optimization33
Lightweight Hardware Implementation of Binary Ring-LWE PQC Accelerator20
MultiPIM: A Detailed and Configurable Multi-Stack Processing-In-Memory Simulator16
Flexion: A Quantitative Metric for Flexibility in DNN Accelerators14
HBM3 RAS: Enhancing Resilience at Scale12
Cryogenic PIM: Challenges & Opportunities11
STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators9
TRiM: Tensor Reduction in Memory9
A Day In the Life of a Quantum Error9
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems8
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions8
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles8
DRAM-CAM: General-Purpose Bit-Serial Exact Pattern Matching7
LT-PIM: An LUT-Based Processing-in-DRAM Architecture With RowHammer Self-Tracking7
Accelerating Concurrent Priority Scheduling Using Adaptive in-Hardware Task Distribution in Multicores7
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks7
Characterizing and Understanding End-to-End Multi-Modal Neural Networks on GPUs7
BTB-X: A Storage-Effective BTB Organization7
Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training6
Instruction Criticality Based Energy-Efficient Hardware Data Prefetching6
Characterizing and Understanding HGNNs on GPUs6
Characterizing and Understanding Distributed GNN Training on GPUs6
Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications6
Infinity Stream: Enabling Transparent and Automated In-Memory Computing5
Towards Improved Power Management in Cloud GPUs5
A First-Order Model to Assess Computer Architecture Sustainability5
Deep Partitioned Training From Near-Storage Computing to DNN Accelerators5
X-ray: Discovering DRAM Internal Structure and Error Characteristics by Issuing Memory Commands5
Hardware Acceleration for GCNs via Bidirectional Fusion5
Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays4
WPC: Whole-Picture Workload Characterization Across Intermediate Representation, ISA, and Microarchitecture4
Zero-Copying I/O Stack for Low-Latency SSDs4
Decoupled SSD: Reducing Data Movement on NAND-Based Flash SSD4
Mitigating Timing-Based NoC Side-Channel Attacks With LLC Remapping4
Row-Streaming Dataflow Using a Chaining Buffer and Systolic Array+ Structure4
Adaptive Web Browsing on Mobile Heterogeneous Multi-cores4
Last-Level Cache Insertion and Promotion Policy in the Presence of Aggressive Prefetching4
Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture4
OpenMDS: An Open-Source Shell Generation Framework for High-Performance Design on Xilinx Multi-Die FPGAs4
DAM: Deadblock Aware Migration Techniques for STT-RAM-Based Hybrid Caches4
Dynamic Optimization of On-Chip Memories for HLS Targeting Many-Accelerator Platforms4
Managing Prefetchers With Deep Reinforcement Learning4
Hungarian Qubit Assignment for Optimized Mapping of Quantum Circuits on Multi-Core Architectures4
Accelerating Graph Processing With Lightweight Learning-Based Data Reordering3
A Quantum Computer Trusted Execution Environment3
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs3
MQSim-E: An Enterprise SSD Simulator3
Data-Aware Compression of Neural Networks3
gem5-accel: A Pre-RTL Simulation Toolchain for Accelerator Architecture Validation3
Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing3
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models3
DAMARU: A Denial-of-Service Attack on Randomized Last-Level Caches2
Hardware-Implemented Lightweight Accelerator for Large Integer Polynomial Multiplication2
BayesTuner: Leveraging Bayesian Optimization For DNN Inference Configuration Selection2
Fine-Grained Scheduling in Heterogeneous-ISA Architectures2
Balancing Performance Against Cost and Sustainability in Multi-Chip-Module GPUs2
FPGA-Accelerated Data Preprocessing for Personalized Recommendation Systems2
The Case for Replication-Aware Memory-Error Protection in Disaggregated Memory2
PreGNN: Hardware Acceleration to Take Preprocessing Off the Critical Path in Graph Neural Networks2
The Case for Dynamic Bias in Global Adaptive Routing2
A Case for Speculative Strength Reduction2
A Case Study of a DRAM-NVM Hybrid Memory Allocator for Key-Value Stores2
Reducing the Silicon Area Overhead of Counter-Based Rowhammer Mitigations2
Multi-Prediction Compression: An Efficient and Scalable Memory Compression Framework for GP-GPU2
Accelerators & Security: The Socket Approach2
Ramulator 2.0: A Modern, Modular, and Extensible DRAM Simulator2
Scale-Model Simulation2
Advancing Compilation of DNNs for FPGAs Using Operation Set Architectures2
Exploring PIM Architecture for High-Performance Graph Pattern Mining2
MPU-Sim: A Simulator for In-DRAM Near-Bank Processing Architectures2
Stride Equality Prediction for Value Speculation2
XLA-NDP: Efficient Scheduling and Code Generation for Deep Learning Model Training on Near-Data Processing Memory2
A Flexible Embedding-Aware Near Memory Processing Architecture for Recommendation System2
Characterization and Analysis of Deep Learning for 3D Point Cloud Analytics1
DVFaaS: Leveraging DVFS for FaaS Workflows1
Modeling Periodic Energy-Harvesting Computing Systems1
LV: Latency-Versatile Floating-Point Engine for High-Performance Deep Neural Networks1
A Pre-Silicon Approach to Discovering Microarchitectural Vulnerabilities in Security Critical Applications1
DRAMA: Commodity DRAM Based Content Addressable Memory1
TokenSmart: Distributed, Scalable Power Management in the Many-Core Era1
Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware1
Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack1
Energy-Efficient Bayesian Inference Using Bitstream Computing1
SSE: Security Service Engines to Accelerate Enclave Performance in Secure Multicore Processors1
FlexScore: Quantifying Flexibility1
Modeling DRAM Timing in Parallel Simulators With Immediate-Response Memory Model1
Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems1
Improving Energy-Efficiency of Capsule Networks on Modern GPUs1
Hardware Accelerated Reusable Merkle Tree Generation for Bitcoin Blockchain Headers1
Analysis of Data Transfer Bottlenecks in Commercial PIM Systems: A Study With UPMEM-PIM1
Approximate Multiplier Design With LFSR-Based Stochastic Sequence Generators for Edge AI1
Direct-Coding DNA With Multilevel Parallelism1
DNA Pre-Alignment Filter Using Processing Near Racetrack Memory1
Ensuring Data Confidentiality in eADR-Based NVM Systems1
ADT: Aggressive Demotion and Promotion for Tiered Memory1
RouteReplies: Alleviating Long Latency in Many-Chip-Module GPUs1
SmaQ: Smart Quantization for DNN Training by Exploiting Value Clustering1
By-Software Branch Prediction in Loops1
LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads1
A Model for Scalable and Balanced Accelerators for Graph Processing1
NoHammer: Preventing Row Hammer With Last-Level Cache Management1
Architectural Implications of GNN Aggregation Programming Abstractions1
T-CAT: Dynamic Cache Allocation for Tiered Memory Systems With Memory Interleaving1
0.13500499725342