About

First Post:

Introduction

I am a PhD candidate in the School of Computer Science at Peking University, advised by Prof. Yun Liang. My research interests include programming language, compiler design, high-performance computing, and system-level optimization for machine learning.

Education

  • 2021~   | PhD Candidate, School of Computer Science, Peking University, Beijing, China
  • 2017~2021 | Bachelor of Science, School of Computer Science, Peking University, Beijing, China

Awards

  •    2024 | ByteDance Scholarship, ByteDance, China
  • 2022~2023 | Schlumberger Scholarship, School of Computer Science, Peking University, Beijing, China
  • 2019~2020 | PKU Second Class Scholarship, Peking University, Beijing, China
  • 2019~2020 | Merit Student, Peking University, Beijing, China
  • 2017~2018 | Merit Student, Peking University, Beijing, China

Publications

(“*” means equal contribution)

Venue Title & Links Author List
DATE’26 LATIAS: A General Architecture-Operator Model for Spatial Accelerators with Complex Topology and Memory Hierarchy Chengrui Zhang, Liancheng Jia, Chu Wang, Tianqi Li, Renze Chen, Xiuping Cui, Size Zheng, Shengen Yan, Xiuhong Li, Yu Wang, Xiang Chen and Yun Liang
NIPS’24 ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction [site][code][paper][slides][poster] Renze Chen, Zhuofeng Wang, Beiquan Cao, Tong Wu, Size Zheng, Xiuhong Li, Xuechao Wei, Shengen Yan, Meng Li, Yun Liang
ICCAD’24 MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers [site][paper] Zebin Yang*, Renze Chen*, Taiqiang Wu, Ngai Wong, Yun Liang, Runsheng Wang, Ru Huang, Meng Li
ICCAD’24 FlexHE: a Flexible Kernel Generation Framework for Homomorphic Encryption-Based Private Inference [site][paper] Jiangrui Yu, Wenxuan Zeng, Tianshi Xu, Renze Chen, Yun Liang, Runsheng Wang, Ru Huang, Meng Li
​​
DAC’24 MoteNN: Memory Optimization via Fine-grained Scheduling for Deep Neural Networks on Tiny Devices [site][paper][slides][poster] Renze Chen, Zijian Ding, Size Zheng, Meng Li, and Yun Liang
MLSys’24 vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs [site][paper][slides] Size Zheng*, Renze Chen*, Meng Li, Zihao Ye, Luis Ceze, and Yun Liang
ASPLOS’24 MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN [site][code][paper][slides][poster] Renze Chen, Zijian Ding, Size Zheng, Chengrui Zhang, Jingwen Leng, Xuanzhe Liu, and Yun Liang
HPCA’23 Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion [site][paper] Size Zheng, Siyuan Chen, Peidi Song, Renze Chen, Xiuhong Li, Shengen Yan, Dahua Lin, Jingwen Leng, and Yun Liang
ISCA’22 AMOS: Enabling Automatic Mapping for Tensor Computations On Spatial Accelerators with Hardware Abstraction [site][code][paper] Size Zheng, Renze Chen, Anjiang Wei, Yicheng Jin, Qin Han, Liqiang Lu, Bingyang Wu, Xiuhong Li, Shengen Yan, and Yun Liang
TPDS’22 NeoFlow: A Flexible Framework for Enabling Efficient Compilation for High Performance DNN Training [site][paper] Size Zheng, Renze Chen, Yicheng Jin, Anjiang Wei, Bingyang Wu, Xiuhong Li, Shengen Yan, and Yun Liang
ASPLOS’20 FlexTensor: An Automatic Schedule Exploration and Optimization Framework for Tensor Computation on Heterogeneous System [site][code][paper] Size Zheng, Yun Liang, Shuo Wang, Renze Chen, and Kaiwen Sheng

Projects

Name Description
Triton-Distributed Distributed Compiler based on Triton for Parallel Systems
ArkVale ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS’24).
MAGIS MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS’24).
AMOS AMOS: Enabling Automatic Mapping for Tensor Computations On Spatial Accelerators with Hardware Abstraction (ISCA’23).
FlexTensor FlexTensor: An Automatic Schedule Exploration and Optimization Framework for Tensor Computation on Heterogeneous System (ASPLOS’20).
CCTV CCTV: C++ Compile-Time EValuator for Scheme LISP. An interpreter for a tiny dialect of Scheme LISP language implemented with C++ template meta-programming.
CppyList A C++ library of python-like heterogeneous list.
CppFP A C++ library for “curry”, “partial” and some other functional programming combinators.
ECAIA Implementation of “Essentials of Compilation: An Incremental Approach” with Racket language.
AVL-Tree An AVL-Tree implementation with linear-time merging operation.
CoCo A simple symmetry coroutine library for POSIX C.

Coursework

Name Description
Rust-Linear Project of Design Principles of Programming Language (2022). Implementation of a flexible linear type system with Rust language.
Shift-Reset Project of Design Principles of Programming Language (2020). Implementation of paper “Selective CPS Transformation for shift and reset” with OCaml language.
MIT-JOS Lab. of Operating System (Honor Track). Implementation of a tiny micro-kernel OS.
MiniC Lab. of Practice for Compiler Design. Implementation of a C compiler from scratch.