Jingwen's Homepage

Publication



  • Architecture for Deep Learning

[AAAI’22]
BlockSkim: Efficient Question Answering for Transformer

[ASPLOS’22]
VELTAIR: Towards High-Performance Multi-Tenant Deep Learning Services via Adaptive Compilation and Scheduling

[HPCA’22]
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS

[SC’21]
Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction

[IISWC’21]
Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators

[ICCD’21]
Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks















  • Resiliency and Efficiency




















  • Cloud Computing

[ASPLOS’22]
Astraea: Towards QoS-Aware and Resource-Efficient Multi-stage GPU Services

[IPDPS’21]
AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph







  • Misc.