<html><head><meta http-equiv="content-type" content="text/html; charset=us-ascii"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><font face="PingFangSC-Regular" style="font-size: 14px;">Hi ADAPTers, </font><div><font face="PingFangSC-Regular" style="font-size: 14px;"><br></font></div><div><font face="PingFangSC-Regular" style="font-size: 14px;">Language models (LMs) are becoming the foundation for almost all major language technologies. More and more LLMs are surging nowadays,  and it is useful for users and researchers to find out their difference in performance. Today my presentation is about a work for the evaluation of LMs which helps to improve the transparency of language models. This work taxonomizes the vast design space of language model evaluation into scenarios and metrics, <span style="caret-color: rgb(16, 18, 20); color: rgb(16, 18, 20); white-space: pre-wrap; background-color: rgb(255, 255, 255);">covers most of the scenarios and conducts the experiments on </span>30 language models. More details of this work can be explored on </font><a href="https://crfm.stanford.edu/helm/v0.1.0/?">https://crfm.stanford.edu/helm/v0.1.0/?</a>.</div><div><font face="PingFangSC-Regular" style="font-size: 14px;"><br></font></div><div><font face="PingFangSC-Regular" style="font-size: 14px;"><br></font></div><div><font face="PingFangSC-Regular" style="font-size: 14px;">Hope you enjoy it.</font></div><div><font face="PingFangSC-Regular" style="font-size: 14px;">Chunhao</font></div></body></html>