<meta http-equiv="Content-Type" content="text/html; charset=GB18030"><div><div style="">Hi,Adapters,</div><div style=""><br></div><div style=""><p class="MsoNormal" style="margin: 0pt 0pt 0.0001pt; text-align: justify;"><font face="Calibri">Large Language Models (LLMs) arguably open the way to general purpose systems. However, evaluating these systems is an open problem: given their emerging new capabilities, LLMs are regularly breaking AI benchmarks, at an ever-increasing rate.Furthermore, open-ended generation generally requires human or model-based evaluation. Human evaluation will become less and less feasible when increasing the task complexity.Model-based evaluations on the other hand are by construction dependent of stronger models hence cannot evaluate new state-of-the-art models. Overall, evaluating new AI systems requires to rethink benchmarks.</font></p><p class="MsoNormal" style="margin: 0pt 0pt 0.0001pt; text-align: justify;"><br></p></div><div style="">Hope you find this talk interesting.</div><div style=""><br></div><div style=""><br></div><div style="">Time: Wed 10 am. - 11:30 am.<br>Meeting link: <a href="https://teams.microsoft.com/l/meetup-join/19%3ameeting_M2VmMTU5MzgtODUzOC00NmU4LTg0MzktNGFjNDdiMmIwYTI1%40thread.v2/0?context=%7b%22Tid%22%3a%225cdc5b43-d7be-4caa-8173-729e3b0a62d9%22%2c%22Oid%22%3a%221a8b9fa0-af57-4a1c-9390-22d1c201d622%22%7d" rel="noopener" target="_blank" style="outline: none; cursor: pointer; color: rgb(30, 84, 148);">https://teams.microsoft.com/l/meetup-join/19%3ameeting_M2VmMTU5MzgtODUzOC00NmU4LTg0MzktNGFjNDdiMmIwYTI1%40thread.v2/0?context=%7b%22Tid%22%3a%225cdc5b43-d7be-4caa-8173-729e3b0a62d9%22%2c%22Oid%22%3a%221a8b9fa0-af57-4a1c-9390-22d1c201d622%22%7d</a><br></div><div style=""><br></div><div style=""><br></div><div style="">Best wishes,</div><div style="">minghao</div></div><div><br></div><div><div style="font-size:14px;font-family:Verdana;color:#000;"><a class="xm_write_card" id="in_alias" style="white-space: normal; display: inline-block; text-decoration: none !important;font-family: -apple-system,BlinkMacSystemFont,PingFang SC,Microsoft YaHei;" href="https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=%E5%90%95%E9%93%AD%E6%B5%A9&icon=https%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Doidb%26k%3DZxcqP4ibkvk9H1jocgZwzGA%26kti%3DY4YidgAAAAI%26s%3D0&mail=913257822%40qq.com&code=PtRrfR8uu5B_Bro0vTOqR7iXpHgqXzeLUgBEfDydKDoJzqeKpoT3sY7YLsdc_R_PcFOfpr0iskYnsog2ogywzw" target="_blank"><br></a></div></div>