[Adapt] [Seminar]Model Evaluation

黄姗姗 798508656 at qq.com
Wed Sep 16 00:03:33 CST 2020


Hi Adapters,


     Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models. 




     In this seminar, I want to present a a taskagnostic methodology, CheckList, for testing NLP models. They said NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it. I will introduce this method and my work around this area.


Related papers:

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList




Hope you can gain a fresh perspective after the talk.


Time: Wed 4:30pm

Venue: SEIEE 3-414

Best regards,

Shanshan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.sjtu.edu.cn/pipermail/adapt/attachments/20200916/c966784a/attachment.html>


More information about the Adapt mailing list