[Adapt] [Seminar]Model Evaluation
黄姗姗
798508656 at qq.com
Wed Sep 16 00:03:33 CST 2020
Hi Adapters,
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models.
In this seminar, I want to present a a taskagnostic methodology, CheckList, for testing NLP models. They said NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it. I will introduce this method and my work around this area.
Related papers:
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
Hope you can gain a fresh perspective after the talk.
Time: Wed 4:30pm
Venue: SEIEE 3-414
Best regards,
Shanshan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.sjtu.edu.cn/pipermail/adapt/attachments/20200916/c966784a/attachment.html>
More information about the Adapt
mailing list