[Adapt] [Seminar]Model Evaluation

Wed Sep 16 00:03:33 CST 2020

Hi Adapters,

     Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models. 

     In this seminar, I want to present a a taskagnostic methodology, CheckList, for testing NLP models. They said NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it. I will introduce this method and my work around this area.

Related papers:

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Hope you can gain a fresh perspective after the talk.

Time: Wed 4:30pm

Venue: SEIEE 3-414

Best regards,

Shanshan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.sjtu.edu.cn/pipermail/adapt/attachments/20200916/c966784a/attachment.html>