[Adapt] [Seminar] Introduction to a study on transformer feed-forward layers

于赟皓 frankyu2017 at sjtu.edu.cn
Wed Nov 9 09:52:26 CST 2022


Hi Adapters,
 
Previous works have found that pretrained language models (PLMs) know certain facts, like “Dante was born in Florence”. There are also some studies on whether we can use pretrained language models as knowledge bases. But how can PLMs memorize certain facts? Are there any inner mechanisms that help PLMs accomplish that?

In this seminar, I will introduce a potential answer to this question. My talk is mainly based on the paper “Transformer Feed-Forward Layers Are Key-Value Memories”, which studies the feed-forward layers in transformers.
 
Hope you enjoy it!

Time: Wed 4:00 pm
Venue: SEIEE 3-414

Cheers,
Yunhao


More information about the Adapt mailing list