[Adapt] [Seminar] Introduction to a study on transformer feed-forward layers
于赟皓
frankyu2017 at sjtu.edu.cn
Wed Nov 9 09:52:26 CST 2022
Hi Adapters,
Previous works have found that pretrained language models (PLMs) know certain facts, like “Dante was born in Florence”. There are also some studies on whether we can use pretrained language models as knowledge bases. But how can PLMs memorize certain facts? Are there any inner mechanisms that help PLMs accomplish that?
In this seminar, I will introduce a potential answer to this question. My talk is mainly based on the paper “Transformer Feed-Forward Layers Are Key-Value Memories”, which studies the feed-forward layers in transformers.
Hope you enjoy it!
Time: Wed 4:00 pm
Venue: SEIEE 3-414
Cheers,
Yunhao
More information about the Adapt
mailing list