[Adapt] [Seminar] Reinforcement learning with human feedback in LLMs

王宇飞 arthur-w at sjtu.edu.cn
Tue Sep 26 16:23:42 CST 2023


Hi, Adapters,

Reinforcement learning with human feedback (RLHF) is a new technique for training large language models that has been critical to ChatGPT, Claude, LLaMA, and more. 
These language models have shown impressive capabilities in the past few years by generating diverse and compelling text from human input prompts. However, what makes a "good" text is inherently hard to define as it is subjective and context dependent. RLHF enables language models to align a model trained on a general corpus of text data to that of complex human values. 
In this talk, I will introduce RLHF and its implementations in LLMs. 
Hope you will enjoy it. 

Time: Wed 10 am. - 11:30 am.
Meeting link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_M2VmMTU5MzgtODUzOC00NmU4LTg0MzktNGFjNDdiMmIwYTI1%40thread.v2/0?context=%7b%22Tid%22%3a%225cdc5b43-d7be-4caa-8173-729e3b0a62d9%22%2c%22Oid%22%3a%221a8b9fa0-af57-4a1c-9390-22d1c201d622%22%7d

Best wishes,
Arthur





More information about the Adapt mailing list