[Adapt] [Seminar] Handle Classification Tasks with Large Output Dimension: Word Embedding as Example

Luo Kangqi luo.kangqi at qq.com
Wed May 4 01:03:10 CST 2016

Hi Adapters,

This time I will introduce some techniques for handling classification tasks when having a large output dimension.

As a typical task, language model tries to predict the correct word given context words. For example, predicting the word "?" in the sentence "the quick fox ? over the lazy dog" given its surrounding words. This is essentially a classification task, but the size of different classes may be large, since there are millions of distinct words in vocabulary, traditional learning algorithms will encounter problems related with time complexity.
In this talk, I will introduce two techniques: Hierarchical Softmax and Noise-Contrastive Estimation, and show you the intuitions behind these models. Notice that NCE will appear the second time in our seminars, hope you can understand how it works if you are confused last time.

The seminar will take place at 4:30 pm at SEIEE 3-528 today. See you then!

Luo Kangqi
ADAPT Lab, SEIEE 3-341
Shanghai Jiao Tong University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.sjtu.edu.cn/pipermail/adapt/attachments/20160504/09881988/attachment.html>

More information about the Adapt mailing list