Can someone explain this line? #21

teucer · 2018-07-09T14:06:20Z

If my understanding is correct this is finding the places where there is delimiter and filters for them. How does this help with training?

pytorch-openai-transformer-lm/model_pytorch.py

Line 207 in 253ca42

clf_h = clf_h[flat == self.clf_token, :]

rodgzilla · 2018-07-10T15:46:10Z

When the information reaches the classification head, it has one vector of dimension n_embd associated to each position of each input. If you want to get a single prediction for each input (as it is the case with classification tasks) you have to select one of these input.

As the transformer network is auto-regressive, the value you select has to be the rightmost one which corresponds to clf_token in the input as it is created like this:

x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]

teucer · 2018-07-10T16:11:35Z

@rodgzilla Thank you a lot for the explanation. It makes a lot of sense! Out of curiosity, why all the values cannot be used?

thomwolf · 2018-07-18T08:47:56Z

Well for a classifier, we usually want a fixed length representation of the sentence so we can't really use a varying number of values. Starting from that, the last hidden state is the most logical summary of the sentence. But there are other possible options of course, feel free to try your ideas!

mehdimashayekhi · 2018-07-19T23:56:12Z

in original open ai code (https://github.com/openai/finetune-transformer-lm/blob/bd1cf7d678926041e6d19193cab7e5cd8ce2fce6/train.py#L191) in train.py in the model function here in this line clf_logits = clf(clf_h, 1, train=train), why ny is 1?, shouldn't it be 2? because we have two classes. is there a reason to use 1 and then later reshape the logits second dimension to 2?! I really appreciate your help,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can someone explain this line? #21

Can someone explain this line? #21

teucer commented Jul 9, 2018

rodgzilla commented Jul 10, 2018

teucer commented Jul 10, 2018

thomwolf commented Jul 18, 2018

mehdimashayekhi commented Jul 19, 2018 •

edited

Loading

Can someone explain this line? #21

Can someone explain this line? #21

Comments

teucer commented Jul 9, 2018

rodgzilla commented Jul 10, 2018

teucer commented Jul 10, 2018

thomwolf commented Jul 18, 2018

mehdimashayekhi commented Jul 19, 2018 • edited Loading

mehdimashayekhi commented Jul 19, 2018 •

edited

Loading