Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create transforms for entailment task? #40

Open
lordzuko opened this issue Sep 9, 2018 · 12 comments
Open

How to create transforms for entailment task? #40

lordzuko opened this issue Sep 9, 2018 · 12 comments

Comments

@lordzuko
Copy link

lordzuko commented Sep 9, 2018

Give the transformation method for ROC stories dataset, which is

def transform_roc(X1, X2, X3):
    n_batch = len(X1)
    xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
    mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)
    start = encoder['_start_']
    delimiter = encoder['_delimiter_']
    for i, (x1, x2, x3), in enumerate(zip(X1, X2, X3)):
        x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
        x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]
        l12 = len(x12)
        l13 = len(x13)
        xmb[i, 0, :l12, 0] = x12
        xmb[i, 1, :l13, 0] = x13
        mmb[i, 0, :l12] = 1
        mmb[i, 1, :l13] = 1
    # Position information that is added to the input embeddings in the TransformerModel
    xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
    return xmb, mmb

I have created following transforms for entailment task:

def transform_entailment(X1, X2):
    n_batch = len(X1)
    xmb = np.zeros((n_batch, 1, n_ctx, 2), dtype=np.int32)
    mmb = np.zeros((n_batch, 1, n_ctx), dtype=np.float32)
    start = encoder['_start_']
    delimiter = encoder['_delimiter_']
    for i, (x1, x2), in enumerate(zip(X1, X2)):
        x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
        l12 = len(x12)
        xmb[i, 0, :l12, 0] = x12
        mmb[i, 0, :l12] = 1

    # Position information that is added to the input embeddings in the TransformerModel
    xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
    return xmb, mmb

Using this I am getting following error during loss computation:

Namespace(afn='gelu', analysis=False, attn_pdrop=0.1, b1=0.9, b2=0.999, bpe_path='data/dataset_tweet_encode/vocab_40000.bpe', clf_pdrop=0.1, data_dir='data/', dataset=None, desc=None, e=1e-08, embd_pdrop=0.1, encoder_path='data/dataset_tweet_encode/encoder_bpe_40000.json', l2=0.01, lm_coef=0.5, log_dir='log/', lr=6.25e-05, lr_schedule='warmup_linear', lr_warmup=0.002, max_grad_norm=1, n_batch=1, n_ctx=512, n_embd=768, n_head=12, n_iter=3, n_layer=12, n_transfer=12, n_valid=0.1, opt='adam', resid_pdrop=0.1, save_dir='save/', seed=42, submission_dir='submission/', submit=False, vector_l2=False)


Traceback (most recent call last):                                              
  File "train.py", line 225, in <module>
    run_epoch()
  File "train.py", line 83, in run_epoch
    compute_loss_fct(XMB, YMB, MMB, clf_logits, lm_logits)
  File "/home/lordzuko/PycharmProjects/Transformer-Pytorch/loss.py", line 53, in __call__
    lm_losses = self.lm_criterion(lm_logits, x_shifted)
  File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 862, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1550, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1405, in nll_loss
    .format(input.size(0), target.size(0)))
ValueError: Expected input batch_size (66) to match target batch_size (0).

Can anyone please guide me through this ?

@artemisart
Copy link

Do you use ClassificationLossCompute ?
Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.

@lordzuko
Copy link
Author

@artemisart Yes I was using the same, made the changes you suggested, now its working.
Thank you !!

@davidefiocco
Copy link

Do you use ClassificationLossCompute ?
Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.

Hi @artemisart @lordzuko ! If there is a bug in ClassificationLossCompute do you think it's worth opening an issue specifically on it?

@p-null
Copy link

p-null commented Oct 25, 2018

 xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
    mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)

Hi, can anyone tell me what does the 2 mean between n_batch and n_ctx mean?

@artemisart
Copy link

From my understanding, it's the number of texts to be processed in parallel (for each example) by Transformer, so 1 for text classification, 2 for Story Cloze Test (rocstories), n for multi-choice, etc.

@rodgzilla
Copy link
Contributor

Hi

@artemisart is correct.

@p-null
Copy link

p-null commented Oct 29, 2018

I upload the openai-gpt for classification task and it can reproduce the result reported in original paper

@zhipeng-fan
Copy link

What is the use of mmb?

@zhipeng-fan
Copy link

What is the use of mmb?

I see, it is the mask.

@aayushee
Copy link

@artemisart Yes I was using the same, made the changes you suggested, now its working.
Thank you !!

Hi,
I am also trying to use the transformer model for the entailment task and trying to replicate the SNLI results in paper. Could you please let me know what changes you made to the train.py file for the entailment task?

@BUPTHYP
Copy link

BUPTHYP commented May 29, 2019

Do you use ClassificationLossCompute ?
Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.

Hi @artemisart @lordzuko ! If there is a bug in ClassificationLossCompute do you think it's worth opening an issue specifically on it?

Hello Davidefiocco,Could you please upload the dataset you have processed? I am also a freshman who wants to see the whole process of running this program. If it is convient of you, you can also send two files to the mailbox [email protected]. Thank you very much.I saw your thoughts and answered, and helped me a lot.

@BUPTHYP
Copy link

BUPTHYP commented May 29, 2019

@artemisart Yes I was using the same, made the changes you suggested, now its working.
Thank you !!

Very THX!Read your conversation with them,help me a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants