Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open-LLaMA-3B results are much worse than reported in this repo #68

Open
XinnuoXu opened this issue Jul 6, 2023 · 5 comments
Open

Open-LLaMA-3B results are much worse than reported in this repo #68

XinnuoXu opened this issue Jul 6, 2023 · 5 comments

Comments

@XinnuoXu
Copy link

XinnuoXu commented Jul 6, 2023

Task Version Metric Value Stderr
anli_r1 0 acc 0.3330 ± 0.0149
anli_r2 0 acc 0.3320 ± 0.0149
anli_r3 0 acc 0.3367 ± 0.0136
arc_challenge 0 acc 0.2099 ± 0.0119
acc_norm 0.2705 ± 0.0130
arc_easy 0 acc 0.2542 ± 0.0089
acc_norm 0.2517 ± 0.0089
hellaswag 0 acc 0.2621 ± 0.0044
acc_norm 0.2741 ± 0.0045
openbookqa 0 acc 0.1800 ± 0.0172
acc_norm 0.2500 ± 0.0194
piqa 0 acc 0.5147 ± 0.0117
acc_norm 0.5011 ± 0.0117
record 0 f1 0.2017 ± 0.0040
em 0.1964 ± 0.0040
rte 0 acc 0.4946 ± 0.0301
truthfulqa_mc 1 mc1 0.2375 ± 0.0149
mc2 0.4767 ± 0.0169
wic 0 acc 0.5000 ± 0.0198
winogrande 0 acc 0.5099 ± 0.0140
@XinnuoXu
Copy link
Author

XinnuoXu commented Jul 6, 2023

It seems that the anli_* and truthfulqa_mc are similar. But the rest is -20% worse. I'm wondering the results reported in this repo for hellaswag and ARC_* are few-shot = 0 or not?

@young-geng
Copy link
Contributor

Everything reported here is zero shot. Did you turn off the fast tokenizer when evaluating? There is a bug in the recent release of transformers library which causes the auto converted tokenizer to output different tokens than the original tokenizer. Therefore, when evaluating OpenLLaMA, you need to turn off the fast tokenizer.

@buzzCraft
Copy link

Is that bug still there? I thought I read somewhere that it got fixed.

@young-geng
Copy link
Contributor

@buzzCraft It got fixed in the main branch of transformers but there hasn't been a release with that fix yet

@buzzCraft
Copy link

@young-geng ok,since we are on the bleeding edge of the llm field, I usually go with the dev branch.

I also want to thank you and the team for the amazing work you have done. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants