Fix custom token in train.py by naufalso · Pull Request #246 · tatsu-lab/stanford_alpaca

naufalso · 2023-04-26T01:18:40Z

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that tokenizer.eos_token, tokenizer.bos_token, and tokenizer.unk_token are all '' (empty string).

Since '' (empty string) is not equal to None, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached. I tried to debug the code and found that `tokenizer.eos_token`, `tokenizer.bos_token`, and `tokenizer.unk_token` are all `'' (empty string).` Since `'' (empty string)` is not equal to `None`, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes. I have tested that after the training using the modified code, the model can output EOS token correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix custom token in train.py#246

Fix custom token in train.py#246
naufalso wants to merge 1 commit into
tatsu-lab:mainfrom
naufalso:patch-1

naufalso commented Apr 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

naufalso commented Apr 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant