Skip to content

Fix custom token in train.py#246

Open
naufalso wants to merge 1 commit into
tatsu-lab:mainfrom
naufalso:patch-1
Open

Fix custom token in train.py#246
naufalso wants to merge 1 commit into
tatsu-lab:mainfrom
naufalso:patch-1

Conversation

@naufalso
Copy link
Copy Markdown

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that tokenizer.eos_token, tokenizer.bos_token, and tokenizer.unk_token are all '' (empty string).

Since '' (empty string) is not equal to None, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that `tokenizer.eos_token`, `tokenizer.bos_token`, and `tokenizer.unk_token` are all `'' (empty string).`

Since `'' (empty string)` is not equal to `None`, the custom tokens in the training code will not be added. So I would  suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant