Skip to content

Training error for DistilBERT (pages 575-582) #42

Answered by rasbt
labdmitriy asked this question in Q&A
Discussion options

You must be logged in to vote

Yeah, so the way I understand it is if you initialize a general tokenizer, it wouldn't know what the max_length is you need and you have to specify it manually.

In this case, we use a tokenizer specific to the model we want to fine-tune, so it sets the appropriate max_length value. Sure, we could hard-code the max_length=512, but this would not work for arbitrary models anymore (although, I think 99% of the HF models have a model max_length of 512.

Replies: 2 comments 9 replies

Comment options

You must be logged in to vote
5 replies
@labdmitriy
Comment options

@labdmitriy
Comment options

@labdmitriy
Comment options

@rasbt
Comment options

@labdmitriy
Comment options

Answer selected by labdmitriy
Comment options

You must be logged in to vote
4 replies
@rasbt
Comment options

@labdmitriy
Comment options

@rasbt
Comment options

@labdmitriy
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #41 on March 24, 2022 15:34.