We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recently ModernBert was launched with 8k context lenght: https://huggingface.co/blog/modernbert
Does Flair support fine tuning and inference of models with context lenght higher than 512?
Thanks!
The text was updated successfully, but these errors were encountered:
I was able to run it, just need to set the mex_length parameter:
from flair.embeddings import TransformerWordEmbeddings embeddings = TransformerWordEmbeddings(model='answerdotai/ModernBERT-base', layers="all", subtoken_pooling="first", fine_tune=True, use_context=True, transformers_tokenizer_kwargs={'model_max_length': 8192} )
Remember to install the main transformer, flash-attn and triton
pip install git+https://github.com/huggingface/transformers.git pip install flash-attn --no-build-isolation pip install triton
ModernBERT will be included in v4.48.0 of transformers. Until then, it requires installing transformers from main.
Sorry, something went wrong.
Hey @heukirne ,
if you plan to use ModernBERT for Token Classification tasks, please have a look at my NER repo:
https://github.com/stefan-it/modern-bert-ner
ModernBERT has currently a tokenizer issue, which is (probably) not seen for text classification tasks...
No branches or pull requests
Question
Recently ModernBert was launched with 8k context lenght: https://huggingface.co/blog/modernbert
Does Flair support fine tuning and inference of models with context lenght higher than 512?
Thanks!
The text was updated successfully, but these errors were encountered: