Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General help for custom token classification endtask #3

Open
JulesBelveze opened this issue Mar 8, 2021 · 6 comments
Open

General help for custom token classification endtask #3

JulesBelveze opened this issue Mar 8, 2021 · 6 comments

Comments

@JulesBelveze
Copy link

Hey all,

Thanks for the great repo, that's exactly what I was looking for 😄

I have been through the examples and documentation you guys provided but I am attempting to use the library for token classification (specifically for NER).
I have my own datasets.Dataset, a custom BERT model and I am not using a HF Trainer.

I have tried to follow the steps provided here but they are quite confusing to me...
@madlag Could you by any chance give me further hints/notebooks on how I could use the library to reach my end goal?

Thanks a lot for you help,
Cheers,
Jules

@madlag
Copy link
Contributor

madlag commented Mar 9, 2021

Hello,

That's cool, I have not tested on NER right now, it will be interesting to check it.
Yes, I will expand the steps into a real example, so you can see what need to be done (give me a day or two).
Is your code visible somewhere ? If yes, I can check that my steps are correct.

Regards,

François

@JulesBelveze
Copy link
Author

Hey @madlag ,

Thanks for your answer 😄
Awesome then, because I have started adapting it to my use case but not 100% sure if I'm doing it right.

Yep, I just sent you an invite to the repo (I removed a bunch of files from the original library). Let me know what you think of it, and if you have any suggestion!

Really appreciate your help,
Cheers,
Jules

@madlag
Copy link
Contributor

madlag commented Mar 10, 2021

Hi Jules !

I pushed a new branch "madlag_fix", but so I could not test it completely: I tried to run the experiment, but I lack the dataset files, and I don't know where to find them, so if you tell me how, I can check it more properly ;-)

Regards,

François

@JulesBelveze
Copy link
Author

Hey @madlag ,

Awesome!! Thanks a lot for your help! I just tried it and training went well! 😄

However the size of the saved model is identical to the one without fine-pruning... I saw that you shared a notebook in #5 , I will have a look at it and let you know if I need further help!

Thanks again François,
Cheers,
Jules

@madlag
Copy link
Contributor

madlag commented Mar 12, 2021

Hi !
Yes, #5 contains useful stuff for you too.
You will need to adjust the sparse parameters to adjust sparsity.
When everything goes well, some heads are pruned, and the file size of the model is reduced.
But for compatibility reasons with transformers, I cannot change the size of the FFns layers in the disk serialized version, because transformers would not accept to load it.
So you have to use "optimize_model" after loading it to cut the empty parts like in the example notebook .
Don't hesitate to ping me, the library still lacks better documentation, finding the right parameters is not completely trivial right now, and it will actually help me too to know what parts are the most useful to be documented!

Regards,

François

@JulesBelveze
Copy link
Author

Hey Francois,

Awesome, I'm now playing around with the parameters to check how it affects performance and how much it shrinks the model! Next step: investigate how to add distillation to it :)

Cheers,
Jules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants