Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the EOG model in the paper #7

Open
crystal-xu opened this issue Jun 27, 2020 · 3 comments
Open

Some questions about the EOG model in the paper #7

crystal-xu opened this issue Jun 27, 2020 · 3 comments

Comments

@crystal-xu
Copy link

crystal-xu commented Jun 27, 2020

Hi,

I am also trying to adapt the EOG model to DocRED dataset. For me, I converted the DocRED data format, set batch_size to 1, epoch number to 200, and learning rate to 0.01. I just used the PubText word embedding in the original model. However, when the training process of 200 epochs is over, the result on the dev set is as follows:
TEST | LOSS = 0.20529, ACC = 0.9715 , MICRO P/R/F1 = 0.6678 0.0356 0.0676 | TP/ACTUAL/PRED = 410 /11518 /614 , TOTAL 396866 | 0h 00m 28s

It seems that the recall is very low. I guess maybe the root cause is the small batch_size (SGD in this case). However, setting the batch size to 2 will lead to the "CUDA out of memory" issue as the original model only supports the single-GPU scenario.

I am considering adapting the original model to the multi-GPU scenario, but not sure whether it could work. Would you mind telling me if you have made some special modifications to the original model? Or, did you just tune the hyperparameters to the proper values or use the different pre-trained word embeddings, e.g. Glove?

Thanks very much.

@nanguoshun
Copy link
Owner

Hi @crystal-xu, it also takes us a lot of time to understand the code of EoG and we just refer to the data preprocessing part. I suggest you ask Fenia for the details of EoG model. For the "CUDA out of memory" issue, you may use gradient accumulation if you want to increase the batch size on a small GPU.

We tried both Pubmed embedding and Glove embedding and got very similar F1 scores on CDR.

@crystal-xu
Copy link
Author

crystal-xu commented Jul 6, 2020

Hi @nanguoshun , thanks for your replay.

What I intend to do is try the EoG on DocRED directly. I notice that in your paper you achieved about 0.52 F1 score for the EOG model. So, you implement the same model by yourself as your baseline instead of just tuning the hyperparameters? I have achieved about a 0.47 F1 score up to now. So, I am wondering how you could achieve such a nice score.

@nanguoshun
Copy link
Owner

Hi @crystal-xu. YES, we adapt the EoG model to the DocRED dataset. we may release the code in the future and it will take some time to clean the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants