Some questions about the EOG model in the paper #7

crystal-xu · 2020-06-27T23:26:05Z

Hi,

I am also trying to adapt the EOG model to DocRED dataset. For me, I converted the DocRED data format, set batch_size to 1, epoch number to 200, and learning rate to 0.01. I just used the PubText word embedding in the original model. However, when the training process of 200 epochs is over, the result on the dev set is as follows:
TEST | LOSS = 0.20529, ACC = 0.9715 , MICRO P/R/F1 = 0.6678 0.0356 0.0676 | TP/ACTUAL/PRED = 410 /11518 /614 , TOTAL 396866 | 0h 00m 28s

It seems that the recall is very low. I guess maybe the root cause is the small batch_size (SGD in this case). However, setting the batch size to 2 will lead to the "CUDA out of memory" issue as the original model only supports the single-GPU scenario.

I am considering adapting the original model to the multi-GPU scenario, but not sure whether it could work. Would you mind telling me if you have made some special modifications to the original model? Or, did you just tune the hyperparameters to the proper values or use the different pre-trained word embeddings, e.g. Glove?

Thanks very much.

nanguoshun · 2020-07-06T10:57:40Z

Hi @crystal-xu, it also takes us a lot of time to understand the code of EoG and we just refer to the data preprocessing part. I suggest you ask Fenia for the details of EoG model. For the "CUDA out of memory" issue, you may use gradient accumulation if you want to increase the batch size on a small GPU.

We tried both Pubmed embedding and Glove embedding and got very similar F1 scores on CDR.

crystal-xu · 2020-07-06T11:44:32Z

Hi @nanguoshun , thanks for your replay.

What I intend to do is try the EoG on DocRED directly. I notice that in your paper you achieved about 0.52 F1 score for the EOG model. So, you implement the same model by yourself as your baseline instead of just tuning the hyperparameters? I have achieved about a 0.47 F1 score up to now. So, I am wondering how you could achieve such a nice score.

nanguoshun · 2020-07-07T11:45:14Z

Hi @crystal-xu. YES, we adapt the EoG model to the DocRED dataset. we may release the code in the future and it will take some time to clean the code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about the EOG model in the paper #7

Some questions about the EOG model in the paper #7

crystal-xu commented Jun 27, 2020 •

edited

Loading

nanguoshun commented Jul 6, 2020

crystal-xu commented Jul 6, 2020 •

edited

Loading

nanguoshun commented Jul 7, 2020

Some questions about the EOG model in the paper #7

Some questions about the EOG model in the paper #7

Comments

crystal-xu commented Jun 27, 2020 • edited Loading

nanguoshun commented Jul 6, 2020

crystal-xu commented Jul 6, 2020 • edited Loading

nanguoshun commented Jul 7, 2020

crystal-xu commented Jun 27, 2020 •

edited

Loading

crystal-xu commented Jul 6, 2020 •

edited

Loading