You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am also trying to adapt the EOG model to DocRED dataset. For me, I converted the DocRED data format, set batch_size to 1, epoch number to 200, and learning rate to 0.01. I just used the PubText word embedding in the original model. However, when the training process of 200 epochs is over, the result on the dev set is as follows:
TEST | LOSS = 0.20529, ACC = 0.9715 , MICRO P/R/F1 = 0.6678 0.0356 0.0676 | TP/ACTUAL/PRED = 410 /11518 /614 , TOTAL 396866 | 0h 00m 28s
It seems that the recall is very low. I guess maybe the root cause is the small batch_size (SGD in this case). However, setting the batch size to 2 will lead to the "CUDA out of memory" issue as the original model only supports the single-GPU scenario.
I am considering adapting the original model to the multi-GPU scenario, but not sure whether it could work. Would you mind telling me if you have made some special modifications to the original model? Or, did you just tune the hyperparameters to the proper values or use the different pre-trained word embeddings, e.g. Glove?
Thanks very much.
The text was updated successfully, but these errors were encountered:
Hi @crystal-xu, it also takes us a lot of time to understand the code of EoG and we just refer to the data preprocessing part. I suggest you ask Fenia for the details of EoG model. For the "CUDA out of memory" issue, you may use gradient accumulation if you want to increase the batch size on a small GPU.
We tried both Pubmed embedding and Glove embedding and got very similar F1 scores on CDR.
What I intend to do is try the EoG on DocRED directly. I notice that in your paper you achieved about 0.52 F1 score for the EOG model. So, you implement the same model by yourself as your baseline instead of just tuning the hyperparameters? I have achieved about a 0.47 F1 score up to now. So, I am wondering how you could achieve such a nice score.
Hi,
I am also trying to adapt the EOG model to DocRED dataset. For me, I converted the DocRED data format, set batch_size to 1, epoch number to 200, and learning rate to 0.01. I just used the PubText word embedding in the original model. However, when the training process of 200 epochs is over, the result on the dev set is as follows:
TEST | LOSS = 0.20529, ACC = 0.9715 , MICRO P/R/F1 = 0.6678 0.0356 0.0676 | TP/ACTUAL/PRED = 410 /11518 /614 , TOTAL 396866 | 0h 00m 28s
It seems that the recall is very low. I guess maybe the root cause is the small batch_size (SGD in this case). However, setting the batch size to 2 will lead to the "CUDA out of memory" issue as the original model only supports the single-GPU scenario.
I am considering adapting the original model to the multi-GPU scenario, but not sure whether it could work. Would you mind telling me if you have made some special modifications to the original model? Or, did you just tune the hyperparameters to the proper values or use the different pre-trained word embeddings, e.g. Glove?
Thanks very much.
The text was updated successfully, but these errors were encountered: