About Finetune Dataset? #42

LizzieOneDay · 2020-01-16T02:40:21Z

Hello, your final model is finetuned just on 1000 training images of ICDAR 2015, or on ICDAR2015 plus 229 ICDAR 2013 training images as the paper said?

Pay20Y · 2020-01-17T11:16:41Z

Yes, I only use ICDAR15 here. Indeed, the paper also used ICDAR2017MLT to pretrained the model, which is different from my final model.

LizzieOneDay · 2020-01-17T12:44:42Z

Yes, I only use ICDAR15 here. Indeed, the paper also used ICDAR2017MLT to pretrained the model, which is different from my final model.

I followed your training steps. First train on SynthText, then finetuned on ICDAR 2015 dataset, but my test results are a little worse than yours.

Pay20Y · 2020-01-18T06:21:39Z

Training a end-to-end model is so complicated, and I'm still trying to achieve the results proposed in the paper.

LizzieOneDay · 2020-02-22T09:34:22Z

@Pay20Y

 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.

Pay20Y · 2020-02-26T12:46:02Z

@Pay20Y

 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.

Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.

LizzieOneDay · 2020-02-26T13:07:52Z

@Pay20Y
 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.
Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.

Thank you~~. Do you think that simply replacing the inputs ("pad_rois = roi_rotate_part.roi_rotate_tensor_pad(shared_feature, input_transform_matrix, input_box_masks, input_box_widths)") of the roi_rotate_part with the results that generated by detected polys will improve the training performance?

Pay20Y · 2020-02-27T07:06:44Z

@Pay20Y
 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.
Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.
Thank you~~. Do you think that simply replacing the inputs ("pad_rois = roi_rotate_part.roi_rotate_tensor_pad(shared_feature, input_transform_matrix, input_box_masks, input_box_widths)") of the roi_rotate_part with the results that generated by detected polys will improve the training performance?

Yes, but it's not a simple replacement. You should match the detection results to the gt boxes, only in this way, the recognition branch can be optimized.

LizzieOneDay · 2020-02-28T02:39:43Z

@Pay20Y
 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.
Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.
Thank you~~. Do you think that simply replacing the inputs ("pad_rois = roi_rotate_part.roi_rotate_tensor_pad(shared_feature, input_transform_matrix, input_box_masks, input_box_widths)") of the roi_rotate_part with the results that generated by detected polys will improve the training performance?
Yes, but it's not a simple replacement. You should match the detection results to the gt boxes, only in this way, the recognition branch can be optimized.

Thank you. But in the paper (section 3.3 ROIRotate), there's one sentense, "so we use ground truth text regions instead of predicted text regions during training."

Pay20Y · 2020-02-28T07:57:35Z

@Pay20Y
 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.
Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.
Thank you~~. Do you think that simply replacing the inputs ("pad_rois = roi_rotate_part.roi_rotate_tensor_pad(shared_feature, input_transform_matrix, input_box_masks, input_box_widths)") of the roi_rotate_part with the results that generated by detected polys will improve the training performance?
Yes, but it's not a simple replacement. You should match the detection results to the gt boxes, only in this way, the recognition branch can be optimized.
Thank you. But in the paper (section 3.3 ROIRotate), there's one sentense, "so we use ground truth text regions instead of predicted text regions during training."

Yes, the paper did not use multi-stage training either, but multi-stage training really makes sense. You may train the model with the dataset mentioned in the paper (MLT17 + IC13 + IC15) to verify the performance.

LizzieOneDay · 2020-02-28T08:11:47Z

@Pay20Y
 Hello, when you finetune on ICDAR2015, did you train detection branch and recognition branch seperately? Or, can you give some advices when finetuning? Thank you very much.
Sorry for the late response. I think it is unnecessary to train in multi-stage when finetune. And using detection results (not gt) to apply RoI Rotate is important, but I haven't implemented it.
Thank you~~. Do you think that simply replacing the inputs ("pad_rois = roi_rotate_part.roi_rotate_tensor_pad(shared_feature, input_transform_matrix, input_box_masks, input_box_widths)") of the roi_rotate_part with the results that generated by detected polys will improve the training performance?
Yes, but it's not a simple replacement. You should match the detection results to the gt boxes, only in this way, the recognition branch can be optimized.
Thank you. But in the paper (section 3.3 ROIRotate), there's one sentense, "so we use ground truth text regions instead of predicted text regions during training."
Yes, the paper did not use multi-stage training either, but multi-stage training really makes sense. You may train the model with the dataset mentioned in the paper (MLT17 + IC13 + IC15) to verify the performance.

I think the paper said that use gt to apply RoI Rotate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Finetune Dataset? #42

About Finetune Dataset? #42

LizzieOneDay commented Jan 16, 2020

Pay20Y commented Jan 17, 2020

LizzieOneDay commented Jan 17, 2020 •

edited

Loading

Pay20Y commented Jan 18, 2020

LizzieOneDay commented Feb 22, 2020

Pay20Y commented Feb 26, 2020

LizzieOneDay commented Feb 26, 2020

Pay20Y commented Feb 27, 2020

LizzieOneDay commented Feb 28, 2020

Pay20Y commented Feb 28, 2020

LizzieOneDay commented Feb 28, 2020

About Finetune Dataset? #42

About Finetune Dataset? #42

Comments

LizzieOneDay commented Jan 16, 2020

Pay20Y commented Jan 17, 2020

LizzieOneDay commented Jan 17, 2020 • edited Loading

Pay20Y commented Jan 18, 2020

LizzieOneDay commented Feb 22, 2020

Pay20Y commented Feb 26, 2020

LizzieOneDay commented Feb 26, 2020

Pay20Y commented Feb 27, 2020

LizzieOneDay commented Feb 28, 2020

Pay20Y commented Feb 28, 2020

LizzieOneDay commented Feb 28, 2020

LizzieOneDay commented Jan 17, 2020 •

edited

Loading