question for incremental training #15

AIvisionman · 2025-01-04T06:32:03Z

Hi, are the old parameters fixed when doing incremental training?

Haiyang-W · 2025-01-04T10:33:45Z

From my attempts, training the old parameters simultaneously results in slightly better final performance, but the training becomes more unstable, and the loss is prone to exploding.

AIvisionman · 2025-01-04T12:13:56Z

I intuitively believe that Tokenformer has incremental capabilities because the parameters learned in the past can be retained and thus past knowledge can be preserved. Now, if the old parameters are updated at the same time on a new task, I'm having trouble understanding how it can scale the model size without losing capability. Can you explain the confusion further, please?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question for incremental training #15

question for incremental training #15

AIvisionman commented Jan 4, 2025

Haiyang-W commented Jan 4, 2025

AIvisionman commented Jan 4, 2025

question for incremental training #15

question for incremental training #15

Comments

AIvisionman commented Jan 4, 2025

Haiyang-W commented Jan 4, 2025

AIvisionman commented Jan 4, 2025