Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question for incremental training #15

Open
AIvisionman opened this issue Jan 4, 2025 · 2 comments
Open

question for incremental training #15

AIvisionman opened this issue Jan 4, 2025 · 2 comments

Comments

@AIvisionman
Copy link

Hi, are the old parameters fixed when doing incremental training?

@Haiyang-W
Copy link
Owner

From my attempts, training the old parameters simultaneously results in slightly better final performance, but the training becomes more unstable, and the loss is prone to exploding.

@AIvisionman
Copy link
Author

I intuitively believe that Tokenformer has incremental capabilities because the parameters learned in the past can be retained and thus past knowledge can be preserved. Now, if the old parameters are updated at the same time on a new task, I'm having trouble understanding how it can scale the model size without losing capability. Can you explain the confusion further, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants