Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

Open
huanranchen opened this issue Jan 26, 2025 · 1 comment

Comments

@huanranchen
Copy link

Thanks for your great work!
I'm curious about why researchers refer to such methods as "scaling test-time computation." From your blog, it seems there isn't an explicit scaling of test-time computation. It appears that models trained with simple PPO tend to generate longer answers.

@Benjoyo
Copy link

Benjoyo commented Jan 26, 2025

Because the models produce better results by generating a lot of „ thinking“ tokens compared to answering ad-hoc. Hence: more inference compute traded for better accuracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants