A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

huanranchen · 2025-01-26T02:17:44Z

Thanks for your great work!
I'm curious about why researchers refer to such methods as "scaling test-time computation." From your blog, it seems there isn't an explicit scaling of test-time computation. It appears that models trained with simple PPO tend to generate longer answers.

Benjoyo · 2025-01-26T11:06:22Z

Because the models produce better results by generating a lot of „ thinking“ tokens compared to answering ad-hoc. Hence: more inference compute traded for better accuracy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

huanranchen commented Jan 26, 2025

Benjoyo commented Jan 26, 2025

A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

A stupid question: Why researchers call such methods as "scaling test-time computation"? #1

Comments

huanranchen commented Jan 26, 2025

Benjoyo commented Jan 26, 2025