Test-time scaling (TTS) has emerged as a new frontier for scaling the performance of Large Language Models. In test-time scaling, by using more computational resources during inference, LLMs can improve their reasoning process and task performance. Several approaches have emerged for TTS such as distilling reasoning traces from another model or exploring the vast decoding search space by employing a verifier. Employing external verifiers or self-verification is crucial for test-time scaling, as they help guide the search process over large reasoning space. Verification for test-time scaling entails mechanisms or scoring functions used to evaluate the quality or plausibility of different reasoning paths or solutions from the language model during inference, enabling efficient search or selection among them without access to ground-truth labels. This paradigm commonly termed has emerged as a superior approach owing to parameter free scaling at inference time and high performance gains. The verifiers could be prompt-based, fine-tuned as a discriminative or generative model to verify process paths, outcomes or both. Despite their widespread adoption, there is no detailed collection, clear categorization and discussion of diverse verification approaches and their training mechanisms. In this survey, we cover the diverse approaches in the literature and present a unified view of verifier training, types and their utility in test-time scaling.
In parallel scaling, the model generates multiple independent outputs simultaneously, often by varying sampling temperature or prompt exem- plars to induce diversity (Levy et al., 2023; Brown et al., 2024). These outputs form a candidate set $S = {s1, . . . , sk}$, from which a selection mechanism V identifies the final answer $s∗ = V(S)$.
Sequential scaling, in contrast, decomposes a problem into intermediate steps or sub-questions. Each step builds on the previous one, produc- ing a sequence $\{sq_1, . . . , sq_T\}$ where each sqt = $LLM(sq_{t−1}, ct)$ depends on the prior reasoning step and contextual information ct
Test-time Scaling Paper Summary
@misc{venkteshvverifiers,
title={Trust but Verify! A Survey on Verification Design for Test-time Scaling},
author={Venktesh V, Mandeep Rathee and Avishek Anand},
year={2025},
eprint={2503.24235},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={},
}
Comments & Discussion