Back to homepage Published April 28th 2026.

Compute Allocation for AI Discovery and Search

Dmitry Rybin

Graphic illustration for compute allocation

TL;DR: I argue that AI systems for search and discovery should be designed around three compute budgets: \(C_{train}\), \(C_{infer}\), and \(C_{exact}\).

Inference-Time Compute

Discussion of inference-time compute has become mainstream since the rise of reasoning LLMs (o1, R1, o3). However, inference-time compute scaling is a classical idea that dates back to early AI systems. It is explicitly mentioned in:

  • Claude Shannon's 1950 essay "Programming a Computer for Playing Chess"1;
  • Monte Carlo Tree Search in AlphaGo by DeepMind in 20162;
  • Inference-time search for Poker Bots, by Noam Brown in 20183.

As scaling laws were gaining popularity, researchers started to explicitly include \(C_{infer}\) budget in system design. Experimentally, inference-time compute turned out to be as important as train-time compute. Scaling laws predict that theoretically optimal allocation has a constant trade-off between \(C_{train}\) and \(C_{infer}\).4

AI Discovery and Search Systems

Now consider AI systems designed to find the best solution in a large search space, discover a natural-language proof of a theorem, or find the fastest implementation of matrix multiplication on a B200 GPU. In the design of such AI systems for search and discovery, there is another compute component \(C_{exact}\). This compute is spent on executing exact algorithms and external tools. I propose to explicitly consider three components of the compute budget in such system design:

  • \(C_{train}\) - compute allocated to neural network training;
  • \(C_{infer}\) - compute allocated to improving model outputs at inference time, including inference-time search, Chain-of-Thought, majority voting, transformer layer looping, and multi-agent scaling;
  • \(C_{exact}\) - compute allocated to exact algorithms, code execution, string matching, compilers, formal verification, exact solvers.

This explicit decomposition changes how we design and compare AI systems. A system may improve not because the model became better, but because more compute was moved into exact verification, execution, and search. Conversely, a strong system can underperform if \(C_{exact}\) is ignored. See examples:

Application \(C_{train}\) \(C_{infer}\) \(C_{exact}\)
Chess learning approximations of policy and value functions MCTS, lookahead, policy rollout pruning of suboptimal moves, forced mate search with \(K\) moves, endgame database
Coding Agents LLM pre-training, post-training, RL Chain-of-thought, consensus, looped transformers, multi-agent search fixed agent logic, code execution, tool calls, string matching, compilers, verifiers
Theorem Proving Agents LLM pre-training, post-training, RL Chain-of-thought, multi-agent search enumeration of approaches5, numerical experiments, tool calls
Combinatorial Search learning heuristics, branching, value estimates tree search scaling, lookahead, rollout SAT/MIP solvers, presolve, exact solution of small sub-problems

It is tempting to treat \(C_{exact}\) as a subset of \(C_{infer}\). I would argue against it. \(C_{infer}\) relies on trained model weights to improve model output: looping of middle transformer layers, generating more reasoning tokens, majority voting and other inference-time strategies. Unlike \(C_{exact}\), it can scale independently of what tools and algorithms we have access to.

Readers coming from symbolic AI may think that \(C_{exact}\) is the same as \(C_{symbolic}\), but I think it is misleading. For example, specialized tool calls may not rely on symbolic reasoning.

Scaling Laws across three components

How should a fixed compute budget be allocated across these components? I propose a very strong conjecture.

Master Conjecture: For every fixed problem class \(\mathcal{P}\), the optimal allocation of the total compute budget \(C\) is a fixed-ratio allocation

\(C_{train}^{*}, C_{infer}^{*}, C_{exact}^{*} = \alpha_{\mathcal{P}}C, \beta_{\mathcal{P}}C, \gamma_{\mathcal{P}}C\)

where \(\alpha_{\mathcal{P}} + \beta_{\mathcal{P}} + \gamma_{\mathcal{P}} = 1\) are constants that depend only on the problem class \(\mathcal{P}\).

In my PhD Thesis "Machine Learning-based Search in Combinatorial Optimization" I prove the Master Conjecture for some tree-search models. These models are highly simplified, and it might be true that the optimal allocation has a different behavior depending on the problem class \(\mathcal{P}\). In some cases it may be true that \(C_{exact}^{*}\) scales as \(O(C)\) but in others only as \(O(C / \log C)\) or even \(O(\log C)\). Nevertheless, it is interesting that empirical experiments in combinatorial games like Hex and Chess show the same form of scaling laws.

Example 1: Chess AI Engine

Public numerical data on Chess Engines like Stockfish reports an \(\approx 26\) ELO gain from doubling of train compute (as described in SFNNv6), \(\approx 110\) ELO gain from doubling of inference-time compute (obtained from data on 20% time odds ELO change), and \(\approx 3\) ELO gain from doubling of exact compute (Syzygy tablebases). It follows that the optimal ratios can be computed by maximizing the ELO:

\(\mathrm{Maximize\;\;} E(C) \approx 26 \log_2 C_{train} + 110 \log_2 C_{infer} + 3 \log_2 C_{exact}\)

\(\mathrm{s.t.\;\;\;} C_{train} + C_{infer} + C_{exact} = C\)

Solving this optimization problem gives:

\(C_{train} : C_{infer} : C_{exact} \approx 18\% : 80\% : 2\%\)

Note that training happens once, while test-time inference and exact algorithms are used every game. In practice, people don't allocate 4 times more compute on inference for a single game than on training. If you know the number of games \(N\) in advance, it is easy to find that the optimal allocation budget changes to \((18N : 80 : 2)\). Note that in scientific discovery applications, inference may happen only a couple of times.

Example 2: AI for Science, AlphaTensor, AlphaEvolve

DeepMind has built many AI systems for search and discovery: AlphaZero, AlphaFold, AlphaTensor6, AlphaEvolve7, AlphaProof8. Some of these results were later improved9 by adding classical CS algorithms and search techniques. Here I provide a table that shows compute allocation in these systems, highlights potential design improvements, and shows how \(C_{exact}\) clarifies complex design decisions.

System Allocation Structure Possible Improvements
AlphaTensor Large \(C_{train}\) and \(C_{infer}\) for neural network training and tree search. \(C_{exact}\) is near zero, only used for data augmentation and permutations. A natural extension would be to combine tensors and expressions proposed by a neural network with exact classical algorithms such as MILP solvers and flip graph search.
AlphaEvolve Large \(C_{train}\) for LLM pre-training, post-training, RL. Large \(C_{infer}\) for reasoning model inference. Task-dependent \(C_{exact}\) budget. Initially FunSearch was used to sample programs that generate solutions. With this design, \(C_{exact}\) would only be used for program evaluation. Instead, in AlphaEvolve DeepMind introduced a new kind of \(C_{exact}\) scaling: they sample search algorithms, which are then executed with a significant compute budget.
AlphaProof Large \(C_{train}\) for training. Large \(C_{infer}\) for reasoning models and tree search. Task-dependent \(C_{exact}\) for compilation with Lean kernel, tactics execution. Recent Lean Coding agents by Harmonic and AxiomMath likely include significantly more \(C_{exact}\) budget for: fixing Lean compilation errors, decomposing into Lemmas, and stronger compilation engines.

Connections to Other Topics

The idea of \(C_{exact}\) compute allocation is related to many other ongoing discussions and phenomena in AI:

  • François Chollet's ideas on Program Synthesis and Neurosymbolic AI;
  • Sebastian Pokutta's blog "Not every discovery needs an LLM"10;
  • Leaked Claude Code implementation showed reliance on many exact string matching and context processing algorithms;
  • Whenever frontier AI labs report math benchmarks, the numbers with tools and without tools differ substantially. This can be viewed as gains from \(C_{exact}\).

Notes

  1. Claude E. Shannon, Programming a computer for playing chess, Philosophical Magazine, 1950.
  2. David Silver et al., Mastering the game of Go with deep neural networks and tree search, Nature, 2016. David Silver et al., Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, arXiv, 2017.
  3. Noam Brown and Tuomas Sandholm, Superhuman AI for heads-up no-limit poker: Libratus beats top professionals, Science, 2018.
  4. Charlie Snell et al., Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters, arXiv, 2024. Andy L. Jones, Scaling Scaling Laws with Board Games, arXiv, 2021.
  5. Michael P. Brenner, Vincent Cohen-Addad, and David P. Woodruff, Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery, arXiv, 2026.
  6. Alhussein Fawzi et al., Discovering faster matrix multiplication algorithms with reinforcement learning, Nature, 2022.
  7. Alexander Novikov et al., AlphaEvolve: A coding agent for scientific and algorithmic discovery, arXiv, 2025. Google DeepMind, AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms, May 14, 2025.
  8. Thomas Hubert et al., Olympiad-level formal mathematical reasoning with reinforcement learning, Nature, 2026. Google DeepMind, AI achieves silver-medal standard solving International Mathematical Olympiad problems, July 25, 2024.
  9. Timo Berthold, FICO Xpress Optimization Surpasses AlphaEvolve's Achievements, FICO Blog, June 2025. Manuel Kauers and Jakob Moosbauer, The FBHHRBNRSSSHK-Algorithm for Multiplication in \(\mathbb{Z}_2^{5\times5}\) is still not the end of the story, arXiv, 2022.
  10. Sebastian Pokutta, Not every discovery needs an LLM, blog post.

If you want to cite this work

@misc{Rybin2026ComputeAllocation,
  author = {Rybin, Dmitry},
  title = {Compute Allocation for AI Discovery and Search},
  year = {2026},
  howpublished = {\url{https://rybindmitry.github.io/blogs/compute-allocation.html}},
  note = {Blog post}
}