Article

Obsidian Source: Cadence - Evolving Code with LLMs for NP-Hard Problems

Summary

Pending synthesis from local Obsidian source.

Original source title: Run a single generation

Extracted Preview

Introduction

There are very few paper that have moved me as much as [AlphaEvolve](). The paper improved on the RL pilled frontier of the current AI research landscape, and proved - by discovering novel algorithms, just how much we can still squeeze performance out of LLMs by improving solutions to hard problems and evolving them to discover new solutions. Here's what @karpathy has to say about it, and is more or less what AlphaEvolve does:

>Scaling up RL is all the rage right now. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story.

>...

>There's significantly more bits of supervision we extract per rollout via a review/reflect stage of human mechanism of improvement along the lines of "what went well? what didn't go so well? what should I try next time?" etc. and the lessons from this stage feel explicit, like a new string to be added to the system prompt for the future, optionally to be distilled into weights (/intuition) later a bit like sleep.

>...

>Example algorithm: given a task, do a few rollouts, stuff them all into one context window (along with the reward in each case), use a meta-prompt to review/reflect on what went well or not to obtain string "lesson", to be added to system prompt (or more generally modify the current lessons database). Many blanks to fill in, many tweaks possible, not obvious.

When I was going through the paper, something struck to me. The paradigms introduced in AlphaEvolve can be used make progress for *optimization problems*, where there is no exact "algorithmic breakthrough", but rather heuristic or cleverness breakthrough which is something that can't be learned through pattern recognition prevalent in LLMs.

NP Hard problems by definition have no polynomial time solutions. These are practically unsolvable for larger instances, and often require clever approaches than brute force solutions. Since exact algorithms are computationally prohibitive, the two main strategies used to solve NP Hard problems are *approximation* and *heuristics*. These are not technically easier for LLMs(or any particular system for solving such problems) to excel on these two fronts, as they require extensive information about problem domain and often require manual tuning.

"Cadence" is an attempt to understand evolutionary problem solving by studying how solutions evolve over generations of improvement, and whether or not it could lead to newer, improved solutions for NP Hard problems. The solutions generated by LLMs directly feeds into the idea of Asymmetric Verifiability, as for TSP it is much easier to verify than to solve. Asymmetric verification and programming problems go hand in hand, as it’s tedious to read code and check its correctness, but if you have test cases with ample coverage, you can quickly check any given solution.

For the purpose of this blog, I chose the *Travelling Salesman Problem*(TSP), and analyzed how the solutions evolves. TSP is a classical optimization problem, with approximate solutions traditionally requiring exponential time to solve(but less time to verify!). [According to Jason Wei](https://www.jasonwei.net/blog/asymmetry-of-verification-and-verifiers-law), all tasks that follow the properties defined below, will be solved by AI(*Verifier's Law*):

Integration Notes

  • Source folder: /home/yashs/Documents/Docs/Obsidian/Research-Notes
  • Local source: /home/yashs/Documents/Docs/Obsidian/Research-Notes/Cadence - Evolving Code with LLMs for NP-Hard Problems.md
  • Raw copy: raw/obsidian/research-notes/Cadence - Evolving Code with LLMs for NP-Hard Problems.md

Links Created Or Updated

Open Questions