The Bitter Lesson

Finbarr Timbers

Jun 26

Far too many people misunderstand the bitter lesson

Read →

6 Comments

Kevin

Jun 26

Go is a good example of your point because it proves that deep learning and scale are *not* all you need.

There's only one algorithm today that plays Go at a superhuman level: Monte Carlo tree search. It was invented specifically for Go. In the 90s people used monte carlo methods for Go bots, in the 00s people invented Monte Carlo tree search for go bots, and then when AlphaGo finally became superhuman it was using MCTS plus neural networks. So all this Go-specific algorithmic research really did pay off.

Pure AI, like just a policy network, is okay at Go but not at the level of top humans.

Similarly, the best chess algorithm today is alpha-beta tree search, which was invented specifically for chess.

The part that seems to always get replaced as AI-based systems scale is "feature engineering". Not the algorithmic search design.

Expand full comment

Reply (1)

Finbarr Timbers

Jun 26

Do you have a citation for alpha-beta tree search being SOTA at Chess? I thought MCTS was! (but I'm not familiar with the chess literature.)

I don't get how search is a counter-example. I think that I (and Rich) would both consider search to be a great example of a method that scales with compute, which is exactly the point of the bitter lesson. The bitter lesson says nothing about deep learning!

Expand full comment

Reply (1)

Kevin

Jun 26

Stockfish is generally winning TCEC over Leela - see https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship . There is not so much "literature" on it, they just routinely have competitions.

I don't intend to be bringing all this up as a counterexample, I'm agreeing with your "point two", that it is not correct to say "The idea that deep learning and scale are all we need".

To me it seems to break down as, we see that designing specialized *search algorithms* for a domain has proven to scale, but using specialized *feature engineering* gets surpassed by generic AI techniques.

Expand full comment

Reply (1)

Finbarr Timbers

Jun 27

ah yeah Stockfish, of course.

Expand full comment

Mark Elliott

Jun 26

The exponential scaling hype is due for its sinusoidal AI winter at some unpredictable point in a system where Nietzsche envy is converted to ambition which is then converted to money in the hyper leveraged economy? Say HI to your dad for ME. Cheers.

Expand full comment

Justin Bayer

Jun 27

How can an essay be great if it is misunderstood so often? That’s pretty much the definition of a bad essay.

Further, the essay says almost verbatim what you wrote it does not: compute is set up against human knowledge. E.G., quoting, „Time spent on one is time not spent on the other.“

That being said, the essay makes use of many cherry-picked examples, and some of them are not faithful to what the state of the art is. For example, many problems in CV don’t use learning at all.

Expand full comment

Artificial Fintelligence

The Bitter Lesson