Go is a good example of your point because it proves that deep learning and scale are *not* all you need.
There's only one algorithm today that plays Go at a superhuman level: Monte Carlo tree search. It was invented specifically for Go. In the 90s people used monte carlo methods for Go bots, in the 00s people invented Monte Carlo tree search for go bots, and then when AlphaGo finally became superhuman it was using MCTS plus neural networks. So all this Go-specific algorithmic research really did pay off.
Pure AI, like just a policy network, is okay at Go but not at the level of top humans.
Similarly, the best chess algorithm today is alpha-beta tree search, which was invented specifically for chess.
The part that seems to always get replaced as AI-based systems scale is "feature engineering". Not the algorithmic search design.
Do you have a citation for alpha-beta tree search being SOTA at Chess? I thought MCTS was! (but I'm not familiar with the chess literature.)
I don't get how search is a counter-example. I think that I (and Rich) would both consider search to be a great example of a method that scales with compute, which is exactly the point of the bitter lesson. The bitter lesson says nothing about deep learning!
I don't intend to be bringing all this up as a counterexample, I'm agreeing with your "point two", that it is not correct to say "The idea that deep learning and scale are all we need".
To me it seems to break down as, we see that designing specialized *search algorithms* for a domain has proven to scale, but using specialized *feature engineering* gets surpassed by generic AI techniques.
The exponential scaling hype is due for its sinusoidal AI winter at some unpredictable point in a system where Nietzsche envy is converted to ambition which is then converted to money in the hyper leveraged economy? Say HI to your dad for ME. Cheers.
How can an essay be great if it is misunderstood so often? That’s pretty much the definition of a bad essay.
Further, the essay says almost verbatim what you wrote it does not: compute is set up against human knowledge. E.G., quoting, „Time spent on one is time not spent on the other.“
That being said, the essay makes use of many cherry-picked examples, and some of them are not faithful to what the state of the art is. For example, many problems in CV don’t use learning at all.
Go is a good example of your point because it proves that deep learning and scale are *not* all you need.
There's only one algorithm today that plays Go at a superhuman level: Monte Carlo tree search. It was invented specifically for Go. In the 90s people used monte carlo methods for Go bots, in the 00s people invented Monte Carlo tree search for go bots, and then when AlphaGo finally became superhuman it was using MCTS plus neural networks. So all this Go-specific algorithmic research really did pay off.
Pure AI, like just a policy network, is okay at Go but not at the level of top humans.
Similarly, the best chess algorithm today is alpha-beta tree search, which was invented specifically for chess.
The part that seems to always get replaced as AI-based systems scale is "feature engineering". Not the algorithmic search design.
Do you have a citation for alpha-beta tree search being SOTA at Chess? I thought MCTS was! (but I'm not familiar with the chess literature.)
I don't get how search is a counter-example. I think that I (and Rich) would both consider search to be a great example of a method that scales with compute, which is exactly the point of the bitter lesson. The bitter lesson says nothing about deep learning!
Stockfish is generally winning TCEC over Leela - see https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship . There is not so much "literature" on it, they just routinely have competitions.
I don't intend to be bringing all this up as a counterexample, I'm agreeing with your "point two", that it is not correct to say "The idea that deep learning and scale are all we need".
To me it seems to break down as, we see that designing specialized *search algorithms* for a domain has proven to scale, but using specialized *feature engineering* gets surpassed by generic AI techniques.
ah yeah Stockfish, of course.
The exponential scaling hype is due for its sinusoidal AI winter at some unpredictable point in a system where Nietzsche envy is converted to ambition which is then converted to money in the hyper leveraged economy? Say HI to your dad for ME. Cheers.
How can an essay be great if it is misunderstood so often? That’s pretty much the definition of a bad essay.
Further, the essay says almost verbatim what you wrote it does not: compute is set up against human knowledge. E.G., quoting, „Time spent on one is time not spent on the other.“
That being said, the essay makes use of many cherry-picked examples, and some of them are not faithful to what the state of the art is. For example, many problems in CV don’t use learning at all.