Ethan Caballero–Scale is All You Need

The Inside View

Content provided by Michaël Trazzi. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Michaël Trazzi or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

2+ y ago 51:54

M4A•Episode home

Ethan is known on Twitter as the edgiest person at MILA. We discuss all the gossips around scaling large language models in what will be later known as the Edward Snowden moment of Deep Learning. On his free time, Ethan is a Master’s degree student at MILA in Montreal, and has published papers on out of distribution generalization and robustness generalization, accepted both as oral presentations and spotlight presentations at ICML and NeurIPS. Ethan has recently been thinking about scaling laws, both as an organizer and speaker for the 1st Neural Scaling Laws Workshop.

Transcript: https://theinsideview.github.io/ethan

Youtube: https://youtu.be/UPlv-lFWITI

Michaël: https://twitter.com/MichaelTrazzi

Ethan: https://twitter.com/ethancaballero

Outline

(00:00) highlights

(00:50) who is Ethan, scaling laws T-shirts

(02:30) scaling, upstream, downstream, alignment and AGI

(05:58) AI timelines, AlphaCode, Math scaling, PaLM

(07:56) Chinchilla scaling laws

(11:22) limits of scaling, Copilot, generative coding, code data

(15:50) Youtube scaling laws, constrative type thing

(20:55) AGI race, funding, supercomputers

(24:00) Scaling at Google

(25:10) gossips, private research, GPT-4

(27:40) why Ethan was did not update on PaLM, hardware bottleneck

(29:56) the fastest path, the best funding model for supercomputers

(31:14) EA, OpenAI, Anthropics, publishing research, GPT-4

(33:45) a zillion language model startups from ex-Googlers

(38:07) Ethan's journey in scaling, early days

(40:08) making progress on an academic budget, scaling laws research

(41:22) all alignment is inverse scaling problems

(45:16) predicting scaling laws, useful ai alignment research

(47:16) nitpicks aobut Ajeya Cotra's report, compute trends

(50:45) optimism, conclusion on alignment

55 episodes

#Tech #Michaël Trazzi