Best Yannic Kilcher Podcasts (2024)

1
Efficient Streaming Language Models with Attention Sinks (Paper Explained) 32:26

10M ago32:26

32:26

#llm #ai #chatgpt How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks - an interesting phenomenon where the token at position 0 acts as an absorber of "extra"…

1
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained) 46:44

10M ago46:44

46:44

#ai #promptengineering #evolution Promptbreeder is a self-improving self-referential system for automated prompt engineering. Give it a task description and a dataset, and it will automatically come up with appropriate prompts for the task. This is achieved by an evolutionary algorithm where not only the prompts, but also the mutation-prompts are i…

1
Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained) 28:25

11M ago28:25

28:25

#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising.OUTLINE:0:00 - Intro2:40 - The impossible triangle6:55 - Pa…

1
Reinforced Self-Training (ReST) for Language Modeling (Paper Explained) 53:06

11M ago53:06

53:06

#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO.Paper: https://arxiv.org/abs/2308.0899…

1
[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise 44:10

12M ago44:10

44:10

#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning.References:https://twitter.com/ylecun/status/1681336284453781505https://ai.meta.com/llama/https://about.fb.com/news/2023/07/llama-2-statement-of-support/https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-media-platform-ranking-the-worlds-l…

1
How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich) 29:08

12M ago29:08

29:08

#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime.https://threatmap.checkpoint.com/Links:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https…

1
Recipe AI suggests FATAL CHLORINE GAS Recipe 7:05

12M ago7:05

7:05

#llm #safety #gpt4 A prime example of intellectual dishonesty of journalists and AI critics.Article: https://gizmodo.com/paknsave-ai-savey-recipe-bot-chlorine-gas-1850725057My Recipe AI: https://github.com/yk/recipe-aiLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https:…

1
DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors) 53:31

12M ago53:31

53:31

#ai #diffusion #stabilityai An interview with DeepFloyd members Misha Konstantinov and Daria Bakshandaeva on the release of the model IF, an open-source model following Google's implementation of Imagen.References:https://www.deepfloyd.ai/deepfloyd-ifhttps://huggingface.co/DeepFloydhttps://twitter.com/_gugutse_https://twitter.com/_bra_ketLinks:Home…

1
[ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released 31:04

12M ago31:04

31:04

#gpt4 #mit #ai A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper...OUTLINE:0:00 - ChatGPT gives out Windows 10 keys0:30 - MIT exam paper2:50 - Prompt engineering5:30 - Automatic grading6:45 - Response by other MIT students…

1
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained) 35:44

12M ago35:44

35:44

#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It i…

1
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained) 1:02:16

12M ago1:02:16

1:02:16

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs.Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynncOUTLINE:0:00 - Introduction1:50 - Fully Connected In-Person Conference in SF June 7th3:00 - Transformers vs RNNs8:00 - RWKV: Best of both worlds12:30 - L…

1
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review) 29:28

12M ago29:28

29:28

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting.OUT…

1
OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman) 16:12

12M ago16:12

16:12

#ai #openai #gpt4 US Senate hearing on AI regulation.MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4Links:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www.linkedin.com/i…

1
[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion 39:06

12M ago39:06

39:06

#google #openai #mlnews Updates from the world of Machine Learning and AIGreat AI memes here: https://twitter.com/untitled01ipynbOUTLINE:0:00 - Google I/O 2023: Generative AI in everything0:20 - Anthropic announces 100k tokens context0:35 - Intro1:20 - Geoff Hinton leaves Google7:00 - Google memo leaked: we have no moat11:30 - OpenAI loses 540M12:3…

1
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained) 24:33

12M ago24:33

24:33

#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are.OUTLINE:0:00 - Intro2:15 - Transformers on long sequences4:30 - Tasks considered8:00 - Recurrent Memory Transformer19:40 - Experiments…

1
OpenAssistant RELEASED! The world's best open-source Chat AI! 21:05

12M ago21:05

21:05

#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chatHomepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1Code: https://github.com/LAION-AI/Open-AssistantPaper (temporary): https://ykilcher.com/oa-paperLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: htt…

1
OpenAssistant First Models are here! (Open-Source ChatGPT) 16:52

12M ago16:52

16:52

#openassistant #chatgpt #gpt4https://open-assistant.io/chathttps://huggingface.co/OpenAssistanthttps://github.com/LAION-AI/Open-AssistantLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www.…

1
The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more) 41:01

12M ago41:01

41:01

#mlnews #gpt4 #copilotYour weekly news all around the AI worldCheck out W&B courses (free): https://wandb.courses/OUTLINE:0:00 - Intro0:20 - GPT-4 announced!4:30 - GigaGAN: The comeback of Generative Adversarial Networks7:55 - ChoppedAI: AI Recipes8:45 - Samsung accused of faking space zoom effect14:00 - Weights & Biases courses are free16:55 - Dat…

1
GPT-4 is here! What we know so far (Full Analysis) 34:09

12M ago34:09

34:09

#gpt4 #chatgpt #openai References:https://openai.com/product/gpt-4https://openai.com/research/gpt-4https://cdn.openai.com/papers/gpt-4.pdfLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www…

1
This ChatGPT Skill will earn you $10B (also, AI reads your mind!) 43:27

12M ago43:27

43:27

#mlnews #chatgpt #llamaChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner!ERRATA: It's a 4090, not a 4090 ti 🙃OUTLINE:0:00 - Introduction0:20 - GTC 23 on March 201:55 - ChatGPT API is out!4:50 - OpenAI bec…

1
LLaMA: Open and Efficient Foundation Language Models (Paper Explained) 41:06

12M ago41:06

41:06

#ai #meta #languagemodel LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer on more data and show that something like gpt-3 can be outperformed by significantly smaller models when trained like this. Meta also releases the trained models to the research community.OUTLINE:0:00 - Introducti…

1
Open Assistant Inference Backend Development (Hands-On Coding) 1:21:23

12M ago1:21:23

1:21:23

#ai #huggingface #coding Join me as I build streaming inference into the Hugging Face text generation server, going through cuda, python, rust, grpc, websockets, server-sent events, and more...Original repo is here: https://github.com/huggingface/text-generation-inferenceOpenAssistant repo is here: https://github.com/LAION-AI/Open-Assistant (see in…

1
OpenAssistant - ChatGPT's Open Alternative (We need your help!) 35:47

12M ago35:47

35:47

#openassistant #chatgpt #ai Help us collect data for OpenAssistant, the largest and most open alternative to ChatGPT.https://open-assistant.ioOUTLINE:0:00 - Intro0:30 - The Project2:05 - Getting to Minimum Viable Prototype5:30 - First Tasks10:00 - Leaderboard11:45 - Playing the Assistant14:40 - Tricky Facts16:25 - What if humans had wings?17:05 - C…

1
ChatGPT: This AI has a JAILBREAK?! (Unbelievable AI Progress) 31:54

1+ y ago31:54

31:54

#chatgpt #ai #openai ChatGPT, OpenAI's newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:40 - Sponsor: Weights & Biases 3:20 - ChatGPT: How does it work? 5:20 - Reinforcement Learnin…

1
[ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving 41:55

1+ y ago41:55

41:55

#ai #mlnews #gpt4 Your weekly news from the AI & Machine Learning world. OUTLINE: 0:00 - Introduction 0:25 - AI reads brain signals to predict what you're thinking 3:00 - Closed-form solution for neuron interactions 4:15 - GPT-4 rumors 6:50 - Cerebras supercomputer 7:45 - Meta releases metagenomics atlas 9:15 - AI advances in theorem proving 10:40 …

1
CICERO: An AI agent that negotiates, persuades, and cooperates with people 1:01:02

1+ y ago1:01:02

1:01:02

#ai #cicero #diplomacy A team from Meta AI has developed Cicero, an agent that can play the game Diplomacy, in which players have to communicate via chat messages to coordinate and plan into the future. Paper Title: Human-level play in the game of Diplomacy by combining language models with strategic reasoning Commented game by human expert: https:…

1
[ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming 22:52

1+ y ago22:52

22:52

#mlnews #ai #mlinpl Your news from the world of Machine Learning! OUTLINE: 0:00 - Introduction 1:25 - Stable Diffusion Multiplayer 2:15 - Huggingface: DOI for Models & Datasets 3:10 - OpenAI asks for more funding 4:25 - The Stack: Source Code Dataset 6:30 - Google Vizier Open-Sourced 7:10 - New Models 11:50 - Helpful Things 20:30 - Prompt Databases…

1
The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.) 27:50

1+ y ago27:50

27:50

#ai #stablediffusion #license So-called responsible AI licenses are stupid, counterproductive, and have a dangerous legal loophole in them. OpenRAIL++ License here: https://www.ykilcher.com/license OUTLINE: 0:00 - Introduction 0:40 - Responsible AI Licenses (RAIL) of BLOOM and Stable Diffusion 3:35 - Open source software's dilemma of bad usage and …

1
ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview) 1:04:58

1+ y ago1:04:58

1:04:58

#ai #language #knowledge Large Language Models have the ability to store vast amounts of facts about the world. But little is known, how these models actually do this. This paper aims at discovering the mechanism and location of storage and recall of factual associations in GPT models, and then proposes a mechanism for the targeted editing of such …

1
Neural Networks are Decision Trees (w/ Alexander Mattick) 31:50

2y ago31:50

31:50

#neuralnetworks #machinelearning #ai Alexander Mattick joins me to discuss the paper "Neural Networks are Decision Trees", which has generated a lot of hype on social media. We ask the question: Has this paper solved one of the large mysteries of deep learning and opened the black-box neural networks up to interpretability? OUTLINE: 0:00 - Introduc…

1
This is a game changer! (AlphaTensor by DeepMind explained) 55:06

2y ago55:06

55:06

#alphatensor #deepmind #ai Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operat…

1
[ML News] Stable Diffusion Takes Over! (Open Source AI Art) 27:27

2y ago27:27

27:27

#stablediffusion #aiart #mlnews Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this... Sponsor: NVIDIA GPU Raffle: https://ykilcher.com/gtc OUTLINE: 0:00 - Introduction 0:30 - What is Stable Diffusion? 2:25 - Open-Source Contributions and Creations 7:55 - Textual Inversion 9:…

1
How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit 50:19

2y ago50:19

50:19

#ai #sparsity #gpu Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, wha…

1
More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt) 1:06:36

2y ago1:06:36

1:06:36

#ai #interview #research Jacob Steinhardt believes that future AI systems will be qualitatively different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think. OUTLINE: 0:00 I…

1
The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!) 19:42

2y ago19:42

19:42

#huggingface #pickle #exploit Did you know that something as simple as loading a model can execute arbitrary code on your machine? Try the model: https://huggingface.co/ykilcher/total... Get the code: https://github.com/yk/patch-torch-save Sponsor: Weights & Biases Go here: https://wandb.me/yannic OUTLINE: 0:00 - Introduction 1:10 - Sponsor: Weight…

1
The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi) 1:01:48

2y ago1:01:48

1:01:48

#ai #selforganization #emergence Read Sebastian's article here: https://sebastianrisi.com/self_assemb... OUTLINE: 0:00 - Introduction 2:25 - Start of Interview 4:00 - The intelligence of swarms 9:15 - The game of life & neural cellular automata 14:10 - What's missing from neural CAs? 17:20 - How does local computation compare to centralized computa…

1
The Man behind Stable Diffusion 25:41

2y ago25:41

25:41

#stablediffusion #ai #stabilityai An interview with Emad Mostaque, founder of Stability AI. OUTLINE: 0:00 - Intro 1:30 - What is Stability AI? 3:45 - Where does the money come from? 5:20 - Is this the CERN of AI? 6:15 - Who gets access to the resources? 8:00 - What is Stable Diffusion? 11:40 - What if your model produces bad outputs? 14:20 - Do you…

1
[ML News] BLOOM: 176B Open-Source | Chinese Brain-Scale Computer | Meta AI: No Language Left Behind 14:02

2y ago14:02

14:02

#mlnews #bloom #ai Today we look at all the recent giant language models in the AI world! OUTLINE: 0:00 - Intro 0:55 - BLOOM: Open-Source 176B Language Model 5:25 - YALM 100B 5:40 - Chinese Brain-Scale Supercomputer 7:25 - Meta AI Translates over 200 Languages 10:05 - Reproducibility Crisis Workshop 10:55 - AI21 Raises $64M 11:50 - Ian Goodfellow l…

1
JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained) 59:37

2y ago59:37

59:37

Yann LeCun's position paper on a path towards machine intelligence combines Self-Supervised Learning, Energy-Based Models, and hierarchical predictive embedding models to arrive at a system that can teach itself to learn useful abstractions at multiple levels and use that as a world model to plan ahead in time. OUTLINE: 0:00 - Introduction 2:00 - M…

1
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos (Paper Explained) 32:33

2y ago32:33

32:33

#openai #vpt #minecraft Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a…

1
Parti - Scaling Autoregressive Models for Content-Rich Text-to-Image Generation (Paper Explained) 34:57

2y ago34:57

34:57

#parti #ai #aiart Parti is a new autoregressive text-to-image model that shows just how much scale can achieve. This model's outputs are crips, accurate, realistic, and can combine arbitrary styles, concepts, and fulfil even challenging requests. OUTLINE: 0:00 - Introduction 2:40 - Example Outputs 6:00 - Model Architecture 17:15 - Datasets (incl. P…

1
Did Google's LaMDA chatbot just become sentient? 22:22

2y ago22:22

22:22

#lamda #google #ai Google engineer Blake Lemoine was put on leave after releasing proprietary information: An interview with the chatbot LaMDA that he believes demonstrates that this AI is, in fact, sentient. We analyze the claims and the interview in detail and trace how a statistical machine managed to convince at least one human that it is more …

1
[ML News] DeepMind's Flamingo Image-Text model | Locked-Image Tuning | Jurassic X & MRKL 24:18

2y ago24:18

24:18

Your updates directly from the state of the art in Machine Learning! OUTLINE: 0:00 - Intro 0:30 - DeepMind's Flamingo: Unified Vision-Language Model 8:25 - LiT: Locked Image Tuning 10:20 - Jurassic X & MRKL Systems 15:05 - Helpful Things 22:40 - This AI does not exist References: DeepMind's Flamingo: Unified Vision-Language Model https://www.deepmi…

1
[ML News] Meta's OPT 175B language model | DALL-E Mega is training | TorToiSe TTS fakes my voice 19:23

2+ y ago19:23

19:23

#mlnews #dalle #gpt3 An inside look of what's happening in the ML world! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 1:40 - Meta AI releases OPT-175B 4:55 - CoCa: New CLIP-Competitor 8:15 - DALL-E Mega is training 10:05 - TorToiSe TTS is amazing! 11:50 - Investigating Vision Transformers …

1
This A.I. creates infinite NFTs 18:47

2+ y ago18:47

18:47

#nft #gan #ai Today we build our own AI that can create as many bored apes as we want! Fungibility for everyone! Try the model here: https://huggingface.co/spaces/ykilcher/apes or here: https://ykilcher.com/apes Files & Models here: https://huggingface.co/ykilcher/apes/tree/main Code here: https://github.com/yk/apes-public (for the "what's your ape…

1
Author Interview: SayCan - Do As I Can, Not As I Say: Grounding Language in Robotic Affordances 58:31

2+ y ago58:31

58:31

#saycan #robots #ai This is an interview with the authors Brian Ichter, Karol Hausman, and Fei Xia. Original Paper Review Video: https://youtu.be/Ru23eWAQ6_E Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of …

1
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan - Paper Explained) 28:46

2+ y ago28:46

28:46

#saycan #robots #ai Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of these plans are feasible or appropriate. SayCan combines the semantic capabilities of language models with a bank of low-level skills, whi…

1
Author Interview - ACCEL: Evolving Curricula with Regret-Based Environment Design 57:45

2+ y ago57:45

57:45

#ai #accel #evolution This is an interview with the authors Jack Parker-Holder and Minqi Jiang. Original Paper Review Video: https://www.youtube.com/watch?v=povBD... Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and…

1
ACCEL: Evolving Curricula with Regret-Based Environment Design (Paper Review) 44:05

2+ y ago44:05

44:05

#ai #accel #evolution Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and drawbacks. This paper presents ACCEL, which takes the next step into the direction of constructing curricula for multi-capable agents. ACCEL co…

1
LAION-5B: 5 billion image-text-pairs dataset (with the authors) 58:01

2+ y ago58:01

58:01

#laion #clip #dalle LAION-5B is an open, free dataset consisting of over 5 billion image-text-pairs. Today's video is an interview with three of its creators. We dive into the mechanics and challenges of operating at such large scale, how to keep cost low, what new possibilities are enabled with open datasets like this, and how to best handle safet…

Podcasts Worth a Listen

Yannic Kilcher Podcasts

Podcasts Worth a Listen

Quick Reference Guide