Yannic Kilcher public
[search 0]
More
Download the App!
show episodes
 
I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society. Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher Patre ...
  continue reading
 
Loading …
show series
 
#llm #ai #chatgpt How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks - an interesting phenomenon where the token at position 0 acts as an absorber of "extra"…
  continue reading
 
#ai #promptengineering #evolution Promptbreeder is a self-improving self-referential system for automated prompt engineering. Give it a task description and a dataset, and it will automatically come up with appropriate prompts for the task. This is achieved by an evolutionary algorithm where not only the prompts, but also the mutation-prompts are i…
  continue reading
 
#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising.OUTLINE:0:00 - Intro2:40 - The impossible triangle6:55 - Pa…
  continue reading
 
#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO.Paper: https://arxiv.org/abs/2308.0899…
  continue reading
 
#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning.References:https://twitter.com/ylecun/status/1681336284453781505https://ai.meta.com/llama/https://about.fb.com/news/2023/07/llama-2-statement-of-support/https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-media-platform-ranking-the-worlds-l…
  continue reading
 
#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime.https://threatmap.checkpoint.com/Links:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https…
  continue reading
 
#llm #safety #gpt4 A prime example of intellectual dishonesty of journalists and AI critics.Article: https://gizmodo.com/paknsave-ai-savey-recipe-bot-chlorine-gas-1850725057My Recipe AI: https://github.com/yk/recipe-aiLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https:…
  continue reading
 
#ai #diffusion #stabilityai An interview with DeepFloyd members Misha Konstantinov and Daria Bakshandaeva on the release of the model IF, an open-source model following Google's implementation of Imagen.References:https://www.deepfloyd.ai/deepfloyd-ifhttps://huggingface.co/DeepFloydhttps://twitter.com/_gugutse_https://twitter.com/_bra_ketLinks:Home…
  continue reading
 
#gpt4 #mit #ai A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper...OUTLINE:0:00 - ChatGPT gives out Windows 10 keys0:30 - MIT exam paper2:50 - Prompt engineering5:30 - Automatic grading6:45 - Response by other MIT students…
  continue reading
 
#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It i…
  continue reading
 
#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs.Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynncOUTLINE:0:00 - Introduction1:50 - Fully Connected In-Person Conference in SF June 7th3:00 - Transformers vs RNNs8:00 - RWKV: Best of both worlds12:30 - L…
  continue reading
 
#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting.OUT…
  continue reading
 
#ai #openai #gpt4 US Senate hearing on AI regulation.MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4Links:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www.linkedin.com/i…
  continue reading
 
#google #openai #mlnews Updates from the world of Machine Learning and AIGreat AI memes here: https://twitter.com/untitled01ipynbOUTLINE:0:00 - Google I/O 2023: Generative AI in everything0:20 - Anthropic announces 100k tokens context0:35 - Intro1:20 - Geoff Hinton leaves Google7:00 - Google memo leaked: we have no moat11:30 - OpenAI loses 540M12:3…
  continue reading
 
#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are.OUTLINE:0:00 - Intro2:15 - Transformers on long sequences4:30 - Tasks considered8:00 - Recurrent Memory Transformer19:40 - Experiments…
  continue reading
 
#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chatHomepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1Code: https://github.com/LAION-AI/Open-AssistantPaper (temporary): https://ykilcher.com/oa-paperLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: htt…
  continue reading
 
#openassistant #chatgpt #gpt4https://open-assistant.io/chathttps://huggingface.co/OpenAssistanthttps://github.com/LAION-AI/Open-AssistantLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www.…
  continue reading
 
#mlnews #gpt4 #copilotYour weekly news all around the AI worldCheck out W&B courses (free): https://wandb.courses/OUTLINE:0:00 - Intro0:20 - GPT-4 announced!4:30 - GigaGAN: The comeback of Generative Adversarial Networks7:55 - ChoppedAI: AI Recipes8:45 - Samsung accused of faking space zoom effect14:00 - Weights & Biases courses are free16:55 - Dat…
  continue reading
 
#gpt4 #chatgpt #openai References:https://openai.com/product/gpt-4https://openai.com/research/gpt-4https://cdn.openai.com/papers/gpt-4.pdfLinks:Homepage: https://ykilcher.comMerch: https://ykilcher.com/merchYouTube: https://www.youtube.com/c/yannickilcherTwitter: https://twitter.com/ykilcherDiscord: https://ykilcher.com/discordLinkedIn: https://www…
  continue reading
 
#mlnews #chatgpt #llamaChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner!ERRATA: It's a 4090, not a 4090 ti 🙃OUTLINE:0:00 - Introduction0:20 - GTC 23 on March 201:55 - ChatGPT API is out!4:50 - OpenAI bec…
  continue reading
 
#ai #meta #languagemodel LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer on more data and show that something like gpt-3 can be outperformed by significantly smaller models when trained like this. Meta also releases the trained models to the research community.OUTLINE:0:00 - Introducti…
  continue reading
 
#ai #huggingface #coding Join me as I build streaming inference into the Hugging Face text generation server, going through cuda, python, rust, grpc, websockets, server-sent events, and more...Original repo is here: https://github.com/huggingface/text-generation-inferenceOpenAssistant repo is here: https://github.com/LAION-AI/Open-Assistant (see in…
  continue reading
 
#openassistant #chatgpt #ai Help us collect data for OpenAssistant, the largest and most open alternative to ChatGPT.https://open-assistant.ioOUTLINE:0:00 - Intro0:30 - The Project2:05 - Getting to Minimum Viable Prototype5:30 - First Tasks10:00 - Leaderboard11:45 - Playing the Assistant14:40 - Tricky Facts16:25 - What if humans had wings?17:05 - C…
  continue reading
 
#chatgpt #ai #openai ChatGPT, OpenAI's newest model is a GPT-3 variant that has been fine-tuned using Reinforcement Learning from Human Feedback, and it is taking the world by storm! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:40 - Sponsor: Weights & Biases 3:20 - ChatGPT: How does it work? 5:20 - Reinforcement Learnin…
  continue reading
 
#ai #mlnews #gpt4 Your weekly news from the AI & Machine Learning world. OUTLINE: 0:00 - Introduction 0:25 - AI reads brain signals to predict what you're thinking 3:00 - Closed-form solution for neuron interactions 4:15 - GPT-4 rumors 6:50 - Cerebras supercomputer 7:45 - Meta releases metagenomics atlas 9:15 - AI advances in theorem proving 10:40 …
  continue reading
 
#ai #cicero #diplomacy A team from Meta AI has developed Cicero, an agent that can play the game Diplomacy, in which players have to communicate via chat messages to coordinate and plan into the future. Paper Title: Human-level play in the game of Diplomacy by combining language models with strategic reasoning Commented game by human expert: https:…
  continue reading
 
#mlnews #ai #mlinpl Your news from the world of Machine Learning! OUTLINE: 0:00 - Introduction 1:25 - Stable Diffusion Multiplayer 2:15 - Huggingface: DOI for Models & Datasets 3:10 - OpenAI asks for more funding 4:25 - The Stack: Source Code Dataset 6:30 - Google Vizier Open-Sourced 7:10 - New Models 11:50 - Helpful Things 20:30 - Prompt Databases…
  continue reading
 
#ai #stablediffusion #license So-called responsible AI licenses are stupid, counterproductive, and have a dangerous legal loophole in them. OpenRAIL++ License here: https://www.ykilcher.com/license OUTLINE: 0:00 - Introduction 0:40 - Responsible AI Licenses (RAIL) of BLOOM and Stable Diffusion 3:35 - Open source software's dilemma of bad usage and …
  continue reading
 
#ai #language #knowledge Large Language Models have the ability to store vast amounts of facts about the world. But little is known, how these models actually do this. This paper aims at discovering the mechanism and location of storage and recall of factual associations in GPT models, and then proposes a mechanism for the targeted editing of such …
  continue reading
 
#neuralnetworks #machinelearning #ai Alexander Mattick joins me to discuss the paper "Neural Networks are Decision Trees", which has generated a lot of hype on social media. We ask the question: Has this paper solved one of the large mysteries of deep learning and opened the black-box neural networks up to interpretability? OUTLINE: 0:00 - Introduc…
  continue reading
 
#alphatensor #deepmind #ai Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operat…
  continue reading
 
#stablediffusion #aiart #mlnews Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this... Sponsor: NVIDIA GPU Raffle: https://ykilcher.com/gtc OUTLINE: 0:00 - Introduction 0:30 - What is Stable Diffusion? 2:25 - Open-Source Contributions and Creations 7:55 - Textual Inversion 9:…
  continue reading
 
#ai #sparsity #gpu Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, wha…
  continue reading
 
#ai #interview #research Jacob Steinhardt believes that future AI systems will be qualitatively different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think. OUTLINE: 0:00 I…
  continue reading
 
#huggingface #pickle #exploit Did you know that something as simple as loading a model can execute arbitrary code on your machine? Try the model: https://huggingface.co/ykilcher/total... Get the code: https://github.com/yk/patch-torch-save Sponsor: Weights & Biases Go here: https://wandb.me/yannic OUTLINE: 0:00 - Introduction 1:10 - Sponsor: Weight…
  continue reading
 
#ai #selforganization #emergence Read Sebastian's article here: https://sebastianrisi.com/self_assemb... OUTLINE: 0:00 - Introduction 2:25 - Start of Interview 4:00 - The intelligence of swarms 9:15 - The game of life & neural cellular automata 14:10 - What's missing from neural CAs? 17:20 - How does local computation compare to centralized computa…
  continue reading
 
#stablediffusion #ai #stabilityai An interview with Emad Mostaque, founder of Stability AI. OUTLINE: 0:00 - Intro 1:30 - What is Stability AI? 3:45 - Where does the money come from? 5:20 - Is this the CERN of AI? 6:15 - Who gets access to the resources? 8:00 - What is Stable Diffusion? 11:40 - What if your model produces bad outputs? 14:20 - Do you…
  continue reading
 
#mlnews #bloom #ai Today we look at all the recent giant language models in the AI world! OUTLINE: 0:00 - Intro 0:55 - BLOOM: Open-Source 176B Language Model 5:25 - YALM 100B 5:40 - Chinese Brain-Scale Supercomputer 7:25 - Meta AI Translates over 200 Languages 10:05 - Reproducibility Crisis Workshop 10:55 - AI21 Raises $64M 11:50 - Ian Goodfellow l…
  continue reading
 
Yann LeCun's position paper on a path towards machine intelligence combines Self-Supervised Learning, Energy-Based Models, and hierarchical predictive embedding models to arrive at a system that can teach itself to learn useful abstractions at multiple levels and use that as a world model to plan ahead in time. OUTLINE: 0:00 - Introduction 2:00 - M…
  continue reading
 
#openai #vpt #minecraft Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a…
  continue reading
 
#parti #ai #aiart Parti is a new autoregressive text-to-image model that shows just how much scale can achieve. This model's outputs are crips, accurate, realistic, and can combine arbitrary styles, concepts, and fulfil even challenging requests. OUTLINE: 0:00 - Introduction 2:40 - Example Outputs 6:00 - Model Architecture 17:15 - Datasets (incl. P…
  continue reading
 
#lamda #google #ai Google engineer Blake Lemoine was put on leave after releasing proprietary information: An interview with the chatbot LaMDA that he believes demonstrates that this AI is, in fact, sentient. We analyze the claims and the interview in detail and trace how a statistical machine managed to convince at least one human that it is more …
  continue reading
 
Your updates directly from the state of the art in Machine Learning! OUTLINE: 0:00 - Intro 0:30 - DeepMind's Flamingo: Unified Vision-Language Model 8:25 - LiT: Locked Image Tuning 10:20 - Jurassic X & MRKL Systems 15:05 - Helpful Things 22:40 - This AI does not exist References: DeepMind's Flamingo: Unified Vision-Language Model https://www.deepmi…
  continue reading
 
#mlnews #dalle #gpt3 An inside look of what's happening in the ML world! Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro 0:20 - Sponsor: Weights & Biases 1:40 - Meta AI releases OPT-175B 4:55 - CoCa: New CLIP-Competitor 8:15 - DALL-E Mega is training 10:05 - TorToiSe TTS is amazing! 11:50 - Investigating Vision Transformers …
  continue reading
 
#nft #gan #ai Today we build our own AI that can create as many bored apes as we want! Fungibility for everyone! Try the model here: https://huggingface.co/spaces/ykilcher/apes or here: https://ykilcher.com/apes Files & Models here: https://huggingface.co/ykilcher/apes/tree/main Code here: https://github.com/yk/apes-public (for the "what's your ape…
  continue reading
 
#saycan #robots #ai This is an interview with the authors Brian Ichter, Karol Hausman, and Fei Xia. Original Paper Review Video: https://youtu.be/Ru23eWAQ6_E Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of …
  continue reading
 
#saycan #robots #ai Large Language Models are excellent at generating plausible plans in response to real-world problems, but without interacting with the environment, they have no abilities to estimate which of these plans are feasible or appropriate. SayCan combines the semantic capabilities of language models with a bank of low-level skills, whi…
  continue reading
 
#ai #accel #evolution This is an interview with the authors Jack Parker-Holder and Minqi Jiang. Original Paper Review Video: https://www.youtube.com/watch?v=povBD... Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and…
  continue reading
 
#ai #accel #evolution Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and drawbacks. This paper presents ACCEL, which takes the next step into the direction of constructing curricula for multi-capable agents. ACCEL co…
  continue reading
 
#laion #clip #dalle LAION-5B is an open, free dataset consisting of over 5 billion image-text-pairs. Today's video is an interview with three of its creators. We dive into the mechanics and challenges of operating at such large scale, how to keep cost low, what new possibilities are enabled with open datasets like this, and how to best handle safet…
  continue reading
 
Loading …

Quick Reference Guide