Best Applied Mathematics Podcasts (2024)

1
Diffusion Forcing to Expert Tuning, Structured Planning, Vision-Language Models, and Tabular ML Benchmarks 11:34

3h ago11:34

11:34

Diffusion Forcing: Next-token Prediction Meets Full-Sequence DiffusionLet the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language ModelsPlanetarium: A Rigorous Benchmark for Translating Text to Structured Planning LanguagesInternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Co…

1
Advancing AI's Mathematical Reasoning: WE-MATH, ROS-LLM Framework, Autoregressive Image Generation 10:36

3h ago10:36

10:36

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoningMMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient EvaluationLiteSearch: Efficacious Tree Search for LLMWavelets Are All You Need for Autoregressive Image…

1
Success Ratio: The Mathematics of Success Hidden in the Universe 2:58

20h ago2:58

2:58

Begin with 'Success Ratio' to enter the realm of dispelling ignorance and achieving true self-transformation according to the natural laws of the universe. Now available for purchase on Amazon.com "Asking even a slightly wrong question about life can lead to completely different paths." The Search Ends Here DON'T LET YOUR SUBCONSCIOUS DISMISS THIS …

1
Persona-Driven Data Synthesis, Enhancing Medical MLLMs, Robot Learning, Knowledge Distillation in LLMs, Text to 3D Gaussian Revolution 11:24

22h ago11:24

11:24

Scaling Synthetic Data Creation with 1,000,000,000 PersonasHuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at ScaleLLaRA: Supercharging Robot Learning Data for Vision-Language PolicyDirect Preference Knowledge Distillation for Large Language ModelsGaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enh…

1
OMG-LLaVA: Unifying Vision and Language Understanding, Step-DPO for LLMs Mathematical Reasoning, MUMU's Multimodal Image Generation 12:15

2d ago12:15

12:15

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and UnderstandingStep-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMsMUMU: Bootstrapping Multimodal Image Generation from Text-to-Image DataSimulating Classroom Education with LLM-Empowered AgentsSeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval …

1
Inequitable exposure to wildfire smoke 10:40

3d ago10:40

10:40

Inequitable wildfire smoke exposure in California Science Sessions are brief conversations with cutting-edge researchers, National Academy members, and policymakers as they discuss topics relevant to today's scientific community. Learn the behind-the-scenes story of work published in the Proceedings of the National Academy of Sciences (PNAS), plus …

1
FineWeb Datasets, YouDream's 3D Animals, PDE-Solving Breakthrough, Noise-Conditioned Perception Alignment, Language Models' Continual Learning 11:02

6d ago11:02

11:02

The FineWeb Datasets: Decanting the Web for the Finest Text Data at ScaleYouDream: Generating Anatomically Controllable Consistent Text-to-3D AnimalsDiffusionPDE: Generative PDE-Solving Under Partial ObservationAligning Diffusion Models with Noise-Conditioned PerceptionUnlocking Continual Learning Abilities in Language Models…

1
BigCodeBench Challenges, Cambrian-1 Leap, D-MERIT's Evaluation, Long Context Breakthrough in Vision 11:06

7d ago11:06

11:06

DreamBench++: A Human-Aligned Benchmark for Personalized Image GenerationBigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex InstructionsCambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMsEvaluating D-MERIT of Partial-annotation on Information RetrievalLong Context Transfer from Language to Vision…

1
LongRAG Breakthrough, LLMs as Judges, Transformer Memory Insights, Video Library AI, Democratizing Art Styles 10:14

8d ago10:14

10:14

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMsJudging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-JudgesComplexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a TaskTowards Retrieval Augmented Generation over Large Video LibrariesStylebreeder: Exploring …

1
Scaling In-Context Reinforcement Learning, ChartMimic's AI Benchmark, Multimodal Document Comprehension, Long Context Reasoning Challenges 10:36

13d ago10:36

10:36

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement LearningMake It Count: Text-to-Image Generation with an Accurate Number of ObjectsChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code GenerationNeedle In A Multimodal HaystackBABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Hay…

1
Revolutionizing Vision and Language Models: Depth Prediction Breakthroughs, Pixel-Level Transformers, and Robotic Skill Learning 13:20

14d ago13:20

13:20

Depth Anything V2An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual PixelsTransformers meet Neural Algorithmic ReasonersSamba: Simple Hybrid State Space Models for Efficient Unlimited Context Language ModelingOpenVLA: An Open-Source Vision-Language-Action ModelAlleviating Distortion in Image Generation via Multi-Resolut…

1
Gentrification and biodiversity 10:07

21d ago10:07

10:07

Biodiversity and gentrification Science Sessions are brief conversations with cutting-edge researchers, National Academy members, and policymakers as they discuss topics relevant to today's scientific community. Learn the behind-the-scenes story of work published in the Proceedings of the National Academy of Sciences (PNAS), plus a broad range of s…

1
NaRCan Revolutionizes Video Editing, Training-Free Video Generation, Recaptioning Web Images with LLaMA-3, Novel Data Synthesis Approach, Smartphone LLM Inference 11:33

18d ago11:33

11:33

NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video EditingMotionClone: Training-Free Motion Cloning for Controllable Video GenerationWhat If We Recaption Billions of Web Images with LLaMA-3?Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with NothingPowerInfer-2: Fast Large Language Model I…

1
Revolutionizing Image Synthesis with TiTok, Multilingual Code Benchmark, Exploring GenAI Prompting Techniques, 10:53

19d ago10:53

10:53

An Image is Worth 32 Tokens for Reconstruction and GenerationMcEval: Massively Multilingual Code EvaluationZero-shot Image Editing with Reference ImitationThe Prompt Report: A Systematic Survey of Prompting TechniquesTextGrad: Automatic "Differentiation" via Text

1
LlamaGen's Image Revolution, Husky: The Multi-Step Reasoner, Vript's Video Breakthrough, VALL-E 2 Achieves Human Parity 10:46

20d ago10:46

10:46

Autoregressive Model Beats Diffusion: Llama for Scalable Image GenerationHusky: A Unified, Open-Source Language Agent for Multi-Step ReasoningVript: A Video Is Worth Thousands of WordsLighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View SynthesisVALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text …

1
Mixture-of-Agents, Benchmarking LLMs, and GenAI Arena Evaluation 11:06

23d ago11:06

11:06

Mixture-of-Agents Enhances Large Language Model CapabilitiesWildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the WildCRAG -- Comprehensive RAG BenchmarkGenAI Arena: An Open Evaluation Platform for Generative ModelsLarge Language Model Confidence Estimation via Black-Box Access

1
Enhancing AI Video and Image Generation, BitsFusion Quantization, Step-aware Optimization, Thought-Augmented Reasoning, and Single Forward Video Generation 11:39

24d ago11:39

11:39

ShareGPT4Video: Improving Video Understanding and Generation with Better CaptionsBitsFusion: 1.99 bits Weight Quantization of Diffusion ModelStep-aware Preference Optimization: Aligning Preference with Denoising Performance at Each StepBuffer of Thoughts: Thought-Augmented Reasoning with Large Language ModelsSF-V: Single Forward Video Generation Mo…

1
AI Papers Podcast Special Edition: Apple Intelligence & Ferret-UI 1:52

24d ago1:52

1:52

Apple announced new Siri features and Apple Intelligence today, Interestingly, Apple already released a paper, titled "Ferret-UI," on how it all works - a multimodal vision-language model capable of understanding widgets, icons, and text on an iOS mobile screen, and reasoning about their spatial relationships and functional meanings. https://arxiv.…

1
Block Transformers: Faster Inference, Mobile Device AI Agents, 3D-Image Generation, Low Latency TTS 10:41

25d ago10:41

10:41

Block Transformer: Global-to-Local Language Modeling for Fast InferenceParrot: Multilingual Visual Instruction TuningMobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent CollaborationOuroboros3D: Image-to-3D Generation via 3D-aware Recursive DiffusionLiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autore…

1
Seed-TTS, Decoding LLMs, Innovations in Text-to-Video, Self-Improving AI Preferences, and Refining Diffusion Models 11:10

27d ago11:10

11:10

Seed-TTS: A Family of High-Quality Versatile Speech Generation ModelsTo Believe or Not to Believe Your LLMI4VGen: Image as Stepping Stone for Text-to-Video GenerationSelf-Improving Robust Preference OptimizationGuiding a Diffusion Model with a Bad Version of Itself

1
MMLU-Pro: Next-Level Language Understanding, Tailored LLMs, High FPS Video Generation Innovation 11:30

28d ago11:30

11:30

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding BenchmarkLearning Temporally Consistent Video Depth from Video Diffusion PriorsShow, Don't Tell: Aligning Language Models with Demonstrated FeedbackArtificial Generational Intelligence: Cultural Accumulation in Reinforcement LearningZeroSmooth: Training-free Diffuser Adaptati…

1
Transformers and State-Space Models Unite, Multi-modal LLM Benchmark, Perplexity in Data Pruning, Advancing 4D Content Generation 10:23

30d ago10:23

10:23

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space DualityVideo-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video AnalysisPerplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference ModelsKaleido Diffusion: Improving Conditional Diffusion Models with Au…

1
DITTO-2 Speeds Up Music AI, GECO's Quick 3D Generation, PLA4D's 4D Advances, DevEval's Real-World Code Benchmark, Parrot's LLM Application Efficiency 10:47

1M ago10:47

10:47

AI Papers Podcast for 06/04/2024 DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music GenerationGECO: Generative Image-to-3D within a SECOndPLA4D: Pixel-Level Alignments for Text-to-4D Gaussian SplattingDevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code RepositoriesParrot: Efficient Serving of LLM-b…

1
School enrollment during the COVID-19 pandemic 10:10

1M ago10:10

10:10

School enrollment during COVID-19 Science Sessions are brief conversations with cutting-edge researchers, National Academy members, and policymakers as they discuss topics relevant to today's scientific community. Learn the behind-the-scenes story of work published in the Proceedings of the National Academy of Sciences (PNAS), plus a broad range of…

1
Boosting Text Retrieval with CLIP Models, Rethinking Retrieval Augmented Generation, and Deciphering Human Behavior through MotionLLM 10:42

1M ago10:42

10:42

AI Papers Podcast for 06/03/2024 Jina CLIP: Your CLIP Model Is Also Your Text RetrieverSimilarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered ThoughtsMotionLLM: Understanding Human Behaviors from Human Motions and VideosXwin-LM: Strong and Scalable Alignment Practice for LLMsMOFA-Video: Controllable Image Animati…

1
Jonathan Gorard: the complete first interview 2:48:59

1M ago2:48:59

2:48:59

I’ve heard from many of you that you’d like the whole of my conversation with Jonathan Gorard in a single podcast. So here it is, the complete first interview. These three hours are a brilliant exposition of Wolfram Physics from a figure whose contributions to the project are second to none. — Jonathan Gorard Jonathan Gorard at The Wolfram Physics …

1
Bilingual LLM Transparency, T2V-Turbo's Video Generation, LLMs Surpassing Human Theory of Mind Performance, Advancements in LLM Attribution 8:47

1M ago8:47

8:47

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model SeriesT2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward FeedbackLLMs achieve adult human performance on higher-order theory of mind tasksNearest Neighbor Speculative Decoding for LLM Generation and AttributionZipper: A Multi-Tower Decoder Ar…

1
Phased Consistency Model, 2-Stage Backpropagation, and the Future of 4D World Reconstruction 8:09

1M ago8:09

8:09

Phased Consistency Model2BP: 2-Stage BackpropagationGFlow: Recovering 4D World from Monocular VideoInstruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction TuningLLaMA-NAS: Efficient Neural Architecture Search for Large Language Models

1
Vision-Language Models, Arithmetic Transformers, Next-Gen Video Editing: 10:20

1M ago10:20

10:20

An Introduction to Vision-Language ModelingTransformers Can Do Arithmetic with the Right EmbeddingsMatryoshka Multimodal ModelsI2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion ModelsZamba: A Compact 7B SSM Hybrid ModelLooking Backward: Streaming Video-to-Video Translation with Feature Banks…

1
ConvLLaVA's Visual Compression, Efficient LLVM, Multilingual Aya 23, and AutoCoder's Code Mastery 11:11

1M ago11:11

11:11

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal ModelsMeteor: Mamba-based Traversal of Rationale for Large Language and Vision ModelsGrokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of GeneralizationAya 23: Open Weight Releases to Further Multilingual ProgressStacking Your Transformers: A Close…

2M ago10:30

10:30

Adapting to poor air quality Science Sessions are brief conversations with cutting-edge researchers, National Academy members, and policymakers as they discuss topics relevant to today's scientific community. Learn the behind-the-scenes story of work published in the Proceedings of the National Academy of Sciences (PNAS), plus a broad range of scie…

Podcasts Worth a Listen

Applied Mathematics Podcasts

Podcasts Worth a Listen

Quick Reference Guide