Best Igor Melnyk Podcasts (2024)

1
[QA] Law of the Weakest Link: Cross Capabilities of Large Language Models 7:32

42m ago7:32

7:32

https://arxiv.org/abs//2409.19951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Law of the Weakest Link: Cross Capabilities of Large Language Models 16:18

43m ago16:18

16:18

https://arxiv.org/abs//2409.19951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] Realistic Evaluation of Model Merging for Compositional Generalization 8:29

2h ago8:29

8:29

This paper evaluates various model merging methods for compositional generalization in image classification, generation, and NLP, clarifying their merits, requirements, and computational costs in a shared experimental setting. https://arxiv.org/abs//2409.18314 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…

1
Realistic Evaluation of Model Merging for Compositional Generalization 21:13

5h ago21:13

21:13

This paper evaluates various model merging methods for compositional generalization in image classification, generation, and NLP, clarifying their merits, requirements, and computational costs in a shared experimental setting. https://arxiv.org/abs//2409.18314 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…

1
[QA] Emu3: Next-Token Prediction is All You Need 7:43

20h ago7:43

7:43

Emu3 introduces a next-token prediction model for multimodal tasks, outperforming existing models and simplifying design by focusing on tokenization of images, text, and videos. https://arxiv.org/abs//2409.18869 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/p…

1
Emu3: Next-Token Prediction is All You Need 17:28

20h ago17:28

17:28

Emu3 introduces a next-token prediction model for multimodal tasks, outperforming existing models and simplifying design by focusing on tokenization of images, text, and videos. https://arxiv.org/abs//2409.18869 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/p…

1
[QA] MIO: A Foundation Model on Multimodal Tokens 8:38

20h ago8:38

8:38

MIO is a novel multimodal foundation model that excels in understanding and generating speech, text, images, and videos, outperforming existing models in any-to-any capabilities and diverse tasks. https://arxiv.org/abs//2409.17692 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podc…

1
MIO: A Foundation Model on Multimodal Tokens 19:09

20h ago19:09

19:09

MIO is a novel multimodal foundation model that excels in understanding and generating speech, text, images, and videos, outperforming existing models in any-to-any capabilities and diverse tasks. https://arxiv.org/abs//2409.17692 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podc…

1
[QA] A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor ? 7:52

2d ago7:52

7:52

The paper evaluates OpenAI's o1 model in medical scenarios, highlighting its enhanced reasoning and accuracy over GPT-4, while also identifying weaknesses and releasing data for further research. https://arxiv.org/abs//2409.15277 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podca…

1
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor ? 8:52

2d ago8:52

8:52

The paper evaluates OpenAI's o1 model in medical scenarios, highlighting its enhanced reasoning and accuracy over GPT-4, while also identifying weaknesses and releasing data for further research. https://arxiv.org/abs//2409.15277 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podca…

1
[QA] Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models 8:44

2d ago8:44

8:44

The Logic-of-Thought (LoT) prompting method enhances logical reasoning in Large Language Models by integrating propositional logic, significantly improving performance across various reasoning tasks. https://arxiv.org/abs//2409.17539 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://p…

1
Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models 16:05

2d ago16:05

16:05

The Logic-of-Thought (LoT) prompting method enhances logical reasoning in Large Language Models by integrating propositional logic, significantly improving performance across various reasoning tasks. https://arxiv.org/abs//2409.17539 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://p…

1
[QA] Making Text Embedders Few-Shot Learners 7:45

3d ago7:45

7:45

We propose bge-en-icl, a model leveraging in-context learning in LLMs for high-quality text embeddings, achieving state-of-the-art performance on MTEB and AIR-Bench benchmarks. https://arxiv.org/abs//2409.15700 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/po…

1
Making Text Embedders Few-Shot Learners 16:11

3d ago16:11

16:11

We propose bge-en-icl, a model leveraging in-context learning in LLMs for high-quality text embeddings, achieving state-of-the-art performance on MTEB and AIR-Bench benchmarks. https://arxiv.org/abs//2409.15700 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/po…

1
[QA] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale 6:45

3d ago6:45

6:45

The paper introduces PROX, a framework enabling small language models to refine data effectively, outperforming human-crafted methods and enhancing efficiency in LLM pre-training across various benchmarks. https://arxiv.org/abs//2409.17115 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale 8:57

3d ago8:57

8:57

The paper introduces PROX, a framework enabling small language models to refine data effectively, outperforming human-crafted methods and enhancing efficiency in LLM pre-training across various benchmarks. https://arxiv.org/abs//2409.17115 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
[QA] Infer Human's Intentions Before Following Natural Language Instruction 8:18

4d ago8:18

8:18

The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
Infer Human's Intentions Before Following Natural Language Instruction 27:36

4d ago27:36

27:36

The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
[QA] MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models 7:05

4d ago7:05

7:05

This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …

1
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models 15:10

4d ago15:10

15:10

This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …

1
[QA] Counterfactual Token Generation in Large Language Models 7:53

5d ago7:53

7:53

This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…

1
Counterfactual Token Generation in Large Language Models 14:52

5d ago14:52

14:52

This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…

1
[QA] Characterizing stable regions in the residual stream of LLMs 7:45

5d ago7:45

7:45

The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…

1
Characterizing stable regions in the residual stream of LLMs 5:26

5d ago5:26

5:26

The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…

1
[QA] Watch Your Steps: Observable and Modular Chains of Thought 7:30

6d ago7:30

7:30

We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …

1
Watch Your Steps: Observable and Modular Chains of Thought 29:35

6d ago29:35

29:35

We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …

1
[QA] Seeing Faces in Things: A Model and Dataset for Pareidolia 7:38

6d ago7:38

7:38

This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…

1
Seeing Faces in Things: A Model and Dataset for Pareidolia 10:54

6d ago10:54

10:54

This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…

1
[QA] Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts 8:20

7d ago8:20

8:20

The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…

1
Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts 29:04

7d ago29:04

29:04

The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…

1
[QA] Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking 7:40

7d ago7:40

7:40

This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…

1
Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking 11:39

7d ago11:39

11:39

This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…

1
[QA] LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models 7:46

8d ago7:46

7:46

This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models 13:56

8d ago13:56

13:56

This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
[QA] Embedding Geometries of Contrastive Language-Image Pre-Training 7:35

9d ago7:35

7:35

This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…

1
Embedding Geometries of Contrastive Language-Image Pre-Training 15:25

9d ago15:25

15:25

This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…

1
[QA] Kolmogorov–Arnold Transformer 8:14

11d ago8:14

8:14

The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…

1
Kolmogorov–Arnold Transformer 15:05

11d ago15:05

15:05

The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…

1
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 11:52

11d ago11:52

11:52

This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…

1
[QA] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 6:42

11d ago6:42

6:42

This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…

1
[QA] Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm 7:03

11d ago7:03

7:03

This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm 12:28

11d ago12:28

12:28

This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
[QA] Is Tokenization Needed for Masked Particle Modelling? 7:52

12d ago7:52

7:52

This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
Is Tokenization Needed for Masked Particle Modelling? 20:39

12d ago20:39

20:39

This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
[QA] Finetuning Language Models to Emit Linguistic Expressions of Uncertainty 6:49

13d ago6:49

6:49

https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty 12:41

14d ago12:41

12:41

https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning 7:23

13d ago7:23

7:23

Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning 26:23

14d ago26:23

26:23

Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
[QA] On the limits of agency in agent-based models 8:12

15d ago8:12

8:12

AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
On the limits of agency in agent-based models 19:50

15d ago19:50

19:50

AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

Podcasts Worth a Listen

Igor Melnyk Podcasts

Podcasts Worth a Listen

Quick Reference Guide