Go offline with the Player FM app!
[QA] Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models
Manage episode 415008604 series 3524393
Transformers can use meaningless filler tokens to solve tasks, but learning to use them is challenging. Additional tokens can provide computational benefits independently of token choice.
https://arxiv.org/abs//2404.15758
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1037 episodes
Manage episode 415008604 series 3524393
Transformers can use meaningless filler tokens to solve tasks, but learning to use them is challenging. Additional tokens can provide computational benefits independently of token choice.
https://arxiv.org/abs//2404.15758
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1037 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.