Artwork

Content provided by PyTorch, Edward Yang, and Team PyTorch. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by PyTorch, Edward Yang, and Team PyTorch or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

CUDA graph trees

20:50
 
Share
 

Manage episode 408615350 series 2921809
Content provided by PyTorch, Edward Yang, and Team PyTorch. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by PyTorch, Edward Yang, and Team PyTorch or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
CUDA graph trees are the internal implementation of CUDA graphs used in PT2 when you say mode="reduce-overhead". Their primary innovation is that they allow the reuse of memory across multiple CUDA graphs, as long as they form a tree structure of potential paths you can go down with the CUDA graph. This greatly reduced the memory usage of CUDA graphs in PT2. There are some operational implications to using CUDA graphs which are described in the podcast.
  continue reading

82 episodes

Artwork

CUDA graph trees

PyTorch Developer Podcast

33 subscribers

published

iconShare
 
Manage episode 408615350 series 2921809
Content provided by PyTorch, Edward Yang, and Team PyTorch. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by PyTorch, Edward Yang, and Team PyTorch or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
CUDA graph trees are the internal implementation of CUDA graphs used in PT2 when you say mode="reduce-overhead". Their primary innovation is that they allow the reuse of memory across multiple CUDA graphs, as long as they form a tree structure of potential paths you can go down with the CUDA graph. This greatly reduced the memory usage of CUDA graphs in PT2. There are some operational implications to using CUDA graphs which are described in the podcast.
  continue reading

82 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide