Artwork

Content provided by Rudderstack. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Rudderstack or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

179: Time Series Data Management and Data Modeling with Tony Wang of Stanford University

50:42
 
Share
 

Manage episode 403551926 series 3264623
Content provided by Rudderstack. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Rudderstack or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Highlights from this week’s conversation include:

  • Tony's background and research focus (3:35)
  • Challenges in academia and industry (6:15)
  • Ph.D. student's routine (10:47)
  • Academic paper review process (15:26)
  • Aha moments in research (20:05)
  • Academic lab structure (23:09)
  • The decision to move from hardware to data research (24:43)
  • Research focus on time series data management (27:40)
  • Data modeling in time series and OLAP systems (32:01)
  • Issues and potential solutions for parquet format (37:32)
  • Role of external indices in parquet files (42:19)
  • Tony's open source project (47:11)
  • Final thoughts and takeaways (49:30)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

  continue reading

375 episodes

Artwork
iconShare
 
Manage episode 403551926 series 3264623
Content provided by Rudderstack. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Rudderstack or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Highlights from this week’s conversation include:

  • Tony's background and research focus (3:35)
  • Challenges in academia and industry (6:15)
  • Ph.D. student's routine (10:47)
  • Academic paper review process (15:26)
  • Aha moments in research (20:05)
  • Academic lab structure (23:09)
  • The decision to move from hardware to data research (24:43)
  • Research focus on time series data management (27:40)
  • Data modeling in time series and OLAP systems (32:01)
  • Issues and potential solutions for parquet format (37:32)
  • Role of external indices in parquet files (42:19)
  • Tony's open source project (47:11)
  • Final thoughts and takeaways (49:30)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

  continue reading

375 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide