Artwork

Content provided by Utsav Shah. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Utsav Shah or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Software at Scale 25 - Rajesh Venkataraman: Senior Staff Software Engineer at Google

52:16
 
Share
 

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on August 05, 2024 18:25 (11d ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Manage episode 295899258 series 2899471
Content provided by Utsav Shah. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Utsav Shah or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Rajesh Venkataraman is a Senior Staff Engineer at Google where he works on Privacy and Personalization at Google Pay. He’s had experience building and maintaining search systems for a large part of his career. He worked on natural language processing at Microsoft, the cloud inference team at Google, and released parts of the search infrastructure at Dropbox.

Apple Podcasts | Spotify | Google Podcasts

In this episode, we discuss the nuances and technology behind search systems. We go over search infrastructure - data storage and retrieval, as well as search quality - tokenization, ranking, and more. I was especially curious about how image search and other advanced search systems work internally with constraints for low latency, high search quality, and cost-efficiency.

Highlights

08:00 - Getting started building a search system - where to begin? Some history.

13:30 - Why we should use different hardware for different parts of a high throughput search system

17:00 - What goes on behind the scenes in a search system when it has to incorporate a picture or a PDF? The rise of transformers, not the Optimus Prime kind.

We go on to discuss how transformers work at a very high level.

27:00 - The key idea for non-text search is being able to store, index, and search for vectors efficiently. Searches often involve nearest neighbor searches. Indexing involves techniques as simple as only storing the first few bits of each vector dimension in hashmaps.

34:00 - How search systems efficiently rebuild their inverted indices based on changing data; internationalization for search systems; search user interface design and research.

42:00 - How should a student interested in building a search system learn the best practices and techniques to do so?


This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev
  continue reading

60 episodes

Artwork
iconShare
 

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on August 05, 2024 18:25 (11d ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Manage episode 295899258 series 2899471
Content provided by Utsav Shah. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Utsav Shah or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Rajesh Venkataraman is a Senior Staff Engineer at Google where he works on Privacy and Personalization at Google Pay. He’s had experience building and maintaining search systems for a large part of his career. He worked on natural language processing at Microsoft, the cloud inference team at Google, and released parts of the search infrastructure at Dropbox.

Apple Podcasts | Spotify | Google Podcasts

In this episode, we discuss the nuances and technology behind search systems. We go over search infrastructure - data storage and retrieval, as well as search quality - tokenization, ranking, and more. I was especially curious about how image search and other advanced search systems work internally with constraints for low latency, high search quality, and cost-efficiency.

Highlights

08:00 - Getting started building a search system - where to begin? Some history.

13:30 - Why we should use different hardware for different parts of a high throughput search system

17:00 - What goes on behind the scenes in a search system when it has to incorporate a picture or a PDF? The rise of transformers, not the Optimus Prime kind.

We go on to discuss how transformers work at a very high level.

27:00 - The key idea for non-text search is being able to store, index, and search for vectors efficiently. Searches often involve nearest neighbor searches. Indexing involves techniques as simple as only storing the first few bits of each vector dimension in hashmaps.

34:00 - How search systems efficiently rebuild their inverted indices based on changing data; internationalization for search systems; search user interface design and research.

42:00 - How should a student interested in building a search system learn the best practices and techniques to do so?


This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev
  continue reading

60 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide