Go offline with the Player FM app!
Episode 117: Vector Databases, The Embeddings Revolution, and Working in China with Frank Liu
Manage episode 364581785 series 2593033
Show Notes
- (01:41) Frank shared formative experiences of his upbringing moving from China to the US.
- (04:45) Frank described his overall academic experience at Stanford, studying Electrical Engineering with a minor in Computer Science.
- (08:41) Frank talked about his research and industry experience while at Stanford.
- (11:34) Frank shared his proudest accomplishments working at Yahoo as a research engineer in the Vision and Machine Learning group.
- (16:37) Frank went over his experience co-founding a company that developed indoor localization and navigation solutions called Orion.
- (23:06) Frank walked through his decision to leave Silicon Valley for China.
- (26:02) Frank talked about his experience living and doing business in China (check out his two-part blog series that has covered normal life and the pandemic story in China).
- (32:44) Frank elaborated on the work culture differences between the East and the West.
- (37:58) Frank reflected on his decision to join Zilliz back in August 2021.
- (42:55) Frank unpacked the notion of vector databases for the un-initiated.
- (47:44) Frank provided a brief overview on the high-level design of Milvus, Zilliz's advanced open-source vector database solution.
- (51:38) Frank highlighted three unique use cases of Milvus - malware detection, reverse image search, and drug discovery.
- (56:51) Frank introduced Towhee, an open-source project that helps software engineers develop and deploy applications that utilize embeddings in just a few lines of code.
- (01:01:59) Frank anticipated the evolution of the embedding tooling landscape to support the increasing adoption of unstructured data.
- (01:04:21) Frank gave a primer on Zilliz Cloud, Zilliz's enterprise vector database solution.
- (01:06:30) Closing segment.
Frank's Contact Info
Zilliz's Resources
- Website | Twitter | LinkedIn | GitHub | YouTube
- Zilliz Cloud Database
- Milvus (Docs | GitHub)
- Towhee (Docs | GitHub)
Mentioned Content
Articles and Presentations
- A Gentle Introduction to Vector Databases (Dec 2021)
- My Experience Living and Working in China, Part I (Feb 2022)
- My Experience Living and Working in China, Part II (March 2022)
- Making ML More Accessible for Application Developers (April 2022)
- Understanding Neural Network Embeddings (April 2022)
- Building An Open-Source Platform for Generating Embedding Vectors (Berlin Buzzwords, 2022)
People
- Yann LeCun (Chief AI Scientist at Meta, Professor at NYU)
- Yangqing Jia (Creator of the Caffe deep learning framework)
- Soumith Chintala (Creator of the PyTorch deep learning framework)
Book
- A Short History of Nearly Everything (by Bill Bryson)
Notes
My conversation with Frank was recorded back in August 2022. The Zilliz team has had some important announcements in 2023 that I recommend looking at:
- The landing page of Zilliz Cloud
- The beta launch of Milvus 2.3
- The development of GPTCache
- The OSS Chat demo application
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
133 episodes
Manage episode 364581785 series 2593033
Show Notes
- (01:41) Frank shared formative experiences of his upbringing moving from China to the US.
- (04:45) Frank described his overall academic experience at Stanford, studying Electrical Engineering with a minor in Computer Science.
- (08:41) Frank talked about his research and industry experience while at Stanford.
- (11:34) Frank shared his proudest accomplishments working at Yahoo as a research engineer in the Vision and Machine Learning group.
- (16:37) Frank went over his experience co-founding a company that developed indoor localization and navigation solutions called Orion.
- (23:06) Frank walked through his decision to leave Silicon Valley for China.
- (26:02) Frank talked about his experience living and doing business in China (check out his two-part blog series that has covered normal life and the pandemic story in China).
- (32:44) Frank elaborated on the work culture differences between the East and the West.
- (37:58) Frank reflected on his decision to join Zilliz back in August 2021.
- (42:55) Frank unpacked the notion of vector databases for the un-initiated.
- (47:44) Frank provided a brief overview on the high-level design of Milvus, Zilliz's advanced open-source vector database solution.
- (51:38) Frank highlighted three unique use cases of Milvus - malware detection, reverse image search, and drug discovery.
- (56:51) Frank introduced Towhee, an open-source project that helps software engineers develop and deploy applications that utilize embeddings in just a few lines of code.
- (01:01:59) Frank anticipated the evolution of the embedding tooling landscape to support the increasing adoption of unstructured data.
- (01:04:21) Frank gave a primer on Zilliz Cloud, Zilliz's enterprise vector database solution.
- (01:06:30) Closing segment.
Frank's Contact Info
Zilliz's Resources
- Website | Twitter | LinkedIn | GitHub | YouTube
- Zilliz Cloud Database
- Milvus (Docs | GitHub)
- Towhee (Docs | GitHub)
Mentioned Content
Articles and Presentations
- A Gentle Introduction to Vector Databases (Dec 2021)
- My Experience Living and Working in China, Part I (Feb 2022)
- My Experience Living and Working in China, Part II (March 2022)
- Making ML More Accessible for Application Developers (April 2022)
- Understanding Neural Network Embeddings (April 2022)
- Building An Open-Source Platform for Generating Embedding Vectors (Berlin Buzzwords, 2022)
People
- Yann LeCun (Chief AI Scientist at Meta, Professor at NYU)
- Yangqing Jia (Creator of the Caffe deep learning framework)
- Soumith Chintala (Creator of the PyTorch deep learning framework)
Book
- A Short History of Nearly Everything (by Bill Bryson)
Notes
My conversation with Frank was recorded back in August 2022. The Zilliz team has had some important announcements in 2023 that I recommend looking at:
- The landing page of Zilliz Cloud
- The beta launch of Milvus 2.3
- The development of GPTCache
- The OSS Chat demo application
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
133 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.