Artwork

Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

David Garnitz on VectorFlow - Weaviate Podcast #66!

1:04:35
 
Share
 

Manage episode 381292832 series 3524543
Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Hey everyone! Thank you so much for watching the 66th Weaviate Podcast with David Garnitz, the creator of VectorFlow! VectorFlow (open-sourced on GH and linked below) is a new tool for ingesting data into Vector Databases such as Weaviate! There is quite an interesting End-to-End stack emerging at the ingestion layer, from retrieving data from misc. sources such as Slack, Salesforce, GitHub, Google Drive, Notion, ... to then Chunking the Text (maybe with the use of Visual Document Layout parsers like what Unstructured is imagining), extracting Metadata potentially (say the "age" of an NBA player as in the Evaporate-Code+ research) -- then sending this data off to embedding model inference and unpacking that can of worms from inference acceleration to load balancing, and finally -- importing the vectors themselves to Weaviate! I learned so much from this conversation, I really hope you enjoy listening and please check out VectorFlow below! VectorFlow: https://github.com/dgarnitz/vectorflow Chapters 0:00 VectorFlow on GitHub! 0:52 Welcome David Garnitz! 1:17 Vector Flow, Founding Vision 2:00 Billions of Vectors in Weaviate! 4:20 End-to-end data importing 6:30 Metadata Extraction in Vector Database Flows 10:15 Vectorizing 100s of millions of billions of chunks 15:58 Fine-Tuning Embedding Models 23:50 Zero-Shot Models in Metadata and Chunking 36:36 Vector + SQL 42:45 Self-Driving Databases 49:23 Generative Feedback Loop REST API 51:38 GPT Cache 55:55 Building VectorFlow

  continue reading

101 episodes

Artwork
iconShare
 
Manage episode 381292832 series 3524543
Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Hey everyone! Thank you so much for watching the 66th Weaviate Podcast with David Garnitz, the creator of VectorFlow! VectorFlow (open-sourced on GH and linked below) is a new tool for ingesting data into Vector Databases such as Weaviate! There is quite an interesting End-to-End stack emerging at the ingestion layer, from retrieving data from misc. sources such as Slack, Salesforce, GitHub, Google Drive, Notion, ... to then Chunking the Text (maybe with the use of Visual Document Layout parsers like what Unstructured is imagining), extracting Metadata potentially (say the "age" of an NBA player as in the Evaporate-Code+ research) -- then sending this data off to embedding model inference and unpacking that can of worms from inference acceleration to load balancing, and finally -- importing the vectors themselves to Weaviate! I learned so much from this conversation, I really hope you enjoy listening and please check out VectorFlow below! VectorFlow: https://github.com/dgarnitz/vectorflow Chapters 0:00 VectorFlow on GitHub! 0:52 Welcome David Garnitz! 1:17 Vector Flow, Founding Vision 2:00 Billions of Vectors in Weaviate! 4:20 End-to-end data importing 6:30 Metadata Extraction in Vector Database Flows 10:15 Vectorizing 100s of millions of billions of chunks 15:58 Fine-Tuning Embedding Models 23:50 Zero-Shot Models in Metadata and Chunking 36:36 Vector + SQL 42:45 Self-Driving Databases 49:23 Generative Feedback Loop REST API 51:38 GPT Cache 55:55 Building VectorFlow

  continue reading

101 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide