Rohit Agarwal On Portkey - Weaviate Podcast #61! Weaviate podcast

Artwork

Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Weaviate Podcast « »
Rohit Agarwal on Portkey - Weaviate Podcast #61!

1y ago 49:24

Share

MP3•Episode home

Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Hey everyone! Thank you so much for watching the 61st episode of the Weaviate Podcast! I am beyond excited to publish this one! I first met Rohit at the Cal Hacks event hosted by UC Berkeley where we had a debate about the impact of Semantic Caching! Rohit taught me a ton about the topic and I think it's going to be one of the most impactful early applications of Generative Feedback Loops! Rohit is building Portkey, a SUPER interesting LLM middleware that does things like load balancing between LLM APIs, and as discussed in the podcast there are all sorts of opportunities for this kind of space whether it be routing to tool-specific LLMs, different cost / accuracy requirements, or multiple models in the HuggingGPT sense. It was amazing chatting with Rohit, this was the best dive into LLMOps I have personally been apart of! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast! Check out portkey here! https://portkey.ai/blog Chapters 0:00 Introduction 0:24 Portkey, Founding Vision 2:20 LLMOps vs. MLOps 4:00 Inference Hosting Options 7:05 3 Layers of LLM Use 8:35 LLM Load Balancers 12:45 Fine-Tuning LLMs 17:08 Retrieval-Aware Tuning 21:16 Portkey Cost Savings 23:08 HuggingGPT 26:28 Semantic Caching 32:40 Frequently Asked Questions 34:00 Embeddings vs. Generative Tasks 35:30 AI Moats, GPT Wrappers 39:56 Unlocks from Cheaper LLM Inference

… continue reading

104 episodes

#Tech #Weaviate

Artwork

Rohit Agarwal on Portkey - Weaviate Podcast #61!

Weaviate Podcast

published 1y ago

Share

MP3•Episode home

Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Hey everyone! Thank you so much for watching the 61st episode of the Weaviate Podcast! I am beyond excited to publish this one! I first met Rohit at the Cal Hacks event hosted by UC Berkeley where we had a debate about the impact of Semantic Caching! Rohit taught me a ton about the topic and I think it's going to be one of the most impactful early applications of Generative Feedback Loops! Rohit is building Portkey, a SUPER interesting LLM middleware that does things like load balancing between LLM APIs, and as discussed in the podcast there are all sorts of opportunities for this kind of space whether it be routing to tool-specific LLMs, different cost / accuracy requirements, or multiple models in the HuggingGPT sense. It was amazing chatting with Rohit, this was the best dive into LLMOps I have personally been apart of! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast! Check out portkey here! https://portkey.ai/blog Chapters 0:00 Introduction 0:24 Portkey, Founding Vision 2:20 LLMOps vs. MLOps 4:00 Inference Hosting Options 7:05 3 Layers of LLM Use 8:35 LLM Load Balancers 12:45 Fine-Tuning LLMs 17:08 Retrieval-Aware Tuning 21:16 Portkey Cost Savings 23:08 HuggingGPT 26:28 Semantic Caching 32:40 Frequently Asked Questions 34:00 Embeddings vs. Generative Tasks 35:30 AI Moats, GPT Wrappers 39:56 Unlocks from Cheaper LLM Inference

… continue reading

104 episodes

#Tech #Weaviate

All episodes

×

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

Listen to 500+ topics

Quick Reference Guide

Top Podcasts

The Bill Simmons Podcast

Comedy of the Week

How Did This Get Made?

Doug Loves Movies

TED Talks Daily

NBC Nightly News with Lester Holt

The World This Hour

Daily Boost Motivation and Coaching

This American Life

Sword and Scale

Help/FAQ | Upgrade | Advertise

Arts|Business|Comedy|Economics|Entertainment|News|Politics|Religion

Science|Soccer|Sports|Storytelling|Technology|True Crime

Copyright 2024 | Sitemap | Privacy Policy | Terms of Service | | Copyright