Best Nicolay Gerold Podcasts (2024)

1
Data Processing for AI, Integrating AI into Data Pipelines, Spark | ep 16 46:26

1d ago46:26

46:26

This episode of "How AI Is Built" is all about data processing for AI. Abhishek Choudhary and Nicolay discuss Spark and alternatives to process data so it is AI-ready. Spark is a distributed system that allows for fast data processing by utilizing memory. It uses a dataframe representation "RDD" to simplify data processing. When should you use Spar…

1
Building AI Agents for the Enterprise: Realistic Use Cases, Cost Controls, Seamless UX | ep 15 35:12

9d ago35:12

35:12

In this episode, Nicolay talks with Rahul Parundekar, founder of AI Hero, about the current state and future of AI agents. Drawing from over a decade of experience working on agent technology at companies like Toyota, Rahul emphasizes the importance of focusing on realistic, bounded use cases rather than chasing full autonomy. They dive into the ke…

1
Building Predictable Agents: Prompting, Compression, and Memory Strategies | ep 14 32:14

16d ago32:14

32:14

In this conversation, Nicolay and Richmond Alake discuss various topics related to building AI agents and using MongoDB in the AI space. They cover the use of agents and multi-agents, the challenges of controlling agent behavior, and the importance of prompt compression. When you are building agents. Build them iteratively. Start with simple LLM ca…

1
Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3 14:53

18d ago14:53

14:53

In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations. Kirk breaks down his approach using relatable concepts: The "Two-Sided Funnel": This model streamlines data flow by converting various data sources into a standard format before distributing it. Universal Data Streams: Kirk expla…

1
ETL for LLMs, Integrating and Normalizing Unstructured Data | ep 13 36:48

25d ago36:48

36:48

In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs). Carbon is streamlining AI development by providing a platform for integrating unstructured data from various sources, enabling businesses to build innovative AI applications more efficiently wh…

1
Serverless Data Orchestration, AI in the Data Stack, AI Pipelines | ep 12 28:06

30d ago28:06

28:06

In this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly complex, spanning multiple teams, tools and cloud services, the need for unified orchestration and visibility has never been greater. Orchestra is a serverless data orches…

1
Mastering Vector Databases: Product & Binary Quantization, Multi-Vector Search 40:06

1M ago40:06

40:06

Ever wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from Weaviate. They break down complex topics like quantization, multi-vector search, and the potential of multimodal search, making them accessible for all listeners. Zain …

1
Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage | ep 10 45:33

1M ago45:33

45:33

In this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks down the basics using simple analogies, explaining how data architecture involves sorting, cleaning, and painting a picture with data, much like organizing Lego bricks…

1
Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack | ep 9 27:53

2M ago27:53

27:53

Jorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, and strategies for optimizing your data pipelines. This episode is full of practical advice for anyone looking to build a high-performance data analytics platform. Lake …

1
Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models | ep 8 36:40

2M ago36:40

36:40

Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG. Key Points: Knowledge Graph…

1
Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture | ep 7 38:12

2M ago38:12

38:12

From Problem to Requirements to Architecture. In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the right tools, implementing effective data governance, and leveraging powerful concepts like software-defined assets. They discuss the challenges of keeping up with the e…

1
Data Orchestration Tools: Choosing the right one for your needs | ep 6 32:37

2M ago32:37

32:37

In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, t…

1
Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals | ep 5 29:40

2M ago29:40

29:40

In this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-source library that helps developers test, evaluate, and fine-tune Retrieval Augmented Generation (RAG) applications, streamlining their path to production readiness. Mai…

1
Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2 21:33

3M ago21:33

21:33

In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the traditional notion of columnar storage, allowing for more efficient handling of large multimodal datasets like images and embeddings. Weston discusses the goals driving LanceDB'…

1
Unlocking AI with Supabase: Postgres Configuration, Real-Time Processing, and Extensions | ep 4 31:57

3M ago31:57

31:57

Had a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring: Core components and how they power real-time AI solutions Optimizing Postgres for AI workloads The magic of PG Vector and other key extensions Supabase’s future and exciting n…

1
AI Inside Your Database, Real-Time AI, Declarative ML/AI | ep 3 36:04

3M ago36:04

36:04

If you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while keeping everything up-to-date with real-time calculations. It works with various databases and aims to make AI development less of a headache. In this podcast, we explo…

1
Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1 13:37

3M ago13:37

13:37

Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tuple). This means when you update data, Oriole tracks the changes needed to "undo" the update if necessary. Think of this like the "undo" function in a text editor. Inste…

1
AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation | ep 2 37:09

3M ago37:09

37:09

Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any data into the schema your AI and software needs. bem.ai is a data tool that focuses on transforming any data into the schema needed for AI and software. It acts as a sys…

1
Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1 34:04

3M ago34:04

34:04

Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning. Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what…

Podcasts Worth a Listen

Nicolay Gerold Podcasts

Podcasts Worth a Listen

1
How AI Is Built 🛠

Nicolay Gerold

1
Data Processing for AI, Integrating AI into Data Pipelines, Spark | ep 16 46:26

1
Building AI Agents for the Enterprise: Realistic Use Cases, Cost Controls, Seamless UX | ep 15 35:12

1
Building Predictable Agents: Prompting, Compression, and Memory Strategies | ep 14 32:14

1
Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3 14:53

1
ETL for LLMs, Integrating and Normalizing Unstructured Data | ep 13 36:48

1
Serverless Data Orchestration, AI in the Data Stack, AI Pipelines | ep 12 28:06

1
Mastering Vector Databases: Product & Binary Quantization, Multi-Vector Search 40:06

1
Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage | ep 10 45:33

1
Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack | ep 9 27:53

1
Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models | ep 8 36:40

1
Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture | ep 7 38:12

1
Data Orchestration Tools: Choosing the right one for your needs | ep 6 32:37

1
Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals | ep 5 29:40

1
Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2 21:33

1
Unlocking AI with Supabase: Postgres Configuration, Real-Time Processing, and Extensions | ep 4 31:57

1
AI Inside Your Database, Real-Time AI, Declarative ML/AI | ep 3 36:04

1
Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1 13:37

1
AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation | ep 2 37:09

1
Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1 34:04

Quick Reference Guide