WTF is he talking about? Washed up, wanna be wizard turned family man creates his own universe, while sniffing out the truth. This podcast is for the colorful, deep thinking, truth seeking individuals who aren't afraid to look at the cross.
…
continue reading
A show about not just the technologies, but the people and stories behind them. In every episode, Ronak and Guang sit down with engineers, founders, and investors to chat about their paths, lessons they’ve learned and of course, the misadventures along the way.
…
continue reading
A podcast celebrating and reviewing the Savage Dragon comic and all things Erik Larsen
…
continue reading
How AI is Built dives into the different building blocks necessary to develop AI applications: how they work, how you can get started, and how you can master them. Build on the breakthroughs of others. Follow along, as Nicolay learns from the best data engineers, ML engineers, solution architects, and tech founders.
…
continue reading
The goal of this podcast is to create a place where people discuss their inside views about existential risk from AI.
…
continue reading
Whether for entertainment or enterprise, getting into the metaverse isn't an intuitive exercise.To read, watch and listen to more metaverse stories head to digitalnationaus.com.au
…
continue reading
A data driven blog
…
continue reading
1
Head vs Torso Queries | Grokking Search | Micro-lesson
6:42
6:42
Play later
Play later
Lists
Like
Liked
6:42
Your queries are on a spectrum. Head Tail High Volume Low Volume General Specific Few Queries Many Queries When we talk about volume, we talk about the amount of searches with the same query term. Tail queries still have a large volume of search volume, but as a distribution. What counts into the head, torso, and tail que…
…
continue reading
1
Grokking Synthetic Biology | Dmitriy Ryaboy (Twitter, Ginkgo Bioworks)
1:08:51
1:08:51
Play later
Play later
Lists
Like
Liked
1:08:51
From building a data platform and Parquet at Twitter to using AI to make biology easier to engineer at Ginkgo Bioworks, Dmitriy joins the show to chat about the early days of big data, the conversation that made him jump into SynBio, LLMs for proteins and more. Segments: (00:03:18) Data engineering roots (00:05:40) Early influences at Lawrence Berk…
…
continue reading
1
Query Understanding: Doing The Work Before The Query Hits The Database | S2 E1
53:02
53:02
Play later
Play later
Lists
Like
Liked
53:02
Welcome back to How AI Is Built. We have got a very special episode to kick off season two. Daniel Tunkelang is a search consultant currently working with Algolia. He is a leader in the field of information retrieval, recommender systems, and AI-powered search. He worked for Canva, Algolia, Cisco, Gartner, Handshake, to pick a few. His core focus i…
…
continue reading
1
Facets vs Filters | Grokking Search | Micro-lesson
4:39
4:39
Play later
Play later
Lists
Like
Liked
4:39
Facets let your users tune their search results. They are not filters. Filters eliminate search results. Facets give users the ability to reduce the search results. Facets allow the user in the frontend to limit the search results to a limited, more specific set. They are extremely valuable in combination with head queries (reference), which return…
…
continue reading
1
Early Twitter's fail-whale wars | Dmitriy Ryaboy
1:08:46
1:08:46
Play later
Play later
Lists
Like
Liked
1:08:46
A veteran of early Twitter's fail whale wars, Dmitriy joins the show to chat about the time when 70% of the Hadoop cluster got accidentally deleted, the financial reality of writing a book, and how to navigate acquisitions. Segments: (00:00:00) The Infamous Hadoop Outage (00:02:36) War Stories from Twitter's Early Days (00:04:47) The Fail Whale Era…
…
continue reading
Today we are launching the season 2 of How AI Is Built. The last few weeks, we spoke to a lot of regular listeners and past guests and collected feedback. Analyzed our episode data. And we will be applying the learnings to season 2. This season will be all about search. We are trying to make it better, more actionable, and more in-depth. The goal i…
…
continue reading
1
Discovering the power of story-telling in engineering | Adam Gordon Bell (CoRecursive)
1:02:28
1:02:28
Play later
Play later
Lists
Like
Liked
1:02:28
Known for hosting the CoRecursive podcast, which dives into the stories behind the code, Adam joins the show to chat about discovering that the great engineers he had looked up to are actually great communicators, his framework for building one of the best storytelling engineering podcasts, and the journey getting into DevRel. Chapters: (00:00:00) …
…
continue reading
1
Behind designing Kubernetes' APIs | Brian Grant (Google)
2:10:56
2:10:56
Play later
Play later
Lists
Like
Liked
2:10:56
As the original architect and API design lead of Kubernetes, Brian joins the show to chat about why "APIs are forever", the keys to evangelizing impactful projects, and being an Uber Tech at Google, and more. Segments: (00:03:01) Internship with Mark Ewing (00:07:10) “Mark and Brian's Excellent Environment” manual (00:11:58) Poker on VT100 terminal…
…
continue reading
Today Matthew and I reminisce about early childhood memories like kicking Dad in the nuts, being stung by a swarm of bees, the "NBA Jam Slap", and some of our most shameful moments like the "Black and White Stick", and much more. Also Matthew mentioned a movie called "The PeanutButter Gang" when in fact the movie he's referring to is called The But…
…
continue reading
1
Ditching the rules to build a team that lasts | Bryan Cantrill, Steve Tuck (Oxide)
2:06:29
2:06:29
Play later
Play later
Lists
Like
Liked
2:06:29
From building a new kind of server to building a new kind of company, co-founders Bryan and Steve join the show to chat about their "meet cute" and the origin story of Oxide, their unconventional recruiting process, transparent and uniform salaries, and their solution to the "N+1 shithead problem". Segments: (00:03:03) Bryan and Steve's "meet cute"…
…
continue reading
1
Unlocking Value from Unstructured Data, Real-World Applications of Generative AI | ep 17
36:28
36:28
Play later
Play later
Lists
Like
Liked
36:28
In this episode of "How AI is Built," host Nicolay Gerold interviews Jonathan Yarkoni, founder of Reach Latent. Jonathan shares his expertise in extracting value from unstructured data using AI, discussing challenging projects, the impact of ChatGPT, and the future of generative AI. From weather prediction to legal tech, Jonathan provides valuable …
…
continue reading
1
Data Processing for AI, Integrating AI into Data Pipelines, Spark | ep 16
46:26
46:26
Play later
Play later
Lists
Like
Liked
46:26
This episode of "How AI Is Built" is all about data processing for AI. Abhishek Choudhary and Nicolay discuss Spark and alternatives to process data so it is AI-ready. Spark is a distributed system that allows for fast data processing by utilizing memory. It uses a dataframe representation "RDD" to simplify data processing. When should you use Spar…
…
continue reading
1
Growing and selling an indie business | Michael Lynch (TinyPilot)
1:40:18
1:40:18
Play later
Play later
Lists
Like
Liked
1:40:18
Having quit Google in 2018 to bootstrap indie software businesses, Michael is known for writing very transparently about the ups and downs of his journey. After recently selling his hardware business TinyPilot for $600K, Michael returns to the show to chat about the misconceptions about running an indie business, the hardest part of selling a compa…
…
continue reading
1
Savage FINcast – Episode 132: This Title Was a Missed Opportunity
2:49:17
2:49:17
Play later
Play later
Lists
Like
Liked
2:49:17
Like the song goes, “Opportunity comes once in a lifetime, Yo.” Then, “Something… Something… Mom’s Spaghetti!” And like the song this episode of the Savage FINcast is about missing opportunity. Craig, Raven, Jim, and Mark take a look at Savage Dragon 271 as several running plot threads seem to pop off while another heads turns a major corner. Do th…
…
continue reading
1
Building AI Agents for the Enterprise: Realistic Use Cases, Cost Controls, Seamless UX | ep 15
35:12
35:12
Play later
Play later
Lists
Like
Liked
35:12
In this episode, Nicolay talks with Rahul Parundekar, founder of AI Hero, about the current state and future of AI agents. Drawing from over a decade of experience working on agent technology at companies like Toyota, Rahul emphasizes the importance of focusing on realistic, bounded use cases rather than chasing full autonomy. They dive into the ke…
…
continue reading
1
Breaking distributed systems for fun and profit | Kyle Kingsbury (Jepsen)
1:23:17
1:23:17
Play later
Play later
Lists
Like
Liked
1:23:17
Well-known for his insightful and meticulous write-ups on testing distributed systems, Kyle (aka Aphyr) joins the show to chat about the origins of Jepsen, how he built a business around testing distributed systems, his writing process, favorite databases, and more. Segments: (00:03:29) From Physics to Software Engineering (00:07:47) The origins of…
…
continue reading
1
Building Predictable Agents: Prompting, Compression, and Memory Strategies | ep 14
32:14
32:14
Play later
Play later
Lists
Like
Liked
32:14
In this conversation, Nicolay and Richmond Alake discuss various topics related to building AI agents and using MongoDB in the AI space. They cover the use of agents and multi-agents, the challenges of controlling agent behavior, and the importance of prompt compression. When you are building agents. Build them iteratively. Start with simple LLM ca…
…
continue reading
1
The 3 traps of open source funding models | Wes McKinney (pandas, Voltron Data, Posit)
1:08:51
1:08:51
Play later
Play later
Lists
Like
Liked
1:08:51
From creating one of the Python’s most influential libraries to co-founding Voltron Data, Wes joins the show to chat about why the book cover of the pandas book doesn’t feature a panda, open source pitfalls to avoid, the pros and cons of hiring engineers at a non-profit, and more. Segments: (00:02:50) Guang’s complaint about the pandas book cover (…
…
continue reading
1
Data Integration and Ingestion for AI & LLMs, Architecting Data Flows | changelog 3
14:53
14:53
Play later
Play later
Lists
Like
Liked
14:53
In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations. Kirk breaks down his approach using relatable concepts: The "Two-Sided Funnel": This model streamlines data flow by converting various data sources into a standard format before distributing it. Universal Data Streams: Kirk expla…
…
continue reading
1
ETL for LLMs, Integrating and Normalizing Unstructured Data | ep 13
36:48
36:48
Play later
Play later
Lists
Like
Liked
36:48
In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs). Carbon is streamlining AI development by providing a platform for integrating unstructured data from various sources, enabling businesses to build innovative AI applications more efficiently wh…
…
continue reading
1
Impact Driven Development | Matt Klein (Envoy, bitdrift)
1:19:18
1:19:18
Play later
Play later
Lists
Like
Liked
1:19:18
From creating Envoy to co-founding bitdrift to reimagine mobile observability, Matt joins the show to chat about being told to simply “write some proxy in Python” in the early days of building Envoy, early influences from building “shrink wrap” software at Microsoft, the process of spinning bitdrift out of Lyft, and much more. Segments: (00:03:10) …
…
continue reading
1
Serverless Data Orchestration, AI in the Data Stack, AI Pipelines | ep 12
28:06
28:06
Play later
Play later
Lists
Like
Liked
28:06
In this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly complex, spanning multiple teams, tools and cloud services, the need for unified orchestration and visibility has never been greater. Orchestra is a serverless data orches…
…
continue reading
1
Build the scary stuff | Bryan Cantrill (Oxide)
2:19:41
2:19:41
Play later
Play later
Lists
Like
Liked
2:19:41
From being a distinguished engineer at Sun Microsystems to co-founding Oxide Computer Company to build a new kind of server, Bryan joins the show to chat about being told that he’s on a suicide mission when starting Oxide, the moment he felt “I’m actually living HBO Silicon Valley”, and lessons from Sun. And much more. Chapters: (00:02:24) The Orig…
…
continue reading
Ready for a three hour episode? Because the Savage FINcast has got three hour episode for ya. Craig, Raven, Jim, and Mark peel back the cover of Savage Dragon 270 to witness the aftermath of last issue. With Walter is on the run, Mickey lurking in the shadows, and Frank is slumming it in Dimension X this issue has got more plot threads then you can…
…
continue reading
1
Mastering Vector Databases: Product & Binary Quantization, Multi-Vector Search
40:06
40:06
Play later
Play later
Lists
Like
Liked
40:06
Ever wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from Weaviate. They break down complex topics like quantization, multi-vector search, and the potential of multimodal search, making them accessible for all listeners. Zain …
…
continue reading
1
Lessons from the early days building Kafka and Confluent | Jay Kreps
1:16:08
1:16:08
Play later
Play later
Lists
Like
Liked
1:16:08
From writing the first lines of Kafka over a Christmas break as a LinkedIn engineer to running a public company as the CEO of Confluent, Jay joins the show to chat about how he and his co-founders convinced investors to take a chance on their vision, what many engineers get wrong about communication, and why engineers can make great CEOs - even whe…
…
continue reading
The bro is back in town! Matthew flew into town so we could do some podcasting. Today we got together with Dad. Some of the things we talked about were O.J. Simpson, training for Body For Life, Matthew dating twins, and more. If you'd like to support my family and I create more content, please come visit us at Patreon. patreon.com/dogbo Support the…
…
continue reading
1
Building Robust AI and Data Systems, Data Architecture, Data Quality, Data Storage | ep 10
45:33
45:33
Play later
Play later
Lists
Like
Liked
45:33
In this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks down the basics using simple analogies, explaining how data architecture involves sorting, cleaning, and painting a picture with data, much like organizing Lego bricks…
…
continue reading
1
Modern Data Infrastructure for Analytics and AI, Lakehouses, Open Source Data Stack | ep 9
27:53
27:53
Play later
Play later
Lists
Like
Liked
27:53
Jorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, and strategies for optimizing your data pipelines. This episode is full of practical advice for anyone looking to build a high-performance data analytics platform. Lake …
…
continue reading
1
Building 2 Iconic OSSs Back-to-Back | Maxime Beauchemin (Airflow, Preset)
58:55
58:55
Play later
Play later
Lists
Like
Liked
58:55
If you’ve worked on data problems, you probably have heard of Airflow and Superset, two powerful tools that have cemented their place in the data ecosystem. Building successful open-source software is no easy feat, and even fewer engineers have done this back to back. In part 2 of the conversation, we talk about Max’s journey in open source. Segmen…
…
continue reading
1
Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models | ep 8
36:40
36:40
Play later
Play later
Lists
Like
Liked
36:40
Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG. Key Points: Knowledge Graph…
…
continue reading
1
[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)
2:16:08
2:16:08
Play later
Play later
Lists
Like
Liked
2:16:08
This is a special crosspost episode where Adam Gleave is interviewed by Nathan Labenz from the Cognitive Revolution. At the end I also have a discussion with Nathan Labenz about his takes on AI. Adam Gleave is the founder of Far AI, and with Nathan they discuss finding vulnerabilities in GPT-4's fine-tuning and Assistant PIs, Far AI's work exposing…
…
continue reading
1
Navigating the Modern Data Stack, Choosing the Right OSS Tools, From Problem to Requirements to Architecture | ep 7
38:12
38:12
Play later
Play later
Lists
Like
Liked
38:12
From Problem to Requirements to Architecture. In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the right tools, implementing effective data governance, and leveraging powerful concepts like software-defined assets. They discuss the challenges of keeping up with the e…
…
continue reading
1
Become a LLM-ready Engineer | Maxime Beauchemin (Airflow, Preset)
41:05
41:05
Play later
Play later
Lists
Like
Liked
41:05
If you’ve worked on data problems, you probably have heard of Airflow and Superset, two powerful tools that have cemented their place in the data ecosystem. Building successful open-source software is no easy feat, and even fewer engineers have done this back to back. In Part 1 of this conversation, we chat about how to adapt to the LLM-age as engi…
…
continue reading
1
Data Orchestration Tools: Choosing the right one for your needs | ep 6
32:37
32:37
Play later
Play later
Lists
Like
Liked
32:37
In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, t…
…
continue reading
1
Building Reliable LLM Applications, Production-Ready RAG, Data-Driven Evals | ep 5
29:40
29:40
Play later
Play later
Lists
Like
Liked
29:40
In this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-source library that helps developers test, evaluate, and fine-tune Retrieval Augmented Generation (RAG) applications, streamlining their path to production readiness. Mai…
…
continue reading
1
Life as a Distinguished Engineer | Joakim Recht (Uber)
1:15:43
1:15:43
Play later
Play later
Lists
Like
Liked
1:15:43
Out of thousands of engineers at Uber, there’s only a handful of Distinguished Engineers and Joakim was one of them. In this conversation we chat about Why software engineering is a lot like a sausage factory. Considerations for leaving big tech for a startup. “How to beat the promo commitee”. How can one effectively shape engineering culture? “Men…
…
continue reading
1
Lance v2: Rethinking Columnar Storage for Faster Lookups, Nulls, and Flexible Encodings | changelog 2
21:33
21:33
Play later
Play later
Lists
Like
Liked
21:33
In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the traditional notion of columnar storage, allowing for more efficient handling of large multimodal datasets like images and embeddings. Weston discusses the goals driving LanceDB'…
…
continue reading
1
Unlocking AI with Supabase: Postgres Configuration, Real-Time Processing, and Extensions | ep 4
31:57
31:57
Play later
Play later
Lists
Like
Liked
31:57
Had a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring: Core components and how they power real-time AI solutions Optimizing Postgres for AI workloads The magic of PG Vector and other key extensions Supabase’s future and exciting n…
…
continue reading
1
Savage FINcast – Episode 130: Rockk Grokk Cockk
2:36:18
2:36:18
Play later
Play later
Lists
Like
Liked
2:36:18
Its a moderately tardy episode of the Savage FINcast! Hosts Jim, Craig, Raven, and Mark take a look at Savage Dragon 269, where not even a singular 69 occurs. Malcolm and the new SOS is acclimating to San Fran life. But something fishy is going on down. And possibly something Mickey as well. All this, and Erik Larsen news, listener letters, and FIN…
…
continue reading
1
AI Inside Your Database, Real-Time AI, Declarative ML/AI | ep 3
36:04
36:04
Play later
Play later
Lists
Like
Liked
36:04
If you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while keeping everything up-to-date with real-time calculations. It works with various databases and aims to make AI development less of a headache. In this podcast, we explo…
…
continue reading
1
Supabase acquires OrioleDB, A New Database Engine for PostgreSQL | changelog 1
13:37
13:37
Play later
Play later
Lists
Like
Liked
13:37
Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tuple). This means when you update data, Oriole tracks the changes needed to "undo" the update if necessary. Think of this like the "undo" function in a text editor. Inste…
…
continue reading
We’re super excited to have Kelsey back on the show! Our last conversation was around his incredible career journey - from working at McDonald’s after school to starting his own computer store, to hacking on python infrastructure with the core developers, to meeting Satya Nadella for an interview. In part two of this conversation, we dive deep into…
…
continue reading
1
AI Powered Data Transformation, Combining gen & trad AI, Semantic Validation | ep 2
37:09
37:09
Play later
Play later
Lists
Like
Liked
37:09
Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any data into the schema your AI and software needs. bem.ai is a data tool that focuses on transforming any data into the schema needed for AI and software. It acts as a sys…
…
continue reading
1
Ethan Perez on Selecting Alignment Research Projects (ft. Mikita Balesni & Henry Sleight)
36:45
36:45
Play later
Play later
Lists
Like
Liked
36:45
Ethan Perez is a Research Scientist at Anthropic, where he leads a team working on developing model organisms of misalignment. Youtube: https://youtu.be/XDtDljh44DMEthan is interviewed by Mikita Balesni (Apollo Research) and Henry Sleight (Astra Fellowship)) about his approach in selecting projects for doing AI Alignment research.A transcript & wr…
…
continue reading
1
Pops and I Keep Riffing: Forgiveness, Encouragement, and Perseverance.
1:28:53
1:28:53
Play later
Play later
Lists
Like
Liked
1:28:53
Today my Dad asks me some questions about my phone and wallet being stolen recently and about my 18 year old cat Hendrix. We also talk about encouraging strangers at the gym and persevering through tough circumstances. If you'd like to support my family and I create more content, please come visit us at Patreon. patreon.com/dogbo Support the Show.…
…
continue reading
1
Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1
34:04
34:04
Play later
Play later
Lists
Like
Liked
34:04
Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning. Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what…
…
continue reading
1
Engineer's guide to startup advising | Kelsey Hightower
49:50
49:50
Play later
Play later
Lists
Like
Liked
49:50
We’re super excited to have Kelsey back on the show! Our last conversation was around his incredible career journey - from working at McDonald’s after school to starting his own computer store, to hacking on python infrastructure with the core developers, to meeting Satya Nadella for an interview. In part one of this conversation, we dive deep into…
…
continue reading
1
Pops and I Strike Again: Faith, Grace, Will, and Emotional Processing
1:42:17
1:42:17
Play later
Play later
Lists
Like
Liked
1:42:17
Today Pops and I talk about faith, grace, will, and how different couples process things emotionally. If you'd like to support my family and I create more content, please come visit us at Patreon. patreon.com/dogbo Support the Show.By Mark Castleman McClanahan
…
continue reading
1
The hard power of management and the soft power of senior ICs | Josh Wills
1:18:33
1:18:33
Play later
Play later
Lists
Like
Liked
1:18:33
As a self-described “gainfully unemployed data person”, Josh Wills is an angel investor and has worked on and led data teams at Slack, Cloudera, WeaveGrid and Google. We discuss: How to get started with angel investing without a ton of $$ Attributes that define great engineering managers What’s it like transitioning from management back to IC Chall…
…
continue reading