#008 Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models

How AI Is Built

Content provided by Nicolay Gerold. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Nicolay Gerold or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

1+ y ago 36:39

MP3•Episode home

Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG.

Key Points:

Knowledge Graphs: Graphlet utilizes knowledge graphs as a filtering layer on top of keyword metadata and vector search, aiding in information retrieval.
Storage for KGs: A single piece of content in their data model resides across multiple systems: a document store with JSON, a graph node, and a search index. This hybrid approach creates a virtual entity with representations in different databases.
Entity Extraction: Azure Cognitive Services and other models are employed to extract entities from text for improved understanding.
Metadata-first approach: The metadata-first strategy involves extracting comprehensive metadata from various sources, ensuring it is canonicalized and filterable. This approach aids in better indexing and retrieval of data, crucial for effective RAG.
Challenges: Entity resolution and deduplication remain significant challenges in knowledge graph development.

Notable Quotes:

"Knowledge graphs is a filtering [mechanism]...but then I think also the kind of spidering and pulling extra content in is the other place this comes into play."
"Knowledge graphs to me are kind of like index per se...you're providing a new type of index on top of that."
"[For RAG]...you have to find constraints to make it workable."
"Entity resolution, deduping, I think is probably the number one thing."
"I've essentially built a connector infrastructure that would be like a FiveTran or something that Airflow would have..."
"One of the reasons is because we're a platform as a service, the burstability of it is really important. We can spin up to a hundred instances without any problem, and we don't have to think about it."
"Once cost and performance become a no-brainer, we're going to start seeing LLMs be more of a compute tool. I think that would be a game-changer for how applications are built in the future."

Kirk Marple:

Nicolay Gerold:

Chapters

00:00 Graphlit’s Hybrid Approach 02:23 Use Cases and Transition to Graphlit 04:19 Knowledge Graphs as a Filtering Mechanism 13:23 Using Gremlin for Querying the Graph 32:36 XML in Prompts for Better Segmentation 35:04 The Future of LLMs and Graphlit 36:25 Getting Started with Graphlit

Graphlit, knowledge graphs, AI, document store, graph database, search index co-pilot, entity extraction, Azure Cognitive Services, XML, event-driven architecture, serverless architecture graph rag, developer portal

63 episodes

#Tech #Nicolay Gerold #Technology #LLM #Machine Learning #Data Engineering