Artwork

Content provided by Nicolay Gerold. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Nicolay Gerold or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Building Taxonomies: Data Models to Remove Ambiguity from AI and Search | S2 E8

58:40
 
Share
 

Manage episode 443565022 series 3585930
Content provided by Nicolay Gerold. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Nicolay Gerold or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Today we have Jessica Talisman with us, who is working as an Information Architect at Adobe. She is (in my opinion) the expert on taxonomies and ontologies.

That’s what you will learn today in this episode of How AI Is Built. Taxonomies, ontologies, knowledge graphs.

Everyone is talking about them no-one knows how to build them.

But before we look into that, what are they good for in search?

Imagine a large corpus of academic papers. When a user searches for "machine learning in healthcare", the system can:

  • Recognize "machine learning" as a subcategory of "artificial intelligence"
  • Identify "healthcare" as a broad field with subfields like "diagnostics" and "patient care"
  • We can use these to expand the query or narrow it down.
  • We can return results that include papers on "neural networks for medical imaging" or "predictive analytics in patient outcomes", even if these exact phrases weren't in the search query
  • We can also filter down and remove papers not tagged with AI that might just mention it in a side not.

So we are building the plumbing, the necessary infrastructure for tagging, categorization, query expansion and relexation, filtering.

So how can we build them?

1️⃣ Start with Industry Standards • Leverage established taxonomies (e.g., Google, GS1, IAB) • Audit them for relevance to your project • Use as a foundation, not a final solution

2️⃣ Customize and Fill Gaps • Adapt industry taxonomies to your specific domain • Create a "coverage model" for your unique needs • Mine internal docs to identify domain-specific concepts

3️⃣ Follow Ontology Best Practices • Use clear, unique primary labels for each concept • Include definitions to avoid ambiguity • Provide context for each taxonomy node

Jessica Talisman:

Nicolay Gerold:

00:00 Introduction to Taxonomies and Knowledge Graphs 02:03 Building the Foundation: Metadata to Knowledge Graphs 04:35 Industry Taxonomies and Coverage Models 06:32 Clustering and Labeling Techniques 11:00 Evaluating and Maintaining Taxonomies 31:41 Exploring Taxonomy Granularity 32:18 Differentiating Taxonomies for Experts and Users 33:35 Mapping and Equivalency in Taxonomies 34:02 Best Practices and Examples of Taxonomies 40:50 Building Multilingual Taxonomies 44:33 Creative Applications of Taxonomies 48:54 Overrated and Underappreciated Technologies 53:00 The Importance of Human Involvement in AI 53:57 Connecting with the Speaker 55:05 Final Thoughts and Takeaways

  continue reading

29 episodes

Artwork
iconShare
 
Manage episode 443565022 series 3585930
Content provided by Nicolay Gerold. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Nicolay Gerold or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Today we have Jessica Talisman with us, who is working as an Information Architect at Adobe. She is (in my opinion) the expert on taxonomies and ontologies.

That’s what you will learn today in this episode of How AI Is Built. Taxonomies, ontologies, knowledge graphs.

Everyone is talking about them no-one knows how to build them.

But before we look into that, what are they good for in search?

Imagine a large corpus of academic papers. When a user searches for "machine learning in healthcare", the system can:

  • Recognize "machine learning" as a subcategory of "artificial intelligence"
  • Identify "healthcare" as a broad field with subfields like "diagnostics" and "patient care"
  • We can use these to expand the query or narrow it down.
  • We can return results that include papers on "neural networks for medical imaging" or "predictive analytics in patient outcomes", even if these exact phrases weren't in the search query
  • We can also filter down and remove papers not tagged with AI that might just mention it in a side not.

So we are building the plumbing, the necessary infrastructure for tagging, categorization, query expansion and relexation, filtering.

So how can we build them?

1️⃣ Start with Industry Standards • Leverage established taxonomies (e.g., Google, GS1, IAB) • Audit them for relevance to your project • Use as a foundation, not a final solution

2️⃣ Customize and Fill Gaps • Adapt industry taxonomies to your specific domain • Create a "coverage model" for your unique needs • Mine internal docs to identify domain-specific concepts

3️⃣ Follow Ontology Best Practices • Use clear, unique primary labels for each concept • Include definitions to avoid ambiguity • Provide context for each taxonomy node

Jessica Talisman:

Nicolay Gerold:

00:00 Introduction to Taxonomies and Knowledge Graphs 02:03 Building the Foundation: Metadata to Knowledge Graphs 04:35 Industry Taxonomies and Coverage Models 06:32 Clustering and Labeling Techniques 11:00 Evaluating and Maintaining Taxonomies 31:41 Exploring Taxonomy Granularity 32:18 Differentiating Taxonomies for Experts and Users 33:35 Mapping and Equivalency in Taxonomies 34:02 Best Practices and Examples of Taxonomies 40:50 Building Multilingual Taxonomies 44:33 Creative Applications of Taxonomies 48:54 Overrated and Underappreciated Technologies 53:00 The Importance of Human Involvement in AI 53:57 Connecting with the Speaker 55:05 Final Thoughts and Takeaways

  continue reading

29 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide