Artwork

Content provided by InfoQ. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by InfoQ or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Emmanuel Ameisen, Head of AI at Insight, on Building a Semantic Search System for Images

35:19
 
Share
 

Manage episode 218505357 series 1024147
Content provided by InfoQ. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by InfoQ or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
On this week’s podcast, Wes Reisz talks to Emmanuel Ameisen, head of AI for Insight Data Science, about building a semantic search system for images using convolution neural networks and word embeddings, how you can build on the work done by companies like Google, and then explores where the gaps are and where you need to train your own models. The podcast wraps up with a discussion around how you get something like this into production. Why listen to this podcast: - A common use case is the ability to search for similar things - I want to find another pair of sunglasses like these, or I want a cat that looks like this picture, or even a tool like Google’s Smart Reply, can all be considered broadly the domain of semantic search. - For image classification you generally want a convolutional neural network. You typically use a model pre-trained with a public data set like Imagenet pre-trained to generate embeddings, using the pre-trained model up to the penultimate layer, and storing the value of the activations. - From here the idea is to mix image embeddings with word embeddings. The embeddings, whether for words or images, are just a vector that represents a thing. There are many approaches to getting vectors for words, but the one that started it is word2vec. - For both image embeddings and word embeddings you can typically use pre-trained models, meaning that you only need to train the final step of bringing the two models together. - Before deploying to production it is important that you validate the model against biases such as sexism, typically using outside people to a carry out a through audit. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RAEUrV You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2RAEUrV
  continue reading

278 episodes

Artwork
iconShare
 
Manage episode 218505357 series 1024147
Content provided by InfoQ. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by InfoQ or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
On this week’s podcast, Wes Reisz talks to Emmanuel Ameisen, head of AI for Insight Data Science, about building a semantic search system for images using convolution neural networks and word embeddings, how you can build on the work done by companies like Google, and then explores where the gaps are and where you need to train your own models. The podcast wraps up with a discussion around how you get something like this into production. Why listen to this podcast: - A common use case is the ability to search for similar things - I want to find another pair of sunglasses like these, or I want a cat that looks like this picture, or even a tool like Google’s Smart Reply, can all be considered broadly the domain of semantic search. - For image classification you generally want a convolutional neural network. You typically use a model pre-trained with a public data set like Imagenet pre-trained to generate embeddings, using the pre-trained model up to the penultimate layer, and storing the value of the activations. - From here the idea is to mix image embeddings with word embeddings. The embeddings, whether for words or images, are just a vector that represents a thing. There are many approaches to getting vectors for words, but the one that started it is word2vec. - For both image embeddings and word embeddings you can typically use pre-trained models, meaning that you only need to train the final step of bringing the two models together. - Before deploying to production it is important that you validate the model against biases such as sexism, typically using outside people to a carry out a through audit. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RAEUrV You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2RAEUrV
  continue reading

278 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide