Artwork

Content provided by NLP Highlights and Allen Institute for Artificial Intelligence. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by NLP Highlights and Allen Institute for Artificial Intelligence or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

111 - Typologically diverse, multi-lingual, information-seeking questions, with Jon Clark

38:29
 
Share
 

Manage episode 259983969 series 1452120
Content provided by NLP Highlights and Allen Institute for Artificial Intelligence. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by NLP Highlights and Allen Institute for Artificial Intelligence or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
We invited Jon Clark from Google to talk about TyDi QA, a new question answering dataset, for this episode. The dataset contains information seeking questions in 11 languages that are typologically diverse, i.e., they differ from each other in terms of key structural and functional features. The questions in TyDiQA are information-seeking, like those in Natural Questions, which we discussed in the previous episode. In addition, TyDiQA also has questions collected in multiple languages using independent crowdsourcing pipelines, as opposed to some other multilingual QA datasets like XQuAD and MLQA where English data is translated into other languages. The dataset and the leaderboard can be accessed at https://ai.google.com/research/tydiqa.
  continue reading

145 episodes

Artwork
iconShare
 
Manage episode 259983969 series 1452120
Content provided by NLP Highlights and Allen Institute for Artificial Intelligence. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by NLP Highlights and Allen Institute for Artificial Intelligence or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
We invited Jon Clark from Google to talk about TyDi QA, a new question answering dataset, for this episode. The dataset contains information seeking questions in 11 languages that are typologically diverse, i.e., they differ from each other in terms of key structural and functional features. The questions in TyDiQA are information-seeking, like those in Natural Questions, which we discussed in the previous episode. In addition, TyDiQA also has questions collected in multiple languages using independent crowdsourcing pipelines, as opposed to some other multilingual QA datasets like XQuAD and MLQA where English data is translated into other languages. The dataset and the leaderboard can be accessed at https://ai.google.com/research/tydiqa.
  continue reading

145 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide