Patrick Lewis on Retrieval-Augmented Generation - Weaviate Podcast #76!

Weaviate Podcast

Content provided by Weaviate. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Weaviate or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

10M ago 58:58

MP3•Episode home

Hey everyone, I am SUPER excited to present our 76th Weaviate Podcast featuring Patrick Lewis, an NLP Research Scientist at Cohere! Patrick has had an absolutely massive impact on Natural Language Processing with AI and Deep Learning! Especially notable for the current climate in AI and Weaviate is that Patrick is the lead author of the original "Retrieval-Augmented Generation" paper!! Patrick has contributed to many other profoundly impactful papers in the space as well such as DPR, Atlas, Task-Aware Retrieval with Instruction, and many many others! This was such an illuminating conversation, here is a quick overview of the chapters in the podcast! 1. Origin of RAG - Patrick explains the build-up that lead to the RAG paper, AskJeeves, IBM Watson, conceptual shift to retrieve-read in mainstream connectionist approaches to AI. 2. Atlas - Atlas shows that a much smaller LLM when paired with Retrieval-Augmentation can still achieve competitive few-shot and zero-shot task performance. This is super impactful because this few-shot and zero-shot capability has been a massive evangelist for AI broadly, and the fact that smaller Retrieval-Augmented models can do this is massive for the economically unlocking these applications. Teasing apart some architectural details of RAG: 3. Fusion In-Decoder - Interesting encoder-decoder transformer design in which each document + the query is encoded separately, then concatenated and passed to the LM. 4. End-to-End RAG - How to think about jointly training an embedding model and an LLM augmented with retrieval? 5. Query Routers - How to route queries from say SQL or Vector DBs? (More nuance on this later with Multi-Index Retrieval) 6. ConcurrentQA - Super interesting work on the privacy of multi-index routers. For example, if you ask "Who is the father of our new CEO" - this may reveal the private information of the new CEO with the public query of their father. 7. Multi-Index Retrieval 8. New APIs for LLMs 9. Self-Instructed Gorillas 10. Task-Aware Retrieval with Instructions 11. Editing Text, EditEval and PEER 12. What future direction excites you the most? Links: Learn more about Patrick Lewis: https://www.patricklewis.io/ Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: https://arxiv.org/abs/2005.11401 Atlas: https://arxiv.org/pdf/2208.03299.pdf Fusion In-Decoder: https://arxiv.org/pdf/2007.01282.pdf Chapters 0:00 Welcome Patrick Lewis! 0:36 Origin of RAG 5:20 Atlas 10:43 Fusion In-Decoder 17:50 End-to-End RAG 27:05 Query Routers 32:05 ConcurrentQA 37:30 Multi-Index Retrieval 40:05 New APIs for LLMs 41:50 Self-Instructed Gorillas 44:35 Task-Aware Retrieval with Instructions 52:00 Editing Text, EditEval and PEER 55:35 What future direction excites you the most?

104 episodes

#Tech #Weaviate