Episode 32: Economics, Data For Good and AI Research with Sara Hooker

Datacast

Content provided by James Le. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by James Le or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

4y ago 1:33:38

MP3•Episode home

(2:20) Sara shared her childhood growing up in Africa.
(4:05) Sara talked about her undergraduate experience at Carleton College studying Economics and International Relations.
(9:07) Sara discussed her first job working as an Economics Analyst at Compass Lexecon in the Bay Area.
(12:20) Sara then joined Udemy as a data analyst, then transitioned to the engineering team to work on spam detection and recommendation algorithms.
(14:58) Sara dig deep into the “hustling period” of her career and how she brute-forced her way to grow as an engineer.
(17:24) Sara founded Delta Analytics - a local Bay Area non-profit community of data scientists, engineers, and economists in 2014 that believes in using data for good.
(20:53) Sara shared Delta’s collaboration with Eneza Education to empower students to access quizzes by mobile texting in Kenya (check out her presentation at the ODSC West 2016).
(25:16) Sara shared Delta’s partnership with Rainforest Connection to identify illegal de-forestation using steamed audio from the rainforest (check out her presentation at MLconf Seattle 2017).
(28:22) Sara unpacked her blog post Why “data for good” lacks precision, in which she described 4 key criteria frequently used to qualify an initiative as “data for good” and discussed some open challenges associated with each.
(36:34) Sara unpacked her blog post Slow learning, in which she revealed her journey to get accepted into the AI Residency program at Google AI.
(41:03) Sara discussed her initial research interest on model interpretability for deep neural networks and her work done at Google called The (Un)reliability of Saliency Methods - which argues that saliency methods are not reliable enough to explain model prediction.
(45:55) Sara pushed the research above further with A Benchmark for Interpretability Methods in Deep Neural Networks, which proposes an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks called RemOve And Retrain.
(48:46) Sara explained why model interpretability is not always required (check out her talks at PyBay 2018, REWORK Toronto 2018, and REWORK San Francisco 2019).
(52:10) Sara explained the typical measurements of model reliability and the limitations of them, such as localization methods and points of failure.
(59:04) Sara explained why model compression is an interesting research direction and her work The State of Sparsity in Deep Neural Networks - which highlights the need for large-scale benchmarks in the field of model compression.
(01:02:49) Sara discussed her paper Selective Brain Damage: Measuring the Disparate Impact of Model Pruning - which explores the impact of pruning techniques for neural networks trained for computer vision tasks. Check out the paper website!
(01:05:08) Sara shared her future research directions on efficient pruning, sparse network training, and local gradient updates.
(01:06:56) Sara explained the premise behind her talk Gradual Learning at the Future of Finance Summit in 2019, in which she shared the three fundamental approaches to machine learning impact.
(01:12:20) Sara described the AI community in Africa as well as the issues the community is currently facing: both from the investment landscape and the infrastructure ecosystem.
(01:18:00) Sara and her brother recently started a podcast called Underrated ML which pitches the underrated ideas in machine learning.
(01:20:15) Sara reflected how her background in economics influences her career outlook in machine learning.
(01:25:42) Sara reflected on the differences between applied ML and research ML, and shared her advice for people contemplating between these career paths.
(01:29:49) Closing segment.

Her Contact Information:

Her Recommended Resources :

Deep Learning Indaba
Southeast Asia Machine Learning School
MILA - AI For Humanity
Why “data for good” lacks precision (Sara's take on "Data for Good" initiatives)
Slow learning (Sara's journey to Google AI)
fast.ai
Sanity Check for Saliency Maps by Julius Adebayo et al.
Focal Loss for Dense Object Detection by Tsung-Yi Lin et al.
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications by Andrew Howard et al.
Underrated ML (Sara’s new podcast)
Dumitru Erhan (Research Scientist at Google AI)
Samy Bengio (Research Scientist at Google AI)
Andrea Frome (Ex-Research Engineer at Google AI)
Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman

About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.

Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

133 episodes

#Tech #Careers #Business #James Le #Artificial Intelligence #Machine Learning #Statistics #Technology #Data Science #Science #Research #Data Engineering #Podcasting Education #Startup #Computer Science #Venture Capital