The Golden Age of Data Science


Manage episode 218658728 series 1951941
By Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio streamed directly from their servers.
How did one boy's stuffed yellow elephant permanently intertwine itself in history? What is a data scientist? Why is right now the golden age for data science? We take a crack at all three of these questions—the second two, with the help of Gregory Piatetsky-Shapiro and Ryan Henning. Transcript Ginette: “Over the past few years, we’ve seen these news flashes: “An article in Harvard Business Review in 2014, titled: Data Scientist: the Sexiest Job of the 21st Century “Mashable’s article in 2015: So You Wanna Be a Data Scientist? A Guide to 2015’s Hottest Profession “Business Insider, 2016: Data Science was the #1 Profession as Rated by Glassdoor “A data science industry observer, KDnuggets, 2017: Data Scientist: Best Job in America, Again, which cites the most recent Glassdoor report outlining the very top jobs in America: “It turns out, four of the five top US jobs deal with data. In descending order, we find data scientist, devops engineer, data engineer, and analytics manager.” Curtis: “With four out of five of these top jobs orbiting data, clearly something’s going on here.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Ginette: “Today is a culmination of everything we’ve talked about in our series on the history of data science. This is where all the contributions of Florence Nightingale, William Playfair, Ronald Fisher, Ada Lovelace, and many others come together in one place. We’ll add a couple more people to this list to answer these two questions: ‘What is a data scientist? And why is right now the golden age of data science?’” Curtis: “According to IBM, ‘everyday, we create 2.5 quintillion bytes of data.’ But what does a quintillion actually look like? “Well, if you take one quintillion pennies, you could actually place them face up end to end can and blanket the entire surface of the earth 1.5 times over. Or think about one quintillion ants. That would be like taking all of the ants that exist today on planet earth according to some estimates, and then you have to take that number and multiply it by 100. So, that ant pile in your front yard becomes 100 ant piles in your front yard. Basically ants take over the earth. And we make 2.5 quintillion bytes every single day! “The next question is, how much information does that actually represent? It’s 250,000 times the amount of information that all the printed material in the Library of Congress contains. And we make that every single day.” Ginette: “In 2013, SINTEF published this stat, quote: ‘90% of the world’s data has been created in the preceding two years.’ According to one Ph.D. technologist, this has been true for the last 30 years because every two years, we produce 10 times as much data.” Curtis: “This exponential growth is insane. Just as an example of this type of growth rate, if you take a hypothetical scenario, and you take the world’s population, and say it starts growing as rapidly as data is growing now, it would look like this: Currently, the world’s population, 7 billion people, could fit in the size of Texas if they were living as densely as they do in New York City. Now, in two year’s time with this growth rate, you’d actually have to cover the entire United States and half of Canada with people living in New York City-like density. And if you extrapolate that out ten years keeping the growth rate the same, you’d have to cover the entire planet, including all of the oceans, with New York Cities, and then you’d have to do that with 100–150 additional earths to fit all of those people. That’s the kind of growth rate we’re talking about.” Ginette: “With data collection on the rise, one report goes so far as to say that only the data literate will have the chops to be executives in the future, quote:

62 episodes available. A new episode about every 0 hours averaging 20 mins duration .