Getting into Data Science

22:51
 
Share
 

Manage episode 228285284 series 1951941
By Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio streamed directly from their servers.

What does it take to become a data scientist? We speak with three people who have become data scientists in the last three years and find out what it takes, in their opinions, to land a data science job and to be prepared for a career in the field.

Curtis: We’ve talked a lot in our recent episodes about all the interesting things you can do with data science, and we’ve only talked a little bit recently about what it actually takes to get into the field, which is a topic that a lot of you have reached out to us and asked us to cover in a more thorough way. So today, we’re taking a broader approach on this topic by talking to three data scientists who have become data scientists in the last three years. You’re going to be able to hear all the details of each of their three journeys, how they got started, how they landed their jobs, and what their best advice is for getting into the field, and this will give you a broad view about how to get into data science from three people who have actually done it.

Ginette: I’m Ginette.

Curtis: And I’m Curtis.

Ginette: And you are listening to Data Crunch.

Curtis: A podcast about how applied data science, machine learning, and artificial intelligence are changing the world.

Ginette: A Vault Analytics production.

Ginette: Here at Data Crunch we’ve been hard at work developing a technology that allows executives and business leaders to gain insight from their data instantly—simply by talking to the air. We hook up your data to an Alexa device with custom skills built in to understand the questions you have about your business – and give you answers. Figure out sales forecasts, marketing performance, operational compliance, progress on KPIs, and more by just talking to Alexa.

We are officially launching the product this week and have room for three initial customers—if you’re interested, head over to datacrunchcorp.com/alexa or datacrunchpodcast.com/alexa (both work), and book some time to chat with us. We’ll assess if your company is a good fit, and if so, we look forward to working with you!

Tyler Folkman: My name’s Tyler Folkman. I’ve gotten into data science in kind of a strange route to be honest. I did my undergrad in economics, actually originally thinking to get into computer science, but for some reason, I had this thought that computer science was going to get outsourced; I don’t know if that was a thing, but I think people back in the early 2000s were talking about computer science getting outsourced, so I thought about business, which ended up begin economics, which I really liked, and then ended up doing economic consulting, which is, basically in usually large litigation cases, lawyers hire economists to value damages, so for example, when Samsung and Apple were suing each other, I worked on the Samsung side to help value how much they might sue Apple for, for patent infringement, and a lot of that involves statistical analyses, data analytics, econometrics as economists would call it.

And I got really interested in just this idea of data being a really powerful tool for making decisions and coming to conclusions, and so I started hearing about machine learning on the Internet, kind of dabbling with Python, which at the time, I was a Windows user, and it was a huge pain to get Python installed, but I kind of got it up and running, played around with things like SciKit learn, read some blogs, and really got into machine learning and found that it was really housed more in the computer science department at that time, and just kind of decided to apply to some computer science departments and was lucky to get in at University of Texas at Austin and do some studies there, join a machine learning lab and got to do some work at Amazon. Really got a really good set of experiences to kind of help me learn how to be both a programmer and a machine learning person, a little bit of statistics, and jumped straight from there over here to Ancestry and was lucky to land here with, honestly, a job that surprised me with how much data it had before I started looking into it. This is my first company I’ve been at with the role of data scientist as my actual title, where before I dabbled a lot with data and worked a lot with data but more as a statistician or an econometrician.

Ginette: Tyler points out the importance of learning programming skills. Even though he had a solid economics background, becoming a better programmer and statistician—along with, of course, his machine learning skills—were important to round out his resume.

This is Juan Pablo Murillo, and his path also took a few twists and turns before he became a data scientist.

Juan: My background is in mathematics. My undergrad and my master’s are both in mathematics. During my master’s I was curious about taking classes outside of math. I had never programmed. I had never really dealt with any applications more so than knowing that some of the theory and work we were doing had applications, so it was during the master’s that I decided to take classes in a different department both in statistics and biostatistics, and that’s when I was introduced to statistical models, hypothesis testing SAS/R. After finishing up my master’s, I wanted a job in industry, or I was curious about it. I applied to a few, but I I did not land any interviews. At the same time, I was applying to teaching positions and schools were calling me, and so as I was running out of time, or I thought I was running out of time anyway, I decided to take a teaching position.

I taught for one year to be honest, and when I was teaching, I was teaching math. The head of the department asked if anyone wanted to teach a Python class, and I had never programmed I had never programmed in Python, but I identified that as a great opportunity for me to learn, and so knowing that no one wanted it, I volunteered.

Curtis: After this teaching year, Juan really wanted to make the jump to industry.

Juan: I looked at many bootcamps, and there was one that caught my attention. It was in Silicon Valley area, and I talked to the person in charge, and he accepted me to the program, but the program eventually did not run, but I wanted it. I was ready to make the move, so I called my parents and said, “hey, I think I need your help,” and I explained to them what I was trying to do, and they said, “it’s fine. You can you can move out here; you know, we’ll help you out.” So I moved to Seattle. I started applying while taking online classes. I wasn’t landing anything; and then I joined a two-month boot camp offered by Northeastern University. I figured I can use their connections. They might help with an internship, and so I did that. That was from January through March of 2016. They did help me with an internship with a startup that doesn’t exist anymore.

Then I kept on applying. I was asked, I was going through some of the phone interviews and then, I was invited on-premise for an in-person interview, but I wasn’t landing anything. And I was getting discouraged, but I knew, “I’m making progress.” I wasn’t getting phone calls before. Now I’m going through the phone calls, and I’m interviewing online, and I’m one out of let’s say four or five candidates, and I was always changing my strategy or adapting new things to my strategy, right? Posting some of my work on Rpubs.com. That was R code. I kept on trying out or working on R shiny. Some more Python here and there. Updating my LinkedIn. Reaching out to people on LinkedIn. Asking for an internship on LinkedIn. Or somehow, going through people who went to the same schools as I did. There aren’t that many Sonoma State or University of IR graduates here in Seattle, but still those few, I try to utilize. I would go to meetups, and my strategy with meetups was to look for individuals who are in their late 30s, 40s, perhaps 50s, because I thought they are more likely to be in management, and I met one person. I shared my my resume with him. I didn’t hear back at first. Two or three months later, through somebody else, he got my resume again. He invited me, and eventually I started working for him.

Curtis: For Juan, he also had strong skills in mathematics, and he kept working his programming skills, improving his education with a bootcamp, and continued to be persistent, until he eventually landed a job.

This is Wendy LeFevre, who also jumped into data science from teaching.

Wendy: There are a lot of things I’ve done in the past five, six years that have gotten me to this place right now, data science. It definitely wasn’t an easy journey. I graduated with a Bachelor of Science and Mathematics from the University of Texas at El Paso and I have a minor in education, so immediately after that, I started teaching mathematics for a high school back in El Paso while I was on the program the bachelor’s degree I definitely absolutely love statistics like that was my favorite subject. I love the fact that you could predict anything, and so the thing I got out of that Bachelors was I want to continue this and I want to get a master’s degree in statistics ‘cuz I definitely want to study this further.

Ginette: Wendy then went on to teach at a local high school in El Paso.

Wendy: After that I was very interested in robotics as a part of the robotics team that was at the high school that I was teaching at that point, and so I just started you know going there to their meetings, and they were they had a First Tech Challenge team, and you know the students were showing me the things that they were coding, and so that was like my first exposure to to coding other than the normal computer science courses that you have to take when you are a math major.

Ginette: She eventually moved to a high school in Austin, Texas, and then was asked to head up a University of Texas summer camp, all of which were highly successful experiences.

Wendy: They had to engage fully, immerse themselves into the project that I was giving them. So I was doing that, and as I did that, you know, we had meetings with UT faculty, and at that point they were telling me like you should share everything that you’ve been doing. It’s super interesting, and I presented at a conference in UT that was called “How to Make Students Love Statistics.” I showed them pictures of everything all the crazy stuff that I was doing, and the faculty, they love this, and they were super successful, and after that they offered me to teach and build a statistics program at UT.

As being part of that program after that summer, the director of of that you teach outreach program offered me a position to actually direct all of that UT prefreshman engineering camp during the summer, and I obviously took it, and so I started directing this camp.

Then I remember coming up to the director of the program and asking her, “Hey, I want to introduce new stuff, and I just found out about this other thing, and and she was just like, “No, that curriculum is already done; it’s perfect,” so there was to me that that was like a stopping point, the fact that I was not able to I thought there was always, there’s always room for improvement, and I wanted to do more and keep learning and that was not going to happen. It was like, “Okay, this is perfect. Don’t touch it. I thought I don’t think this is the job for me so even though I had a director position, you know, I was faced with the very difficult decision, and it was, do I keep this position and and keep doing this, or do I take a risk and look for a job as an entry-level analyst where I can go prove myself from somebody else that I could do all this math and stats and an R.

And so I did that. I took the risk. While I was looking for a job, I noticed that all of them required SQL, so you need to know SQL, and at that point, I had never coded in SQL. The only thing I had done was taking online courses for it, which I think prepares you a little bit, but it’s not like the real thing. Once you’re in a real company do you know those queries or not a small join and a where statement, so I landed a job at Kasasa.

Curtis: So Tyler, Juan, and Wendy all landed jobs as data scientists through different routes. What are some of the takeaways from the processes that they followed?

Tyler: I think the first and most important thing is trying to get past the resume screen, right, so like every company, even companies like Ancestry that aren’t extremely huge get tons of resumes. And what I did a lot of when I was looking for jobs is I tried to find ways to bypass that, so, like, if I knew someone there or if I could find a contact of someone there, people were surprisingly receptive to helping people out, so cold LinkedIn messaging people type of thing; for Ancestry, I actually found a relative who used to work there who connected me with someone who did work there who talked to me a little bit and was able to introduce me to some people, which I think got me the interview, which I think helps a lot because it’s easy to get screened out, even if you’re really good, they might screen you out because they don’t really you know.

And once I got to that point, it was really up to me to you know impress them, and I do think, my time at Amazon help me understand best practices around software engineering, which is I think is helpful for data scientists, and then I’ve always been really passionate about machine learning, so I spent a lot of my free time just implementing algorithms, studying algorithms, making sure I understood how they worked, and so I think I was able to show that I had a basic understanding of the algorithms, had some business sense, which I think is important these days or how to maybe apply them to Ancestry, and have the coding skills to actually make it happen. So I think those three things are what people were looking for at the time, which is enough machine learning for an entry data science job out of master’s, some communication skills, ability to like have impact on a business, and then this general know how around coding to get things out, at the end of the day, you have to write code to get in production, and so having some coding understanding is important.

Curtis: From what I’ve seen in the field as well, Tyler is spot on here. It’s really important to have coding skills and to know how the algorithms work, but it’s also really important to have some basic business sense, understand how to communicate the algorithms and their results to business users and other people so that they can understand what’s happening and why it’s important and what they can do with it. If you can communicate those things clearly and simply to a business user and to clients, then you’ll be extremely valuable in the business setting.

Juan: If I were to go differently, I would work on different projects because I didn’t have enough experience—that was always the feedback. We’re taking someone with more experience, right? So it could be that I’m not employed in an analytics position, but I am working on analytics projects, and here’s my portfolio, right? Here are all the projects I’m currently working on. I would work on more projects; I would make sure that those projects are in use cases that are applicable to the industry or industries that I’m considering entering.

I would be more active on LinkedIn from the get-go, participate in groups, so that I get more visibility on my name. Truly you never know where opportunity is going to come from.

I would focus on mastering one tool, not R and Python, just one of the two so that I get really good at one, and then present myself more of a, I wouldn’t say an expert, but someone really strong in one of the two. Similarly, I would be much better at SQL, or any of a querying language. I would network more. I would improve my presentation skills. I would try to make it not so much about the tools that I’m using or the data quality or the algorithms, but I would make it more about telling a story.

Ginette: Wendy commented about how being a continuous learner is so important in this field.

Wendy: When I went through the interview process, they do test you on your SQL. It was it was something very simple. I was able to do it, and I think what I brought to the table was, you know, I can teach your whole team statistics. taken day they can learn how to use a right methodology when you’re answering business questions and other than that the most valuable thing that I could offer the employer was my ability to learn anything. I made sure that I made that point like okay I might have entry level SQL right now, but I can master this and I can learn anything, so I think that’s what got me into the door.

Ginette: Like Nic Ryan said in our episode with him, SQL, or something like it, is incredibly important today. The impressive move Wendy took was convincing her potential bosses that she knew some SQL and could master it given the opportunity—which she did. Here are some other tidbits from Tyler, Juan, and Wendy as we talked to them about their journey into the field.

Tyler: It blows my mind how much great material there is for machine learning online, like Andrew Ng’s machine learning class on Coursera. Andrew Ng’s just released a deep learning class on Coursera. Fast.AI has an open source class. These are some of the best people in the world providing awesome content. Open.ai provides a lot of their algorithms for free. Facebook, Google. It’s just insane the amount of resources if you want to get into the field that are available for you.

Curtis: Juan also told us about how being a data scientists stacked up to his expectations, once he landed the job.

Juan: Keep pushing right, and keep track of your progress, and if some of your strategies are not working, adopt new strategies and give them a try.

My first role was as a data science consultant and then I was brought in-house by the client. I’m more in a position, I would say I’m an analytics consultant within my team. I am happy, yes. This is what I wanted. The expectation and the reality are a bit different. I do spend a big chunk of my time prepping data and writing hive queries, creating more tables, and I also spend a lot of time in meetings, right? And definitely when I was learning the tools, it was all about machine learning and predictive modeling in algorithms, and it isn’t quite like that, but overall I am happy, and I knew that it wouldn’t be exactly as these intro courses are selling it to me.

Ginette: Wendy commented to us about how data science work is solving a continuous puzzle.

Wendy LeFevre: I feel like if if you’re not interested in learning new things on going every single step, then this is not the job for you. Learning everything about data science requires you to be on top of it, you know. Just when you think like, “Oh, I have this—I got this,” somebody else comes up with a new algorithm for machine learning that is better, so you need to be on top of it. You need to understand how to read every single library. You know, there’s always better ways to do stuff, and you need to be on top of that analogy to make sure that you know you make the best out of your job, and so definitely loving puzzles. You need to be interested in continuous learning you have to, and and I would definitely tell them you know don’t let fear stop you from from anything, you know, but I think it’s fear that immobilizes us when we are faced with challenges, and so if you’re not that kind, then this is not for you, but if you’re interested in challenging yourself, then this is definitely a fun job, and trust me all the step of the way from putting a data set together analyzing it you know creating hypothesis answering to business questions talking to other departments about your findings all of it I find it fun, so I feel like if you’re if you find that fun, then all of the data science that you’re exposed to is going to it’s going to be good for you.

Ginette: Wendy also followed up with us to include this advice: If you are a student that is interested in data science, don’t be scared to take programming or statistics courses. Sure, they are challenging, but putting these skills to use in the corporate world is very rewarding, and yes, they’re tons of fun!

If you’re a parent, please expose your child to computer programming related topics as early as you can. Most students in high school that have never been exposed to any type of programming think of computer science as something “too difficult to even try”.

And finally, if you’re an analyst, and you love what you do, you love learning and applying math to make data-driven decisions, and you thrive on challenges, then you’re on the right track to become a data scientist. Take the first step by stepping out of your comfort zone and tackling the most challenging projects at your company. Go to Meetups happening in your city where you can network with other data scientists and be exposed to the tools that they’re using. Play and manipulate code written by other scientists in Kaggle, this will make you feel more comfortable coding. Don’t be afraid of failure. The more you fail, the more you learn, and the faster you can move on to mastering the next thing.

Curtis: My best wrap up advice. The two-step guide to getting the job you want: (1) Solve their problems. Talk to people about what problems they are trying to solve. Ask them for some sample data. Ask them what technologies they use. Then, solve their problems with their sample data, using those technologies. Or do something with public data similar to what they are doing.

This shows initiative, and it’s the skills and subject matter they care about. And they know how it is to work with you. If someone did this in an interview, and they had the requisite skills, I’d hire them. Make recommendations to them that are good.

Here is a hint at what people are trying to solve—what action do they need to take, and why is getting that action right important to them, and how can you help them take that action? Answer that question and your on track to 90% of data science problems. (The answer is usually not showing them a ton of analysis work, p.s.)

(2) Know their technology—make sure you have the skills they are asking for. You don’t have to be perfect at it. Look at the skills they are asking about, do some research, play around with the technologies, go through the basic learning curve. Everything can be learned.

Ginette: Thanks for listening! as a reminder, if you’d like to be one of our first customers to use Alexa to provide you with answers to your business questions, head over to datacrunchcorp.com/alexa or datacrunchpodcast.com/alexa (both work), and book some time to chat with us. We’ll assess if your company is a good fit, and if so, we look forward to working with you!

A big thank you to Tyler, Juan, and Wendy for their time and advice. And as always, head to our website for show notes and attributions.

Attributions

Music

“Loopster” Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0 License
http://creativecommons.org/licenses/by/3.0/

63 episodes available. A new episode about every 0 hours averaging 20 mins duration .