Artwork

Content provided by Himakara Pieris. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Himakara Pieris or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Navigating AI Projects With Khrystyna Sosiak from TomTom

40:37
 
Share
 

Manage episode 376385950 series 3458395
Content provided by Himakara Pieris. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Himakara Pieris or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

I’m excited to share this conversation with Khrystyna Sosiak. Khrystyna is a product manager at TomTom. Before that, she was a lead AI coach at Intel and a senior data scientist at mBank. During this conversation, Khrystyna shared her approach to navigating the complex landscape of AI projects, which includes investing in research, strategically placing bets, fostering stakeholder support, and embracing transparency.

Links

Khrystyna On LinkedIn

Transcript

[00:00:13] Himakara Pieris: I'm, Himakara Pieris. You're listening to smart products. A show where we recognize, celebrate and learn from industry leaders who are solving real-world problems. Using AI.

[00:00:25] Himakara Pieris: Khrystyna welcome to smart products.

[00:00:27] Khrystyna Sosiak: Thank you. I'm super excited to be here. Thank you for having me.

[00:00:30] Himakara Pieris: To start things off, could you tell us a bit about your background, um, what kind of environments that you've worked in, and also what kind of AI projects that, that you've been part of?

[00:00:39] Khrystyna Sosiak: Yes. So, uh, currently I'm a product manager at TomTom. I'm working on the external developer experience and uh, and analytics and billing topics. And in past I was working on the machine learning operations platforms and, uh, in my previous experience was a data scientist. So I was actually working with, [00:01:00] uh, with machine learning and with artificial intelligence before I moved into product.

[00:01:05] Himakara Pieris: What would be a good example of an AI project that you worked on?

[00:01:10] Khrystyna Sosiak: Probably one of the most, Exciting and interesting, , products that we've been working on that was very powerful is, , understanding the customer's behavior and, and the patterns.

[00:01:22] Khrystyna Sosiak: And then based on that ing uh, the right products. So I was working in banks, so we would analyze. All the data that we can find about our customers, right, of course, with two G D P R and making sure that we only use the right data, but, and then making sure that all the communication that goes to the customers is the right communication about the right products and in the right way.

[00:01:46] Khrystyna Sosiak: So really understanding the customer needs and, uh, the stage of the customer life and saying that's, that's what the customer need at this point, and that's how we. Understand that and how we can communicate and [00:02:00] sell it to the customers. So it's not about only making money, but it's understanding how we can actually.

[00:02:06] Khrystyna Sosiak: Go through this journey of life with the customer and supporting them. So, and understanding that by the data that they're generating and by the insights that we can find in this data. And sometimes, you know, and like data that you have like that generated by your transactions and by your history, like, It's a really specific data that show a lot about the person that probably some people even don't know about themselves.

[00:02:33] Khrystyna Sosiak: And the real goal is how we can use it for the benefit of the customer and not to harm the customer, right? And, um, we really change the way that we approach them. Uh, we approached the, the marketing communication with the customers, what was very interesting and transform transformational to see how very old fashioned organization would really move in direction into the [00:03:00] AI and making sure that all the decisions and the marketing strategies are powered by ai.

[00:03:06] Khrystyna Sosiak: So yeah, that was very interesting. It took us a long time. We made a lot of mistakes on the way, but it was a super interesting learning experience.

[00:03:17] Himakara Pieris: If I take a step back, so we're talking about mbank a consumer banking operation and reaching out the customers at the right time is something very important to, to become that part of the customer's daily life or, or their journey.

[00:03:32] Himakara Pieris: How was that done before and what point. Did the bank decide to explore AI as a possible, , solution to, possible tool to improve the, communications with the customers?

[00:03:46] Khrystyna Sosiak: I think the turning point was understanding that where the, you know, not only trends, but like the industry goals, right? And really AI powers the financial industry and the financial industry thing [00:04:00] in general.

[00:04:00] Khrystyna Sosiak: It's been very innovative in, uh, Trying to adopt the new technology and trying to make sure that the customers get the best experience before it was all triggered by the events. So you can imagine, I mean, it's still used widely, right? And when we talk about recommendation systems and like how the communication is done, right?

[00:04:20] Khrystyna Sosiak: You open the webpage, you open the app, and you, you scroll through some pages, you know about the credit card, for example, and then, Next day you would receive the email saying, Hey, here's the discount. Or in today, someone would call and say, Hey, we saw that you are interested in a credit card. Do you want to order the credit card?

[00:04:41] Khrystyna Sosiak: We have this discount for you. And usually it was triggered by one event, right? Or the, the sequence of events. But it's also very event triggering, right? So you only can. You only can base your recommendations on what customer actually does on the webpage. You don't really go into details [00:05:00] of like, okay, what are the factors about the customers that can affect that and what is actually the things that they need?

[00:05:07] Khrystyna Sosiak: It's, um, so yeah, it was something that was used. For years and, uh, it worked. You know, there was some success rates there, so I cannot say it didn't work, but we know that moving forward expectations of the customers are higher because when we live in the era of ai, when you have, you know, Netflix and Facebook with the recommendation title, your.

[00:05:30] Khrystyna Sosiak: You know, reactions and like what you see, what you like, what you don't like. Really we need to be there as well. And just saying you clicked on something and that's why we think it's could be interesting for you. It's not good enough anymore.

[00:05:45] Himakara Pieris: Sounds, like the previous, , approach for doing this is purely driven by specific events.

[00:05:51] Himakara Pieris: You have a rule-based system. If you click on this page, then you must be interested in this product. Let's unleash all the marketing communication , to sell that product [00:06:00] towards you. Whereas now, , the idea is we can possibly make this better by using ai. , To make sure that we are making more personalized and more relevant recommendations to the customer.

[00:06:10] Himakara Pieris: And by doing that, you improve the customer's experience and you would also improve the sort of the clickthroughs or, or, or signups for that product that you're, that you're positioning for the customer. , so when you start there, so it sounds like it started more with a. With an experimental approach.

[00:06:26] Himakara Pieris: Is that right where you're saying, okay, we have this way, we are doing things now we have all these new tools that are coming to the market, coming to the world. Let's pick them up and see whether we can move the needle, , with these tools rather than the, the method that we are doing now, which is our baseline.

[00:06:42] Himakara Pieris: Is that a fair assessment?

[00:06:44] Khrystyna Sosiak: It's for assessment and to be honest, it's for assessment not only about this project and not only about this experience, about almost all of the experiences that I had with the big companies or even small companies trying to get into the AI and trying, [00:07:00] you know, if it's. Not like the, the companies that actually build it, right?

[00:07:03] Khrystyna Sosiak: That they're trying to adopt it. It's really about, we have some data, we see the trends, we see that our competitors are using it, so how can we benefit from it? And I can see very often, like also talking to my colleagues and to my friends that there's very. There's a lot of companies that would hire like, uh, machine learning or, uh, engineer or data scientists say, that's the data we have.

[00:07:26] Khrystyna Sosiak: We have no idea what we can do with it. You know, try to figure something out. And I think sometimes there is some wrong expectations about Right. What we can do and what we cannot do. So yeah, it's all started like that, right? We have the data. Here's the set of the business. Problems that we have, and then let's iterate.

[00:07:46] Khrystyna Sosiak: Let's see what gonna work, what not gonna work. And a lot of things fails before something starts working. Right. And I think that's a learning experience that once you, you cannot, like, you cannot get there. [00:08:00] If you, they make mistakes and learn on a way, because then your experience and your success is much more meaningful because you actually understand what you've done and how you've done it and why you made those informed decisions about some steps of the machine learning process that we have.

[00:08:18] Khrystyna Sosiak: And that was very important also for the data scientist and for the product manager to understand better how this industry works. And how building these products are different and why they're failing.

[00:08:33] Himakara Pieris: So I imagine you're in a conference room on, there are two whiteboards on either side. On one whiteboard you have a whole set of business priorities and all the other side you have a catalog of all the data services that's available to you.

[00:08:45] Himakara Pieris: And then in the middle you have a data scientist and a machine learning engineer with a, , with a, with a toolkit, right? So, so you're running through a bunch of experiments using the toolkit you have and the data you have to see where you can impact, , the business priorities that you've identified.

[00:08:59] Himakara Pieris: Is that a good [00:09:00] way to look

[00:09:00] Himakara Pieris: at it?

[00:09:01] Khrystyna Sosiak: Yeah, it was like that definitely. It was also like someone from the business coming saying, that's the problem we have, we need support and we need to, to sort it out and the help. Uh, right. It was also like, Hey, that's the data we never used.

[00:09:16] Khrystyna Sosiak: Maybe we can, there's some opportunities in this data that not discovered that can actually bring the value to the co to the, uh, company, or it's for. Selling more or automation, right. So it's really different range of how those initiatives can start and they usually start from very different directions, right?

[00:09:37] Khrystyna Sosiak: But I think one is definitely you have a set of priorities and business priorities that you want to achieve, and then you ask yourself, right, that's where my company wants to go and. That's like us, and that's like what we have as a asset is data. How can we help the company to get where they want to be?

[00:09:54] Khrystyna Sosiak: You know? And the goal of the company could be very far from like adopting ai, [00:10:00] right? It could be like, you know, growing the revenue or going to the end number of customers this year. And then you try to. Understand what actually this goal means and how you can operationalize it, and how you can use the assets that you have in a team.

[00:10:16] Khrystyna Sosiak: And that's usually the people. And you need to have the right people. And that's very important to have the right set of people, but also to have the right data, right? And then it's this combination. You can, you can deliver.

[00:10:31] Himakara Pieris: So you're starting with a problem definition or series of problem definitions. Right? What was the next step for you, in this type of project? , is it a building, a prototype or where do you go

[00:10:41] Himakara Pieris: from there?

[00:10:43] Khrystyna Sosiak: So once we have the business, business requirements, right, and understanding the questions, and I think one of the problems that's sometimes. Uh, machine learning projects fails because we ask the wrong set of questions and, uh, we then, then [00:11:00] we start to tie to those questions, right?

[00:11:02] Khrystyna Sosiak: And, uh, yeah, so you would set the, the expectations and the business goals or the problem that you want to solve. And the next step for us always was, and first of all, really understanding and deep diving into the problem, and also understanding the customer behind that problem or the process behind that problem.

[00:11:20] Khrystyna Sosiak: So it's not all like saying, Okay, our like acquisition process or our like, I dunno, customer support process doesn't work. Need to understand why. Where is this breaking thing that is not working that you actually want to optimize for and you want to improve? And once you really know that, then you understand and you like passionate about your problem that you're trying to solve and you really understand the difference it can make, then you deep dive into the data you have and that is such a.

[00:11:50] Khrystyna Sosiak: Critical point, like, uh, I used to do the, the classes of machine learning, uh, in one academy. It's like a lot of different [00:12:00] people from different backgrounds that they're trying to learn machine learning and ai, and I was always there. No matter how exciting you are, like about all this cool algorithms, you know, and the machine learning models you can build, you always need to start with data because the data is the key to successful product.

[00:12:17] Khrystyna Sosiak: When we talk about AI and machine learning and we really need to make sure that this part is set and we, most of the time we'll spend more, most of our time in. Understanding and gathering and preparing the right data because you know, that's why machine learning models failed. That's why so many of the projects I was working on failed is because we didn't have right data or it was the bad quality, right?

[00:12:47] Khrystyna Sosiak: Or it was. Something else, but it was, um, it was the data, right? That would never generate you the good results to the problems that you said. You could have a set of right questions, [00:13:00] but if data is not there, there's not gonna be an answer.

[00:13:06] Himakara Pieris: ,what would be a good approach to validating?

[00:13:09] Himakara Pieris: That your dataset is of good quality, you are acceptable for the kind of problem that you're looking at.

[00:13:16] Khrystyna Sosiak: There is the couple of the criteria that you can look at. Definitely the first one is you need to understand what is the problem you're trying to solve and what is type of the data that you need, right?

[00:13:28] Khrystyna Sosiak: So the first one always says like, start with the question, do you actually have any data available? And then do you have a fresh data available, , do you have a good. Quality data available, , the system that are generating the data are reliable because all of those things that are really like, it's.

[00:13:45] Khrystyna Sosiak: It's not even starting at the data, starting of what is the system that is generating the data? And that's where things start. And that's where also sometimes when you think about the bias in data that starts, it's not because of [00:14:00] the data, it's because of the system or the person that is generating the data.

[00:14:05] Khrystyna Sosiak: And I would say, look at the system and ask, can I trust this system with the quality of the data that the system is generating? And, um, yeah, definitely fresh data is super important. You know, I saw the, some people that would like, okay, we have this 10 years old data set. Let's build something for, we're gonna do the prediction about, you know, tomorrow it's not gonna work because the world is changing all the time.

[00:14:34] Khrystyna Sosiak: You know, the, the behavior that customers had one. Week ago, one month ago, could never be the same anymore. I had a very interesting example of the project that failed during Covid. So we had a very good model. It was doing the predictions, for the churn. And we were about to launch it for like, some pilot, right?

[00:14:57] Khrystyna Sosiak: It was, it was quite costly. [00:15:00] We wanted to be very, uh, strict on like, what is the customer base? We would launch it. We still decided to launch it, and then the covid started and you know, we said, okay, we, we still gonna see. We just gonna validate the model. And our model crashed on the new data. Because the customer behavior changed completely.

[00:15:19] Khrystyna Sosiak: And that's the thing that you need to have a recent data. You need to validate are my data representative to the reality that I live in? You know, is the data that I generated, actually the data that. Would correspond to what customers do and how they think at this particular moment.

[00:15:38] Khrystyna Sosiak: So that's, that's an important one as well. Right. There's also a lot about security and whether the data actually should be used because we live in a world when you know, the data, it's one of the. Very valuable assets that a lot of people trying to get and trying to use, not in the right way. And we as the product managers [00:16:00] that are building the products on top of the data and then responsible for the data that we use and for the, you know, Security and privacy of our customers is really important.

[00:16:12] Khrystyna Sosiak: Not only to think about the business metrics such as, you know, new customers, the revenue, but thinking about the quality, uh, uh, the security and the privacy of our customers first. And if there's the risk or you have a risk that, okay, it feels like something may go wrong, then I would say stop it before you start it because, um, The reputation laws or the, you know, defines even like all like very financial things.

[00:16:43] Khrystyna Sosiak: You could actually harm your company more. Doing some things like that than benefiting. And I think that's also something to remember is. Can I use this data? Is that the right data to use? Right. Do I have, for example, the consent, the rightly corrected consent of the [00:17:00] customers that I can use this data, right?

[00:17:02] Khrystyna Sosiak: It's my data anonymized in the right way. Really a lot of things I think data we can talk about it for, for a long time, but it's uh, it's the key.

[00:17:11] Himakara Pieris: So what is the next step from there?

[00:17:15] Khrystyna Sosiak: When we talk today, , about why, machine learning products and projects fail. One of them is because there is unrealistic expectations and there's no clear communication. And aligning the expectations between what the business usually or product expects, and what technically can be delivered. And I think the next step is once we are done with the data and once the data is prepared, then we have the feature engineering and the data is clean. It's the experimentation phase.

[00:17:47] Khrystyna Sosiak: And you know, sometimes it can take one. Quick to build the model and we say, wow, it works. You know, like we can go to the next phase and sometimes it can take months. [00:18:00] There's no result. And I think having this transparency with the stakeholders saying, Hey, that's not a software project. You know, I, I cannot tell you.

[00:18:09] Khrystyna Sosiak: Like, okay, that's gonna take two sprints and that's gonna take one sprint and we are gonna be done because that's a very different paradigm of. Building something and thinking about something. And I think bringing, not bringing your stakeholders, your managers, the people that the sponsors of the project align with that.

[00:18:29] Khrystyna Sosiak: And understanding and being on board before you start could really cause, you know, disagreement, but also just failing what you're doing just because there's no support anymore. And I think that's one of the important things, , is just. When you do the experimentation, and it can take a lot of time, right?

[00:18:49] Khrystyna Sosiak: Make sure that you have the set of boundaries that's saying, okay, that's what we aligned with the stakeholders and that's the metric that we are gonna optimize for and that's when we are gonna [00:19:00] stop. So there's, I always say when you think about experimentation and building the model, and like there's need to be two things.

[00:19:07] Khrystyna Sosiak: You look at the one. What is the metric? Like? What is the metric value that you are optimizing for and what is like your north star that you say it's good enough, you know, we can move on. We can try to validate it on the real data. We can try to see whether we can put it in production. That's one. But there's another one, and this one is actually much more important is saying.

[00:19:35] Khrystyna Sosiak: Honestly sitting with your stakeholders and saying, how much time do we have to do the experimentations it to say that after this time, we're not gonna try anymore? You know, I. We say we not gonna do it anymore because we don't see any progress. And I think having this boundary set at the beginning, before you start invest, being invested in [00:20:00] the project is so important because you know, the more you invest, it's a, it's from the psychology.

[00:20:06] Khrystyna Sosiak: The more you invest in a project, the more difficult is for you to say. It's over, even though everyone knows it's over and there's nothing gonna be out of it. And that's how also, uh, a lot of projects fail, right? But also fell with having these bad feelings that someone is killing something that is so close to your heart.

[00:20:28] Khrystyna Sosiak: But when you have this set of expectations and when you're very clear, we have this goal, and if you are not achieving this goal, or we're not close to this goal in this particular timeframe, We gonna, we gonna just kill it, you know? And it's, it's good because then you know it and you know, you work hard to make it work, but it doesn't work.

[00:20:50] Khrystyna Sosiak: That's something you agreed on at the beginning.

[00:20:54] Himakara Pieris: You touched on this earlier. It sounds like part of that conversation is having a good way to validate the [00:21:00] results or the impact I. , of the model, , and compare that with some real world as of right now, results. So you have a very clear comparison

[00:21:08] Himakara Pieris: point.

[00:21:10] Khrystyna Sosiak: Absolutely. I think that understanding, because that's, that's the one thing I always said, that you can build the best model in the world, but there's no impact in building the model if it's not gonna land on production and actually being used.

[00:21:24] Khrystyna Sosiak: And I think that's, that's so important to make sure that what we build is line on production and stakeholders. The key in making sure it's there and it's used. And I think that having the expectations about the time bound and when we, when, you know, we call it off and we said, you are not gonna continue.

[00:21:45] Khrystyna Sosiak: But also having the real expectations with the stakeholders about. How the machine learning works and the mistakes that it can make. You know, the error rate and the what is the cost of the [00:22:00] error. We failed one project. The model was really good. There was impressive, like it was so scientifically interesting build, like we literally spent like months reading all the research papers and trying to understand how to solve one problem and we built the model that was really good.

[00:22:17] Khrystyna Sosiak: And it was like very close to someone's, you know, the results were very close to the results that someone like wrote the PhD on and it was really good. And, but because we, at the beginning, we haven't really. Talk and calculated what is the cost of an error for us and how we as the company are ready to take this risk and this cost.

[00:22:40] Khrystyna Sosiak: Saying that we know that's the value the model can bring, but that also the. The the risk that we can take, uh, we need to take. Are we happy with that or not? And I think having this conversation with the sponsors, right, with the senior leadership that would sponsor your project [00:23:00] and also be the decision makers.

[00:23:01] Khrystyna Sosiak: Whether at the end is the most important step gonna happen and the model is gonna land on production or not. And I think that's very important. And sometimes as the data scientist or the product manager that works with the data science, we are so invested and we are trying to sell our idea and our, you know, what we do.

[00:23:22] Khrystyna Sosiak: That we are so focused on the benefits that we don't explicitly talk about the risks, and I think it's very important to talk about those two things.

[00:23:34] Himakara Pieris: You talked about having this time boxing or having a clear understanding of how much time are we.

[00:23:42] Himakara Pieris: Love to spend on this problem . What would be a good way to estimate what's acceptable or reasonable? Because if you say, okay, you have two weeks to solve this problem, right? Then if it's not done in two weeks, you're gonna kill it. That doesn't sound quite reasonable. Maybe it is for some problems.[00:24:00]

[00:24:00] Himakara Pieris: What is your way of figuring out what is the right amount of time? What is the right number of cycles to burn through for any given problem?

[00:24:09] Khrystyna Sosiak: So there is the couple, there's no, like, I don't think there's like the one formula you can apply and you have the right estimation. And I think that estimations in general with, with the eye, it's very difficult, right?

[00:24:21] Khrystyna Sosiak: Because it's hard to estimate when it's gonna work. So the couple of things that I would look at and I would use like as a frame reference is the first one. And I always like start with the business because I'm, you know, the product manager. So I'll start with the business question and I'll say, For how long we can afford that, right?

[00:24:42] Khrystyna Sosiak: Like for how, actually, for how long we can afford trying to solve this problem in this way. Because you know, when you say yes to something, you say no to something else. And if you say yes to one opportunity and one project, it means that those [00:25:00] resources not gonna be used for something else. And that's the question of also the, the budget and the risk that we are ready to take and for how long we.

[00:25:09] Khrystyna Sosiak: Can sustain that and for how long it's. It's okay for us to take the risk that at the end it's not gonna work out right. And I think that's a clear conversation that we need also to calculate the cost. And I really like to understand on like, you know, when you have a numbers, it's much easier because you can calculate the cost of your people, you can calculate the cost of the processing power that you need, and you can say that's the cost of one week of doing it.

[00:25:37] Khrystyna Sosiak: And that's the cost of one month of doing it. If we know, let's not go with this happy scenario, let's go with the best scenario and say, we know it's gonna fail for how long? Like what is the risk? And like in like money, you know, that we are ready to take. And I think that's something that the first thing we would do, right?

[00:25:57] Khrystyna Sosiak: It's just to understand what is the [00:26:00] risk we are happy to take and we know that the reward that we can get. It's much higher, right? Because if we say that, okay, the return of investment like for the year is gonna be this percentage, but if we move on like for one month, like with doing it, it's, or like for six months, right?

[00:26:19] Khrystyna Sosiak: It's, it's gonna take us like five years to return the investment of that, then I would say, you know, we probably shouldn't be doing it in the first place. That's the first one. Experience. I think talking to the data scientists and to the engineers and also understanding, okay, looking at the problem that we have and the, the complexity of the problem that we have.

[00:26:43] Khrystyna Sosiak: How much time is the reasonable amount of time to invest to see the first results? And I would never say, Hey, let's do one iteration. And we, and we decide because I think it's not enough, right? We need to try different things. And then [00:27:00] I would optimize for how we can reduce the, the, the time. Of trying new things, right, of trying new algorithms, of adding new data.

[00:27:10] Khrystyna Sosiak: So then, you know of it's not gonna take us weeks, but it take, gonna take us days, right? Maybe just to see whether we can find something where actually works and then optimize and then it rate on something. We actually start. Working. Uh, but yeah, the time the estimation is, is difficult most of the time. I think that you look at the resources that you have and the technical complexity of the problem that you have, because, you know, sometimes you are such a complex problem.

[00:27:40] Khrystyna Sosiak: Like if someone would ask. Q to build the chart G p T in one week. Like, I mean, probably there's some people that can do that. But you know, if you look at just like normal data science team in some company, they would not do that in one week. Right? And that's like realistic. There's no way, right? So you need to say what is the time that you're comfortable with delivering the first results, [00:28:00] right?

[00:28:00] Khrystyna Sosiak: And, and then going from there and understanding whether there is actually the positive change. In the next situations or it all stays the same because if you try 10 times and it all fails, then probably that's, you know, the time for us to stop.

[00:28:18] Himakara Pieris: This sounds like you're placing a series of betts on probability of success in r o i, in the technical feasibility and in the team, and also the kind of adoption you could, you could get, right? So you have to decide case by case basis.

[00:28:31] Himakara Pieris: How much are I, you willing to bet that, , you can, you can deliver x times return on investment and how much you wanna bet , this is technically feasible, , et cetera. So are there any other things you would put into that mix of considerations other than r o i, technical feasibility, your team's capabilities and adoption?

[00:28:54] Khrystyna Sosiak: , let me think about it. So I. I would, again, I think that [00:29:00] it's also important to, always do the market research and understanding what is on the market. And also there's so many use cases explained, right? And I think just getting this information and understanding, so what is the reasonable amount of time to spend on something like that, right?

[00:29:17] Khrystyna Sosiak: I think that's very, that's also key. To understand and also set the right set of expectations and the time bound. So yeah, you, and also like, okay, when we, when we look usually like when you in commercial, not in the research, right? And when you have one problem, so for example, you build recommendation model for one product, right?

[00:29:40] Khrystyna Sosiak: I dunno for, for dresses then it's very easy to replicate it and build it for, you know, shoes. And once, if you have the similar set of problems you're trying to solve, for example, with different data or for different segments, then it becomes much easier because with experience, then you understand, okay, that's probably the [00:30:00] amount of time that we would need to validate and then we would need to build.

[00:30:04] Khrystyna Sosiak: So it's always the question whether I'm doing it and I already done something like that in a previous that. , more or less similar. It's never going to be the same, but the problem statement and the problem space and the data space is something that we know, do we know it? And then it's easier to have the set of expectations.

[00:30:26] Khrystyna Sosiak: But then when we think about something completely new, we never touched, right? Like if you take someone who all the time for their career used to build recommendation systems and, I don't know, text analysis, and you tell them now, you know, to do the generative AI of. Uh, videos of some popular singers that is very different problem space, right?

[00:30:49] Khrystyna Sosiak: And then it's very different, difficult to estimate. So then I would also give a bit higher buffer, right? And say, okay, we, we will need more time than usually, [00:31:00] right? If usually for the. Simple sim uh, problem space. We'll say, okay, it's two weeks, and if in two weeks doesn't work, then we move on. Then for something very complex and new, we will say, okay, it's gonna be one month just because it's, we don't know it yet.

[00:31:16] Khrystyna Sosiak: We need to discover, we need to learn. We need to trade.

[00:31:20] Himakara Pieris: Say you have a functional model that's performing well to an acceptable level, could you take us through the process of productionalizing that and what kind of pitfalls you would look out for? , what would you flag as high risk factors that could cause a project to fail?

[00:31:37] Khrystyna Sosiak: The one is like not being able to pro, uh, put in, in production. That's like the, the highest, uh, I think problem that a lot of companies think more than we think of, like they really have. But when we say, okay, we are ready for that, I think the one that is.

[00:31:53] Khrystyna Sosiak: Very common for like maybe smaller companies as well, but also for the big ones. Not all of them have the [00:32:00] right strategy, like have a strategy for the ML ops and how would you actually deploy the machine learning model, right, and have the right infrastructure. To maintain it. And I think that's a very important thing that we need to understand that, , deploying machine learning model and also the processes and the monitoring that is required is a bit different from the normal deployment of, you know, some A P I or the uh, or the application.

[00:32:27] Khrystyna Sosiak: And we need to be able to do that and we need to have qualified people that know how to do it. And for me, I think the biggest problem. In doing it, in my previous experience in my teams that I had was not having the right people in place. You know, because usually when you hire, and also like I started quite some time ago, not that long ago, but like six years ago, and.

[00:32:56] Khrystyna Sosiak: At that time, like you would just hire data [00:33:00] scientists. There was no profile of like mops. There was no profile of, you know, machine learning engineer. You would just hire someone who works with data. And when you have a set of like five people that know how to do the data preparation and build machine learning model itself.

[00:33:18] Khrystyna Sosiak: It's not the same skillset as needed to deploy the model and make it work in production. And I think that's the biggest problem is not having the right people and also having the expectation that, you know, oh yeah. We, we will, we, you know, someone will, will come and do it. Because usually it's difficult to find those people and we need to, to make sure that we have them in place and they know what to do, right?

[00:33:44] Khrystyna Sosiak: Because when we just approach that, you know, we don't have, for example, the right monitoring in place. And I'll say that's the most important for me. And that's why some projects, a lot of them fail is because we would put something in production and then, you know, it works and. [00:34:00] Actually there was like one month, six months, one year, and we still operating under the assumption of what we validated one year ago that it worked.

[00:34:12] Khrystyna Sosiak: But we don't know if it works now. And if you have a webpage and you build it once, it's gonna work, you know, until something new. So you gonna break something in the code. But with machine learning, it's different because. What works today doesn't necessarily is gonna work tomorrow or in one year. And we need to have all the monitoring in place to make sure that actually your machine learning model is still helping your business and not harming your business.

[00:34:41] Khrystyna Sosiak: And uh, I think that's one of the very important aspects about. Having a running machine, learning models and AI in production is having the right monitoring and alerting in place, and also knowing what are the actions I'm gonna take once I [00:35:00] receive that alert. You know, like what it means. Like what it means.

[00:35:05] Khrystyna Sosiak: Not only, okay, I need to retrain this model, but if it's gonna stop working, if we are gonna turn it off. What is the cost of that for one, one minute, one second, one hour, right? Or if I will continue, it's, we still continue working while I'm fixing something. What is the cost of that? And I think those things is very important because we can say we are gonna monitor, you know, we have a set of metrics we're gonna monitor and we are gonna receive an alert.

[00:35:32] Khrystyna Sosiak: And then you receive alerts in the middle of night. And then what? And what is the next step? And I think having this plan and strategy in place of not only exciting part of building the model, but the part when actually customers interact with that and something goes wrong and the world is changing and the data is changing and things are breaking.

[00:35:54] Khrystyna Sosiak: It's very important.

[00:35:57] Himakara Pieris: I know you have like [00:36:00] seven key questions or seven areas that you look at, , as a way to mitigate the risk of failure in AI projects. Could you talk us through those seven key items?

[00:36:11] Khrystyna Sosiak: There's a lot of projects that we have done that failed, and that's why we learned from that. And it's like, okay, there's some things that you can watch out to, like, not to make those mistakes and prevent your product from failing. So the first one would be you ask the wrong questions. You don't understand the problem or you don't understand the customer, or you don't understand the data, and then the question and the metric you're optimizing for, and the problem you're trying to solve is actually not the right, and then no matter what you do is not going to work because the question is wrong.

[00:36:46] Khrystyna Sosiak: Uh, so you need to invest in that. You need to invest time in understanding that. Um, the last, the next one, it kind of, not very, not technical, right? You say, okay, machine learning. Products fell because [00:37:00] there's like something with technology or data? No, actually most of them fell because there is no support from stakeholders.

[00:37:08] Khrystyna Sosiak: There's no understanding, there's no, uh, sponsorship for the things that we do. There's no willingness to change the approach That was. Used for years. And I think bringing all of them on board and making sure that you get buy-in from them, from the stakeholders that you need to work with. And it's, I'm not only talking about the sponsors that gonna give you money to do that, right.

[00:37:33] Khrystyna Sosiak: Or like, Green light, but also about the people that would, for example, at the end of the day, will need to use the machine learning model. Are they willing to do that? Right. Because if you have department of 100 people and you come to them and you tell them, you know, we want you to use that, and they say, no.

[00:37:49] Khrystyna Sosiak: There's not, there's not a lot you can do. Um, yeah, the data, data is the key. So having the data quality, um, in the right place, checking the [00:38:00] data quality, having the, the right data set in place, it's very important. And if you don't have it, it's a very big problem that can cause the failure. Uh, the data science team, having the right people in place and having the right set of people, when you have the team that every single person would know only one thing, and it's all the same thing, it's not gonna work because with the way the product works and also there's different stages and you need different set of skills, um, Going for something super complex that you not necessarily understand and having like so super complex models, it's also very often like you'll fail and you'll not even know why you failed because it was so complex that you don't even know what optimize for and like that's another one.

[00:38:50] Khrystyna Sosiak: Right? And also over promising, overselling, setting the wrong expectations. It's another one because. You are, there's [00:39:00] always the risk of failing and you need to go and talk about that and make sure that cus uh, stakeholders know about it. And there's always the return of investment, there's always the risk.

[00:39:12] Khrystyna Sosiak: And just making sure that you and people you work with align on that and fine with that. That's, uh, that's the things I would look for.

[00:39:22] Himakara Pieris: Thank you for coming on the podcast and sharing insights today. Khrystyna, is there anything else that you'd like to share with the audience?

[00:39:29] Khrystyna Sosiak: I think that just making sure that what you do in life, you bring some impact or to your customers or to your business or to the world, and making sure that we use our power of using technology in the right way.

[00:39:43] Khrystyna Sosiak: I think that's, that's very important and there's so much power that we have right now and opportunities, so yeah.

  continue reading

15 episodes

Artwork
iconShare
 
Manage episode 376385950 series 3458395
Content provided by Himakara Pieris. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Himakara Pieris or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

I’m excited to share this conversation with Khrystyna Sosiak. Khrystyna is a product manager at TomTom. Before that, she was a lead AI coach at Intel and a senior data scientist at mBank. During this conversation, Khrystyna shared her approach to navigating the complex landscape of AI projects, which includes investing in research, strategically placing bets, fostering stakeholder support, and embracing transparency.

Links

Khrystyna On LinkedIn

Transcript

[00:00:13] Himakara Pieris: I'm, Himakara Pieris. You're listening to smart products. A show where we recognize, celebrate and learn from industry leaders who are solving real-world problems. Using AI.

[00:00:25] Himakara Pieris: Khrystyna welcome to smart products.

[00:00:27] Khrystyna Sosiak: Thank you. I'm super excited to be here. Thank you for having me.

[00:00:30] Himakara Pieris: To start things off, could you tell us a bit about your background, um, what kind of environments that you've worked in, and also what kind of AI projects that, that you've been part of?

[00:00:39] Khrystyna Sosiak: Yes. So, uh, currently I'm a product manager at TomTom. I'm working on the external developer experience and uh, and analytics and billing topics. And in past I was working on the machine learning operations platforms and, uh, in my previous experience was a data scientist. So I was actually working with, [00:01:00] uh, with machine learning and with artificial intelligence before I moved into product.

[00:01:05] Himakara Pieris: What would be a good example of an AI project that you worked on?

[00:01:10] Khrystyna Sosiak: Probably one of the most, Exciting and interesting, , products that we've been working on that was very powerful is, , understanding the customer's behavior and, and the patterns.

[00:01:22] Khrystyna Sosiak: And then based on that ing uh, the right products. So I was working in banks, so we would analyze. All the data that we can find about our customers, right, of course, with two G D P R and making sure that we only use the right data, but, and then making sure that all the communication that goes to the customers is the right communication about the right products and in the right way.

[00:01:46] Khrystyna Sosiak: So really understanding the customer needs and, uh, the stage of the customer life and saying that's, that's what the customer need at this point, and that's how we. Understand that and how we can communicate and [00:02:00] sell it to the customers. So it's not about only making money, but it's understanding how we can actually.

[00:02:06] Khrystyna Sosiak: Go through this journey of life with the customer and supporting them. So, and understanding that by the data that they're generating and by the insights that we can find in this data. And sometimes, you know, and like data that you have like that generated by your transactions and by your history, like, It's a really specific data that show a lot about the person that probably some people even don't know about themselves.

[00:02:33] Khrystyna Sosiak: And the real goal is how we can use it for the benefit of the customer and not to harm the customer, right? And, um, we really change the way that we approach them. Uh, we approached the, the marketing communication with the customers, what was very interesting and transform transformational to see how very old fashioned organization would really move in direction into the [00:03:00] AI and making sure that all the decisions and the marketing strategies are powered by ai.

[00:03:06] Khrystyna Sosiak: So yeah, that was very interesting. It took us a long time. We made a lot of mistakes on the way, but it was a super interesting learning experience.

[00:03:17] Himakara Pieris: If I take a step back, so we're talking about mbank a consumer banking operation and reaching out the customers at the right time is something very important to, to become that part of the customer's daily life or, or their journey.

[00:03:32] Himakara Pieris: How was that done before and what point. Did the bank decide to explore AI as a possible, , solution to, possible tool to improve the, communications with the customers?

[00:03:46] Khrystyna Sosiak: I think the turning point was understanding that where the, you know, not only trends, but like the industry goals, right? And really AI powers the financial industry and the financial industry thing [00:04:00] in general.

[00:04:00] Khrystyna Sosiak: It's been very innovative in, uh, Trying to adopt the new technology and trying to make sure that the customers get the best experience before it was all triggered by the events. So you can imagine, I mean, it's still used widely, right? And when we talk about recommendation systems and like how the communication is done, right?

[00:04:20] Khrystyna Sosiak: You open the webpage, you open the app, and you, you scroll through some pages, you know about the credit card, for example, and then, Next day you would receive the email saying, Hey, here's the discount. Or in today, someone would call and say, Hey, we saw that you are interested in a credit card. Do you want to order the credit card?

[00:04:41] Khrystyna Sosiak: We have this discount for you. And usually it was triggered by one event, right? Or the, the sequence of events. But it's also very event triggering, right? So you only can. You only can base your recommendations on what customer actually does on the webpage. You don't really go into details [00:05:00] of like, okay, what are the factors about the customers that can affect that and what is actually the things that they need?

[00:05:07] Khrystyna Sosiak: It's, um, so yeah, it was something that was used. For years and, uh, it worked. You know, there was some success rates there, so I cannot say it didn't work, but we know that moving forward expectations of the customers are higher because when we live in the era of ai, when you have, you know, Netflix and Facebook with the recommendation title, your.

[00:05:30] Khrystyna Sosiak: You know, reactions and like what you see, what you like, what you don't like. Really we need to be there as well. And just saying you clicked on something and that's why we think it's could be interesting for you. It's not good enough anymore.

[00:05:45] Himakara Pieris: Sounds, like the previous, , approach for doing this is purely driven by specific events.

[00:05:51] Himakara Pieris: You have a rule-based system. If you click on this page, then you must be interested in this product. Let's unleash all the marketing communication , to sell that product [00:06:00] towards you. Whereas now, , the idea is we can possibly make this better by using ai. , To make sure that we are making more personalized and more relevant recommendations to the customer.

[00:06:10] Himakara Pieris: And by doing that, you improve the customer's experience and you would also improve the sort of the clickthroughs or, or, or signups for that product that you're, that you're positioning for the customer. , so when you start there, so it sounds like it started more with a. With an experimental approach.

[00:06:26] Himakara Pieris: Is that right where you're saying, okay, we have this way, we are doing things now we have all these new tools that are coming to the market, coming to the world. Let's pick them up and see whether we can move the needle, , with these tools rather than the, the method that we are doing now, which is our baseline.

[00:06:42] Himakara Pieris: Is that a fair assessment?

[00:06:44] Khrystyna Sosiak: It's for assessment and to be honest, it's for assessment not only about this project and not only about this experience, about almost all of the experiences that I had with the big companies or even small companies trying to get into the AI and trying, [00:07:00] you know, if it's. Not like the, the companies that actually build it, right?

[00:07:03] Khrystyna Sosiak: That they're trying to adopt it. It's really about, we have some data, we see the trends, we see that our competitors are using it, so how can we benefit from it? And I can see very often, like also talking to my colleagues and to my friends that there's very. There's a lot of companies that would hire like, uh, machine learning or, uh, engineer or data scientists say, that's the data we have.

[00:07:26] Khrystyna Sosiak: We have no idea what we can do with it. You know, try to figure something out. And I think sometimes there is some wrong expectations about Right. What we can do and what we cannot do. So yeah, it's all started like that, right? We have the data. Here's the set of the business. Problems that we have, and then let's iterate.

[00:07:46] Khrystyna Sosiak: Let's see what gonna work, what not gonna work. And a lot of things fails before something starts working. Right. And I think that's a learning experience that once you, you cannot, like, you cannot get there. [00:08:00] If you, they make mistakes and learn on a way, because then your experience and your success is much more meaningful because you actually understand what you've done and how you've done it and why you made those informed decisions about some steps of the machine learning process that we have.

[00:08:18] Khrystyna Sosiak: And that was very important also for the data scientist and for the product manager to understand better how this industry works. And how building these products are different and why they're failing.

[00:08:33] Himakara Pieris: So I imagine you're in a conference room on, there are two whiteboards on either side. On one whiteboard you have a whole set of business priorities and all the other side you have a catalog of all the data services that's available to you.

[00:08:45] Himakara Pieris: And then in the middle you have a data scientist and a machine learning engineer with a, , with a, with a toolkit, right? So, so you're running through a bunch of experiments using the toolkit you have and the data you have to see where you can impact, , the business priorities that you've identified.

[00:08:59] Himakara Pieris: Is that a good [00:09:00] way to look

[00:09:00] Himakara Pieris: at it?

[00:09:01] Khrystyna Sosiak: Yeah, it was like that definitely. It was also like someone from the business coming saying, that's the problem we have, we need support and we need to, to sort it out and the help. Uh, right. It was also like, Hey, that's the data we never used.

[00:09:16] Khrystyna Sosiak: Maybe we can, there's some opportunities in this data that not discovered that can actually bring the value to the co to the, uh, company, or it's for. Selling more or automation, right. So it's really different range of how those initiatives can start and they usually start from very different directions, right?

[00:09:37] Khrystyna Sosiak: But I think one is definitely you have a set of priorities and business priorities that you want to achieve, and then you ask yourself, right, that's where my company wants to go and. That's like us, and that's like what we have as a asset is data. How can we help the company to get where they want to be?

[00:09:54] Khrystyna Sosiak: You know? And the goal of the company could be very far from like adopting ai, [00:10:00] right? It could be like, you know, growing the revenue or going to the end number of customers this year. And then you try to. Understand what actually this goal means and how you can operationalize it, and how you can use the assets that you have in a team.

[00:10:16] Khrystyna Sosiak: And that's usually the people. And you need to have the right people. And that's very important to have the right set of people, but also to have the right data, right? And then it's this combination. You can, you can deliver.

[00:10:31] Himakara Pieris: So you're starting with a problem definition or series of problem definitions. Right? What was the next step for you, in this type of project? , is it a building, a prototype or where do you go

[00:10:41] Himakara Pieris: from there?

[00:10:43] Khrystyna Sosiak: So once we have the business, business requirements, right, and understanding the questions, and I think one of the problems that's sometimes. Uh, machine learning projects fails because we ask the wrong set of questions and, uh, we then, then [00:11:00] we start to tie to those questions, right?

[00:11:02] Khrystyna Sosiak: And, uh, yeah, so you would set the, the expectations and the business goals or the problem that you want to solve. And the next step for us always was, and first of all, really understanding and deep diving into the problem, and also understanding the customer behind that problem or the process behind that problem.

[00:11:20] Khrystyna Sosiak: So it's not all like saying, Okay, our like acquisition process or our like, I dunno, customer support process doesn't work. Need to understand why. Where is this breaking thing that is not working that you actually want to optimize for and you want to improve? And once you really know that, then you understand and you like passionate about your problem that you're trying to solve and you really understand the difference it can make, then you deep dive into the data you have and that is such a.

[00:11:50] Khrystyna Sosiak: Critical point, like, uh, I used to do the, the classes of machine learning, uh, in one academy. It's like a lot of different [00:12:00] people from different backgrounds that they're trying to learn machine learning and ai, and I was always there. No matter how exciting you are, like about all this cool algorithms, you know, and the machine learning models you can build, you always need to start with data because the data is the key to successful product.

[00:12:17] Khrystyna Sosiak: When we talk about AI and machine learning and we really need to make sure that this part is set and we, most of the time we'll spend more, most of our time in. Understanding and gathering and preparing the right data because you know, that's why machine learning models failed. That's why so many of the projects I was working on failed is because we didn't have right data or it was the bad quality, right?

[00:12:47] Khrystyna Sosiak: Or it was. Something else, but it was, um, it was the data, right? That would never generate you the good results to the problems that you said. You could have a set of right questions, [00:13:00] but if data is not there, there's not gonna be an answer.

[00:13:06] Himakara Pieris: ,what would be a good approach to validating?

[00:13:09] Himakara Pieris: That your dataset is of good quality, you are acceptable for the kind of problem that you're looking at.

[00:13:16] Khrystyna Sosiak: There is the couple of the criteria that you can look at. Definitely the first one is you need to understand what is the problem you're trying to solve and what is type of the data that you need, right?

[00:13:28] Khrystyna Sosiak: So the first one always says like, start with the question, do you actually have any data available? And then do you have a fresh data available, , do you have a good. Quality data available, , the system that are generating the data are reliable because all of those things that are really like, it's.

[00:13:45] Khrystyna Sosiak: It's not even starting at the data, starting of what is the system that is generating the data? And that's where things start. And that's where also sometimes when you think about the bias in data that starts, it's not because of [00:14:00] the data, it's because of the system or the person that is generating the data.

[00:14:05] Khrystyna Sosiak: And I would say, look at the system and ask, can I trust this system with the quality of the data that the system is generating? And, um, yeah, definitely fresh data is super important. You know, I saw the, some people that would like, okay, we have this 10 years old data set. Let's build something for, we're gonna do the prediction about, you know, tomorrow it's not gonna work because the world is changing all the time.

[00:14:34] Khrystyna Sosiak: You know, the, the behavior that customers had one. Week ago, one month ago, could never be the same anymore. I had a very interesting example of the project that failed during Covid. So we had a very good model. It was doing the predictions, for the churn. And we were about to launch it for like, some pilot, right?

[00:14:57] Khrystyna Sosiak: It was, it was quite costly. [00:15:00] We wanted to be very, uh, strict on like, what is the customer base? We would launch it. We still decided to launch it, and then the covid started and you know, we said, okay, we, we still gonna see. We just gonna validate the model. And our model crashed on the new data. Because the customer behavior changed completely.

[00:15:19] Khrystyna Sosiak: And that's the thing that you need to have a recent data. You need to validate are my data representative to the reality that I live in? You know, is the data that I generated, actually the data that. Would correspond to what customers do and how they think at this particular moment.

[00:15:38] Khrystyna Sosiak: So that's, that's an important one as well. Right. There's also a lot about security and whether the data actually should be used because we live in a world when you know, the data, it's one of the. Very valuable assets that a lot of people trying to get and trying to use, not in the right way. And we as the product managers [00:16:00] that are building the products on top of the data and then responsible for the data that we use and for the, you know, Security and privacy of our customers is really important.

[00:16:12] Khrystyna Sosiak: Not only to think about the business metrics such as, you know, new customers, the revenue, but thinking about the quality, uh, uh, the security and the privacy of our customers first. And if there's the risk or you have a risk that, okay, it feels like something may go wrong, then I would say stop it before you start it because, um, The reputation laws or the, you know, defines even like all like very financial things.

[00:16:43] Khrystyna Sosiak: You could actually harm your company more. Doing some things like that than benefiting. And I think that's also something to remember is. Can I use this data? Is that the right data to use? Right. Do I have, for example, the consent, the rightly corrected consent of the [00:17:00] customers that I can use this data, right?

[00:17:02] Khrystyna Sosiak: It's my data anonymized in the right way. Really a lot of things I think data we can talk about it for, for a long time, but it's uh, it's the key.

[00:17:11] Himakara Pieris: So what is the next step from there?

[00:17:15] Khrystyna Sosiak: When we talk today, , about why, machine learning products and projects fail. One of them is because there is unrealistic expectations and there's no clear communication. And aligning the expectations between what the business usually or product expects, and what technically can be delivered. And I think the next step is once we are done with the data and once the data is prepared, then we have the feature engineering and the data is clean. It's the experimentation phase.

[00:17:47] Khrystyna Sosiak: And you know, sometimes it can take one. Quick to build the model and we say, wow, it works. You know, like we can go to the next phase and sometimes it can take months. [00:18:00] There's no result. And I think having this transparency with the stakeholders saying, Hey, that's not a software project. You know, I, I cannot tell you.

[00:18:09] Khrystyna Sosiak: Like, okay, that's gonna take two sprints and that's gonna take one sprint and we are gonna be done because that's a very different paradigm of. Building something and thinking about something. And I think bringing, not bringing your stakeholders, your managers, the people that the sponsors of the project align with that.

[00:18:29] Khrystyna Sosiak: And understanding and being on board before you start could really cause, you know, disagreement, but also just failing what you're doing just because there's no support anymore. And I think that's one of the important things, , is just. When you do the experimentation, and it can take a lot of time, right?

[00:18:49] Khrystyna Sosiak: Make sure that you have the set of boundaries that's saying, okay, that's what we aligned with the stakeholders and that's the metric that we are gonna optimize for and that's when we are gonna [00:19:00] stop. So there's, I always say when you think about experimentation and building the model, and like there's need to be two things.

[00:19:07] Khrystyna Sosiak: You look at the one. What is the metric? Like? What is the metric value that you are optimizing for and what is like your north star that you say it's good enough, you know, we can move on. We can try to validate it on the real data. We can try to see whether we can put it in production. That's one. But there's another one, and this one is actually much more important is saying.

[00:19:35] Khrystyna Sosiak: Honestly sitting with your stakeholders and saying, how much time do we have to do the experimentations it to say that after this time, we're not gonna try anymore? You know, I. We say we not gonna do it anymore because we don't see any progress. And I think having this boundary set at the beginning, before you start invest, being invested in [00:20:00] the project is so important because you know, the more you invest, it's a, it's from the psychology.

[00:20:06] Khrystyna Sosiak: The more you invest in a project, the more difficult is for you to say. It's over, even though everyone knows it's over and there's nothing gonna be out of it. And that's how also, uh, a lot of projects fail, right? But also fell with having these bad feelings that someone is killing something that is so close to your heart.

[00:20:28] Khrystyna Sosiak: But when you have this set of expectations and when you're very clear, we have this goal, and if you are not achieving this goal, or we're not close to this goal in this particular timeframe, We gonna, we gonna just kill it, you know? And it's, it's good because then you know it and you know, you work hard to make it work, but it doesn't work.

[00:20:50] Khrystyna Sosiak: That's something you agreed on at the beginning.

[00:20:54] Himakara Pieris: You touched on this earlier. It sounds like part of that conversation is having a good way to validate the [00:21:00] results or the impact I. , of the model, , and compare that with some real world as of right now, results. So you have a very clear comparison

[00:21:08] Himakara Pieris: point.

[00:21:10] Khrystyna Sosiak: Absolutely. I think that understanding, because that's, that's the one thing I always said, that you can build the best model in the world, but there's no impact in building the model if it's not gonna land on production and actually being used.

[00:21:24] Khrystyna Sosiak: And I think that's, that's so important to make sure that what we build is line on production and stakeholders. The key in making sure it's there and it's used. And I think that having the expectations about the time bound and when we, when, you know, we call it off and we said, you are not gonna continue.

[00:21:45] Khrystyna Sosiak: But also having the real expectations with the stakeholders about. How the machine learning works and the mistakes that it can make. You know, the error rate and the what is the cost of the [00:22:00] error. We failed one project. The model was really good. There was impressive, like it was so scientifically interesting build, like we literally spent like months reading all the research papers and trying to understand how to solve one problem and we built the model that was really good.

[00:22:17] Khrystyna Sosiak: And it was like very close to someone's, you know, the results were very close to the results that someone like wrote the PhD on and it was really good. And, but because we, at the beginning, we haven't really. Talk and calculated what is the cost of an error for us and how we as the company are ready to take this risk and this cost.

[00:22:40] Khrystyna Sosiak: Saying that we know that's the value the model can bring, but that also the. The the risk that we can take, uh, we need to take. Are we happy with that or not? And I think having this conversation with the sponsors, right, with the senior leadership that would sponsor your project [00:23:00] and also be the decision makers.

[00:23:01] Khrystyna Sosiak: Whether at the end is the most important step gonna happen and the model is gonna land on production or not. And I think that's very important. And sometimes as the data scientist or the product manager that works with the data science, we are so invested and we are trying to sell our idea and our, you know, what we do.

[00:23:22] Khrystyna Sosiak: That we are so focused on the benefits that we don't explicitly talk about the risks, and I think it's very important to talk about those two things.

[00:23:34] Himakara Pieris: You talked about having this time boxing or having a clear understanding of how much time are we.

[00:23:42] Himakara Pieris: Love to spend on this problem . What would be a good way to estimate what's acceptable or reasonable? Because if you say, okay, you have two weeks to solve this problem, right? Then if it's not done in two weeks, you're gonna kill it. That doesn't sound quite reasonable. Maybe it is for some problems.[00:24:00]

[00:24:00] Himakara Pieris: What is your way of figuring out what is the right amount of time? What is the right number of cycles to burn through for any given problem?

[00:24:09] Khrystyna Sosiak: So there is the couple, there's no, like, I don't think there's like the one formula you can apply and you have the right estimation. And I think that estimations in general with, with the eye, it's very difficult, right?

[00:24:21] Khrystyna Sosiak: Because it's hard to estimate when it's gonna work. So the couple of things that I would look at and I would use like as a frame reference is the first one. And I always like start with the business because I'm, you know, the product manager. So I'll start with the business question and I'll say, For how long we can afford that, right?

[00:24:42] Khrystyna Sosiak: Like for how, actually, for how long we can afford trying to solve this problem in this way. Because you know, when you say yes to something, you say no to something else. And if you say yes to one opportunity and one project, it means that those [00:25:00] resources not gonna be used for something else. And that's the question of also the, the budget and the risk that we are ready to take and for how long we.

[00:25:09] Khrystyna Sosiak: Can sustain that and for how long it's. It's okay for us to take the risk that at the end it's not gonna work out right. And I think that's a clear conversation that we need also to calculate the cost. And I really like to understand on like, you know, when you have a numbers, it's much easier because you can calculate the cost of your people, you can calculate the cost of the processing power that you need, and you can say that's the cost of one week of doing it.

[00:25:37] Khrystyna Sosiak: And that's the cost of one month of doing it. If we know, let's not go with this happy scenario, let's go with the best scenario and say, we know it's gonna fail for how long? Like what is the risk? And like in like money, you know, that we are ready to take. And I think that's something that the first thing we would do, right?

[00:25:57] Khrystyna Sosiak: It's just to understand what is the [00:26:00] risk we are happy to take and we know that the reward that we can get. It's much higher, right? Because if we say that, okay, the return of investment like for the year is gonna be this percentage, but if we move on like for one month, like with doing it, it's, or like for six months, right?

[00:26:19] Khrystyna Sosiak: It's, it's gonna take us like five years to return the investment of that, then I would say, you know, we probably shouldn't be doing it in the first place. That's the first one. Experience. I think talking to the data scientists and to the engineers and also understanding, okay, looking at the problem that we have and the, the complexity of the problem that we have.

[00:26:43] Khrystyna Sosiak: How much time is the reasonable amount of time to invest to see the first results? And I would never say, Hey, let's do one iteration. And we, and we decide because I think it's not enough, right? We need to try different things. And then [00:27:00] I would optimize for how we can reduce the, the, the time. Of trying new things, right, of trying new algorithms, of adding new data.

[00:27:10] Khrystyna Sosiak: So then, you know of it's not gonna take us weeks, but it take, gonna take us days, right? Maybe just to see whether we can find something where actually works and then optimize and then it rate on something. We actually start. Working. Uh, but yeah, the time the estimation is, is difficult most of the time. I think that you look at the resources that you have and the technical complexity of the problem that you have, because, you know, sometimes you are such a complex problem.

[00:27:40] Khrystyna Sosiak: Like if someone would ask. Q to build the chart G p T in one week. Like, I mean, probably there's some people that can do that. But you know, if you look at just like normal data science team in some company, they would not do that in one week. Right? And that's like realistic. There's no way, right? So you need to say what is the time that you're comfortable with delivering the first results, [00:28:00] right?

[00:28:00] Khrystyna Sosiak: And, and then going from there and understanding whether there is actually the positive change. In the next situations or it all stays the same because if you try 10 times and it all fails, then probably that's, you know, the time for us to stop.

[00:28:18] Himakara Pieris: This sounds like you're placing a series of betts on probability of success in r o i, in the technical feasibility and in the team, and also the kind of adoption you could, you could get, right? So you have to decide case by case basis.

[00:28:31] Himakara Pieris: How much are I, you willing to bet that, , you can, you can deliver x times return on investment and how much you wanna bet , this is technically feasible, , et cetera. So are there any other things you would put into that mix of considerations other than r o i, technical feasibility, your team's capabilities and adoption?

[00:28:54] Khrystyna Sosiak: , let me think about it. So I. I would, again, I think that [00:29:00] it's also important to, always do the market research and understanding what is on the market. And also there's so many use cases explained, right? And I think just getting this information and understanding, so what is the reasonable amount of time to spend on something like that, right?

[00:29:17] Khrystyna Sosiak: I think that's very, that's also key. To understand and also set the right set of expectations and the time bound. So yeah, you, and also like, okay, when we, when we look usually like when you in commercial, not in the research, right? And when you have one problem, so for example, you build recommendation model for one product, right?

[00:29:40] Khrystyna Sosiak: I dunno for, for dresses then it's very easy to replicate it and build it for, you know, shoes. And once, if you have the similar set of problems you're trying to solve, for example, with different data or for different segments, then it becomes much easier because with experience, then you understand, okay, that's probably the [00:30:00] amount of time that we would need to validate and then we would need to build.

[00:30:04] Khrystyna Sosiak: So it's always the question whether I'm doing it and I already done something like that in a previous that. , more or less similar. It's never going to be the same, but the problem statement and the problem space and the data space is something that we know, do we know it? And then it's easier to have the set of expectations.

[00:30:26] Khrystyna Sosiak: But then when we think about something completely new, we never touched, right? Like if you take someone who all the time for their career used to build recommendation systems and, I don't know, text analysis, and you tell them now, you know, to do the generative AI of. Uh, videos of some popular singers that is very different problem space, right?

[00:30:49] Khrystyna Sosiak: And then it's very different, difficult to estimate. So then I would also give a bit higher buffer, right? And say, okay, we, we will need more time than usually, [00:31:00] right? If usually for the. Simple sim uh, problem space. We'll say, okay, it's two weeks, and if in two weeks doesn't work, then we move on. Then for something very complex and new, we will say, okay, it's gonna be one month just because it's, we don't know it yet.

[00:31:16] Khrystyna Sosiak: We need to discover, we need to learn. We need to trade.

[00:31:20] Himakara Pieris: Say you have a functional model that's performing well to an acceptable level, could you take us through the process of productionalizing that and what kind of pitfalls you would look out for? , what would you flag as high risk factors that could cause a project to fail?

[00:31:37] Khrystyna Sosiak: The one is like not being able to pro, uh, put in, in production. That's like the, the highest, uh, I think problem that a lot of companies think more than we think of, like they really have. But when we say, okay, we are ready for that, I think the one that is.

[00:31:53] Khrystyna Sosiak: Very common for like maybe smaller companies as well, but also for the big ones. Not all of them have the [00:32:00] right strategy, like have a strategy for the ML ops and how would you actually deploy the machine learning model, right, and have the right infrastructure. To maintain it. And I think that's a very important thing that we need to understand that, , deploying machine learning model and also the processes and the monitoring that is required is a bit different from the normal deployment of, you know, some A P I or the uh, or the application.

[00:32:27] Khrystyna Sosiak: And we need to be able to do that and we need to have qualified people that know how to do it. And for me, I think the biggest problem. In doing it, in my previous experience in my teams that I had was not having the right people in place. You know, because usually when you hire, and also like I started quite some time ago, not that long ago, but like six years ago, and.

[00:32:56] Khrystyna Sosiak: At that time, like you would just hire data [00:33:00] scientists. There was no profile of like mops. There was no profile of, you know, machine learning engineer. You would just hire someone who works with data. And when you have a set of like five people that know how to do the data preparation and build machine learning model itself.

[00:33:18] Khrystyna Sosiak: It's not the same skillset as needed to deploy the model and make it work in production. And I think that's the biggest problem is not having the right people and also having the expectation that, you know, oh yeah. We, we will, we, you know, someone will, will come and do it. Because usually it's difficult to find those people and we need to, to make sure that we have them in place and they know what to do, right?

[00:33:44] Khrystyna Sosiak: Because when we just approach that, you know, we don't have, for example, the right monitoring in place. And I'll say that's the most important for me. And that's why some projects, a lot of them fail is because we would put something in production and then, you know, it works and. [00:34:00] Actually there was like one month, six months, one year, and we still operating under the assumption of what we validated one year ago that it worked.

[00:34:12] Khrystyna Sosiak: But we don't know if it works now. And if you have a webpage and you build it once, it's gonna work, you know, until something new. So you gonna break something in the code. But with machine learning, it's different because. What works today doesn't necessarily is gonna work tomorrow or in one year. And we need to have all the monitoring in place to make sure that actually your machine learning model is still helping your business and not harming your business.

[00:34:41] Khrystyna Sosiak: And uh, I think that's one of the very important aspects about. Having a running machine, learning models and AI in production is having the right monitoring and alerting in place, and also knowing what are the actions I'm gonna take once I [00:35:00] receive that alert. You know, like what it means. Like what it means.

[00:35:05] Khrystyna Sosiak: Not only, okay, I need to retrain this model, but if it's gonna stop working, if we are gonna turn it off. What is the cost of that for one, one minute, one second, one hour, right? Or if I will continue, it's, we still continue working while I'm fixing something. What is the cost of that? And I think those things is very important because we can say we are gonna monitor, you know, we have a set of metrics we're gonna monitor and we are gonna receive an alert.

[00:35:32] Khrystyna Sosiak: And then you receive alerts in the middle of night. And then what? And what is the next step? And I think having this plan and strategy in place of not only exciting part of building the model, but the part when actually customers interact with that and something goes wrong and the world is changing and the data is changing and things are breaking.

[00:35:54] Khrystyna Sosiak: It's very important.

[00:35:57] Himakara Pieris: I know you have like [00:36:00] seven key questions or seven areas that you look at, , as a way to mitigate the risk of failure in AI projects. Could you talk us through those seven key items?

[00:36:11] Khrystyna Sosiak: There's a lot of projects that we have done that failed, and that's why we learned from that. And it's like, okay, there's some things that you can watch out to, like, not to make those mistakes and prevent your product from failing. So the first one would be you ask the wrong questions. You don't understand the problem or you don't understand the customer, or you don't understand the data, and then the question and the metric you're optimizing for, and the problem you're trying to solve is actually not the right, and then no matter what you do is not going to work because the question is wrong.

[00:36:46] Khrystyna Sosiak: Uh, so you need to invest in that. You need to invest time in understanding that. Um, the last, the next one, it kind of, not very, not technical, right? You say, okay, machine learning. Products fell because [00:37:00] there's like something with technology or data? No, actually most of them fell because there is no support from stakeholders.

[00:37:08] Khrystyna Sosiak: There's no understanding, there's no, uh, sponsorship for the things that we do. There's no willingness to change the approach That was. Used for years. And I think bringing all of them on board and making sure that you get buy-in from them, from the stakeholders that you need to work with. And it's, I'm not only talking about the sponsors that gonna give you money to do that, right.

[00:37:33] Khrystyna Sosiak: Or like, Green light, but also about the people that would, for example, at the end of the day, will need to use the machine learning model. Are they willing to do that? Right. Because if you have department of 100 people and you come to them and you tell them, you know, we want you to use that, and they say, no.

[00:37:49] Khrystyna Sosiak: There's not, there's not a lot you can do. Um, yeah, the data, data is the key. So having the data quality, um, in the right place, checking the [00:38:00] data quality, having the, the right data set in place, it's very important. And if you don't have it, it's a very big problem that can cause the failure. Uh, the data science team, having the right people in place and having the right set of people, when you have the team that every single person would know only one thing, and it's all the same thing, it's not gonna work because with the way the product works and also there's different stages and you need different set of skills, um, Going for something super complex that you not necessarily understand and having like so super complex models, it's also very often like you'll fail and you'll not even know why you failed because it was so complex that you don't even know what optimize for and like that's another one.

[00:38:50] Khrystyna Sosiak: Right? And also over promising, overselling, setting the wrong expectations. It's another one because. You are, there's [00:39:00] always the risk of failing and you need to go and talk about that and make sure that cus uh, stakeholders know about it. And there's always the return of investment, there's always the risk.

[00:39:12] Khrystyna Sosiak: And just making sure that you and people you work with align on that and fine with that. That's, uh, that's the things I would look for.

[00:39:22] Himakara Pieris: Thank you for coming on the podcast and sharing insights today. Khrystyna, is there anything else that you'd like to share with the audience?

[00:39:29] Khrystyna Sosiak: I think that just making sure that what you do in life, you bring some impact or to your customers or to your business or to the world, and making sure that we use our power of using technology in the right way.

[00:39:43] Khrystyna Sosiak: I think that's, that's very important and there's so much power that we have right now and opportunities, so yeah.

  continue reading

15 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide