Academic Forefronts

Environment Variables

Content provided by Asim Hussain and Green Software Foundation. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Asim Hussain and Green Software Foundation or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

2M ago 51:10

MP3•Episode home

This week we are joined by two PhD researchers, Silke Kaiser and Chiara Fusar Bassini, from the Hertie School in Berlin. With host Chris Adams they discuss their use of data science and machine learning and how they are using them to tackle some of today’s most pressing environmental challenges. Silke shares insights into her research on predicting cycling traffic in cities to better inform urban planning and promote sustainable transport, while Chiara discusses her work on analyzing European energy data to support the renewable energy transition. Together, they explore the intersection of technology, data, and policy, highlighting the importance of data-driven decision-making in achieving sustainability goals.

Learn more about our people:

Chris Adams: LinkedIn | GitHub | Website
Silke Kaiser: LinkedIn | Website
Chiara Fusar Bassini: LinkedIn

Find out more about the GSF:

The Green Software Foundation Website
Sign up to the Green Software Foundation Newsletter

News:

Resources:

If you enjoyed this episode then please either:

Follow, rate, and review on Apple Podcasts
Follow and rate on Spotify
Watch our videos on The Green Software Foundation YouTube Channel!

Connect with us on Twitter, Github and LinkedIn!
TRANSCRIPT BELOW:
Silke Kaiser: I like the term of fighting fire with fire. You know, you're trying to make it better, but you're making it maybe even worse. But I think if we make some smart choices along the way, I rather like to compare it to the idea of fighting a forest fire with a controlled burn. What I'm trying to say is that there are different approaches that we can actually also reduce the emissions caused by AI.

Chris Adams: Hello, and welcome to Environment Variables, brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software. I'm your host Chris Adams. Hello and welcome to Environment Variables, where we bring you the latest updates and news in the world of sustainable software development. I'm your host Chris Adams. It's important to understand that when we talk about sustainability and technology, it's easy to mix up sustainability of software with sustainability through software. Sustainability of software development is about understanding the direct impacts of technology and doing as much as we can to reduce it without delivering a worse experience for people using the software. Here, we care about the impact of code, like making it more efficient or making sure the energy we use is cleaner and coming from the cleanest possible sources. So when we talk about green software or green IT, this is what we're talking about. Sustainability through software development, this occurs through the application of software to solve a specific sustainability problem or provide us with insights that we didn't have previously to help us meet some of our sustainability goals. To make this really concrete, you can use green software and talk all day long about the sustainability of software whilst helping people drill for oil and gas. Now you can do that, but it's really not a good idea if you wanna hit any kind of societal climate goals. And if you're listening to this podcast, I think you probably don't wanna do that either. So while we usually cover the sustainability of software. It can also be helpful to look up from our keyboards sometimes to talk about the effects of software, the effects it can have for helping us reach our climate goals. So in this episode, we'll be diving deep into the work of PhD candidates who are pushing the boundaries of what's possible in sustainability ad software. In this episode, we're joined by two researchers from Berlin Institutions, the Berlin School of Economics and the Berlin Hertie School of Governance. So what insights can we gain from their research? How are they using technology to address some of the challenges around sustainability today? Let's find out. So, first of all, Silke, can I just give you folks a bit of time to introduce yourselves? I'll hand over to you, Silke, and then hand over to you, Chiara, to introduce and give you the floor. So, yeah, Silke, thank you very much for joining. The floor is yours.

Silke Kaiser: Thank you very much, Chris. I'm Silke Kaiser. I'm a PhD researcher at the Berlin School of Economics and at the Hertie School Berlin. I'm excited to be here today. My research focuses on the analysis of sustainable transport data, with particular emphasis on cycling data. I utilize various tools from machine learning, data science, and spatial statistics to explore this field.

Maybe more on a personal note, outside of academia, I would say I'm a bit of an auto enthusiast. Just earlier this week, I came back from a vacation in France, but I'm happy to be back and to join you for this episode today.

Chris Adams: I'll be Back just in time for the weather to be nice in Berlin, right?

Silke Kaiser: Exactly.

Chris Adams: Cool. Thank you for that, Silke. And, Chiara, can I do the same to give you some space to introduce yourself as well?

Chiara Fusar Bassini: Yes, thank you for that. My name is Chiara Fusar Bassini. I'm a PhD researcher at Hertie school in Berlin. I'm very excited to be here for my very first podcast. My own research focuses on the analysis of European energy markets. And I use data science and machine learning to analyze time series of dispatch of single power plants and try to look at how power plant dispatch has changed in the context of an evolving energy system. Beside being an academic, I'm also a rather lousy actor in an amateur theater group here in Berlin.

Chris Adams: Cool, thank you. So you said you're working in a theatre. Are there any particular roles you play or anything like that? Because I think you may be the first actor who's come onto the podcast actually, Kiara. Chiara.

Chiara Fusar Bassini: Well, I have a tendency to take on the roles of either mad people or police men or like with gender changes. So anything in between is business, usually like mad people it has been, like the latest role I've had was a police officer. And before that I was a mentally ill person.

Chris Adams: Wow. I did not, I was not expecting that. Okay. All right. Well, welcome onto the show. And I guess that maybe we'll see some, productions in future. Folks, if you're new to the podcast, my name is Chris Adams. I am the executive director of the Green Web Foundation. That's not the same as the Green Software Foundation.

It's one of the members of the Green Software Foundation. I'm also one of the chairs of the Policy Working Group and also one of the hosts of this podcast. Okay, so before we talk in depth about your research, I just want to check for people who are also listening to this, you've been doing this research under the supervision of Professor Lynn Kaack, I believe, is that the case?

Chiara Fusar Bassini: Precisely. We've been working with her for a few years now, each of us.

Chris Adams: Brilliant. Okay, so what we've, the reason I'm sharing this for listeners is that we did an episode 5, where we spoke to Lynn and another person, Will, oh his name has changed, I think it's Will Alpine now, talking all about climate change and AI, two years ago. So if you enjoyed this, I would suggest looking at that to learn a little bit more.

And that might provide some extra context for this discussion. All right, then. Are you two folks sitting comfortably? Happy to go ahead with this?

Silke Kaiser: I think it's good to go and share insights on our research.

Chris Adams: Good stuff. Thank you. For anyone who is listening as well, the thing I'll just share is that we will share show notes with links to all the projects and papers that come up to this. So if any of this is interesting to you, then yes, we, you can continue your quest for more knowledge and insight outside of this podcast. All right. Silke. Let's start with you. Your research focuses on predicting cycling traffic in cities using data from bike sharing systems. And this is something I believe you worked with Lynn and another researcher, Nadja Klein, on from this. Could you maybe just explain a little bit about how this actually helps people when they're trying to design how people move around in cities like, say, Berlin or Paris or things like that?

Silke Kaiser: Yes, I'd love to. So what we generally see when we think about transport in cities is that public space in cities is limited. Whether it be in Europe, the USA, or any other place. Generally, then when we think about how we want to redistribute the space among different mode shares in city, we see that there often tends to be a heated debate.

And especially as we work towards promoting more sustainable modes of transport and therefore reducing the CO2 footprint of our cities, conflicts often arise. And the question then remains, how do we actually want to prioritize these different modes of transport and allocate the space and also financial resources among them?

So, for example, take Paris as an example. I've lived in the city for several years during my studies. And what you can see in the city that in the past few years, they made a lot of changes to prioritize cyclists, which has improved the uptake of cycling, but which has also led to quite heated debates.

The same we can see here in Berlin, the city that we're currently both located in, is that we had a re-election last year here in February, and a lot about the debate actually hinged in part on the choice between prioritizing cycling and individualized motorized transport. So what I do try to do in my research is actually to provide more data to this debate, because what I see as the main challenge in these kinds of debates is that we don't have accurate data on how much cycling traffic we actually have in cities.

Chris Adams: Ah, okay. So maybe just kind of dive into there. So it's basically, we don't have the data to really have a data informed discussion, basically. That's one of the things that is the challenge here. And for context, so the three of us live in berlin. We saw, basically, the new government and the new mayor come into power on a very kind of pro car platform, basically.

So this is what you're referring to, right?

Silke Kaiser: Precisely. So, answering maybe the first part of your question, so for example, in Berlin, we have only 40 locations where we count cyclists. I think in Paris, it's around 53, in New York, it's 41. And then in Berlin, for example, we have around eight times more locations where we count motorized traffic. So we just have much more data and much more information on motorized traffic than we do have on cycling.

And then yes, in Berlin, actually, there was quite a bit of a heated debate pretty much between let's say the inner city, which was more pro cycling and in the suburbs, which were more likely pro cars. The government switched from a green to a more conservative government, which actually decided to suspend actually just this month quite some, projects, long distance commuter paths, and both bicycle parking houses in cities.

But that's just more really on the political side of the debate. And then what I really see as the main challenge is that just this data and the information is missing on where we actually would need the infrastructure most, given how much we want to prioritize cyclists.

Chris Adams: Okay, so maybe I should ask, where is the data coming from then, for this?

Silke Kaiser: So, we do have these 40 counting stations in Berlin, that's the case study that I'm looking at, and what we then figured is, well, we don't have that much precise data on cycling, but we do have an abundance of other data on cycling. So, for example, we do have, as you mentioned, the bike sharing data. We do have as well data from Strava.

That's an app popular to record yourself while doing sports. We do have data on infrastructure, we do have data on weather, we do have data on socio economic factors. And we figured, well, why don't we use all this available data to actually extrapolate from these few isolated locations to actually obtain city wide estimates.

And that's what I did in this research with Lynn Kaack and Nadja Klein. So what, precisely we do is that we train various machine learning algorithms to use all this kind of data in combination with the cycling counting station data that we have to obtain citywide estimates. And what we actually found is that only using this data is a bit tricky.

It provides us estimates for completely new locations 32%, which is you know, rather good in comparison to having no data at all, but still 32 percent is an error I take seriously. And what we then simulated, continuing with this research was, what if we make some sample counts for new locations? So for example, if I want to estimate the traffic in front of your house, Chris, the cycling volume in front of your house, we figured that if we would maybe put you, someone else, on an automated machine there to count the cyclists, and we were to count the traffic for 10 days.

And then combine it with our models, we're able to get estimates with an error of only 17%, so really rather low for complete new locations. And this gives us a good estimate of how much cycling traffic we actually have in every single street of a city.

Chris Adams: Ah, I see. Okay, so you're using the machine learning model, basically, to make the extrapolations when there's, there might not be so much data, give you more accurate so you can say, "well, I'm more confident that this many people are trying to get around using an active, non gas burning form of transportation," for example, right?

Silke Kaiser: Precisely,

Chris Adams: Ah, okay. All right. Thank you for clearing that up. And I understand. And just, we'll come back to this a little bit later, but you said you're using ML. not the same as generative AI or something like that. That's a different, there's, a whole flavor of different things you might be using there, right?

Silke Kaiser: precisely. I mean, there's many different models out there. In this paper, we used rather simple machine learning algorithms, nothing comparable to maybe what most people think of when they think about chat GPT or whichever generative AI you might think about. Those are really rather simple models making usage of the data we have.

And those models So, I just tend to have, I sometimes, you know, sometimes I think about, I mean, you can target these problems with very complicated algorithms, but sometimes using rather simple algorithms might just be sufficient. And that's what we actually found in this research.

Chris Adams: Okay, alright, thanks. When I was doing a bit of research, I realized that you also wrote a piece around, I think in a publication called Catalyse, where you're talking about active transport and why it's important for greenhouse gas emissions, because you've spoken about like mode shift, which I assume means basically moving from a Being in a car to moving and maybe a active transport, which people use like scooters, bikes, stuff like that.

That's what I think you're referring to there. Could you maybe talk a little bit about how this research actually contributes to like the adoption of cycling as one of the potential modes, because you spoke about it in Paris and I went to Paris and was being pretty transformational when I was there compared to being a few years back.

And I also bike around Berlin too. So I have a vested interest in learning this. Maybe you could like lay out why that's actually benefiting and how active transport helps. Basically, us meet our climate goals.

Silke Kaiser: Absolutely. So, when I, Referring to other research that I've read, research that I haven't done myself, but it is out there and it's been cited a lot, is that we do see that cycling has numerous benefits. It benefits your individual health. If you cycle, it's good for your physical health and then all the sicknesses or illnesses related to insufficient physical health.

We can also see that if you cycle, it's also good for me because then generally we see a reduction in noise and air pollution in cities. So it really benefits the public health, the broader public. And then yes, absolutely. I mean, I did read the IPCC report, which is a report on climate change and it comes out, the last one came out in 2023.

And what they found is that actually 15 percent of net global greenhouse gas emissions are related to transport. This of course includes all kinds of transport, but also, one of them is urban transport. And then switching, there are many levers how to tackle this, right? And within general, as in climate change, there's no one solution fits all, but switching from motorized traffic to cycling is one of those means to actually reduce those greenhouse gas emissions.

And coming a bit back to my research as well, what we find also in research and science is that all you can think about, you know, talk to your friends and family and gather some anecdotal evidence, you'll probably find that one of the biggest deterrents that keeps people from cycling is that they actually are afraid because there's not enough cycling infrastructure.

They're afraid of accidents. And that's a relevant fear. We do see many accidents in cities right now, mainly between motorized traffic and cycling, but also all kinds of other accidents. And what we do can do in cities is, to actually promote cycling, is to build more attractive infrastructure for cyclists.

This can include bicycle lanes, a better design of roundabouts. And all this attracts people to cycle, but, it actually also reduces the risk of serious and fatal accidents. So what I really try to do with my research is that again, if we have these heated debates in cities, how we want to distribute space among cyclists, cars, delivery trucks, etc.

I'm trying to provide data on where we actually have how many cyclists in which street, so that when policy makers or transport planners come around, they can use my data and actually make fact driven decisions when and where infrastructure benefits the most, the greatest number of people. And that's how I hope my research can contribute

Chris Adams: Ah, I see. Okay. So that's really helpful. And I think there are some kind of comparisons I can make, which make to maybe help me understand if, and some of the listeners. So you know how a couple of years ago, in the middle of the pandemic and COVID, one way to reduce the number of COVID cases was just to reduce the number of people taking tests, right? You know, that's not necessarily the best way to solve that, know, and it feels like

we've got a similar situation. We've got a data asymmetry problem here it looks like you're doing some work to address for that. I mean, Also, you've spoken about, as I understand it, there are various parts of, like, our economy which are easier to decarbonize than other ones.

Like if, for example, in Germany and in America, transport's the biggest, one of the places where we've seen not so much progress on reduction, on carbon emission reductions compared to things like the energy sector and stuff, which is decarbonizing relatively quickly. So this is what some of this is a reference to. Okay, so what we'll do is we'll share some links to Catalyst, I'm sorry, Catalyse, the paper there, and also some of the papers that you have. So we spoke about Paris, and we've spoke about Berlin, where we both live. Are there any other places you would point people to as examples of, okay, this is what good might look like, and this is one place which actually has quite good data to show where you've actually seen quite effective policymaking to kind of change the environment to make it easier to cycle? Because, yeah, not everyone wants to become a MAMIL, like middle aged man in Lycra wearing the helmets and everything like that.

Silke Kaiser: So, I mean, there are definitely some cities that you know that are popular for cycling, for example, just earlier this spring, I did a research day for some months in Copenhagen. And obviously Copenhagen is a bit of a dream for cyclists, right? I'm not the first one to mention this. And then there are other cities, Amsterdam, you name it, but generally I do have to admit that in my research, I haven't really come across cities that do have much better data.

I would say it's a grasping problem across different cities that data is missing. Copenhagen and Amsterdam have taken political decisions to prioritize cycling, but I do have to admit that I didn't, I haven't come so much across that they've made this as a data-driven decision, but this was more of a political decision.

Chris Adams: Ah, okay. And there's one thing I'll just ask before, Chiara, I'm, okay, I am a closet energy, well, not very closet energy nerd. I'm totally gonna, looking forward to talk about that. But Silke, I was just going to ask you, so you mentioned use of Strava and you mentioned the use of, Okay. It's useful to have these new sources of data, but there's also a question about the provenance of that data and like the circumstances under which it's collected.

So for example, we've seen Strava used in lots of other places and if you're using Strava, you tend to be a bit richer, a bit younger, a bit healthier than most people. Maybe you could talk a little bit about some of that, because there are various sources available to inform these policy sessions, and like, Strava is one.

But like, where else, like, assuming you had, you were suddenly queen of the world, where would you wish you could get some of the data from to kind of inform this in future?

Silke Kaiser: So you're absolutely right, Strava definitely is quite biased. It's the data, for example, for Berlin, I definitely know that they're all male, they're mainly male, young, and they do tend to do very sporty biking in comparison, for example, what I probably do to commute work. So it is true that some of the data we use is biased and we're trying to balance this off with the other data sources that we're having.

We're also taking socioeconomic factors into account because obviously we do not want to have, infrastructure is meant to be there for everyone and not for privileged or less privileged people. It's, meant to be equal for everyone. But then obviously I thought about a lot, well, how could we actually improve the data availability in cities?

And I definitely see two levers that we have. Well, first of all, we can place more cycling counting stations. That is a bit challenging because, for example, we have so many kilometers of roads and it's hard to track them all. There are cheaper options than the ones that we're currently employing. So this might be our one option.

Some of them are then using cameras, for example, that are just much cheaper to put them out there. And then the other question, and that's something I'm also looking forward to, to answer this question is because I'll be looking at this in my future research, is that actually how can we place the sensors that we have better across a city? Because currently look it up again for your city.

You'll probably find a similar image is that we do tend to place these censuses as very busy and scenic roads. And the question is actually, can we maybe place them at more diverse spots within a city? And if yes, how can we choose those streets to place those sensors at to actually get a more comprehensive image of the cycling traffic and then also of all kinds of socioeconomic areas and a more equal data image.

Chris Adams: Cool. Thank you for that, Silke. All right. We'll come back to you a little bit later about some of the specific techniques that we were using, because we spoke a little bit about ML and there's a lot more we might dive into there. Chiara, if it's okay, can I ask you a little bit about your research analyzing European energy data?

Because you didn't hint a little bit about how this can affect renewable energy transition, and one of the, one of the things that Germany has in particular is a target to have 80 percent of the grid running on renewables by in, wow, in five years. So that's not much time, and we've also spoken about on the grid, we've spoken about things like time shifting and location shifting as a kind of carbon-aware software, particularly in changing how data centers, like, fit into the grid, I suppose, or the energy they use. Can you maybe talk a little bit about some of the challenges you've found actually working with this data hands on? Because we At best, most of us developers, we might use it in a really nice, pretty fashion from electricity map or Watttime or in an SDK, but it sounds like you're pretty much at the front end having to figure out how the sausage gets made. So yeah, if I can ask you, maybe you could tell us a little bit more about how this data comes about and what are some of the challenges.

Chiara Fusar Bassini: I mean, if you've been using Electricity Maps, you probably have been using an application that in the back uses NSOE data. So you, in Europe, we are rather lucky because. There have been two regulation that have been released in 2011 and 2013, which forced in a way transmission operators to publish a variety of a time series of energy data from the grid and from the markets in an effort to increase transparency. And we have a number of data in a central, on a central repository, which is called the Transparency Platform. We have load data, we have generation data, we have transmission data about the grids, we have balancing data, balancing markets, but also a lot of information on individual power plants. This data is overall extremely useful, but unfortunately it's not Always accurate and it's not always complete and not all the data is not always published in a timely manner.

That very much depends on the type of data, the country itself. We are still very much better off than other markets where there's no data at all. But it's still an issue of like how good and like what the data quality actually is. Because you mentioned time and location shifting. To do time and location shifting, most likely you will be working with aggregated data.

For example, load data, load data, load forecast data. And. One could analyze, for example, a load to decide whether to shift more energy consuming activities at night or at moment where there are off peak time windows. And on the other hand, one could look at aggregated renewable generation data to try to relocate some more of energy consuming activity to time of the day where the grid is actually greener and there are a lot, there's a lot of interest in academia, but also in, in the industry sector to provide us information, to have an estimate of carbon intensity. there are a number of startups out there. You mentioned Electricity Maps, but also academics have come up with top, bottom up, top down and bottom up approaches to compute really at hourly or quarter hourly level, these carbon intensity estimates. The trick here is that you are working with, the aggregated data and aggregate the quality of aggregated data and the timeliness of aggregated data is rather high, the situation is a little bit different when you move to a more geographically granular, like a higher geographical granularity.

Chris Adams: Okay, so from Germany going to like Berlin or Germany going to another part like Frankfurt, for example, something like that. Yeah?

Chiara Fusar Bassini: Rather, when you're looking from, the aggregated generation to the generation of individual power plants, because in that sense, you might be interested to know which power plants are actually generating right now. And you might be interested to know which areas are generating more solar, for example, which areas are generating more wind. Unfortunately, we don't have data on all power plants. Which would be rather impossible in terms of like amount and extent of the data, but we have only data for power plants that are at least 100 megawatt. It's mostly conventional power plants. So, for example, we have no individual information or very little information on wind farms, for example, because Some of them are not big enough to qualify for this criteria. And also this data get published with a significant delay of four days so that you can't really use it to

do anything operational. It's also not conceived for that. And we can or we cannot use the data to properly, like we cannot do it for, we use it more for analysis than for forecasting, but nonetheless, we can use this data to understand a lot on individual power plant data, why they decide to dispatch on how they are dispatched, and especially in the context of conventional power plants, how their dispatch has changed over time because of political reasons, but also because of the increase in cycling of fossil power plants, because they have to adapt to the renewable energy generation, to more renewable energy generation.

Chris Adams: Can I just quickly stop? I just want to check I understand some of the terms you've used for listeners who might not be familiar with load, cycling, some of these things here. So when you talk about loads, you talk about energy, like basically that's power draw, what people are trying to draw from the grid.

You mentioned that. And then you also mentioned, I think, like cycling. So that's like basically scaling down a power station in response to there being loads of wind on the grid or stuff like that. Maybe is that about right?

Chiara Fusar Bassini: Yes. Thanks for asking. So to clarify, so load means the demand and in the past, I mean, demand, especially from, industry, but also from household, it has been rather predictable. And the way we faced demand, or we satisfied demand, because well, in an energy system, demand and supply need to be equalized at any time. In the past, the most of the baseload, so the main bulk of consumption, have been satisfied using traditional fossil fuel technologies, so called dispatchable, because you can, decide when and how to switch them on and off.

Chris Adams: Ah, okay.

Chiara Fusar Bassini: But the thing is, as more and more renewables enter the grid, they cannot be dispatched whenever, they can only be dispatched, or they can't choose when to dispatch a

Chris Adams: Yeah, control the sun and the wind. We can just respond. Yeah.

Chiara Fusar Bassini: They just respond to external weather factors, right? But that also implies that we still have these conventional power plants that are

dispatchable, but we now have to operate them with increased flexibility. So they have to be able to ramp up and ramp down as the load is more and more satisfied by renewable energy sources.

Sources when they are there. And we always make the assumption that say conventional power plants are a hundred percent flexible, but that's not actually the case. For example, some power plants, when they are turned on, they have to generate a minimal capacity. And if the demand for that capacity is not there, that might be an issue, or they might have some minimal times to be switched on and switch off.

So there, there is a plenty of interesting question that arise from the increase of renewables. Like how will conventional power plants cope with more renewables in the grid?

Chris Adams: Ah, I see. Okay. So one thing you're saying, she's like, yes, it's not like a computer. You can't turn it on straight away, like in milliseconds. And so that's one thing you mentioned and the fact they need to do that more is another issue. And if I understand it, what you described was quite a physical process.

It's like, we're not using bits, we're using atoms, like burning coal, things are expanding and contracting. Like, is there a risk that, you know, the, a big power plant could be damaged some more, or does that introduce any wear and tear when people need to scale something back? Because I can imagine someone saying, "hey, you're making me change how I do things, and therefore you're introducing some risk into this.

That's not what this was designed for in the first place."

Chiara Fusar Bassini: Yes, that's actually a very interesting question. What I mentioned, cycling, meaning that you operate conventional power plants more flexibly, has some consequences on the lifetime of power plants, especially if you keep on turning it on and turning it off. There are some wear and tear indeed for thermal power plants, wear and tear consequences, some of some power plants may not even be able to do so because they have some agreements with O&M managers that tell them, you know, "you can do that, but then you'll have to pay more because you will have, we'll have to do more main maintenance." And also, there are a number of obstacles that arise, especially for older power plants that have not been conceived with this flexibility option in mind, but rather to satisfy baseload.

Chris Adams: Ah, okay, thanks for clarifying that. And you're essentially doing some of the research to see how you might predict some of this better to either reduce or basically accommodate some of these changes that we might have when we've got a much more dynamic grid that is influenced by the sun shining and the wind and all the things like that, right? So maybe if we can talk a little bit about some of the techniques being used to track this and reduce the amount of, maybe, reserved capacity that needs to be done, or reduce the amount of wear and tear that might be imposed on the kind of entire system full of all different power generation. You said you spoke a little bit about using machine learning, and We spoke to Silke.

Silke mentioned that she's using some ML models, which are not like generative AI. That's a very, it's a different kind of AI. Could you maybe talk a little bit about like how that gets used, because you hinted at it, and like what some of the barriers are for using some of that, because it, that sounded quite enticing and interesting.

Chiara Fusar Bassini: Yes, before, before that, I might, I want to add on this. So there is some parallel research being done, especially like at engineering department of a lot of engineers trying to use machine learning to efficiently operate conventional power plants to reduce this wear and tear of wear and tear problems.

And in general, like damages from cycling while still satisfying a change in demand. What I'm doing is and Rather different analysis of historical usage of power plants. So to see how power plants are, have actually been operated so far in the markets, how they're, how they, how flexible they actually are.

Because sometimes we assume that they're 100%, again, we assume how they're 100 percent flexible, but how flexible are power plants that we already have in the grid? And also how available are there in cases of outages, for example, how, like. What's the percentage of time in the year that they actually could provide electricity, for example? And in terms of techniques, well, it's a lot of time series data, so most time series apt methods can be used here. It very much depends on the ultimate task, but one of the major obstacles I encounter is that this high granularity data is by far not as good as the aggregated data, especially, for example, an availability of power plants that has to be reported in a rather accurate way, but then is not one to one translatable to time series format because it's published as market messages, meaning that the data that we have is not in a format that makes it directly usable for a researcher. So there there are a number of obstacles that are really determined by the data quality rather than by the task itself.

Chris Adams: Ah, okay, so it comes down to the data a lot of the time then, basically, yeah?

Chiara Fusar Bassini: Again, like as Silke said, sometimes it's really just a matter of the data that you have, like the research that you can do is going to be determined by the quality of the data that you have.

Chris Adams: Okay, we'll touch on that a little bit later, but I guess the, that does make me think about, particularly in Germany and countries where we've seen very rapid changes. Like, Germany, there's, you know, there's a massive craze of balcony solar, for example, or we've seen loads of battery coming onto the grid, or even Pakistan. We've seen, like, a third of the power, the new power introduced this year, was come from rooftop solar, and each one of those is individually less than 100 megawatts. That's an enormous chunk of power. So there's all this new stuff that we're not, don't necessarily have access to the data for to actually figure out, okay, how will the grid work and how can we make sensible predictions on this? That's useful to know. Brilliant.

Okay. So we're speaking a little bit about the upsides and how, where some of the potential might be. We do speak about Green Software, about reducing the environmental impact of some of this, and obviously when we're doing some of this work, I've asked a little bit about the kind of models you might be using, partly because there's a question, whenever we start using technology to help us meet climate goals, it's when some of that energy is still coming from burning fossil fuels, for example, there's trade offs to be made. Does anyone want to go first, talking about how we think about these trade offs? Because as practitioners, I imagine you're at the coalface, but you're also working with some of the people who think about this every single day. And like, if you're working with Lynn, and like, Lynn was one of the founders of Climate Change AI, I reckon she probably has some reckons and you've probably had some conversations about this, right?

Silke Kaiser: Absolutely. Actually, just, I think just the last group meeting we had, we just discussed about precisely this topic, because it obviously, it is a question that keeps coming, a question that we do want to answer. And it is also like, it's in our minds, right? Because if we want to do something positive for the climate, and then actually, the net result might be negative, because our models consume that much energy.

This definitely is a topic that we think about a lot, I would say. I see Chiara nodding. I think she's agreeing with me, but, and I can see that maybe to the outside world, often this can seem a bit like, I like the term of fighting fire with fire, you know, you, you're trying to make it better, but you're making it maybe even worse, but I think if we make some smart choices along the way, I rather like to compare it to the idea of fighting a forest fire with a controlled burn. So right, that we do try, for example, in, in the models that I was employing, I did partially check how much, how big the energy usage was.

I was using simpler models, as I mentioned earlier. So the energy consumption wasn't that high, but I think it's good for us and for everyone out there using, similar models. To track your energy consumption and there are very nice packages and libraries out there, tools, all kinds of things, open source, freely available that are very good in, in managing or in measuring the energy consumption you have.

And then of course there are a whole bunch of other approaches that you can take. Right? I mean, you mentioned it's an issue if it comes from fossil fueled energy, but obviously you know, you can think about, I know that there are a lot of like. Service and data science centers, for example, out there in Iceland, where you tend to have more natural cooling, where a lot of the energy being produced is renewable.

I'm not saying that at all perfect, but what I'm trying to say is that there are different approaches that we can actually also reduce the emissions caused by AI.

Chris Adams: Ok, so there was one thing about the actual technique, like, AI is not a monolith. There's all different approaches within this, in some ways, not particularly helpful term, like the use of relatively small machine learning models, which are relatively simple, that's going to have a totally different footprint to the model used to generate SOAR, like a video or something like that.

And that's something that we probably would benefit from having a better kind of intuition off as practitioners, for example, and you spoke a little about the carbon intensity. So there's two, two things that you have there. And you mentioned some software that you have. And you said that, Chiara, if I can kind of give the floor to you, because I think you mentioned you, you've spoken about some of this before about, yeah, there are some tools and I use them as well. Can I ask you a little bit about when you've been thinking, I mean, how do you think about these trade offs? Or is it a trade off? Or is there another set of dimensions you might be thinking on rather than like forest fires and controlled burns, for example?

Chiara Fusar Bassini: Yeah, I think, there are two things that need to be thought through when using AI. Number one is like, how do you develop your model, and then what do you use your model for? So how do you develop your model? That's similar to what Silke said, for example, doing emission emission tracking while developing the code and while training the code. And at the moment, I think AI is missing some embedded indicator of the social environmental cost of the training. So. We kind of think of performance metrics such as accuracy, such as like classic cross entropy losses and so on, and we think only about precision. But sometimes we need to be a little bit more critical of whether an increase of accuracy of 0,1 percent is worth an increase in the training time of two hours or an increase in the size of the model of 25%. These are like actual numbers and scientists have coined for that the term green AI, meaning Okay, can you know, can we, in a way, embed this measure of the size of the model within the loss that we are trying to minimize in the training of our model? There is another, a good example, for example, is the Bloom model that is an alternative large language model to GPT. It is similar in size, but it required Like the CO2 emissions of the model are 20 times lower than GPT 3. And this has been made possible by, first of all, in smart usage of the training and also tracking of the carbon intensity of the grid. It system was trained, the model was trained mainly in France, which is, which has runs predominantly on nuclear power. So in like carb, much more carbon neutral system. So there are a number of things that one can consider while training their model. But also another thing that is very important, and I think that we sometimes don't really think through, is what are we using AI for? And in that sense, there is currently no standard assessment in place. Like, is this application really worth using AI? AI is, by its nature, ethically neutral. It can be used for anything from targeted advertisement that will have probably a negative impact on environment to detecting wildfires. So very positive impact. I think policymakers in that sense can make a great deal to really make a difference and start, for example, by providing a classification of which user cases are positive for environment and which are negative. It sounds, it may sound like science fiction, but it has already been done in the European AI Act in looking at the perspective of risks, like which application have a high risk and hence should be more controlled and which other have lower risk. And I think a similar classification would be also very useful for environmental purposes.

Chris Adams: I'm really glad you mentioned that because I ended up reading through the AI act for research recently. And the idea of the risk that is, you're right, there doesn't seem to that much be that much reference to the use of AI for, let's say, you know, increasing the extraction of fossil fuels, right, versus that.

That's, there doesn't seem to be much to mention about that, but there is some information about the transparency around training. And now that we've looked at it a bit closer, so within the Green Software Foundation, there is a group called the Real Time Carbon Group. We've been looking into some of the specific implications of this, and it looks like the AI Act, it also, it looks like it's probably going to suggest not just understanding the training, but also the cost of inference, like the use of the model rather than just the training of the model. If I can just quickly, you've, you, mentioned there are tools out there, and Silke also mentioned there are tools out there. If I did want to measure some of this, and if I did think there was some legislation coming for this, what tools are there available for me to measure the direct impact? So at least I know what the trade off might be.

So we understand that the carbon footprint of decarbonizing transport, like Silke mentioned, that's going to be, you know, positive, but quite, but there's ways of calculating that, but for us as practitioners, are there any software or any tools you might recommend that are kind of common in the field now? Either goes. I'm happy to, whoever's more comfortable talking about this.

Chiara Fusar Bassini: I'm thinking CodeCarbon is more probably a standard used by many scientists. I know there are more applications that might have a higher granularity, but I guess that's a

Chris Adams: That's the one that you folks have used, right? Okay, I hear CodeCarbon used a lot, and I, as I understand it, that's the one that's been used for the Bloom model when they wrote a paper about that. That's what I'm not sure Facebook have actually explained this because when I was looking at LLlama's model, so AI models have model cards, which basically, which I think, various responsible practitioners now say, "this model took this much carbon, or they had this much energy gone in to kind of create it," for example, if you go to the existing Llama 3.1 model card on Hugging Face, and you try to follow a link to the actual methodology, It's not actually explaining how it works. there's now a bug. I filed a bug to ask out, ask, well, how did you work these figures out? Because these feel like it's quite important, especially because when you look at the numbers, it's significantly larger than Bloom, basically. That's, so, so what you're referring to is CodeCarbon. That's one tool that people can use that will give you some idea that is in use in a few places already that's relatively safe to start off with. Great. Okay. And we spoke a little bit about some tools. So if someone is, wants to take their first steps, they might look at this.

And there are various projects I'm aware of to make it a bit easier to understand the impact of one versus another. I believe there's one Energy Star AI or Energy Star, AI Energy Star or something like that. There's one person who I've spoken to who's involved with it. Boris. I'm so sorry I can't pronounce your surname, but I do know you're the AI lead, the AI sustainability lead at Salesforce. Boris G is one of the people who's been writing about this. He's not the only author, but he's the person I know, and we'll share a link to that as well, because that's the first thing I've seen of a useful, like, A kind of nice idea to give you an idea of what the inference, the usage as well as the training might actually be. If you were to look at this, we've still got this issue of data or having access to data like, and Silke, I ask you, if you were like queen of everywhere for a moment, how would I change it for here, right? Let's say that you want to be responsible AI practitioners, like what are the things that we need to see in the next, in the coming years to make it possible to be like responsible practitioners so that when we do use AI, we're using it in the kind of greenest possible fashion. Silke, I asked you first about Queen of Everywhere, so maybe you go first and I'll hand over to Chiara.

Silke Kaiser: Well, that's a very good question. I definitely say, as in general, with all kinds of, you see, in more technical approaches, we do need reproducibility and traceability of what we do in our research. I mean, just as you mentioned with the Llama, I think it's important that other people are also able to understand what we did, what was the energy consumption of what did, how can they, how can we check the things that we've done and, see if, we did it properly, if it took a right approach?

And then obviously, I think this is a bit less related to, the topic that I'm or that Chiara was working on, but also in the longterm, we do need to think about ethical concerns coming down to this. And then again, I think just, really, transparency. So I really think that transparency is a good way to address this.

What take do you have on this Chiara? I think one of my major takes also from what Silke mentioned, and I'm really glad you mentioned, is the fact that when we were talking about policy making is that very often policy making is not data-driven. One problem is that we don't have the data and it can be addressed partly by regulators asking for those data, right? But another issue is also that we don't really do data-driven assessment of the policy that we implement then. And I came across very recently a paper that tries to systematically evaluate policies.

And having been implemented in the last 25 years, this very recent paper has been published like a few days ago.

And I thought it was very interesting to well, once again, the results of the paper is there's no one size fits all and some countries depending on their level of development might need different policies. And we have to keep that in mind that we can't use the same policies for a developed country whose energy consumption, for example, is no longer linearly dependent on its GDP from a different, from a developed country or a developing country that has very different issues. But I think in general, this approach of doing data-driven policymaking and science-driven policymaking is something that would really, we would really need in this space.

Chiara Fusar Bassini: I think that's something I can really agree on. I often feel that as a scientist, we feel like we're trying to really produce clear results, objective results. And then often we feel there's maybe a bit of a lack between the two. The research that we do and how much this is sometimes uptaken, by policy.

And obviously we hope that because we do really put so much effort into this and always try to be objective. We hope that this will eventually be more used more and more in the policy sector.

Chris Adams: You've touched on a really interesting point, and I can think of some examples that just occurred to me. So, we had an interview with, oh, Vlad Kor, his first name is Vlad, I'm gonna mispronounce his surname, but we spoke a little about, all about the rebound effect, and Vlad Coraoma, that's it, Vlad Coraoma had this lovely post actually on LinkedIn talking about the curse of potentialities, potential itis, which is basically talking about, we have all these kinds of really exciting projects, but whether people follow through to check whether the actual gains materialised, or the benefits materialised, there's much less effort put into that.

And we've been seeing, like you said, Chiara, from the last 25, we've been seeing predictions for things that would happen in 2020 or 2030. And 2020 is in the past now, we can check if this is, if these actually delivered, but a lot of the time we do not see that. And in our field, specifically as kind of cloud providers, or people who might be consuming services, there's some, there's a really, I'm thinking of a really good example. Microsoft has a whole thing about pushing for AI and everything like that. And we know that, as you mentioned, AI can be used for good, and can be used for bad, or used for Climate aware things, which are really helpful and things are not so good. And we've even seen like people who are workers really pushing for this. I'll share a link to an article in Grist where, written by Maddie Stone, where she talks about some of the sustainable connected community inside Microsoft, speaking to some of the management there. There's a guy called Darrell Willis. He's the vice president of energy. And they spoke and said, "hi, we are pushing for," you know, "can we please have a conversation about what we're using AI for inside our company, because we're one of the largest companies in the world and we're one of the leaders in various industries," right?

And there was a commitment to say, we're going to produce, as the management said, "we're going to start releasing information about, okay, how much of our use of AI is coming from the fossil part of the industry versus the renewable part of the industry?" And this feels like a really important data point if we're going to be looking at tens of billions of dollars used on AI.

I mean We know that it's an accelerant. If it's an accelerant of fossil fuel extraction and burning, that's a very different story to using tens of billions of dollars for renewable energy, for example. And if we've seen commitments at a management level, then it would be nice to see these. As we understand, these commitments were made, these were shared inside the team, but we don't have this, and we'll share a link to the specific terms, because actually, I'll just share the quote with you, because I think it's one thing that, if you're an employee of a cloud firm, or a customer of a cloud firm, it's the kind of thing you might want to know about, so on the call, "Darrell Willis, committed to providing employees with updates on net zero requirements as Microsoft continued to implement these energy principles. Committed to providing a breakdown of energy divisions revenue across six different sectors from oil, gas extraction, to all zero, low to zero carbon energy. So sharing this information internally." Now this feels like a thing that employees probably should be aware of or asking for. Also feels like something that if you're an investor of Microsoft or a customer, you might want to know.

Because there's an impact inside your supply chain thinking about this. And if you're choosing one provider because they have really strong GSG credentials, this may make you view it somewhat differently. We'll share the links because it seems to be the best concrete example I can think of at significant scale that we might be talking about. And I'll get down on my soapbox because that's just the thing that really leapt out when I, when you spoke about that. So we coming up to time, and we've spoken about the different uses of AI, sustainability of software, as well as some of the Things you might want to use or be aware of as a practitioner. If people do want to find out about the work that the two of you are doing, where should people be looking? So Silke, if people are interested in your work, is there a LinkedIn page or is there a website that you direct people's attention to?

Silke Kaiser: I normally try to direct people to my personal webpage, which is silkekeiser.github.io. Or you also, you can also find me on X or on LinkedIn. And I'm always happy to share news on my research as well as the articles that are out there. And I'd be happy to, if people were to look at those pieces of information.

Chris Adams: Cool, thank you. Alright, and Chiara, if I just hand over for you?

Chiara Fusar Bassini: I've seen Silke's website and you guys should really see it.

It's a very nice animation. I don't have myself a website, but I'm very active on LinkedIn. You can find me under Chiara Fosar Fusar Bassini.

Chris Adams: Chiara F U S A R, we'll put it in the link, we'll add it in show notes. So, Chiara Fusar Bassini. Brilliant. Thank you, folks. This has been lots and lots of fun. I've learned a lot from this, and this has been a really nice chat. Hopefully, we'll cross paths sometime in Berlin, but otherwise, thanks again for coming on, and have a lovely week.

Silke Kaiser: Thank you very much for having us.

Chris Adams: Ta ra! Hey everyone, thanks for listening! Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, Google Podcasts, or wherever you get your podcasts. And please, do leave a rating and review if you like what we're doing. It helps other people discover the show, and of course, we'd love to have more listeners. To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser. Thanks again and see you in the next episode.

88 episodes

#Tech #Non-Profit #Business #News #Tech News #Asim Hussain #Green Software Foundations #Software #Foundation #Open Source #Software Development #Changelog #Information Technology #Carbon #Carbon Emission #Greenhouse Gasses #Emissions Environmental #Software E