Journey to MongoDB [Mark Porter]

The Swyx Mixtape

Content provided by Shawn Swyx Wang. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Shawn Swyx Wang or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

3y ago 10:16

MP3•Episode home

Listen to more on the StackOverflow Podcast: https://stackoverflow.blog/2021/08/06/podcast-364-mark-porter-mongodb-database/

Transcript

markporter

[00:00:00] swyx: This is Mark Porter, the CTO of Mongo DB on his personal journey from relational databases to Mongo DB.

[00:00:06] Mark Porter: I am a relentless tech geek. I've loved tech my whole life. In fact, my Twitter handle is MarkLovesTech. I have used databases since I was 14 with some really ancient technologies started out on a 4k TRS 80 model one computer.

We had to program it in assembly language because there wasn't enough memory to use the local basic copy. And I very quickly got into databases and I was talking to someone the other day and he pointed out something I'd never noticed, which is I've oscillated between using databases and building database.

So I started out at Caltech and NASA using databases for space, data, and chip data. And then I built databases at Oracle versions, 5 6, 7, 8 for about 13 years. And then I used databases at NewsCorp for huge student data systems. And then I built databases at Amazon with Amazon RDS. Then I moved to Grab taxi, which is the Uber of Southeast Asia and use databases to deliver 15 million rides and meals a day, and then came back to Mongo DB.

And here I am building databases again. I frankly can't get away from this thing.

[00:01:20] Ben Popper: I love that story. I wonder. Does that mean. You know, at each point you had some sort of frustration or saw some sort of like opportunity for innovation, you know, you kind of would build something, then you'd be the user of it.

Then you'd realize that like the next sort of turn of the wheel was coming. As you move between those jobs where new paradigms and databases and murders.

[00:01:38] Mark Porter: Yeah. I mean, it's been really interesting. Half of my career. I've been the Bo and half my career. I've been the target. And I got to tell you that sometimes as a customer, you're not really happy being the target of what has been produced.

Look, the reality is, is relational databases have been the modus operandi since 1970, when Cod first did his paper. And then Oracle was the first company that released them in 1979. They were actually known as relational technology back then and then changed their name later to Oracle. So the mission criticality of databases has never been in doubt.

What has changed is the amount of data, the way we process that data. And what's really, really important. And it used to be duplication of data was important and things like that. And while that's still important, what's really important. Now is developer product. Bar none. That is job one for any mission critical software company is developer productivity and innovation

[00:02:35] Ben Popper: makes a lot of sense.

It does seem like data has become almost this, uh, overwhelming force for some companies. Ryan. I know if you have experience with this, but I've been getting a lot of pitches and, and talking with folks on the podcast and you know, it's gone from, we're using data to, we have data lakes and there's a data iceberg.

And, you know, we're only sort of scratching the surface of what we might be able to do with this. Endless flow of unstructured data that we're collecting. And as you mentioned, yeah, a lot of times what they're looking to do is understand it in a way that allows them to enhance productivity or automate certain processes, which right now are very time labor intensive.

Yeah. Yeah. At my previous job, I worked out on an article about data pipelines and, you know, ETL processes and that yeah. There's a becoming a separation, I think, between your production database and the database you use to gain insights, right? Then the production database has to be fast. But the insight database, it can be a little more flexible in how it produces data, right?

[00:03:34] Mark Porter: Yeah. So we think about systems of record. We think about systems of insight and yeah. I mean, definitely different people want to do different things with the databases. And so what we do is we think about personas. Are you an analyst? Are you a developer? Are you an AI ML engineer? Are you a PhD data scientist?

We always try to come at it from the customer and what they want to accomplish. Yeah,

[00:03:56] Ben Popper: I think that's so interesting because as you said, obviously, databases have always been part of working in the world of software and computers, but increasingly there are these specialties that are very important in which are producing these really interesting results that themselves are devoted to data, as opposed to it being something that, you know, needs to be part of the larger process.

Um, so mark, I wanted to touch on something, which is that you had a part of your career at AWS, which now, you know, has grown into. Quite a behemoth. Um, yeah. Just wondering if you can talk to us a little bit about what you learned there and maybe how some of that applies to the role you have at, at Mongo DB.

[00:04:26] Mark Porter: Yeah. So I joined AWS as the general manager of AWS RDS, which at that time was probably the largest fleet of databases in the world. And that fleet grew just tremendously while I was there. It was, it was amazing, you know, just showing. That it's not just databases. It was managed databases that mattered.

So RDS did not build any of its own databases, RDS vended. By the time I left over a million significantly more than a million Postgres, my SQL Maria DB, Oracle, and SQL server databases. And so the product that we produced was managing those databases and people love it when their database stays up. When the backups and restores work, when you can change parameters when fail over works and all those things.

However, over time, as much as I loved running those databases, I became frustrated with how they were shackles almost on customer innovation and customer operability. And so we developed this system called Amazon Aurora, which changed out the storage system underneath Postgres in my SQL. Obviously we couldn't do that for the commercial databases and we made those databases so much more resilient, so much more durable, so much more available, but we kept running into the fundamental limit.

Of a rigid architecture of high fail over times and a single primary architecture, which meant that the blast rate. Of a system going down or play in changing in Oracle database. I mean, it takes down a whole company and I can talk more about availability. In fact, you'll have trouble stopping. When you talk to you about availability, if you get me started

[00:06:09] Ben Popper: well, I mean, that's, that's the, uh, the big thing about a no SQL is, is availability, right?

The replicability, the speed of access. Yeah, for folks who don't know, let let's lay out the value prop here. Like what is sort of the difference between the two and why would you prefer one over the other? You know, you mentioned shackles. I love that word, but yeah. You know, what are the limitations that it allows you to avoid when you, when you move to a new SQL and I gue...

535 episodes

#Tech #Shawn Swyx Wang #Technology #Entrepreneur #Business #Learn