Ep#113 Platform Engineering Teams Done RIGHT!

March 7, 2023

Episode Summary

Welcome to the Jon Myer Podcast! In this episode, we are excited to have Adrian Cockcroft, a technology visionary, and former Cloud Architect at Netflix. Adrian will be sharing his insights on the four principles of platform engineering teams done right. As more organizations adopt cloud technologies, these principles can help engineering teams build and manage scalable platforms that can support the needs of modern applications. So, let's dive in and learn from one of the best in the business!

Adrian - Headshot

About the Guest

Adrian Cockcroft

After a long career working at the leading edge of technology, I retired from full time work on June 3rd 2022. I joined Amazon in October 2016, as a VP in AWS Marketing focused on building relationships with customers. I keynoted 20 AWS Summits around the world, presented on technical and management topics at many events, and hired the open source community engagement team. Moving to Amazon Worldwide Sustainability in March 2021, I led sustainability marketing for AWS, invested in the Amazon Sustainability Data Initiative, helped coordinate the rapid growth in sustainability related headcount across AWS, and helped author, launch and promote the Well Architected Pillar for Sustainability.

#aws #awscloud #finops #cloudcomputing #costoptimization

Episode Show Notes & Transcript

Host: Jon

Our next guest is Adrian, a tech advisor@iionx.net. Our topic today is Platform Engineering right now Adrianne, he's a tech advisor and consultant, previously VP at Amazon, spent time at a VC firm, and was a cloud architect at Netflix, and has been telling stories about that for the last decade. Also, if we have time, we're going to be talking about Microservices retrospect, what we've learned and didn't learn from Netflix, and also he hasn't a dual talk track. And the other one is cloud provider sustainability, which that's an interesting topic. If we have some time to jump right into it, please join me in welcoming Adrian Coro to the show. Adrian, thank you so much for joining me.

Guest: Adrian

Cool, glad to be here again. Good to see you.

Host: Jon

So, Adrian, you joined us at reinvent for one of our talk tracks in our podcast happening today. You're joining my podcast and we're talking about a very important topic. Now, you posted this article on Medium talking about platform engineering teams done right. Adrian, I'm going to let you do a little quick introduction about yourself and now I want to jump into this topic.

Guest: Adrian

Yeah, it was one of those things where there has been a lot of discussion at the term recently and I started seeing other people writing pod writing blogs. Sam Newman wrote one and there's just more chatter about platform engineering. And so I decided that I wanted to just get it out, what I thought about this and some of the experiences based largely on the experiences I had at Netflix, but rather than talking specifically about how to do it in a very specific way about a particular product, I want to talk about the principles you'd use in some of my view of how platform engineering looks and why are people talking about it now anyway's, the other piece

Host: Jon

In your article you outlined principles, before we get to that, I got to ask you, what is the definition of platform engineering?

Guest: Adrian

Well, it's something that people have always done. If you are going to build something, you build it in layers, you build a base platform, and you build on top of that. So it's sort of a stable base that you can build another layer on top of. So as you get deeper into the platform, you usually get things that are more stable, more mature, and you're building something new on top. So rather than hey, I want to build a website, let's write an operating system from scratch, or let's design our CPU. We don't do that kind of thing. We start with a CPU platform, an operating system platform. We don't write our languages. We pick a language platform that comes with a hotter library, then we start looking at frameworks. And after a while, you figure, okay, that collection of the technology stack that you decided to use is your platform.

Guest: Adrian

So that's the general idea of what a platform is. The reason it's come up as a topic now, I mean this is a very old concept and it's used across many industries, not just an IT thing. I think, there are sort of three reasons why it's happening now. One is just the general chatter about it that people have been talking about because Kubernetes is complicated and people have started building products and tools that deal with that complexity as there are more and more options on the cloud vendors. It's like how do you pick out of the 200 different AWS services and whatever there is on Azure and Google and narrow that down to the things you are going to use, how are you going to put them together, what guardrails you're putting in, what's your security stance, how are you building authentication, those kinds of things.

Guest: Adrian

So that's a need to have an opinionated platform. And then on top of that, some vendors got tools that they're trying to sell where they're trying to push platform engineering. And yeah, so they're starting to see some sort of vendor marketing budget where they're trying to get people to pick up platforms as a new buzzword. So you see things like, is platform replacing DevOps and stuff like that? And no, it's just an old Tom topic and it's just another way of doing things. But then the final thing that I think makes it interesting right now is the team's apologies book, hopefully, your views have seen that book or at least heard about it. It's got some very useful ideas on how to structure engineering teams they have different types of teams and one of those teams is a platform team. So they have explicitly as one of the teams, they just picked that name because it seemed reasonable.

Guest: Adrian

But it's, the idea is it's a team that is managing whatever the platform is and it has an API and consumer, so it's an internal product team that you use to build something. And I think what's kicking this off is that that book sort of crystallized the definition of what a platform team and platform engineering was. And that's sort of why it's become just recently a hot topic because this isn't, you can go back 10 years and people were talking about platforms till 20 years. So it's not new but just got some salience right now.

Host: Jon

I was going to ask you if this is not something new because you were talking about architecture using hardware infrastructure and then what you're building on top of that. We're going back to the data center, but you're talking about the generalization. I think your article talks about if you could show this in a graphical design, you were going to use a layer cake, but you decided to use the map or kind of delict everything. And I think layers are very kind of outlines, right? Here's what you got to have, here's the next thing, the next thing, and the platform correct me if I'm wrong, but is it like the application as a whole, here's my app, I'm using it on this platform with this operating system, this service tool on top of it.

Guest: Adrian

So the first principle I have is that there isn't one platform and there isn't one platform team. It's layered and each layer in the platform has different concerns and you'd need different specializations on it. So on your org chart, if you're at any scale, it's all, you'll have multiple platform teams looking after different pieces of it and then they'll layer together in different ways for different parts of the business. Going back to the Netflix example, we were on top of AWS. AWS as a platform. It's a very stable platform. They don't take things away, they just add features. Now and again you can take code you wrote 10 years ago, it probably still works on AWS. You know the scripting and the tooling are just adding features. So it's, it's a very stable growing platform. So on top of that though, we said, okay, this is our policies for how we're going to create new accounts.

Guest: Adrian

This is how we're going to authenticate to them, this is how our internal sort of LD A and name services are hooked up to the cloud. Here's how we manage it from a security point of view, all those kinds of things. So we had an opinionated platform the Netflix cloud platform, and that was just how we decided to use AWS. So that was the first layer and then that was used by different teams across Netflix and the team, most people think of as the team that builds the front-end code. So we built a whole lot of code in Java, there were libraries and all the stuff we needed to deploy Java build and deploy Java applications into production to build a webpage, website, mobile apps, all that stuff. So that was one set of platforms, but there was another team building more operation stuff and they wanted a Python-based platform.

Guest: Adrian

So we ended up building a Python version of it, which with some duplication, but basically to build out the things that we wanted to do there. Then there was a data science team that was again working very differently. They were mostly running Hadoop and things like that whatever, whoever knows whatever they were running at the time. And they had a very different platform because they were concerned with doing data science. You don't need to be concerned with keeping the website up and the front end and downtime. You're mostly concerned with throughput and getting your batch jobs finished by tomorrow, those kinds of things. And we had another platform around encoding movies because a big workload at Netflix was sort of crunching through all of these incoming video streams and recoding them over and over again to make them slightly better, all that kind of work. So they each had their platform team sitting on top of the central platform and then within them, you had other platform layers on top of that as well, depending on exactly what you were doing. So that's kind of the layering. And so that's the first principle is that you should look at it as layers and not try to jam everything into the responsibilities into one team. You'll end up with too many different things happening at once and you want to separate it a little bit,

Host: Jon

But if you're not, you're jamming it into one team, you have multiple different platforms or variations of the platform and no uniformity across the teams. Doesn't that make it a little complex to manage throughout an entire organization?

Guest: Adrian

Well, you want to optimize the different layers to whatever you have. You want the best of the breed. If you've got a data science platform, you are working with, I don't know, Splunk and Databricks and whoever else, you've got a whole bunch of different products there that you're using on top of whatever. Whereas, so you've got different expertise, different vendor relationships. You might have things on different cloud providers like your email platform probably isn't on AWS, it's probably on Microsoft or Google. That's one of your cloud platforms. So you've got all these different platforms and they're different specializations and yeah, there's some overlap, but the first thing to do is to make sure you're managing those vendor relationships well and you're developing and building the sort of digested version of that raw vendor experience into what you want to use with your kind of use cases in mind.

Guest: Adrian

So that's the way I think about it. And on top of that, you can build more portability if that's what you need. Some SaaS providers deploy to every cloud and on-prem because that's what they have to do. So they package something up as a bunch of whatever Kubernetes pieces and then they can deploy it anywhere. So they're working with an abstraction layer that's a little bit higher up because they need to be portable. If you're just trying to get a web service built and run efficiently, you're probably more embedded into your specific cloud provider because that's going to get you going faster and cost less. So there are different trade-offs there.

Host: Jon

So Adrian, before we get to understanding platform engineering teams, you talked about the first principle of the layers and then what's the second principle?

Guest: Adrian

How order did I put them in? So the first one was lots of layers. The second one I think I have to look

Host: Jon

It was dynamic and evolve over time

Guest: Adrian

And just tend to

Host: Jon

Up the stack.

Guest: Adrian

No, yeah, the sense, yeah, that was the second one. Yeah, so the idea is that you've got some of the platforms you're building, and this is again what we did from the Netflix point of view, AWS was developing and adding features underneath. So for a while we ran our own Hadoop clusters, this was ages ago, about 2009 or so. And then AWS came out with me, which is Hadoop cluster as a service. And we said Okay, let's stop doing it ourselves. We'll start using the cloud. Some of the security models and things like that were sort of handcrafted for a while. Key management, we handcrafted a key management system, and then when AWS came with one, we migrated to that. So you kind of look at the things you had to build because your platform doesn't provide them and you want to be able to dynamically move up the stack and have a thinner layer as possible and just keep growing that up as you go. If you get too much into a vendor-provided layer there, then the vendors tend to want to maximize the value of their layer so that they can charge you more for it and they want to make a thicker and thicker layer between you and whatever is underneath. So this is why I think it's important to have an internal platform team that is always trying to thin that layer rather than make it fatter.

Host: Jon

When you said thin that layer, but you also said that you know guys developed and did your own Hadoop plus or your key management system, and then AWS did and then you're utilizing that, what's the difference between you build your own versus buyer use their services? Because if you use their services you're thickening that layer, but then you have your own a km, you're thinking that layer. Help me understand the difference in what's the value of doing one versus the other.

Guest: Adrian

So it's more like if there are vendors who say, hey we'll just take care of everything for you, and they are often used in enterprise accounts because the enterprise is kind of like that and the enterprise end up paying quite a lot of money and ending up with a lot of legacy stuff which is relatively inflexible. All right, so that's the path you can go down, hey, we're going to make this easy and consistent for you, and in the end that may be the right thing to do, but it's going to, in the end, it's going to slow you down and cost you more rather than this idea that you are, you don't want to be beholden to a vendor for the platform layers that are close to your application. All right? You want vendors deep in the stack that are going to just be stable. And so cloud providers, maybe data science providers, those kinds of things you want to build up on that.

Guest: Adrian

But these sort of orchestration layers, I think that's what people are trying to sell you on at some higher level orchestration layer and it looks, there are nice promises in there but you're going to end up locked into an orchestration layer that's going to end up being something that gets in your way that that's kind of the opinion I have anyway. I mean it works for some things but you need to be careful because if you're this going actually to one of the other principles was that internal platform layers whom these using your platform, you can talk to the developers, you can email 'em, you can say okay, we're going to dep deprecate this feature so everyone needs to get off it. We going to stop using it, we're going to change stuff, we're going to change, we're going to do a complete rewrite of this piece of the API for some reason, right?

Guest: Adrian

We're going to internationalize it because now we're launching in countries with foreign languages so we have to do deep changes to the thing that works If you're internal because you know who the customers are, as you talk to them, they're in the same building as you're the same company as you. You've got leverage over the sort of say executives to say no, you are going to move whatever. You can allocate resources to do that When you've got external vendors or if you external externalize what you are building as an API to the outside world, then you you've got to build something stable, got to, you can only add things, you can't take things away. If Netflix changes its external API TV sets stop working and they dunno who, or how many people are running. There are millions of people out there running TV sets and game consoles and set-up boxes and there had to be a lot of stability in that interface whereas the internal platform that was being used by developer studio personalization algorithms could change continuously.

Guest: Adrian

So you have to think about that and decide whether are you building an externally liable API, which you should expect to have really good stability and documentation and evolve relatively slowly. And that means that if you're depending on something that you want to change quickly, then an external vendor's usually not going to be able to develop the new thing fast enough for you. All right, so you've got to build your version of it and then maybe the vendor comes in later and you replace it. I mean that would be probably an API change at that point as well.

Host: Jon

Okay. So you were mentioning the fourth principle clear distinction made between building for internal platforms and optimizing change Quickly before I ask you about the third principle, which is the interface that a platform should be driven by users of the platform. Everybody, I want to let you know that we're talking with Adrian Cockcroft, a tech advisor@orionx.net about platform engineering teams done right? I will post a link in a description below to his median article that defines how it could be done and his interpretation of it and bring some of the information to you. Adrian, the third principle that you have on it, I know we skipped around a little bit, but how do you understand the users of your platform?

Guest: Adrian

Yeah, and that was kind of the difference between a platform team and a regular development team is that you are doing internal product management. You need the interface, the platform needs to be defined by who's using it and if you've got a very specific internal requirement, you should be building APIs that meet that need. So you end up doing product management and you either have and some people advocate hiring specific product managers for platform teams. And if you're a smaller team or you have somebody, it's usually the manager of the team quite often you need to have some project management skilling and some product management skills to be a good platform team manager. Because what you're building is you've got to have a roadmap, you've got to prioritize, you've got to have a backlog of requests of the team saying, okay, we all want to do all these things. What's the order that we're going to do them in? There are never enough resources to build everything instantly. So you've got to say, okay, this is going to be quick and this is going to take a while, and work with the people that want the platform to change to help them build the right things as it evolves. So it's classic product management stuff except you have internal customers rather than external ones.

Host: Jon

Adrian, how does a company take this platform team? Because here's what I'm seeing is that, and we're talking about we have DevOps, DevSecOps ops, however, you want to it is this, there's another term that's out there. How does a company handle an additional team or integrate it with other teams already out there? Is it an extension part of it?

Guest: Adrian

I think most companies already have some kind of platform team. What we're talking about here is sort of codifying the way you think about it. They're just a natural piece of when you break down what you're doing, you will tend to have platform teams. What I'm trying to provide is guidance for when you're thinking about how the org chart should be laid out, if you're the CTO or CIO or something looking at this system, looking at how you lay out your team's. I was trying to get people to think about it, and have some principles for thinking about how they should apply what they're doing, a platform engineering to what they're doing. So that was sort of the idea. But you may already have a whole bunch of teams and they aren't working well because you've jammed too many different things into one team or you're trying to make one platform that does everything and by breaking it up a little bit in the right way, it may be that your platform just works well and your teams work well and your developers get unblocked and you can move faster as generally, everybody wants to save money, reduce waste and go faster.

Guest: Adrian

Those are the sort of core reasons why we have these platforms.

Host: Jon

Speaking of going faster, and reducing waste if we have some time, we're going to be talking about the microservices retrospect that Adrian has two talks on the same day coming up on March 29th. It's really around what we've learned and didn't learn from Netflix and also cloud provider sustainability, the current status, and future directions. Now, Adrian, we're talking about the platform teams' done. Is it already that they have it in place? And I'm thinking of traditional IT environments, enterprises that have been out there where they have the silo effect. I have my networking, my infrastructure, and my ad, here's my regular server admin group that deals with not only want to say the platform like AWS or hardware within a data center and then manages that. And then we have my business applications. I have all these different silos out there. Are we taking a member from each or putting them all together and calling them a platform team or is it just kind of like everybody coming together and doing the work? Not individually, but as a group.

Guest: Adrian

I think some of those teams are naturally platform teams, like a data center ops team is a platform. You're providing the buildings and the networking and the various bits and pieces you need or the things you get from a cloud provider if you're in the cloud. So that's a platform team and we've seen people running internal IT building things that look something more like a public cloud and then they're also getting into the support and sustainability manageability of that. But if you're talking about the sort of siloed environment where you have CIS admins and network admins and storage admins and whatever security people and they're all in different teams and they're not working well together those you've carved up rather by specialization rather than platform teams. And those aren't platform teams. If you look at teams to apologize they have a category for, I forget exactly what got a bit of deep technical specialists where who support other things.

Guest: Adrian

Security is one of those. They touch everything. So everything has to be secure. You need deep specialists, and security people sort of clump together. They have a mindset, you want to hire them into a team. Performance teams sometimes are like that too. They need support across the organization. But again, performance specialists are a particular breed of people. So you get these different areas where there is some specialization, but those aren't platform teams. If you're trying to build an API that makes everybody productive, that's a different skill set. And there's, so there's product mindset where you are figuring out what product you need to build to make the internal product you need to build and what external components you're going to gather together to build that. And that's kind of the integration. It's where the platform team comes together and if you're in, a regular IT environment, you might say VMware is sort of a platform like that.

Guest: Adrian

It sort of gathers, you just go to VMware, you buy it off the shelf and then your platform is going to it's a super functional platform. It does all these cool things. It's the pace of innovation is whatever VMware decide to do, you go to weld every year and find out what they're announcing. So it's got its sort of pace at which it goes at. And if you run purely on top of that, you are running at that pace. And that might be good for your business, whatever you're trying to do, but if you're trying to move faster than that, you need to build something that sits above the cloud, whatever your OR infrastructure platform you have, and build teams that work together. And that was really what DevOps was about, taking these development teams and the ops teams and building interfaces so they could work well together or building them to be the same teams, to organizing it to be one team that has some development and some operation skills in it and can just move faster because they're not having to have file tickets to each other and have meetings to get stuff done.

Guest: Adrian

So that's sort of where DevOps came from. And I think, I don't think there's any conflict here between DevOps and platforms and all the other different areas, but those very siloed organizations are very slow-moving and relatively dysfunctional. That's one of the problems some people fight with.

Host: Jon

I agree with you with the silo effect is dead, it's just that it's very hard to get things done because they pass from one group to the next and then it's a ticket that sits there for two weeks next. And that's why the cloud has been efficient that I'm able to get all this stuff that I need right away. Adrian, how do I as a company know if I have an existing platform team or if I need a platform team?

Guest: Adrian

Well, I drew it out using that ward map. So I drew a stack saying these are the different layers. If you sort of look at that map or try to draw out your version of it, you say, okay, how evolved are the different layers? Have I, do I even have that layer or is it mixed into a whole bunch of other things? If you don't have a platform team, you've probably got different application teams rebuilding similar stuff. So they're all managing their databases, but differently, you've got 15 different ways to fire up stand up on my SQL server or whatever. So you haven't done a platform that says this is there's one way to do this and it automatically gets backups and it automatically gets security and it's automatically got whatever availability you need. So it's sort of the guardrails that you need as a business.

Guest: Adrian

Those are the things you want to platform. And if you see some companies, I know some companies grew very organically where every individual team set up their cloud accounts and just randomly built things that were different and it's a huge mess. And I've talked to some CIOs about trying to clear up after those messes. Some very large AWS users had just no idea what was going on across the organization. They are just sort of organically grown everywhere. And then there's this whole process of trying to, okay, let's turn that into something more of a set of platforms that make sense and they can aggregate for all the right reasons, get better, bigger discounts, and all of those kinds of things.

Host: Jon

Is the platform team helping me understand some of maybe the shareable resources or uniformity within my other services or other platforms that are out there and do they have visibility or should they have visibility into all the platforms? And this seems like a good time to jump in and talk about today's sponsor. Veeam, how would you like to own control and protect your data in any cloud anywhere, including AW w s Veeam backup for AW WS is a native solution to protect all of your AW w s data? It's fully automated, set it, forget it within one platform, centrally managed. VE backup for AWS is a robust solution from snapshot replication, full recovery within AWS granular file recovery, and recovery outside of AWS. Implement VE backup for AWS today before you find out that your current solution isn't working. Now, how about we get you back to that podcast?

Guest: Adrian

What you've got is a developer, well, let's say you have a product manager, a customer-facing product manager that says we need to develop a new feature because customers are something that customers want. They should be working with an engineering team. Say, okay, how quickly can we develop this and test it and see if this makes customers better? The classic sort of model, a better personalization algorithm so that the things we show to the customers are more relevant to them, that kind of stuff. How quickly can you develop that? Depends on how mature the platform is, right? If the developers have to do too much undifferentiated heavy lifting as when Vogels like to say then they're wasting time. So you look, there were things that are going on over and over again and you say, okay, we'll do that once. We'll have somebody build it into a platform and we want it to be an API call or a library that you include in your application and you're just done.

Guest: Adrian

Going back to the sort of Netflix examples, the Netflix DVD site was an English-only site. And when we moved to the cloud and we moved to stream, we internationalized it. So we had to do a build, a whole set of internationalization libraries, and the platform team went and looked at all the different ways we could do that, picked a supplier, figured out how to do all of these different things and just built the whole process for how do you do a fully internationalized site and banked that into the platform. And then all the developers had the same way, all the different teams building front ends for different devices or the website or whatever, they'll have one way to do internationalization. So that's the kind of thing where you are standardizing it to be efficient in the long term.

Host: Jon

So that's where I was getting at with the standardization on how to accomplish maybe a new integration, a new setup like international. They standardize the way that this could be done not for just English or one language, but for all languages. How does that make it helpful in maybe just managing or monitoring or troubleshooting the environment when things are uniform across?

Guest: Adrian

Yeah, the other thing you can do with a platform is bake observability to use another trendy password. But basically yeah, you just like the standard way of doing things. However you go build a new service, if you're building it on top of a platform, the platform has the instrumentation, and it automatically logs what it's doing. You just deploy a thing and then you go to the dashboard wherever the dashboards are and your thing will just appear there with all the normal metrics and uptime and a health check and all that kind of stuff. So that was part of what I'd expect to see in the platform. I mean you could do it a million different ways and many companies have, I mean pretty much every company will have lots of different ways to do logging and lots of different monitoring tools. Whatever you built into your platform is going to end up being the dominant way that you get things done. But you may have different things. If you've got some of your systems built in Java and some in Rust and some in Go, you may well have different monitoring tooling and things for that which you all sort of need to come together in some way. And one of the problems with platforms is as you get more diversity within the platform, it gets increasingly more expensive to keep them all in sync. Keeping your Java platform, your Python platform, and your go platform tooling completely in sync is more work than most people think.

Host: Jon

Is this more just an extension of orchestration? Because for me, I'm thinking that I'm deploying this application here. It orchestrates the standardized infrastructure for the environment. It gets all these observabilities. So I'll use the bud buzzword monitoring capabilities. Here's the backup to Dr. The flow. Here's the database that you get. Is that just part of the platform team to orchestrate these processes?

Guest: Adrian

So the platform team should build into the platform the interfaces that collect the data and deliver it to whatever the monitoring tool is. There's probably another team running whatever the observability stuff is, right? Or send it off to favorite vendors, Datadog, mur Relic, or whatever. But you build in the hooks, the collecting agents, the security platforms similarly, you've probably got some local scanning, file scanning, and things like that. Detection systems that you're building in. Those should be built into the platform. So other teams are saying the security team says this is the security tooling we've got and these are the bits of it we want to put into the platform. And now and again they'll say, okay, we've come up with a new thing or we need to update it or something. So the platform team is taking in requests from other teams to build, and bake in certain opinionated ways to configure all these other vendors and open source and whatever tooling. And that's sort of the idea of it I think.

Host: Jon

So everybody we're talking with Adrian Koff around platform teams done and we're going to get to our next topic in a couple of minutes. But Adrian, my last question for you is, okay, do I feel that I have a platform team or do I need a platform team? Is there process documentation? How do I implement this and make sure that I do it? And I'm sure there's not one size fits all, but what guidance is out there to help me implement a team?

Guest: Adrian

I think the team's, apologies for the book, would be a good place to start as to try and understand how you are laying out your teams in general. And then once you get to say, okay, these pieces of my entire tech stack look like platforms, I would probably sort of work vendor upwards because you need somebody who's an expert in AWS or VMware or whatever, Splunk or something. Those are specialist skills. So you need to build pieces of your platform and just start building layers. But start with the external stable platforms you've decided to pick and then start building layers on top of those. And then as you get closer and closer to the business logic, you want to build something extremely specific to that business logic to make it super efficient and it's a set of libraries that are in the right language that just automatically do all of the usual stuff that the sort of don't repeat yourself kind of principle. All the stuff that everyone has to do every time. So there are several tooling ways to tool that up. But in the end, if somebody wants to build a new feature, if you do it, you should be able to just build some new business logic in a few hours. It shouldn't be a week-long development process even for fairly complicated new things. There are certainly companies out there and there's a book called the Value Flywheel Effect by David Anderson.

Guest: Adrian

Lots of stories. That's a good book. It's a great book. It just came out recently. I wrote one of the forwards for it. Simon Wardley wrote the other one. There's more stuff in the back Wardley mapping, we haven't got into it, but they talk about taking a team that had never used the stuff, the platform they'd built internally, and training them how to do it for an afternoon. And the next day they built a completely new application, deployed it from scratch and it was up and running in production the following day. And that kind of pace of development is unbelievable to most people. And so it's possible to build super productive platforms. And that book's got some great ideas, isn't it?

Host: Jon

Yeah, that's not the traditional model. Or if you talked about five years ago to 10 years ago, it took months, even years to get an application spun up. If you're able to do this in less than a day, then you're super efficient, and the speed to time for it. Adriana, is there a time that it usually takes to implement a successful team or what does success look like during that time?

Guest: Adrian

I mean generally, most companies are on an annual planning cycle for budgets and things like that. And right now most people are in hiring freezes and things like that rather than growing. But it's more about when you are laying out, okay, where are we placing the headcount, the budgets, that's when you're doing the low reorg to say, okay, this is what's important. And if you can figure out ways that you can be more efficient using the platform. Sometimes when times are hard, that's a good time to rethink the way you're organizing. So it's probably good now to say, okay, what can we do that's just going to make our work more efficient, get more done with fewer people, less unproductive work, less wasted work? And one of the ways you do that is by reducing the time to ship, time to value, and getting it super fast so you don't have work sitting out there that has been started and hasn't been finished. That sort of work is in progress. That's where most of the waste is in companies. You want to get a super quick turnaround. So all this is around speeding up the pace at which you get things done and that will end up saving you money.

Host: Jon

Nice. All right. So, Adrian, I'm going to switch gears. I want to talk about two of your talks that are coming up on March 29th. Let's talk about microservices in retrospect. And I like the title of what we learned and didn't learn from Netflix. Do you want to share a few insights on that?

Guest: Adrian

Yeah, this is two talks. They're at Q in London. So yeah, it's an in-person conference. There will be videos that they'll put out later and I have to record the video this coming weekend, so I'm sort of in the middle of trying to write the slides right now. So the keynote I'm giving that morning is this retrospective, looking back at what we learned from Netflix. And the first talk I gave about Netflix at Q on London was in 2012. I did one in 2011 and San Francisco. And those were some of the first times I went out and said, this is what we are doing. And got mostly baffled surprise. People were like the people weren't digesting it at all. And then by about 20 13, 20 14 people were getting into, we saw the docker sort of come out in 2014 really and pick up and containerization and a whole bunch of things that sort of led us to where we are today.

Guest: Adrian

I'm going to look back at that flow, what happened, dig out some of the old slide decks, and see what we ended up contributing to the discussion from the sort of early discussions of Netflix and what got lost along the way. This platform story is a piece of it. There's a whole story about versioning and I have some very specific ideas about that. I need to write another blog post about versioning, but there'll be something about that. And then I've done a few talks recently, which I did fairly humorously, but I called it Adrian's Greatest Hits. You might be able to find a video of me doing that. We did one for the AWS Community Day where I go on stage, I'm not going to do this in London I don't think, but I went on stage wearing six layers of t-shirts and I kept digging out some slides from a talk I did in 2010 with my Netflix shirt on. And then I pulled off a layer and I did a docker contour. Oh, that's

Host: Jon

Pretty cool. Little

Guest: Adrian

Layer. I was just hamming it up and having some fun. But that it was a classic rock band that had gone on tour

Host: Jon

That goes with your layers for platform teams.

Guest: Adrian

Yeah, but no, the idea was really that it was, you didn't want to, if you see a classic rock band, you don't want to hear the obscure album tracks. You want to hear the hits and there's sort of one or two hits on each album. So it was a five-minute piece out of all these 40-minute presentations. It's a bit jumbled, but I had fun with it. And so this is going to be a sort of more refined version of that looking back at, so what did we say then? What did we learn from it? What do we still haven't learned? I keep telling people to do things that seemed obvious 10 years ago, why aren't you doing it? And so hopefully we can sort of move the idea on and bring in some of the gaps. And some people say Microsoft doesn't work for them.

Guest: Adrian

It says, now look at what they're doing. Said, well, of course, it doesn't work. You've missed out on these three important things that made it work. Right? And so there's a sort of holistic view that a lot, all the different things we did support each other into a system that worked. And if you try to cherry-pick and just take two or three pieces out of a complete system, quite often it fails because you are saying, oh, this looks too complicated. I can do a simple version, but a simple version doesn't work because it doesn't deal with all the real-world complexities that are going to creep up on you. So some of this is a bit about sort of pushing back at these people saying, Microsoft's Mac services don't work and we should do, and something else. It's like, well, you're doing it wrong. Get off my lawn and whatever I can get, be the grumpy old guy. Right.

Host: Jon

Nice, nice. So that's the talk that you're doing in the morning at QCon and in the afternoon I believe you're doing one for cloud provider sustainability, the current status, and future directions. And a few minutes ago you were talking about everybody's in their hiring freeze and the things that are going on in sustainability. You also had a LinkedIn post talking about sustainability and some of the things that AWS was doing or couldn't. Do you want to touch on this topic in general what you're going to be talking about and maybe your LinkedIn post?

Guest: Adrian

Yeah, SOCAN has started doing a sustainability track. And the way coupon is structured, they have a keynote in the morning, then they break into multiple tracks. And the track is a curated theme of something like four speakers, a panel, and an open space. And they run that for the rest of the day. So Incan San Francisco, last October, I led and curated the sustainability track. It was the first time they'd done it and they're doing it again in London. I'm not curating, I mean it's being run by some local people from London, but I'm ending up as a speaker in that track and I'm doing, and the talk I'm giving is at the end of the sequence of talks and I'm talking about, okay, this is what the crowd providers currently have, this is what the gaps are. And yeah, we could complain about what isn't there, but what I want to do is say, well, what, let's look a few years into the future if we had good APIs to sustainability information, what could you do?

Guest: Adrian

What kinds of problems do people need to solve? What sort of tooling is going to be there? And just try to do a bit of crystal ball gazing about eventually once everybody's got all the data and it's more usable form there'll be some interesting things and what will the world look like in say, 2025 or something like that. And that's kind of slightly different talk, but as I said, it's at the end of this sustainability focus track where there's a whole load of different talks about different aspects of how to develop more effectively. I call this dev suss ops because DevSecOps had all the fun. So I'm going to dev sustainability and operations and I've been talking about this a few times. So basically I just think carbon is going to be a metric. The energy usage of a system or the carbon-emitting is just another metric that all the tools vendors are eventually going to have in the questionnaires. What are you going to do with that metric?

Host: Jon

I think that the biggest thing is you have this data. So WS released a carbon footprint calculator last year. What was it? April? The sustainability pillar came out around that as well. And you have these metrics, but what are you doing with these metrics? What actions are you taking off these metrics? Adrian, these things were released, and some of these were worked on when you were during your time at AWS. What is the value? What are customers going to be using or how are they going to be using this data?

Guest: Adrian

Yeah, you mentioned this LinkedIn post that I did. So I did a blog post about reinvent and the sustainability talks at reinvent. And in that, I talked about how I was disappointed that they didn't release any updates to the carbon footprint tool, which came out in March last year. And they haven't redone any updates since March last year. So that's a year ago now. I was expecting to see something at reinvent, so I sort of asked Iran, bill, why haven't you done anything? I said, yeah, we're still working on it, but there's a hiring freeze. And we only had a small number of people anyway and there's a bunch of other things that are more important that we're trying to do. So I just said, well, yeah, they're still working on it. It's still important. It's just that in the current environment at AWS, they have a hiring freeze and lots of things are going slower. So it's just one of the ones that got caught in that. So I did that. And then some Caroline Donnelley, I think it is from Computer Weekly, wrote a story about it and called AWS and they said, oh, we're doing fine and we're all being sustainable. And that's why I wrote a response to that over the weekend and probably made somebody

Host: Jon

The typical PR response.

Guest: Adrian

Yeah, yeah. It was the typical PR response and I was helped write those responses when I was at <laugh> like I know what the standard answers are. Because the sustainability standard answers are a document when I was editing when I was there. So it didn't answer the question, which is like, why come it? I understand why they're running slowly, but right now they don't have an API and they've got very basic information. Azure has a comprehensive API in preview, lots of information, Google, again, API, lots of information, and AWS just needs to catch up. And right now you can't build a tool that goes across the three cloud providers because AWS has none of the data and the data on Google and Azure is a little different from each of them. So it's not unified yet, but it's at least there's a first pass at an API and the data model there. So that's kind of what I'm talking about. And I was just sort of grumbling at AWS but also trying to sort of state why I think they haven't done this yet, but probably made some poor PR persons unhappy over the weekend. Because they had to kind of pay attention to something instead of having the weekend off. But that's just the way it goes. And that's sort of summarizing that into this talk I'm doing at the end of one month from now basically.

Host: Jon

Nice. Everybody, I want to let you know March 29th, Adrian has two talks happening. We've already talked about the first one, microservices retrospect, and then the second one we just round up the conversation on cloud providers, sustainability, the current status, and future directions. Now Adrian's been doing a lot of talks in sustainability, actually had a hand in helping and some of this stuff at his previous company and he has deep insights on it. So I think this is going to be one of those cool talks that Adrian's going to provide. So a great value. Adrian, before wrapping things up, is there anything else you'd like to share with the audience?

Guest: Adrian

No, this, I'm doing a few other talks at different events. I'm going to be at MOURA in June in Portland, Oregon much more geeky talk if people are into monitoring tools and techniques. And I'm doing a chaos Carnival talk on March 15th. It's an online talk. Yeah, about chaos engineering, another topic that I get into occasionally. So yeah, I'm doing advisory work, doing a bit of consulting now and again, and I seem to end up needing to spend the next few weeks writing slide decks. But anyway, that's kind of life for me at the moment. At least I can do what I feel like because I don't have a corporate company telling me what I can and can't do and say, so you

Host: Jon

Don't have a PR behind you telling you

Guest: Adrian

I don't have a PR department. And so other than what I decide to do. So that's the other thing recently I've switched from Twitter to Mastodon. Yep. And I have a few accounts there too, I'm pretty easy to find there. I'm on the main Mastodon, the social account, and then I recently, MEDIUM just started, so I just started tinkering around with an I think it's m e.dm, I think is their Macedon server. So basically Mothball, my Twitter account is still sitting there, but there are no tweets on it. So that's kind of a fun thing. Then it's like watching Twitter gradually sort of get messed up and probably some certs are going to time out and it's going to crash sometimes randomly when it's inconvenient. That's the sort of stuff that happens.

Host: Jon

It's still holding strong. I think every conversation I have with a technical person and we say our Twitter handle is like, yes, it's still around and we've been saying it's still around for the last couple of months. We'll see how it's going. I will share your masses on the handle in the description below. Adrianne, will you also share with me the links to some of your talks that are coming up, whether they're recorded live so I can share them with everybody?

Guest: Adrian

Yeah, I give you some links and then I'm Adrian C, so you don't have to spell Cockcroft just Adrian C on GitHub as well, and I put an Adrian, so CEO's slash slides on GitHub. I have a whole bunch of slide decks there, so makes it a bit easier to find stuff.

Host: Jon

Nice. Adrian, thank you so much for joining me. I appreciate it.

Guest: Adrian

Oh yeah, yeah, great to see you again, and thanks for putting all this together.

Host: Jon

Yeah, not a problem. So everybody, Adrian Cockcroft, tech advisor @ OrionX.net. Today we were talking about platform engineering teams done right. Adrian, enjoy the rest of your day, and I appreciate you joining me for this podcast.

Guest: Adrian

Thank you.

Host: Jon

All right everybody. My name's Jon Myer. Thank you for watching the Jon Meyer podcast. Don't forget to hit that, like subscribe in, notify, because guess what, we're out of here.