Ep#107 The Best AWS Cost Optimization and Performance approach

January 25, 2023

Episode Summary

Welcome to the Jon Myer podcast, where we discuss the importance of managing your cloud costs on Amazon Web Services. Today, more and more businesses are moving to the cloud, but with that comes the challenge of managing costs. In this podcast, we'll be discussing strategies, tools, and best practices for optimizing your AWS costs and getting the most out of your cloud investment. Join us as we explore the world of AWS cost optimization and discover how you can save money and improve your bottom line.

cristian-headshot

About the Guest

Cristian Măgherușan-Stanciu

Cloud optimization Specialist

Helping AWS customers optimize their Cloud Computing infrastructure for lower cost, while at the same time often increasing performance and reducing their carbon footprint.

Between 2019 and 2022 worked for AWS, helping strategic AWS customers from Germany to optimize their cloud infrastructure using Spot and Graviton instances.

#aws #awscloud #finops #cloudcomputing #costoptimization

Episode Show Notes & Transcript

Host: Jon

Please join me in welcoming the Cristian Cloud optimization specialist. And I'm not going to give away where he's at, what he's doing and what he's been up to. But let me introduce Cristian to the show. Cristian, thank you so much for joining me.

Guest: Cristian

Thank you, Jon, for having me. It's been a pleasure meeting you earlier.

Host: Jon

Yeah, so Cristian and I had a chance to work and do some things together with AWS Fest, which is to optimize your AWS fest that we host quarterly. But I want to jump into this podcast and Cristian, we're talking about AWS cost performance versus optimization. And I know it's been a beat-down topic, but the value of driving this home is key. Before we jump onto that topic, how about giving everybody a little backstory on yourself?

Guest: Cristian

Yeah. Hello everyone. Hello everyone. I'm Cristian based in Berlin. I used to work at AWS here in Berlin for a while as a specialist solution architect for supporting Graviton. Before AWS I wrote a tool called Autos Sporting, which makes it easy to adopt spot instances in autoscaling groups. And yeah, after September I quit AWS, and then I doubled down on working on this tool. And I have a little small consultancy on cost optimization and optimization in general because I don't like to just restrict it to cost. There are so many other things that you can optimize, but now, yeah, the cost is a big topic nowadays but there are also performers. There's latency. Yeah, all sorts of things that you can leverage in your business that don't necessarily have to do with cost,

Host: Jon

Cristian cost optimization versus cloud optimization. Here are my thoughts and I'm going to give you my humble opinion on it. Cost optimization is under cloud optimization and is a small fraction because if you optimize your cloud environment, yes it could mean that it's going to cost you more in some cases for a business application, but in the long run, it's going to save you a lot of costs. What are your thoughts?

Guest: Cristian

Yeah, and it's the same. So, in some cases, you may pay more. For example, I'm thinking if you have an application that you want to optimize for better performance, you may want to add maybe a caching layer to it. So you add elastic cash that will add a bit of cost, right? So you'll, it'll not be for free, but then that cash can improve the performance of your application. And that performance increase can drive, let's say differentiation in your business niche compared to your competition. So you can get more customers, and more revenue, and even if you increase the cost a bit you could get business results much more than that increase in costs.

Host: Jon

And I think that's key Cristian talking about that is here's what happens. Everybody assumes that going to the cloud's going to save you a bunch of money I'm going to do for and if you think about a data center, and I think that's a really good example of you talking about the caching layer in my data center. I didn't have this caching layer, I didn't have it available or I would have to spin it up. It would cost me too much. You go to the cloud, you're saving a bunch of money now that your application's in the cloud, and guess what? I can add this caching layer now. So you add it, and now all of a sudden people are like, wait, why did my cost go up for this application? It wasn't because you're not doing for and seeing that you're now enabled and D able to do more within the cloud and put more behind the performance of your application.

Guest: Cristian

And also if you go to the cloud, once you move, you can have much more knobs that you can tweak when it comes to the sizing of your application, the scaling in and out all sorts of options that you may not have in the data center before you move. So yeah, all that can help you.

Host: Jon

Yeah, in the data center, you are limited to what most people refer to as t-shirt sizes. You can only fit into these templates and that's it. But when you go to the cloud, you have thousands of different instance types available of your choosing depending on your business application needs. I just posted something about it the other day with regards to what is your instance type from compute memory, HPC you name it. There, I think there were six or seven different I instant types that I listed. Cristian, let's jump onto a little bit more about this cost and the visibility around not only your cloud spend because we are talking about AWS in general being that both of us are Amazonian. So congratulations on it.

Guest: Cristian

Thank you.

Host: Jon

So let's talk about what is your approach typically when you're talking to folks around AWS cloud optimization performance-cost optimization.

Guest: Cristian

Yeah. I mean as a former Amazonian, I took to heart this working backward from the customer mantra that we have at Amazon. And basically, I start looking at the existing spending of the customers, and then based on that, depending on what they have, may maybe I sorted by cost and taking the biggest chunks of the cost may be the bigger services that they run, and try to always tackle the biggest and the lowest hanging fruits. And I'm a big fan of the 80-20 rule, which is the Pareto principle. So this states that often you have to tackle 80%, and you get 80% of the results by tackling 20% of the problem space. And also the opposite. You may spend 80% of the effort just tackling the last 20% of the problem. So it's always good to see where it makes sense to invest and what kind of things to tackle first.

Guest: Cristian

And then at some point, if you keep iterating on this, and at some point, you get the ministering returns. So that's kind of my work style if you can say that. Also in the way I build my tools, so I started autos spotting, tackling, compute, and in particular, spotting, because I've seen that compute was where when I started autos spotting compute was the main cost driver where I was working at. And then nowadays we have public information on the biggest services that generate the spending. So if you look at Advantage they're a vendor in this space and they publish every quarter or so they publish this report of the top spend spenders that they see across their customers. So what I can use from that information is looking at that top of spend drivers and then I can see, okay, I see that EC2 is one of the spots the biggest spend drivers. I have something for EC2 when it comes to spot, but I could also look into savings plans for computing and so on. And then taking the next cost drivers, which for example I did for EBS optimization, there's also r d s in there where I'm looking into some building something next. So this kind of thing like looking at the biggest fish in the pond and always trying to catch that. Yeah, that's kind of my mantra.

Host: Jon

Cristian, you talked about the 80-20 rule and trying to tackle 20% and then vice versa, but 20% of those heavy hitters, those top spend services within your account to get 80% of the savings and then vice versa that you can spend 80% of your time to get just 20% down. So going back to it a little bit, do you find that this process, and you had touched on it, that's an iteration process? It's not a once and done, you have to continuously drive toward us. Do you find that most companies are continuous or are they just doing a one-and-done?

Guest: Cristian

I mean it's never done. You could do one of a spring cleanup if you may, and you can drive the cost significantly down. But then over time, just like your house gets a bit messier every day, it's, it accumulates over time, and from time to time you have to do it again or you have to do it as a continuous process. Just like tightening up your home. You don't wait just for the spring cleanup every year. You have to do it every week or so even daily. To continue that analogy

Host: Jon

My house daily, the kids, everything going on, you're always deadly clean up. Whether it's, I'll give an analogy. So the kids come home, and they drop their backpacks onto the floor. No, it goes to a specific place. You go to your AWS environment, they spin up a service, an E C two, they're not using it anymore. No, you turn it down and you continuously monitor this time over time there's a bunch of tools, there are things that you can utilize AWS offers, services budgets, alerts, visibility, but it's all a kind of culture shift and doing the right thing for your AW s account. For the cost savings benefactor of everything.

Guest: Cristian

Yeah, exactly. So you can do it, you can do it, and you should do it continuously. It's a cultural change that you have to drive throughout your engineering group and they have to internalize all these things, all the technology that they have at their disposal to do these things. And eventually, you move out and then let them do it. So that's kind of my approach.

Host: Jon

All right. So your approach as an expert around not only the cost savings but the performance around it because you worked on it and you kind of have this auto spot, you realize that some of the key benefactors of doing it, I want to touch on the performance aspect. Performance is key because when everybody says, Hey, we got to do cost optimization, that okay, recommendation, you're only using this, knock it down to a t2 and you're like, okay, well I can cut everything in half of, you'd like your performance to suck for your application or you can do the right thing and optimize for a bigger, more efficient instance. But it also saves you a lot of time and money.

Guest: Cristian

Yeah, I mean what you can do is always reduce the size, but watch, keep an eye on the metrics that you have. So you reduce the size but then as you notice that the latency, for example, goes up as you decrease the sizing, maybe at some point that increases in latency will maybe damage your business because customers like your application to be fast. And then that's kind of the point where you have to see, okay, take a judgment call judgment, judgment call. Whether should I continue or do I stop here? I saved enough and if I do it further then my, I'm going to lose my customers. So yeah, that's kind of the <laugh>. And you may also want to do it the opposite way. So you may want to optimize performance with a slight increase in cost, as you said. So go a bit bigger on your capacity, but then if you get increased performance, that translates to better serving your customers. Cause at the end it's all about customers who keep you in the business.

Host: Jon

Cristian, when you approach a customer or customer approaches you from an expert perspective and they want to talk to you about a couple of things, they're like, oh my God, I'm spending so much money. Oh my AWS bill, I need your help. Do you educate them right away and say, Hey listen, there are always cost savings things that we can do, but in some instances, your bill might run a little more or you might spend a little more on this specific application because it's more geared towards the performance. It's all about business value and driving those decisions.

Guest: Cristian

So when I engage with them first, I try to talk to them to see what's important for them. So what are the red lines that shouldn't cross when it comes to latency and this kind of thing? And also things that they want to do to benefit their business. So not just the cost but also as we discuss performance and then based on that to see, okay, I can do these changes and then I come up with a list of things that could be done also based on the priority that we discussed with the impact that you can drive and so on. So it's all about driving this fine balance between cost and performance and seeing what makes sense for the business they have. And it's a lot about how I work with their team, with their engineers so that they internalize these things. I don't want myself to be seen as let's say I go there and I save the world and then I'm out. What I try to do is instead teach them how to fish and let them fish themselves based on some tips and tricks that I give them. But then they do the fishing and then they enjoy when they catch the fish and then that makes this process more sustainable over time. So you get them to get more buy-in and get also some of these wins that they can be happy about.

Host: Jon

Quick wins are always important for everybody and all the teams to understand that this is achievable and something that doesn't take a very long time to do. How important do you feel it is for these engineers or folks that you're talking about to have the right training skill sets or even certifications? Hey, this seems like a good time to jump in and talk about today's sponsor, Veeam. How would you like to own control and protect your data in any cloud anywhere including AWS? Veeam Backup for AWS is a native solution to protect all of your AWS data. It's fully automated, set it, forget it within one platform, centrally managed Veeam backup for AWS is a robust solution from snapshots replication, full recovery within AWS granular file recovery, and including recover outside of AWS implement Veeam backup for AWS today before you find out that your current solution isn't working. Now, how about we get you back to that podcast? How important do you feel it is for these engineers or folks that you're talking about to have the right training skillsets or even certifications?

Guest: Cristian

Yeah, I mean it's important to know for them to know all these things, to have an experience about the cloud. If they don't know anything about the cloud, I mean you don't have many much to do with them. Hopefully, they know their things when it comes to knowledge. I'm more of a big fan of learning by doing. I'm not against certifications, don't get me wrong. I had a conversation about this topic the other week and it's about when you do those certifications and when do you start preparing for them in your life cycle in your experience as a cloud engineer. Cause I see all these people who have never touched the platform, the AWS console, but then they get to prepare for these certifications, study the theory and pass the exams without having any practical experience with the platform. It's like having a surgeon who never touched a patient or never touched the scalpel or whatever it's called.

Guest: Cristian

And then have you done surgery like that? So it's more on first getting to get your hands dirty, learning by building things, and then after you've been using it for a while, you do the certifications and then you have also a better chance to get a right let's say to drive the right benefits of those certifications. Unfortunately, there are a lot of people who do the opposite and start just learning the theory but then never get to build anything. And then they are trying to look for jobs and then when they apply for jobs, people don't look at their certifications, they look at what they have built before. And yes, it's also a bit how my past has been, when I started to build tools auto spotting was actually built out of my passion and will to learn the go programming language and play with technologies like Lambda, because Lambda and go, were just emerging at the time and I was looking for a project to learn these things and I have been using AWS for a couple of years and I just wanted to build something and we were in this cost optimization process and then got the idea to build something for a spot on e C two, but then the entire implementation was inbuilt with Lambda and go without even having a go run time for Lambda.

Guest: Cristian

And there were lots of things like platform limitations that I had to work around as I built it. And I still have workarounds for things that aren't available even today in the platform. So things like the marketplace billing for containerized serverless applications, which is not there yet. And that's kind of my philosophy to start building things, getting your hands dirty, and then after a while, you build experience things, then you build start to get the certifications as a stamp of your knowledge, not before you start to do anything unfortunately. So

Host: Jon

I want to jump in there on the certifications before we continue. I agree with you a little bit on it, probably about halfway because of how I approached my certification back in 2015, I think it was when I first got my AWS SA PRO certification, but I was only using AWS for a fraction at the time. And I think it was in 2014 I started, but I was using it a little bit and then I went and I did the book theory of doing the exam. But when I got to a question that I didn't understand, I went to AW S and I built it, right? I played around with it and I found out, oh, that's true. And I went to the documentation and learned it. Now I did pass the exam on the first try, but when I went for my developer one, I actually failed the exam, and then I was building and working on a project for a company.

Host: Jon

And I learned so much throughout this whole process of building this it was a migration over to AWS. I learned so much that when I started realizing some of the exam questions for the developer on, I was like, oh my god, that makes sense. I know that the reason today, so that's where I agree with you is that my hands-on experience helped me pass certain portions of it. I think the certification does allow you a foot in the door but does not prove your expertise in it kind of really solidifies that what you're doing or doing is the work that you're trying to achieve. But I think in both cases it helps folks. And in other cases, it's okay to wait until you have hands-on experience

Guest: Cristian

For sure. And there's always this situation where you could start the certifications before doing anything, and practical. And unfortunately, the way they are done, it's just a theory exam with those questions, hundreds of questions. And I think at some point I got the impression that the pro exams are more testing your ability to focus and to read the full screen of text than the actual knowledge. But yeah,

Host: Jon

You have two minutes per question to get to read this entire paragraph and you're like, wait a second, there's something here that doesn't fit. And you're just like select.

Guest: Cristian

Yeah, exactly. And honestly, I prefer it more if it was an exam. I used to have this Red Hat certification back in the days like five, 10 years ago and those exams, you were in front of a computer building things and I wish AWS was doing something like that. Cause then it just gets you into this building mindset and gets you to practice these things as a builder, not necessarily just learning the theory for the technology, which you anyway need to know. But yeah I think they are mainly because they optimizing for training the solution architects in AWS, which rarely need to get their hands dirty or they need to do it from time to time, but it's not the focus of the job. So as you go as a solution architect, it's all about knowing everything and having everything in your head, and being able to give a quick answer based on incomplete information from the customer. So I think they're optimizing more towards that use case.

Host: Jon

Cristian, I remember the hands-on exam, I did my CCNA and you had to write and type out the commands because there's only one way to do it. I think with AWS, even the SA Pro, there are multiple ways to get this done or to build something. There are so many different services it's a lot. I mean there are hundreds of services now, but there are different ways that others would do it. I think the difficulty there is that you would need a professional to grade each exam since it's not a solution type that fits all.

Guest: Cristian

I mean, it could be like you have a black box text testing for example. Let's say build something that returns HTTP404 or whatever. If you do some sort of request and then the way you implement that, it matters as long as you build it and it does what it's supposed to do. I mean, you could then check for things. Have you been using lamb or have you been using an EC2 or a FARGATE to build it? But as long as it gives the re-response, it's like in the real world when you build a product as long as it works, the customers don't care how it's built under the hood. So

Host: Jon

Yeah, I would put some cost optimization into that exam because you don't care how it's built and you put it together. I would put that as one of your exam finals. Were you able to save costs on this entire exam and implementation? Let's find out.

Guest: Cristian

Yeah,

Host: Jon

That would be interesting. Cristian, let's talk a little bit more about the skills that it takes to do and take tackle AWS. And not only the cost, but we're also talking about it, I mean even from a FinOps perspective for cloud optimization or cost optimization, it's a culture change. You're coming in there and you're trying to talk to them about their environments and some of the things they do, yet you have others that might have been there a while or they're used to doing things a certain way. How many times have you run into this issue where you're just trying to educate them that, hey, listen, there are always better ways to do it? Here's what I recommend, is there, does one size that fits all?

Guest: Cristian

I mean, one size never fits all, but there are a few sizes that fit pretty well.

Host: Jon

I can go with an XXX large that'll fit over me, no problem. Yeah,

Guest: Cristian

So usually, there are a few options that you can propose and then you can discuss the pros and cons of those options and then see what fits best their business. So it's always coming back from their needs and working backward again. And, yeah, you can build the things in different ways, but then when you build it, you see, okay, this thing doesn't satisfy those requirements. And then you are forced to do something else. Even if you maybe have to compromise on some things. It's always the case. It's not an easy problem in some cases. For example, you may have an application that you want to optimize for cost, but then it's a GPU workload of machine learning, which is it's really hard to do. And yeah, sometimes you just have to take the hit and pay for the things because it's inherent with the workload that you are tackling. So yeah.

Host: Jon

Cristian, let's talk a little bit about we've talked a skillset, we've talked about talking and bringing in an expert around some of these approaches. But if I want to do this on my own, are there other sources or engagements or folks, or not even just internal, but external that I can reach out to? What are your suggestions? Are there communities that I should get involved in?

Guest: Cristian

Yeah, I mean I think it's key to tap into the internet as a big community. And if you look at it, there are all sorts of channels where people gather together and then discuss these topics. For example, there is this Reddit for AWS where people ask all sorts of interesting questions, and others, other people just go and help them. And you should never go at it alone. If you can get help from others try to do that as well. They're also on all sorts of platforms. There are slack rooms and channels for all sorts of technologies where you can go and get help if you need any particular things. I'm also a big fan especially as a solopreneur I'm trying to leverage help whenever I can because there's just me doing all these things and I cannot do everything by myself. So yeah, communities are very important. And also having a big network in the social media context, like on LinkedIn or Twitter. Even if you just have a question and don't ask it in a community, you may ask your social network and it's always, you will find somebody, if you have a large enough network, somebody will run into that problem and you get a pretty good response from the communities that you are in.

Host: Jon

I think communities are key to doing your best. And not only within cloud optimization or anywhere around that, but there is a community for everything from DevOps, FinOps serverless communities. You can join so many of them to ask questions. Everybody's usually very helpful when to do it. And Cristian, you touched on it, on growing your network even socially. If you got a question, post it out there. Even if they didn't run into it, people will ask some other ones. There are some cool AWS tools, AWS iq AWS Reposts is another one that you can go and take a look at. Some of the things, the AWS documentation, you can always follow Cristian or me and just tag someone, one of us, and say, Hey listen, we try, I don't know about you Cristian, but I try to respond to almost everybody on social media to give them some type of feedback if it's in a message or whatever it is, or direct them to the right person

Guest: Cristian

For sure. And then people have access to somebody else's network. Also, people will just share it with their network. You have also, a lot of people now who are being laid off from jobs and they just ask for jobs and then people just go and command for reach. It's an interesting idea that you can help other people just by joining their conversation. Because Andy, the network will share that with much more people and you get them better chances of getting the job back. Yeah.

Host: Jon

Nope, I agree with you. So Cristian, before we wrap things up is there anything else you'd like to leave with the audience, bring them some more information, or even a topic that we didn't touch on?

Guest: Cristian

Yeah, I mean, speaking of, so entrepreneurship, like I was saying, I, I've seen nowadays all these things that can help you. And it's not just as a solopreneur, but in general, as a knowledge worker, it's always too good to look in, to keep an open eye on what's going on in the world and what kind of technologies become available. And I was very happy when I got my hands on this new AI for chatGPT, which is something that I've been playing with pretty much daily and try to get as a helper for whatever I'm doing. And yeah, it's a trick that I started to use more and more for things like getting ideas for post podcast episodes or getting, if I write some content, getting it to proofread or report mistakes or feedback on how the text looks like as a free editor that I have.

Guest: Cristian

Because I don't have anybody else, I mean I would have to go and tap into the network, but then that would be mean. I disturb people from their work and I could just go and ask this thing to proofread this text and see what it thinks about it and often find a lot of interesting things that I was not thinking about. So yeah, it's an interesting time to live in and we have so many things at our disposal and yeah, it's good to start looking at them and taking advantage of them as much as possible. Cause yeah, it's why not? It's a benefit of the world we live in.

Host: Jon

I have not tested Al Chad, G P t yet. I have it on my to-do list. I've seen a lot of cool posts around it and some of the things folks are doing, I've never thought of actually asking us about some podcast topics, wonder what it would come up with. I might have to post those. Hey,

Guest: Cristian

I started the podcast just a few weeks back and I couldn't decide what to do, or what to talk about. And I just told it, give me a list of podcast episodes on this topic and it gave me like 20, 20 items. And then I could just go there and see what makes sense, what doesn't make sense? And even for one of those items, you could dive deep and say, okay, this is the type of topic that I want to talk about. Tell me the most important things in this area. And it gives you a list of bullets, what you can talk about. And this is great for a podcast. I mean you don't use it for writing blog content because that's something that I prefer to have my style and things, although I see people trying to do that as well. But you could do it just by getting bullets of ideas of things to improve, of proofreading things. So all sorts of interesting use cases that I see people using it for.

Host: Jon

Cristian, let me ask you this. When you asked for the podcast topics, where is chat G P T getting this information from? Is it reading the internet and reading what's going on and all the information that's out there and then compiling a list?

Guest: Cristian

Yeah, I mean they crawl the entire internet as far as I know, a few years back. I think the data is one or two years old, but then they index all this data and train the model on it. So it has the entire information of the internet inside somehow. And it's interesting how you can ask it all sorts of things. Even I live in Germany and I even ask it to write an email in Germany to my doctor asking for a covid vaccination. So you say, write me an email for my doctor and that I want to get a vaccine for Covid. And then it came with a perfectly written email in German with all the formal words that you have to use with your doctor and things. And you can ask it in a very informal way. So it has all these things.

Guest: Cristian

I was amazed when I saw all these things and there were people, like doctors in us, these doctors have to go and ask the insurance to cover some patient information. And I've seen a guy, a doctor who was like, okay write me an email for the insurance. I have a patient with this disease please justify the cost of the treatment and give me a bunch of articles of scientific articles that I use as a reference. And it came up with a perfectly formatted email for the insurance Nice. That the guy just copied and sent. So yeah, it's very powerful

Host: Jon

Cristian. I think I just might have to try it out after this podcast. And speaking of podcasts, do you want to share with everybody your podcast name and how they can find you?

Guest: Cristian

Yeah, so I have this podcast on cloud optimization, which I call the leaner cloud. You can find it on any of the podcast platforms out there. I put it pretty much everywhere. Yeah. And so far I have just a few episodes. I just started it a month ago. I covered the first episodes were about, the first episode was an introductory episode about myself, and my journey. But then I went dive deep into savings plans and reserve instances were the next few episodes. So I did a very deep dive into those. And the third, fourth episode was about enterprise discount plans. And I'm going to continue these topics of cost optimizations with spot with Graviton, which are my main areas of expertise. I want to record those this or the next week. And then cover all sorts of topics when it comes to optimizing, not just for cost, but as I said, also the performance with networking and all sorts of things because it's all about this idea that when you teach something to somebody, you also get to learn it better. So I know a bunch of things about many of these topics, but when I prepare a podcast episode, I have to go and dive way deeper and sometimes learn a lot of new things or update my information about these topics compared to what he used to know before. So yeah, it's a good learning exercise also for me.

Host: Jon

And all right, everybody, you got to check out Cristian's podcast as he, he's growing more episodes, the number of followers. Yes, he just kicked it off. But I anticipate he and I will do another podcast later this year recapping how things are going for him and including that leaner cloud. So Cristian, thank you so much for joining me. I appreciate it.

Guest: Cristian

Thank you, Jon.

Host: Jon

As always, everybody, my name's Jon Myer. Thank you for watching the Jon Myer podcast. Don't forget to hit that like subscribe, end notified, because guess what, we're out of here.