Ep#17 Daily Tech Show: Spot Ocean Demo with Zak Harabedian

August 9, 2021

Are you looking for hands on real world experience with Ocean, Spot’s solution for containerless kubernetes infrastructure? Are you wondering how the heck do you even get started or how does this work? You might want to stick around because we’re walking through OCEAN and how it works. I’m Jon Myer and this is “The Daily Show”. Today, we’re talking with customers success engineer from Spot, Zak Harabedian. Have you heard about Spot Ocean? What is ocean? Ocean is Spot’s solution for containerless infrastructure.

It takes care of an array of infrastructure management tasks that the infrastructure and application teams can actually focus on applications knowing that the infrastructure is continuously optimized. And when we’re talking about optimize, we’re talking about the infrastructure around. But optimal performance rightsizing and that cool packing Tetris like feature by creating a real serverless container experience.

Joining me today is customer success engineer from Spot, also a colleague of mine, Zak Harabedian. Zak, welcome to the show, buddy.

Guest: Zak
Hey, Jon, thanks for having me. Really excited to talk to you today.

Host: Jon
Yeah, just the same. You know what, Zak? We have some exciting news to share with everybody. A new release or feature for Ocean that just came out today. But before we jump into the ocean, sorry, excuse the pun or joke, how about we give the audience a little bit of information about yourself?

Guest: Zak
Yeah, definitely so my name is Zak Harabedian, customer success engineer here at Spot, been with the team almost a year now, and I’m here to help our customers, to leverage our products and migrate some workloads and realize some savings. And I’m also going to be a dad soon. So I appreciate the dad jokes. I will be taking diligent notes and I appreciate the mentorship that there will be plenty of dad jokes.

Host: Jon
So as a CSE myself, but I’m only in that spot for just a little over two months, you’ve been my on board. My go to and in fact, you’ve pretty much on board, with most of us on the team. Correct?

Guest: Zak
Yeah, there’s been a bunch of people that have joined after me, our team is growing really quickly and it’s exciting to see and I’m happy to help everybody get up to speed here.

Host: Jon
I think that’s an understatement. I mean, you’re always available. I think I’ve messaged you last minute, five o’clock later at night, and you’re like, yeah, let’s jump on it. In fact, we were troubleshooting an issue yesterday, both of us working through something. So I always appreciate the mentorship and it’s definitely come a long way. Everybody sees it and realizes it.

Guest: Zak
Yeah, thank you for that. That’s part of the fun for me. Helping everyone and helping our customers really is a good time over here.

Host: Jon
All right. So, back in the beginning I gave my definition of ocean and no, not the swimming definition, but the Spot product called ocean. I know the first thing almost everybody thinks about when they hear sport is cost saving. But Spot does so much more from Elasta group to ocean and even the product called Eco. Could you help everyone understand Ocean as we dive into some of the benefits and features? And I know you have some demos coming up too.

*****Guest: Zak
Yeah, definitely, and I think to help explain that, I’m going to just share my screen here just to show a quick, like architecture diagram of what ocean is. So as I explain here, it might make a little bit more sense. So one of the things that’s been becoming a little bit easier over time is managing containers and Kubernetes, and that’s due to managed control planes like X or Chiki X. And those providers are abstracting a lot of the overhead of a DIY controlled plane that you used to have to keep up with and manage.
But the one thing that really hasn’t become easy over time and it’s actually becoming more and more complex, is managing the underlying infrastructure or the data plan. And that’s really where the overwhelming majority of your compute is going to lie within a group. And that this cluster and that’s why we developed our ocean product. So ocean is a serverless compute engine that sits. Our goal is to manage the data plane and the majority of that compute to give our customers that serverless compute experience, give them up to 90 percent cost savings by utilizing Spot instances.


It’s just one area that we help with. And then also the scaling component, which is the tricky part and something we will talk about today. As far as the architecture, we’re not here to replace your control plane. We’re here to plug it to any control plane that you might be using. So whether that’s X, we’re going to deploy our spot controller into your communities cluster and then we’re going to interpret all of the pods requests and we’ll start making the scaling decisions on your behalf.


The other pillar that we’re helping with here is the pod driving auto scaling, which might seem pretty simple, but when you think about just how different all of your pods might be, there’s a lot of things to take into account for every single scaling decision. So just to give you a sense, like every pod that comes into the cluster, we’re looking at its CPU reservations, memory reservations, any request, any limitations, persistent volume claims. And then we’re choosing the most appropriate instance, type lifecycle spot on demand size to make that scaling decision.
And then even after that scaling decision, we’re going to continuously run simulations on your cluster to make sure it is efficient and whether or not we can continue to optimize that by rescheduling pods to a different node, terminating instances that might be underutilized. And that’s kind of why we’ve come up with this category called continuous optimization, and that’s really what to do.

Host: Jon
What I imagine put this a little bit of its container orchestration, but playing a big game of Tetris and really kind of organizing everything in the most efficient, not only the cost optimized way, but performance as well.

Guest: Zak
Yeah, exactly, so in the visual I’m showing here, you can see like multiple pods on what looks to be a larger node. You also might have smaller pods that are requesting less resources. We might be able to pack those onto a smaller node. And that’s another big complex thing that develops. Engineers are trying to figure out what kind of instances and node groups do I need? Should I have a big node group, or a small node group?
Then they need node selector labels and ocean really abstracts all of that from the developers engineer and figure that out for them.

Host: Jon
So basically, ocean handles that and you worry about the application and building and then having its availability to work towards production and basically just taking some of the heavy lifting off of you and creating that real world serverless container experience.

Guest: Zak
Yeah, that’s exactly right.

****Host: Jon
I know today we’re going to be talking about virtual no Groot’s Vang’s head room scaling does a handle. Our reserve incenses size on demand spot instances. Let’s jump into virtual groups being a little bit. How can it help my application within a cluster?

Guest: Zak
Yeah, great question! So virtually no groups are or think of it as a logical abstraction from the rest of your cluster. So let’s take the example of you have some workloads within the Kubernetes cluster and those GPU workloads might need to run a little bit differently than the rest of your cluster. So the common example of how our customers might use that is using like a custom allow or deny list for their instances. So those gyppy workloads you can put in the instance that you want Oceanus scale, you can specify the spot percentage of whether or not you want to run everything on demand or on spots and really any anything that you want to tweak on the infrastructure layer.

****Guest: Zak
We can break that down so you don’t need multiple OCHI clusters to run those workloads differently. Everything’s on a single pane of glass and right in front of you.

Host: Jon
Now, if you just kind of mention a little bit of my question, I want to jump in to the difference of using a virtual no. Does it make sense to just spin up another ocean cluster versus adding another virtual node group? I mean, what are the benefits of cost benefits? Optimize benefits? Am I keeping my application within that cluster, but I really want to help understand a little bit more.

Guest: Zak
Yeah, it’s a great question. So with the virtual node groups, it eliminates the need to have multiple ocean clusters because of how much you’re able to customize every virtual node group. So every detail down to the user data security groups, I am profile that can be customized on a personal note group level. Same thing with the allowance deny list of the instance types, the lifecycle of the instance, and that makes it easier for the developers, engineers to again just focus on less clusters, less management and just focus on the application requirements.

Host: Jon
Now, you mentioned the deny and accept list, why would someone utilizes what are those benefits and have you run into any Gocha some issues by adding to the Denyer or the set list that you like to share with the audience?

Guest: Zak
Yeah, so a lot of our customers have some really neat use cases, I think one that sticks out to me is we have customers with running high performance pods that are very memory intensive. And one of the things that they do is they track their performance on every single instance that these pods get scheduled onto. They came to me a few months ago and they said, hey, we’re going to see a lot better performance on such and such instance type.
But I realize these are popular instanced types. The Spot markets might not be very good. We might have some more fallbacks on demand. Do you have any suggestions? So in cases like that, we were able to work with them to create a virtual unknown group, create those, allow and deny list for that special wait list for their high performance pods, and then for anything that was not a mission-critical performance pod, we were able to use a much larger deny list of instances because then they’re able to leverage even more spot markets and might not need to fall back to on demand.
So it is a scenario where we were helping with the performance as well as all in one.

Host: Jon
Oh, nice, nice definiteness and talking about those denying list.

Host: Jon
Before we jump in the head room, something I’d like to talk about is leveraging ocean scaling, because I feel that’s a big myth when trying to manage a Kubernetes cluster yourself. Does Ocean utilize reserve in the savings plan spot? And what about on demand? And can you talk about that scaling as well?

Guest: Zak
Yeah, definitely, but that’s something we’ve been doing for several years, started with our last group product, and we have these abilities first with like when you think about reserved instances and the the folks buying them on your US organization level, for example, they’re buying them at large scale. They might not be thinking of the individual accounts. And what happens is they might become underutilized. Maybe that team or business unit stop using them. They did. They didn’t notify anyone.

Guest: Zak
Maybe they can’t get rid of them. Ocean’s able to detect that any underutilized our eyes. And as long as they’re in your your whitelist and we’re able to utilize them, we’ll actually launch OnDemand instances to take advantage of the underutilized r.i and same thing goes with savings plan. And the whole purpose of that is to help our customers not have to pay twice for number one, the underutilization of the Aurier savings plans. But then we won’t want Gispert instance, even though they’re much cheaper.

Guest: Zak
You’re paying for that money no matter what. So we want to make sure we can clean up that waste for our customers and make sure everything is 100 percent utilized before we even think about launching spot instances into the cluster.

Host: Jon
OK, Zach, you literally just added more to my knowledge. I didn’t actually I didn’t know that’s what we do and how we handle it. Why would I can see now why we would launch on the man, even though it’s cheaper per spot, but you’re already paying for on demand or reserve, so we might as well take advantage of that cost you already paying for and you can utilize that on demand. That’s actually some really interesting knowledge. Zach, I appreciate you educating me as always.

Guest: Zak
Yeah. And in terms of like the the ocean perspective, turning back to virtual groups for a second, one of the things we might be looking to do for the developers engineer might be able to do is they might have a select group of pods within the cluster that they want to run entirely on demand. Maybe they’re running some databases or stateful applications. You don’t want those running on spots because of interruptions. That’s just a simple parameter. You can change on the virtual node group to let us know that you want to run that underlying infrastructure entirely on demand.

Host: Jon
So, Zach, let’s talk about Headroom, and if I didn’t know about Khubani, I would think of headroom as the kind you’re looking for when you sit in your car to make sure your head doesn’t touch the roof.
But let’s talk about headroom and as it relates to an ocean cluster.

Guest: Zak
Yeah, sounds good. So when you think about headroom, we define it as like a spare buffer capacity within your Kubernetes is clusters. And the purpose of that extra layer of capacity is when you think about new pods coming into the cluster. You’re like I mentioned a little bit earlier when I was explaining the Ocean architecture, we’re going to the controller is going to interpret the request of those pods and scale the infrastructure to meet those needs. But if you have that spare buffer of capacity within the cluster, we’re actually able to schedule those pods immediately to nodes that are already running within the cluster.
And that way you’re not waiting for new nodes to be spun up and bootstrap and you’re going to save a lot of time by doing so. So any spiky cluster is a great use case for headroom, and it can be really configured in a number of different ways. The first one is automatically so you can just tell Ocean that you want to you want a 10 percent buffer of capacity within the cluster and we’ll figure all that out. So we’ll figure out how many of these CPUs we should keep as that buffer, how many megabytes of RAM and use things like that.

Guest: Zak
And you don’t have to think about it. The other way that there’s a few other ways, actually, and to your point, we did have an exciting announcement today that you’re actually able to win my thunder. Go ahead, Zach. You know what, Zach? You are the man. Let’s talk about it, because it was really weird. I’m going to let you share the news because you’re the one who’s done a lot of the training on board and super excited about it.

Guest: Zak
And before when we planned this, it was perfect timing today that we released a new feature. And here to tell you about it is Zach Howzat.

Guest: Zak
That’s perfect. Yeah, good timing. So up until today, when you were defending your head, you could either, like I said, define your percentage or you could manually define X amount of Vecepia is X amount of megabytes of RAM use. And you could do that either on the cluster level or even on the virtual node group level, which is pretty neat. And today we announced the ability to actually leverage both automatic and manual headroom configurations within the cluster.

Guest: Zak
So there’s a lot of neat use cases, really. The sky’s the limit for how you want to define your headroom. I think being able to define it on the virtual node group level has been really helpful for a lot of our customers. And now having that ability to introduce both to the clusters, to the cluster, I think will be really helpful for our customers with some of those trickier workloads.

Host: Jon
So, Zach, thank you for sharing the new release with everybody. Don’t forget to take a look at the description below. I will include the document I’ll post about the new and latest release. That one last question before we get to the demo. And I know everybody’s hanging on for T. Rex. Just going to I imagine that since this is a cloud environment, we have the controls, the schedule, things such as shut downs, working hours or even scaling needs.

Host: Jon
You want to provide a little insight on that?

Guest: Zak
Yeah, that’s exactly right. And in addition to utilizing spot instances, they give you that 90 percent or up to 90 percent cost savings. Another huge way you can save money is just by simply shutting down the compute resources when they’re not in use. And that’s one of the things the Ocean will handle for you. So you can simply put in shut down hours of the clusters. So, for example, if it’s a development cluster, you don’t need it.
24/7 will handle shutting all those nodes down. And then whenever you want those started back up, we will scale everything back up and it’ll be ready for you. So those can be changed very easily. And it’s another great way to to save money when not using it.

Guest: Zak
All right, sounds good to us today. Let me go ahead and share my screen again. The.

Host: Jon
By the way, while you’re sharing your screen, I match my blue lights with Spot, and I figured it was good for Ocean.

Guest: Zak
I just want to confirm that is the correct color scheme. Exactly.

Host: Jon
I think I programed it into the controller. I did pick the right HTML code, so we’ll see. OK, perfect. Now, as Zach is bringing that up, I want to let everybody know Zach didn’t do a demo real quick show and you Ocean and some of the cluster configurations that we talked about today. Plus, don’t forget, if you like what you see, that like subscribe and the notification, because I always have awesome content on the way and cool people like Zach joining me to share the cool stuff that Ocean is releasing.

Host: Jon
And not only that, but with some of the spots products.

Guest: Zak
All right, thanks, John. So I’m going to go ahead and share my screen here, and what I’m showing you guys is just a demo cluster that I have here that is plugged in to ocean. And we’re able to see here is just some high level cluster information, how many nodes we’re managing, how much money we’re saving by utilizing spot instances, how well the cluster is allocated in terms of its utilization as well as the headroom. So where you can see here is in this example, we’re utilizing a automatic headroom configuration of 10 percent, an ocean, trying to figure out just how many cores or CPUs and memory we need to put in this cluster to maintain that capacity.

Guest: Zak
And every time there’s a scaling decision, like let’s say there’s new pods that come into the cluster ocean, it’s going to reevaluate the headroom. It’s going to to up it or maybe terminate some nodes if it’s no longer needed. And this is the automatic piece and. What I can show you guys here is if I click into a virtual node group and I scroll down here, this is where you can introduce Manuell head on the virtual node group level.

Guest: Zak
So on every single virtual group, think of it like every different use case, you’re able to define the amount of CPU megabytes of RAM chips, and we’ll also honor that. So if you wanted to use both, that’s available to you today as well.

Host: Jon
Interesting. So you can do it on the cluster config, but then you can do it on the virtual node configuration.

Guest: Zak
Exactly right. Thanks, John. And one more thing or a couple more things I wanted to show in this social cluster is just the nodes tab. So, as I mentioned, just how many decisions Ocean is making for every scaling decision you’re able to see, like a number of different instance types that will utilize within the cluster. So in some cases, it might be CPU intensive, memory intensive. We’ll also introduce those different life cycles of instances within the cluster.

Guest: Zak
So in this case, what we’re showing here is we’re utilizing spot instances. But you can also see within the cluster that we will actually use a fair amount of OnDemand instances as well, multiple different instance types, third generation sea series, fifth generation, really, whatever the case may be, we’ll utilize all different types of comments. And in this particular cluster, we’re utilizing a 100 percent spot infrastructure. But that’s not always the case. As I mentioned, customers might want to run things on demand for a small portion of the cluster, and that’s where you would be able to see it here.

Guest: Zak
So you can actually even search for the life cycle of spot on demand that can all be broken down within this list here.

Host: Jon
So it seems to me like ocean is not making just a decision based off the lifecycle or the cheapest spot, but you’re constrained Ensign’s tides, availability, headroom, reserve instance if you’re utilizing it and paying for them. But guess what? You don’t have any more. Let’s go to OnDemand so you can use up all the cost and those commitments that you did. It’s not just a simple cost optimization.

Guest: Zak
Yeah, that’s exactly right, and that’s really how our Elastica product started out, is just those those decisions of looking at different spot markets and we’re actually collecting data on all the different spot markets around the around the world, every instance type. And we’re aiming to predict that they spot interruption ahead of it actually being interrupted so we can minimize the interruption within your cluster. And that’s another thing Headroom will help our customers with. So if you have that spare capacity, pods can be quickly rescheduled to a different note in the event of an interruption.

Host: Jon
And that interruption is up to almost 15 minutes. Correct? When is it? Right now. Around two minutes, correct?

Guest: Zak
Yeah, that’s right. So any time you’re using a few instances, whether it’s eight of us as your GCP, you get a very short notice from your cloud provider when they want to reclaim it. And that’s why we try to actually replace these instances ahead of the actual interruption.

Host: Jon
Now, Zach, real quick, you want to show us where the scheduling is that now let’s jump into our T rex game because I will not lose time.

Guest: Zak
Yeah. So the last thing we’ll show here is the ability to define your cluster running hours of your shutdown hours. So just a simple drop down here. You’re able to set the running hours or shutdown hours. The other thing to keep in mind is everything I’m showing you on the console here can also be done programmatically. So whatever your infrastructure is, just a tool is a choice. It can also be defined here.

Host: Jon
Nice. Awesome. All right, Zach, thank you so much for the demo and talking about ocean, let’s jump into T. Rex now as the host, I will go first. Normally, I think the gas goes first, but if I fail right away, it gives you a chance. Plus, I want to know how it goes and gets it. Get you ready to go for that loss. Sorry, Zach. Hopefully you don’t mind. All right, let’s jump in.

Host: Jon
Oh, by the way, I got to show you this. This is great. Let me share my screen. I’m going to play the normal. Share my screen. Bingo. Can you see my screen, Zach? All right. So I played yesterday. I didn’t know there’s a Godzilla version. Closes, what is this? Get out of here. All right, now we’re onto it. OK. All right, I’m going to play Godzilla.

Host: Jon
I’ve never played. This is despicable. So if you haven’t played it, it’s spacebar. Now, another colleague of ours was using the spacebar to jump. I use the up arrows. You can do whatever you like. Are you ready?

Guest: Zak
I’m ready. Let’s see what you got.

Host: Jon
All right, three, two, one. Even little cars, oh, by the way, I make the sound. There you go.

Guest: Zak
You’re making me a little nervous. Looks like you’re jumping too early due to. Don’t get distracted too easily, John. I think you.

Host: Jon
Ringing I know is the delivery man with my pizza. Oh. Oh, I thought I messed up, I hit the button. Oh, it made it. Oh, it’s going faster. Zach, how are you feeling? Oh, three ninety nine, ninety nine. All right, stop sharing, Zach. Go ahead. Bring it on up.

Guest: Zak
All right. Let’s see what I’ve got.

Host: Jon
I’m ready, I’m feeling good, I’m feeling good, you can go with either one, you want Godzilla or.

Guest: Zak
OK.

Host: Jon
Whatever you want, give me one one quick favor, zoom on your screen just a little bit. I’m going to make this a. There you go, perfect. All right. Only a few at the space, right? Oh, there it goes.

There we go.

Host: Jon
You’re kidding, right?

Guest: Zak
What is this like, a tricky length that you sent me is that would go like Godzilla. Real quick. By the way, you lost. Doesn’t matter what happens afterwards. Just say that was the same one I did yesterday. All right, go ahead.

Guest: Zak
I’m under suspicion that you sent me the advanced link because that’s the only explanation.

Host: Jon
So I’ll let you play this. Well, we’ll see how it goes. I don’t know if I’m going to put this into a recording, but you did lose. This is just a free play here. This is going a little faster than might see. Oh, there you go. Either way, you still would have lost one ninety seven to three ninety nine. Bam!

Guest: Zak
We’re going to look into the logs here, John. Something just doesn’t seem right and we’re going to have to debug.

Host: Jon
This was it was all right, everybody. If you loved what you what you see here with the demo, also with Kit that like to subscribe. If anything, I hope you enjoyed the T. Rex Godzilla game because we had a lot of fun. Zenk, you so much for showing us ocean, all the benefits, the walk through and how awesome of a product it is for a real serverless container experience. Thank you so much for joining us.

Guest: Zak
Yeah. Thanks so much for having me. Enjoyed the conversation today and I’m looking forward to our rematch.