273: Phi-fi-fo-fum, I Smell the Bones of The Cloud Pod Hosts

Episode 273 September 04, 2024 01:07:21
273: Phi-fi-fo-fum, I Smell the Bones of The Cloud Pod Hosts
tcp.fm
273: Phi-fi-fo-fum, I Smell the Bones of The Cloud Pod Hosts

Sep 04 2024 | 01:07:21

/

Show Notes

Welcome to episode 273 of The Cloud Pod, where the forecast is always cloudy! Hold onto your butts – this week your hosts Justin, Ryan, Matthew and (eventually) Jonathan are bringing you two weeks worth of cloud and AI news. We’ve got Karpenter, Kubernetes, and Secrets, plus news from OpenAI, MFA changes that are going to be super fun for Matthew, and Azure Phi. Get comfy – it’s going to be a doozy!

Titles we almost went with this week:

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info. 

General News

01:37 Terraform AzureRM provider 4.0 adds provider-defined functions 

03:50 Justin – “Okay, so it doesn’t have anything really to do with Terraform. It has to do with Azure and enabling and disabling resource types that they can monkey with, basically, with configuration code.”

06:12 Rackspace Goes All In – Again – On OpenStack 

07:35 Ryan – “I think there should be something like OpenStack for, you know, being able to run your own hardware and, know, still get a lot of the benefits of compute in a cloud ecosystem, hardware that you control and ecosystems that maybe you don’t want being managed by a third party vendor. So happy to see OpenStack continue to gain support even though I haven’t touched it in years.”

AWS

08:39 Announcing Karpenter 1.0

09:28 Ryan – “See, this is how I know Kubernetes is too complex. I feel like every other week there’s some sort of announcement of some other project that controls like the allocation of resources or the scaling of resources or the something something of pods. And I’m just like, okay, cool.”

11:26  Add macOS to your continuous integration pipelines with AWS CodeBuild

09:28 Justin- “You’re not spin up, so the key thing is that you don’t wanna spin up additional Mac OS’s every time you wanna do this because then you’re paying for every one of those for 24 hours. So because you have a reserved fleet, you’re using the same Mac OS that’s in the fleet and you don’t have to worry about auto scaling it up and down.”

15:00 Announcing general availability of Amazon EC2 G6e instances

15:56 Ryan – “My initial reaction was like, got to figure out like a modern workload where I care about these types of specs on these specific servers. And then I remember I provide cloud platforms to the rest of the business and I go, no, this is going to be expensive. How am I going to justify all this… pass.”

16:56 Now open — AWS Asia Pacific (Malaysia) Region 

15:56 Justin – “The forecast models all die at 2038. We didn’t really understand why. We just assumed that’s when the jobs run out. No, no, that’s a different problem.”

19:52 CloudFormation simplifies resource discovery and template review in the IaC Generator

20:20 Ryan- “This is how I do all of my deployment architectures. Now I just deploy everything and then I generate the picture, screenshot that and then document. Ta -da!”

21:19 Amazon DocumentDB (with MongoDB Compatibility) Global Clusters introduces Failover

22:25 Ryan – “I mean, anytime you can do this type of like a DR and failover at the data layer, I’m, I’m in love with, because it’s so difficult to orchestrate on your own. And so that’s a huge value from using a cloud provider. Like I would like to just click some boxes and make, and it will just work. Awesome.“

22:46 Amazon S3 now supports conditional writes

23:28 Justin – “…either you would have to do an API call to verify if the file was there before, which you’re not paying for, and then you can do your write, or you get to do this. And if you have all your apps trying to do this all at the same time, the milliseconds of latency can kill you on this type of thing. So having the ability is very nice.”

25:10 AWS Lambda now supports function-level configuration for recursive loop detection

25:44 Justin – “I remember when they first added this several years ago, we were like, this is amazing. Thank God they finally did this. But then I forgot about the support part that you had to reach out to support if you didn’t want your attention to your cursive pattern. And I, if I was going to go down that path, I’d just say, don’t – I’ve done something wrong. But, apparently if I think I’m actually right – which is a problem, I think I’m right all the time – it can now cost myself some money. So do be careful with this feature. It’s a gun that can shoot you in the foot very quickly.”

GCP

27:58 Looker opens semantic layer via new SQL Interface and connectors for Tableau & others

29:48 Ryan- “…these types of connectors and stuff offer great amount of flexibility because these BI tools are so complex that people sort of develop their favorite and don’t want to use another one.”

31:10 C4 VMs now GA: Unmatched performance and control for your enterprise workloads

32:42 Matthew – “…the specs on this is outstanding. Like the 20 gigabytes of networking, like they really put a lot into this and it really feels like it’s going to be a good workhorse for people in the future.”

33:19 Containers & Kubernetes Your infrastructure resources, your way, with new GKE custom compute class API

34:51 Ryan – “Kubernetes is really complicated, huh?”

38:50 Matthew – “I do want to point out that they had to say in this article – because this article has absolutely nothing to do with AI in any way shape or form, but it includes AI workloads because for some reason it wouldn’t have been known. and I actually checked the article because I saw it in the note or show notes, but I literally had to go into the article to be like why is that commentary necessary? Did somebody miss their AI quota for the day so they just threw it in?”

40:21 Introducing delayed destruction for Secret Manager, a new way to protect 

your secrets   

41:13 Ryan – “I mean, this is a good feature. AWS has it by default from the, from the rollout where there’s, takes seven days for a secret to actually go away and you can restore it up until then. The monitoring is the bigger one for me, like being able to configure a notification without trying to like, you know, scout through all the API logs for the delete secret API method. So this is nice. I like that.”

44:09 Run your AI inference applications on Cloud Run with NVIDIA GPUs

44:33 Ryan – “No, I mean, this is a great example of how to use serverless in the right way, right? These scales down, you’re doing lightweight transactions on those inference jobs. And then you’re not running dedicated hardware or maintaining an environment, which, you know, basically means that you keep warm.”

45:08  Cloud Functions is now Cloud Run functions — event-driven programming in one unified serverless platform

46:56 Justin – “Yeah, I started to wonder why you would just use Cloud Run. Unless you’re getting some automation with Cloud Run functions that I’m not familiar enough with. But the fact that you get all the Cloud Run benefits with Cloud Functions, and if I get some advantage using functions, I guess it’s a win.”

47:57 What’s New in Assured Workloads: Enable updates and new control packages

48:36 Justin – “Which means AI is coming to the government.” 

50:22 Try the new Managed Service for Apache Kafka and take cluster management off your todo list  

51:13 Justin – “There was no mention about region support, which is really what I need out of this service, versus in region support. But if they can make this multi -region over time, I’m sort of in on this one.”

52:57 Announcing Terraform Google Provider 6.0.0: More Flexibility, Better Control

Azure

55:19 Elevate your AI deployments more efficiently with new deployment and cost management solutions for Azure OpenAI Service including self-service Provisioned

56:22 Matthew – “These are, while they sound crazy, extremely useful because as soon as like, was it 4.0 came out, we had to go like. Boy them because otherwise we were worried we were locked out of the region. So even though we weren’t using them yet, our accounting was like, make sure you deploy them as soon as you see the announcement that may or may not be coming out in a very, in the next couple of days and, and do the units that you’re going to need for production, even though you, didn’t know what we needed yet.”

58:32 Announcing General Availability of Attach & Detach of Virtual Machines on Virtual Machine Scale Sets

59:10 Matthew – “And this is only for flexible, so if you’re not using flexible, which has other issues already with it, like and you are you have to be in a fault counts, you actually have more than capacity than you need. So there’s very specific ways that you can leverage this.”

1:04:29 Announcing mandatory multi-factor authentication for Azure sign-in

1:06:18 Matthew – “Or you just run your worker nodes inside and use the, whatever they call it, service principal to, which is like an IAM role to handle the authentication for you, which definitely works great with Atlantis.”

1:06:47 Boost your AI with Azure’s new Phi model, streamlined RAG, and custom generative AI models 

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod

View Full Transcript

Episode Transcript

[00:00:07] Speaker A: Welcome to the cloud pod, where the forecast is always cloudy. We talk weekly about all things AWs, GCP, and Azure. [00:00:14] Speaker B: We are your hosts, Justin, Jonathan, Ryan, and Matthew. [00:00:18] Speaker C: Episode 273 recorded for the week of August 27, 2024. Phi Phi Pho. Fun. I smell the bones of the cloud pod hosts, particularly Matt and Jonathan, since you didn't record last week when Ryan and I are out. [00:00:33] Speaker D: Hey, we tried. It just didn't work. [00:00:37] Speaker C: Yeah, I understand. I did all the show notes that you guys didn't do last week and this week's show notes and got it done just right before recording. So I feel your pain. [00:00:46] Speaker B: This is the first time I've ever been on this side, usually. [00:00:49] Speaker C: How's it feel, Ryan? [00:00:51] Speaker B: I feel righteous. [00:00:52] Speaker C: Yeah, like you suckers, you couldn't get it done. [00:00:57] Speaker D: Well, we were down to we had a third person that was offering to come do it with us, but both of ours work were a little bit all over the place. Plus I was out Wednesday through Friday, so we really had just Tuesday to try to do it and it just didn't work. [00:01:13] Speaker C: Yeah, it's really weird when Ryan, when Jonathan actually has to do work. He all sudden is really busy. All right, well, let's get to a lot of news because we got two weeks to get through here. I've purged as much as I can, but there's still some good stuff, so let's get into it. First up, Terraform has announced that the general availability of the Terraform Azure RM provider 4.0 is generally available. This improves the extensibility and flexibility in the provider. Since the provider's last major release in March 2022, Hashicorp has added support for some 340 resources and 120 data sources, bringing the total Azure resources to 1101 resources and 360 data sources. The provider has now topped 660 million downloads and Microsoft and hash two continue to develop new innovative integrations that further ease the cloud adoption journey for enterprise organizations. With Terraform 1.8, providers can implement custom functions that you can call from the terraform configuration, and the new provider adds two Azure specific provider functions to let users correct the casing of their resource ids or access the individual components of the id. Previously, the Azure RM provider took an all or nothing approach to Azure resource provider registration. The terraform provider would either attempt to register a fixed set of 68 providers upon initialization registration or be skipped. This didn't match Microsoft's recommendations, which are to register resource providers only as needed to enable the services you are actively using. With adding two new feature flags, resource provider registrations and resource providers to register. Users can now have more control over which providers to register automatically or whether to continue managing a subscription. Resource provider Azure RM has removed a number of deprecated items and is recommended that you look at the removed resources, data sources and the 4.0 upgrade guide before proceeding. Matt, the first question I have for you is what is this registration thing you have to do in Azure for terraform? [00:02:59] Speaker D: I learned this the hard way. Every single resource type you have to enable, which is both good and bad. If you need compute resources, you have to enable compute resources, you need networking resources, you have to enable it. So it gives you the ability to really turn on and off sections of it without giving people full access to everything. Especially for our development team, I might not want them to go play with insert OpenAI, even though we definitely play with it. We might have that disabled. I think OpenAI is actually part of the other ones, but maybe we don't want them to play with computer networking or something along those lines. We can just disable it in that tenant and you just can't use those features of Azure. It just errors out. [00:03:44] Speaker C: It doesn't have anything really to do with terraform, it has to do with Azure and enabling or disabling resource types that they can monkey with, basically with configuration code. [00:03:53] Speaker D: Yeah. So essentially I have a subscription level module, which is all the different resources that we have to enable for everything else going forward that I run first on my subscriptions and then from there everything else cascades down. [00:04:07] Speaker B: Got it. Okay. [00:04:08] Speaker C: Yeah, it's worded kind of weird. And I was like, is this an Azure thing or is this a terraform thing that's unique to Azure? And it is, but it's more about how Azure controls access to things. [00:04:18] Speaker D: Yeah, I haven't played enough with these. [00:04:20] Speaker B: It's similar to the enabling an API within GCP, but I'm still a little like registering the providers in terraform, still a little wrapped around the Axel Huntley. [00:04:32] Speaker D: Well, you just use terraform in order just to turn it on. So I don't want anyone having write access to my subscriptions. So the first thing you run, just the terraform that enables these things at the subscription level. So there's a couple other things that we enable at that level, which I can't remember what they are, just some of the general high level things. So in Aws, if you had to enable a region, I think I know you can do that with API calls or if you are enabling cloudtrail or something at the account level, those top. [00:05:04] Speaker C: Level things got it. That's just one weird azure thing I don't want to ever have to deal with. So appreciate you learning that one the hard way. [00:05:12] Speaker D: The other one that you don't want to deal with is the case id, because obviously Azure and Microsoft are case sensitive in this way. [00:05:19] Speaker C: Of course they are. Why would they be? [00:05:21] Speaker D: They weirdly camel case stuff and in different ways. And sometimes it definitely doesn't confuse you. So, you know, when we are importing some of our resources, we've made that assumptions what we thought it was going to be. And therefore this normalization thing actually is pretty useful. It just fixes it all for you, which is nice. [00:05:40] Speaker B: That is really handy. It was kind of neat with the custom functions. I hadn't heard of that from terraform. Something to play around with. [00:05:47] Speaker C: Yeah, I remember reading about it at 1.8, but they didn't have any like, use cases at that point that made any sense to me. And so I think even when we talked about Terraform 1.8 here on the show, we didn't mention it because it was one of those, like, until I have a use case, it's hard to talk about. Well, this is just a fun story. Rackspace is going all in again on OpenStack. Rackspace has been a very vocal supporter of OpenStack since they launched it in 2010 in a partnership with NASA. Over the last several years, though, they haven't really been doing a lot in the OpenStack space, although they haven't turned their backs. They say they've contributed over 5.6 million lines of code to it, and is one of the largest OpenStack cloud providers in the world. But even with that recent withdrawal, they say that they are now recommitted to OpenStack, and they're reaffirming their commitment by working with the OpenStack foundation and are launching the new OpenStack enterprise, a fully managed cloud offering aimed at critical workloads that run at scale and brings enhanced security and efficiency. And all I can think about with this is, ah, you wanted to make an alternative offer to VMware. Yeah, because everyone's mad at VMware, and if we can make something OpenStack better, then maybe they have a chance to get market share with OpenStack, which, good luck to you, HPE has been trying to do it for a while. I don't think it's worked for them either. [00:07:02] Speaker B: I mean, I've definitely used OpenStack extensively in previous lives, way back in the day, so. [00:07:09] Speaker C: And even that company's gone to AWS, now that's true. Poorly. [00:07:13] Speaker B: But they've gone there and GCP, I remember like they're just doing everything now. But yeah, you know, I, I think there should be something like OpenStack for, you know, being able to run your own hardware and, you know, still get a lot of the benefits of compute in a cloud ecosystem, but on hardware that you control and ecosystems that maybe you don't want being managed by a third party vendor. So happy to see OpenStack continue to getting support, even though I haven't touched it in years. [00:07:46] Speaker C: Yeah, I mean, I think if I were to be serious about trying to do anything with OpenStack, I would probably talk to HP and get some of their stuff because at least I have a support contract then and it's not all open source, but then you're bad. Or I just go Nutanix. [00:08:00] Speaker B: I don't know. [00:08:01] Speaker C: I mean, HP hardware is not the worst hardware you can buy a, but. [00:08:05] Speaker B: It'S not the cheapest you can buy either. [00:08:07] Speaker C: It's true. It's not IBM. All right, well, if you're in the OpenStack, congratulations, someone cares. Not the clap house, not us. Yeah. All right, moving on to AWS. Carpenter 1.0 is now generally available. This is their open sourced Kubernetes cluster auto scaling project. The projects have been adopted for mission critical use cases by industry leaders, adding key features over the years like workload, consolidation, disruption controls and much much more. More. With the 1.0 release, Amazon no longer considers it beta and the stable carpenter APIs are now part of the package and release. The custom resource definition API groups and kind name remain unchanged with the new 10 release and. But AWS has also created conversion web hooks to make migrating from beta to stable more seamless for you. The Carpenter V one adds support for disruption budgets by reason. The supported reasons are underutilized, empty and drifted and this will enable the user to have finer grain control on the disruption budgets that apply to their specific disruption reasons, which is the only major feature of the V one other than it's now stable. [00:09:10] Speaker B: See, this is how I know Kubernetes is too complex. I feel like every other week there's some sort of announcement of some other project that controls like the allocation of resources or the scaling of resources or something, something of pods. And I'm just like, okay, cool, like this. [00:09:32] Speaker D: There's a lot of things all doing the same thing because nothing has just completely taken it over yet. [00:09:37] Speaker C: Well, it's like everyone's decided that managing Kubernetes nodes sucks. So everyone's trying to create an automated way to do that. Then you layer on top of the networking and all the other stuff, but then you get into complexities of the auto scaler and the hyperscaler and the interaction to these things. Everyone gets their own custom thing. You have GKE autopilot, you've got carpenter, you've got something probably for azure aks, even though I don't know what it is, all out there helping you try to get this stuff done without having to think as hard, but it actually makes it more complicated. [00:10:08] Speaker D: This is where sometimes while I do like containers, and I do definitely use them, this is where sometimes I'm like sometimes just straight virtual machines on whatever cloud you're on. Sometimes it's just a little bit simpler in an auto scaling group and let go back to the basics and let the cloud function, the cloud just kind of handle it for you. Maybe I just sound old and now I'm that old person yelling at cloud and or old person yelling at kubernetes. But I'm like, sometimes you just don't overthink it, don't run SQL inside of containers. Justin, it's just not a good life choice. [00:10:44] Speaker C: I haven't done it. I only threatened to do it. But I have a story for that later. Let me tell you, I was like. [00:10:51] Speaker D: Thinking about it of the threatening or have you doing it in a fairly. [00:10:55] Speaker C: Of doing it, how I could do it even more terrible way than the way I came up with originally. So that's better. All right, well, moving on to macOS, and particularly macOS for your continuous integration pipeline is now supported with Codebuild, which initially I thought, didn't you do this already? But I guess you deployed the Mac OS box and then you deploy Jenkins on it, and that's how you do it. But now it's natively integrated with code build. So you can now build your applications on macOS, which would be your iOS apps, because no one else builds real Mac apps anymore. You can build artifacts on managed Apple M two machines that run Mac OS 14 Sonoma AWS code build is a fully managed continuous integration service that compiles, source code, run tests, and produce, rate or deploy software packages. The code build for macOS is based on a recently introduced reserve capacity fleet feature containing instances powered by EC two but maintained by Codebuild. With reserve capacity fleets. You can configure a set of dedicated instances for your build environment, and these machines remain idle, ready to process builds or tests immediately, which reduces your build duration, which is very anti cloud. Codebuild provides a standard disk image to your build. It will contain pre installed versions of Xcode, Fastlane, ruby, Python and node js, as well as codebuild manages auto scaling of your fleet. The key to this, and why this is interesting is because for the macOS you must pay for 24 hours. So the reserve fleet solves the 24 hours problem. And so contrary to your on demand fleet where you pay per minute of the build, reserve fleets are charged for the time. The build machines are reserved for exclusive usage even when no bills are running. The capacity reservation follows the Amazon EC two Mac 24 hours minimum allocation period as required by the software license agreement for macOS. Article three AI. [00:12:39] Speaker D: Wait, so how does this solve it? Because if you're reserving it, you're already paying for it. How is it actually? How is it actually you're not spending. [00:12:47] Speaker C: The key thing is that you don't want to spend up additional Mac OS every time you want to do this because then you're paying for every one of those for 24 hours. So because you have a reserve fleet, you're using the same Mac OS that's in the fleet and you don't have to worry about auto scaling of it up and down. It's basically, essentially what they're doing. [00:13:03] Speaker B: Sounds expensive. [00:13:04] Speaker C: It's expensive no matter what you're doing. It's expensive because it's Mac OS. [00:13:09] Speaker D: Yeah, I remember when they first released Mac, like the. Somebody did the math. It's like after one and a half months you could just buy one Mac mini and put it somewhere. And I don't think that math has changed too much. Like the 24 hours minimum really kills all cloud concepts with Mac. The only advantage is somebody doesn't have a Mac running at their desk or wherever running being their iOS builder. [00:13:35] Speaker B: I mean, it's being contained and updated and patched and you know, there's definitely some advantages. You know, I do believe that the, this is very intentional, which I think is why Amazon does goes a long way to call it out as a software license agreement for Mac OS and not something Amazon is doing. I think they're very much saying it's not us. [00:13:59] Speaker D: Yeah, I mean, there are just other third parties that you can leverage that do most of the same things that Amazon's doing here. [00:14:06] Speaker C: Yeah, Mac stadium and others. But again, the advantage of this is that if you do need, the one advantage you do get out of this is if you need different versions of Xcode because of different iOS that you're supporting or different phone models you're supporting. You can use the same reservation to basically spin up different Macs with the different configurations, do the builds, and then shut them down. But you get some advantage of that and some cost savings. But again, it's more complicated than it needs to be. If Apple just fix their licensing, we could solve this problem. [00:14:35] Speaker B: Good luck. [00:14:38] Speaker C: Well, if you love Nvidia L 40s, Tensor core gpu's, Amazon has a new instance for you. The EC two G six e instance the G six e instance can use for a wide range of ML and spatial computing use cases. The G six e instances deliver up to two and a half times better performance compared to the G five instances, and up to 20% lower inference cost than the P 4D instances customers can use. The G six e instance deploy large language models with up to 13 billion parameters, and diffusion models for generating images, video and audio. The G six e instances feature up to eight Nvidia L 40s. Tensor core gpu's with 344gb of gpu memory, which is 48gb per gpu, and the third generation AMD Epyc processors with 192 VCPU options, 400 gigs of network bandwidth, up to 1.536 terabytes of system memory, and up to 7.6 terabytes of of NVMe SSD storage. Woo, baby. [00:15:33] Speaker B: My initial reaction was like, I gotta figure out, like a modern workload where I care about these types of specs on these specific servers. And then I remember I provide cloud platforms to the rest of the business and I go, oh, no, this is gonna be expensive, and I'm gonna justify all this. [00:15:50] Speaker D: Pass. [00:15:51] Speaker C: Yeah, you don't want this. Unless you wanna go be a part of the mlops team and burn money for a living. You don't want to be a part of this. [00:16:00] Speaker D: I was thinking it was. Sounds like your comment from the last one. Sounds expensive. [00:16:03] Speaker B: Sounds. [00:16:04] Speaker C: Sounds expensive, for sure. [00:16:06] Speaker B: That's what I think of. Any time they announce, any cloud provider announces a new instance type where it's like Nvidia in tensor cores, or any machine learning specified workload. [00:16:18] Speaker C: Yeah, as soon as you say GPU, I'm like, oh, that's getting pricey. How many and how big are they? [00:16:23] Speaker B: And unavailable. [00:16:24] Speaker C: Awesome. [00:16:25] Speaker D: You know, where's my CFO's wallet? Can I borrow that real fast? Thanks. [00:16:31] Speaker C: Well, for all of our customers who are out there, or listeners out there who are unhappy with the very numerous options, including Hong Kong, Hyderabad, Jakarta, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo, and China, we have a new region for you in Malaysia, which is the 13th in the Asia Pacific region. It has three availability zones and you can find it with the API name AP Southeast Dash five, which I will never remember and I constantly have to look up. The new eighties region will support the malaysian government Strategic Mandani economic framework. The initiative aims to improve the living standards for all Malaysians by 2023 with supporting innovation in Malaysia and across asian. The new regions will add up to 12.1 billion to Malaysia's GDP and will support more than 3500 full time jobs at external businesses throughout 2038, which I don't know how they calculate that, but. Okay, cool. [00:17:19] Speaker B: Was it 2038 when like the, that's. [00:17:23] Speaker D: When the Unix epoch time rolls over. [00:17:26] Speaker B: So they're just like, yeah, at least. [00:17:28] Speaker C: Until then the world will end then that's fine. [00:17:31] Speaker B: Yeah, exactly. [00:17:32] Speaker D: Well, all the hyperscale will crash. It's fine. [00:17:34] Speaker C: The forecast models all die at 2038. We didn't really understand why. Just assume that's when the jobs run out. No, that's a different problem. [00:17:43] Speaker D: At what point? I felt like the original concept of the cloud was like running these highly efficient data centers, you know, just strategic locations. And as we now have 13 in just the APAC region, like, I feel like the amount of overlapping, you know, like, like they have to over provision so much now in all these places to handle all these workloads, it's going to be making the cloud, the general cloud, be less and less efficient because they can't just have them in key locations. And I. It's interesting to see, and like, I understand why for government and compliance reasons, and government want to bring innovation into their countries. So I understand why they are building all these regions. But at one point I'm like, these can't be efficient anymore. [00:18:33] Speaker B: I mean, it depends what granular scale you're talking, right? Data center provisioning, it's like a six month lead time. So it's like if you can be efficient within a six month window or nothing, right? [00:18:42] Speaker C: So because of the cloud, now we're all about low latency applications. So low latency applications require you to be closer to the user. And then another factor you have in Asia, I think particular is telecommunication infrastructure is not as strong there. And so I think being able to have this compute closer is helpful for that. And then data sovereignty laws, because, hey, you got to bring 3000 jobs into your. Into the country, plus $12.1 billion in GDP. If I can force all of our local companies to use a local cloud, that then forces the big cloud vendors to come spend a lot of money in my country. It's interesting. Global political dynamics at play. [00:19:21] Speaker D: Yeah. [00:19:22] Speaker C: AWS cloud information now includes two enhancements to the infrastructure as code generator, which customers can use to create infrastructure as code from existing resources. Now, after the infrastructure's code generator finishes scanning the resources in an account, it presents a graphical summary of the different resource types to help customers find the resources they want to include in their template more quickly. After selecting the resources, customers can preview their template in AWS application composer, visualizing the entire application architecture with the resources and their relationships with the template review. [00:19:51] Speaker B: This is how I do all of my deployment architectures. Now I just deploy everything and then I generate the picture, screenshot that and then document ta da ta. Beautiful. [00:20:05] Speaker D: I can't decide if I'm mad at you or that's brilliant. I'll let you know later. I use terraform to generate it, and then after it's all deployed, I brought it to the architectural committee and said this is what I think we should do. Here's the picture to back it up. [00:20:22] Speaker C: What could go wrong? [00:20:23] Speaker B: What could go wrong? Exactly. [00:20:25] Speaker D: I might steal that. Still. Like, I'm okay with that. [00:20:28] Speaker C: I mean, that's what I use platformer two for. [00:20:30] Speaker D: Yeah. [00:20:31] Speaker B: Oh, for sure. Yeah. I mean, these are neat. Anytime you're generating, you know, code, getting your state codified so that you can reuse. [00:20:42] Speaker C: Awesome. [00:20:42] Speaker B: So this makes it even prettier. [00:20:45] Speaker C: It doesn't look pretty. Amazon Document DB with MongoDB compatibility global clusters now introduce failover and switch over. The document DB now supports a fully managed experience for performing a cross region failover to respond to unplanned events such as regional outages. With the global cluster failover, you can convert a secondary region into the new primary region in typically a minute and also maintain the multi region global cluster configuration. An Amazon document DB global cluster is a single cluster that can span up to six AWS regions, enabling doctor from region wide outages and low latency global reads. Combined with the global cluster switchover, you can easily promote a secondary region to primary for both planned and unplanned events. And switchover is managed failover experience meant for planned events such as regional rotations. [00:21:28] Speaker D: I really like that the switchover is built in day one because I feel like a lot of the times these services are like, we have this thing, it will work in a failover and you're like, will, you know. [00:21:39] Speaker C: Well, we talked about this a couple weeks ago with Azure, I believe, where they had a failover in unplanned situation. They did not have a way to go back without redoing it. [00:21:48] Speaker D: Yes. So that's where my comment comes from. [00:21:51] Speaker B: There. I mean, anytime you can do this type of like doctor and failover at the data layer I'm in love with because it's so difficult to orchestrate on your own. And so that's a huge value from using a cloud provider. I would like to just click some boxes and make and it'll just work. Awesome. [00:22:10] Speaker C: Amazon s three now supports conditional writes that check the existence of an object before creating it. This allows you to prevent applications from overriding existing objects. When uploading data, you can perform conditional writes using the put object or complete multi part upload API requests in both general purpose and directory buckets. This makes it easier for distributed applications with multiple clients concurrently updating data in parallel across shared datasets. This allows you to no longer write client side consensus mechanisms. Goodness. [00:22:38] Speaker B: Well, you just did manage this via eventual consistency and roll the dice every time like I've done in every app that I've used. [00:22:45] Speaker C: Your apps didn't require a data loss problem. [00:22:47] Speaker B: No, they did nothing. [00:22:51] Speaker C: But yeah, either you would have to do an API call to verify that the file was there before, which you're not paying for, and then you can do your write or you get to down do this. And if you have all your apps trying to do this all the same time, the milliseconds of latency can kill you on this type of thing. So having the ability is very nice. [00:23:07] Speaker B: This is super cool. I really do like it. You got to change the complexity in the code to be able to understand that object already exists, doesn't need to be updated. I would rather, I'd rather have that manage this to be or if it's. [00:23:22] Speaker C: An object, you know, multiple people uploaded an object and you need to get a different name. Right. You know, that's also good so you don't lose it. [00:23:31] Speaker B: Yeah. [00:23:32] Speaker D: S three is one of those services that every time they add a new feature, I'm like, oh, this is really good. I thought this service was 100% and done and never was going to be touched together. They just keep adding. He's like adding to it new various features. I'm like, this is great. This solves so many problems that I never really realized they had because I'm like, right. I was like a little bit. Data loss is fine. [00:23:54] Speaker C: Yeah, I mean, I kind of like it when they don't touch s three because it underpins so much of AWS. So if they break it trying to add these new features, that's a different problem. [00:24:03] Speaker D: They just have to reboot it every couple of years in us east one like, it's fine. [00:24:09] Speaker B: Yeah, I wonder if it's internal demand for these new features that's driving that, or if it's external from customers. It'd be interesting to talk to the product team about where the demands are being made. [00:24:26] Speaker C: AWS lambda now supports function level configurations which allow you to disable or enable recursive loop detection. Lambda recursive loop detection, enabled by default, is a preventative guardrail that automatically detects and stops recursive implications between lambda and other support services. Runaway workloads customers running intentionally recursive patterns could turn off recursive loot detection on a per account basis through support. Now customers can disable or enable recursive loop protection on a per function basis, allowing them to run their intentionally recursive workflows while protecting the remaining functions in their account from a runaway workload caused by the unintended loop. I remember when they first added this several years ago, we were like, this is amazing. Thank God they finally did this. But then I forgot about the support part, that you had to reach out to support if you did want your, your potentially recursive pattern. And I, if I was going to go down that path, I'd just say I don't, I've done something wrong. [00:25:14] Speaker B: Yeah. [00:25:15] Speaker C: But apparently if I think I'm actually right, which is a problem, I think I write all the time, I can now cost myself some money. So do be careful with this feature. It's a gun that can shoot you in the foot very quickly. [00:25:27] Speaker B: I mean, on the per function basis. Like, I really like this because, yeah, if you have that use case, which I don't know, I guess if you want something to always run, but making a loop. [00:25:38] Speaker C: But I can see like a polling system where you're doing sort of a polling where you poll and you put something into a queue to then poll again, you're waiting for something to happen. I guess I don't really have a great use case for it either, but I assume there's a use case because it's a feature. [00:25:53] Speaker D: I did a fan ad at one point where it was like, get all the instances, and then from the instances get, you know, get like all the instances, all the RDS, like all the different services that we wanted to support and then call itself with a different variable in it. So it was recursively calling itself over and over and over again and using a fan out that way until I definitely caused an infinite loop at one point, and getting like a $400 Lambda bill was just impressive. I thought. So for a 1 hour mistake, I. [00:26:27] Speaker C: Mean, at least you didn't do that. That one guy did and went to bed and then woke up with a three or $400,000 bill from GCP. And then they put him on the road to tell everyone the dangers of making mistakes. [00:26:38] Speaker B: Yeah, no, I think everyone using serverless has done this at least once. I ended up in a situation where I'm happy path. It worked just fine, but I was smart and I wanted to have a resiliency. And so it's like, oh, if you couldn't process the object, put it back in queue. Bad plan. [00:26:56] Speaker C: Oops. [00:26:58] Speaker D: You have to put another queue that only tries like one or two more types, right? Yeah. Hindsight vision. [00:27:04] Speaker C: Yes, exactly. You need three queues. Three, two, one. And then when it gets the one queue, you just dead letter it and then you're fine. All right, let's move on to GCP. Looker is opening the semantic layer via a new SQL interface and connector for tableau and others, which makes no sense until you realize how they've bastardized looker. Google says that data is the driving force of innovation in businesses, especially in the world of accelerating AI adoption. But data driven organizations struggle with inconsistent or unreliable metrics without a single source of truth for data definitions. Metrics can have a different logic depending on what tool or team they came from, and the teams can't trust data go back to their gut, which is a risky strategy. Google designed looker with a semantic model to let you define metrics once and use them everywhere for better governance, security and overall trust in your data. To live up to this vision, they are releasing bI connectors, including GA of their custom built connector for tableau, which will make it easier for lookers metrics layer with a broader system of SQL based tools with an integration layer for looker ML, which is how they bastardize it based on the bigquery plus connectors for popular products. If you're fronting your bigquery through Looker ML and now it becomes data, now the source of truth in Looker ML, you now have a way to get it out of the integration layer is the open SQL interface and gives looker customers more options for how they deploy governed analytics. Also releasing a general purpose JDBC driver for the connecting to the interface and partners include Thoughtspot mode, ApOs system or APO system and products with looker semantic layer. The connectors for looker now include Google Sheets, Looker Studio, Power, Bi Tableau, Thoughtspot mode, Apost systems and that custom JDBC. [00:28:43] Speaker B: I mean, couldn't agree more that data is driving force today because I spend my entire day seemingly just trying to get access to data sets or replicating data sets or doing something, providing data to the rest of the business. And so these types of connectors and stuff offer a great amount of flexibility because these bi tools are so complex that people sort of develop their favorite and don't want to use another one. I mean, I like the looker ML sort of layer on it because it's super powerful, but it's a lot of rope for sure. Like you can do some really cool stuff with data enrichment and augmentation and that kind of thing. But yeah, you can also shoot yourself in the foot pretty fast. But I don't really quite understand that layer other than it's part of the plugin system where they're actually adding the SQL interface. [00:29:40] Speaker C: Yeah, it's basically becomes a plug into looker that gives you SQL and then you can basically query the data that's underpinning the looker ML dashboard that you have. So you can basically use that data in other systems like tableau. [00:29:54] Speaker B: Looker ML was the only way to do that before. [00:29:57] Speaker C: Correct? You basically force everyone to use looker ML and then people are like, well, tableau is our business reporting system of choice. Either they had to go figure out how to recreate the data you created in looker ML in tableau, or you get different results. And now people doubt the data. And that's not good. That causes escalations. It's very painful. All right, Google is pleased to release the general availability of the C four machine series, the most performant general purpose VM for compute engine and GKE customers. This does not have a gpu, so it's not as expensive. C four vms are engineered from the ground up and fine tune deliver industry leading performance, up to 20% better price performance for general purpose workloads and 45% better price performance for cpu based inference versus comparable general available vms from other hyperscalers. Together with the infor machine, C four vms provide the performance flexibility you need to handle majority of workloads. All powered by Google's titanium titanium offload technology, C four provides high performance activity with up to 20 gigabits per second of networking bandwidth and scalable storage with up to 500,000 IOP's and ten gigabits of throughput on hyper disk extreme. The C four instances scale up to 192 V cpu's and 1.5 terabytes of DDR five memory and feature the latest generation performance with Intel's fifth gen Xeon processors. [00:31:10] Speaker B: I look forward to being able to use these five years when they actually roll out the inventory to all their regions. But uh, cool. [00:31:17] Speaker C: Maybe you have a better chance getting them though cause like getting intos and c two s and all those are getting a little harder to do because those intel processors are not made in volume now. [00:31:26] Speaker B: Well, I'm still working on the three series in a lot of places so. Yeah, that's true. I'll believe it when I see it. [00:31:33] Speaker C: I mean if they didn't buy intel they went with maybe a different manufacturer for processors. They'd be better off. Just saying, just saying. Can intel weather down the specs on this is outstanding. [00:31:44] Speaker D: Like the 20gb of networking, like they really put a lot into this and it really feels like it's going to be a good workhorse for people. [00:31:51] Speaker C: Yeah, the C and the N class that are on Google are basically equivalent of C and MS on AWS and whatever combination of letters on. [00:32:01] Speaker D: I don't know, I'm still working on that. [00:32:05] Speaker C: Nv five m twelve double dash for b two. Yeah, v two, why not? So this one a little confusing to me. So you might have some insight on this, Ryan, and you can help explain this one for me, but I'll try to get us through it. Google is launching a new custom compute class API in Google Kubernetes engine. Imagine that your sales platform is working great despite surging demand. Your Kubernetes infrastructure is seamlessly adapting to handle the traffic and your GKE cluster auto scaler is intelligently selecting the best resources from a range of options. You've defined no pages for being out of resources or capacity issues, all powered by the custom compute class API. Google is providing you a fine grained control of your infrastructure choices with GKE can now prioritize and utilize a variety of compute and accelerator options based on specific needs, ensuring that your apps, including AI workloads, always have the resources they need to thrive. GKE custom compute classes maximize obtainability and reliability by providing fallback compute priorities as a list of candidate nodes, characteristics or statistics statically defined node pools. This increases the chances of successful auto scaling while giving you control over the resources that get spun up. If your first priority resource is unable to scale up, GKE will automatically try the second priority node selection and they continue down the list. So for example we have an N 2D with 16 gigs of memory that is referred option is not available. It falls back to a c 2D with 16 gigs of ram. That's unavailable. It goes back to an n 2D with 64 gigs of ram, which why would that be available if the 16 wasn't? But Google's weird. And then we'll fall back to a node pool. Scaling events for top priority nodes may not be available without custom compute classes. So pods land on lower prior instances and require manual intervention previously, but with the active migration of workloads with the new API, all workloads eventually end up on the preferential node shape once it's available. [00:33:49] Speaker B: Kubernetes is really complicated, huh? [00:33:52] Speaker C: Yeah, well, I was looking at these. You manually have to reconfigure it. I'm like, okay, so did you sort of have something like a fleet thing, but the fleet was manual reconfiguration and then this is supposed to solve that? It's sort of a confusing, well, I. [00:34:08] Speaker B: Think previously you had to sort of reconfigure your, the hardware layer of the cluster and then relaunch your pods to get them into the new thing. So this is allowing more. [00:34:18] Speaker C: So you got the outage call that you're out of the n four capacity because you. [00:34:22] Speaker B: Exactly. [00:34:23] Speaker C: Problem. So you're launching in three s or c three s. And so then you're like okay, now I'm stuck with c three until I manually get the n four s back and then I manually drain the c three or c or n four. Yeah, that makes sense, right? That was what threw me off in this article is like, why don't wait? Was there a thing before? Sort of, but no, it sounds, I. [00:34:39] Speaker D: Get what you're saying and that's what's kind of cool is this API that sounds like we'll automatically manage that for you and auto shift stuff as it, you know, capacity becomes available. [00:34:49] Speaker C: Now, which let me play devil's advocate for you a little bit here. [00:34:53] Speaker D: Yeah, it's probably not going to work because that's just a rope. [00:34:58] Speaker C: So you have, you have 500 service pods running out there with your web application and all of a sudden 20% of your user population is experiencing 3 seconds slower response time on average. And you pin it down to basically this node that happens to be running an n two versus an in four. How do you deal with that? And hey, how did you even figure it out? Because unless your monitoring system is detecting all these different node types, all you have is just a bunch of red requests all of a sudden went slower than they used to go. So you need something like honeycomb that can actually handle that. Kind of troubleshooting of a massive amount of nodes and try and identify individual weak performers. [00:35:37] Speaker B: Yeah, I mean, it is tricky. And there's definitely been a heavy adoption of kubernetes in a lot of monitoring platforms. And so you do get the ability to dashboard those things out pretty quickly in terms of like pods on the individual instances. But that doesn't guarantee you're going to have the insight and the metrics to for like you said, you know, a little bit of latency here and there. It's really tricky to sort of compile it all together in a lot of cases like that. [00:36:06] Speaker D: You know, what I've done on AWS, I've stayed within like two families for like spot fleets for kubernetes. Even where you're like, okay, I want these two because I know that they will work and I just let it choose whatever instant size. And worst case one is a little bit hotter because it's 16 versus 8gb of ram for, you know, so it holds more pods and you know, it's fine because it's not that much of a big deal, but like kept it like in a fairly contained framework between them. Yeah, and I always tried to do the same generation or like two generations, so do like c four and M four and you know, c five and M five type of thing for the reason that you're saying. [00:36:46] Speaker C: Yeah, exactly. And that's why I used to do fleets and I, knowing GCP and the capacity problems to get safety, you need to get diversity of fleet type, I think, to really protect yourself. So that's where I sort of like. [00:36:59] Speaker B: And a lot of the benefit, like, you know, typically, you know, to do similar instance types with it, you're going to end up with a ton of clusters. And Kubernetes is complex and hard to upgrade. And so a lot of teams seem to prefer very large clusters and use the namespace rules for dictating which node pools and stuff goes on too. And so it becomes even more of a problem in configurations like that where you have many, many different types of disparate workloads that have different constraints and different things. And so you need very powerful rules engine and you need something that can adapt pretty quickly and so that complexity is part of that, the way that workloads are run. [00:37:44] Speaker D: I do want to point out that they had to say in this article because this article has absolutely nothing to do with AI in any way, shape or form, but it includes AI workloads because for some reason it wouldn't have been known. And I actually checked the article because I saw it in the show notes, but I literally had to go into the article to be like, why is that commentary necessary? Does somebody miss, like, their AI quota for the day? So they just threw it in a kubernetes gke ape. [00:38:15] Speaker C: Every blog post meant to mention AI three times. And yeah, I assume there's some sort. [00:38:20] Speaker B: Of publishing rule, right, that you build into, you know, your thing and so if it doesn't have AI, it won't. [00:38:25] Speaker C: It won't pass the automated check through the CMS build out AI. [00:38:30] Speaker D: They say AI use case and AI workload. I'm like, what about web workload? Does it not work for that because you didn't call it out of. Yes. What about processing workload? Like, come on. [00:38:40] Speaker C: Well, that's one of the things too that compute classes for accelerators and really attributes being ideal for AI use cases only. If you can get those particular nodes, those are even more capacity constraint. Yeah, I would test this one before you get, you know, put that pager away because I don't think it's gonna work as well. And based on what I know about Google capacity. Well, Ryan, we're gonna destroy some stuff, your favorite thing. We're gonna destroy secrets. [00:39:07] Speaker B: Ooh, even better. [00:39:08] Speaker C: Yeah, we're gonna destroy those secrets like Roswell and what happened there, and aliens destroy all of them. But we're gonna do it in a safer manner with the new delayed destruction of secret versions for secret manager, this new capability helps to ensure that secrets material cannot be erroneously deleted, either by accident or as part of an intended malicious attack. Managing secrets and secret versions was possible before it had some challenge and risks. Destruction of a secret version is an irreversible step previously, meaning there is no way to recover your secret once it is destroyed. Nor was there an actual learning model. So if the hacker was in there deleting all of your secrets, you wouldn't find out about it, which is not great, especially if they're your critical secrets, which reduces the chance of a timely intervention from administrator, who hopefully answer their page with a customizable delay duration. You can prevent immediate detection of secret versions, as well as fire a new pub sub event notification that alerts you when a destroy action is attempted. And using all the great Google monitoring tools, you can send a page to Ryan's phone saying someone deleted the secrets. [00:40:07] Speaker B: I mean, this is a good feature. AWS has it by default from the rollout where it takes seven days for a secret to actually go away, and you can restore it. Up until then, the monitoring is the bigger one for me being able to configure a notification without trying to like, you know, scout through all the API logs for the delete secret API method. So this is nice. I like that. [00:40:34] Speaker C: Oh, you don't like cloudtrail? Come on, man. [00:40:37] Speaker B: I mean, this isn't in cloud trail, this is an admin action log explorer. [00:40:42] Speaker C: At least, at least in cloudtrail you would have the, you had the delete secret event, so you could have triggered on that from day one. I mean, I'm sort of weird to me a that they design this without this capability because seems like an obvious attack vector number two, not to have any way to alert on it previously. It seems kind of strange, like, you know, like default principles of the cloud. Like we should audit everything, right? Really not. [00:41:04] Speaker B: Yeah. [00:41:05] Speaker D: The only annoying part about this is you make sure when you set up your IC that's like if Dev then set this to be delete. Because I'm going to delete and recreate this many times as I am figuring out my IAC. And that has never bit me in the butt on the, any of the other cloud providers. You know what bothers me? I'm like, oh no. [00:41:24] Speaker B: Is they run into it still, like every time, like I never learned, like. [00:41:28] Speaker C: That'S the thing I'm bothered most. [00:41:30] Speaker B: It's one thing to run into at. [00:41:31] Speaker D: Once, but every, every time, every time I've done it on Aws, I did it. And then when we, we stood up everything in terraform at my day job, I was like, I wrote the module for, for this, you know, and I was like, I deployed it and I just deleted it and I deployed it. I saw the error, I was like, oh, I don't even understand what's going on yet, but I know what the problem is. Like, yeah, how did I walk into this again? Yes. [00:42:00] Speaker B: S three buckets and objects in there and versioning on that and secrets, all kinds of things. [00:42:08] Speaker D: Well, s three, they at least let you do the forcefully, which purges the objects in the bucket because that was the pain for a long time. Washington. And this replication is pretty quick. Only once or twice I've had to wait like an hour or two for the bucket name to come be available. [00:42:26] Speaker B: Yeah, yeah. [00:42:27] Speaker D: Normally it's like within a minute. So by the time it deletes and I watch it long enough, even with my tf rebuild alias, which just does a destroy auto approve at a upgrade and then auto apply auto approve. Definitely not a command. I recommend people, else, other people have set up. It does work yeah, I know, force. [00:42:51] Speaker C: Is a great, it works great, just force it. Yeah, it's a problem. [00:42:54] Speaker D: What could possibly go wrong? [00:42:57] Speaker C: So I'm going to be the least egregious part of the cloud run stories today. First, so you can now run your AI inference jobs on cloud run with Nvidia GPU's. This allows you to perform real time inference with lightweight open models such as Gemma two b or 7 billion parameter models, or the metalama three 8 billion parameter options, or your own custom model you built, which is lightweight, which. That's awesome. I love this. That I can run cloud run with Nvidia GPU's. [00:43:25] Speaker B: No, I mean this is a great example of how to use serverless in the right way. Right? It scales down, you're doing lightweight transactions on those inference jobs and then you're not running dedicated hardware or maintaining an environment which basically you keep warm. We can make fun of that as long as there's enough GPU's in the back end. But with cloud run, it's abstracted away so it should be fine. [00:43:53] Speaker C: And now the egregious story. Cloud functions is now known as cloud run functions, which is stupid. Yes, this goes beyond a simple name change though. It's not just stupid, they do have an awesome part of it as they have unified cloud function infrastructure with cloud run. And the developers of cloud function second gen get immediate access to all new cloud run features, including those Nvidia GPU's we just talked about. In addition, Google Cloud function gen two customers have access to all the cloud run capabilities, including multi event triggers, high performance direct VPC egress ability to mount cloud storage volumes. Hey, I can run SQL, Google managed language runtimes, traffic splitting manage from Ethan, open telemetry and inference functions with Nvidia GPU's. Now I would not run SQL server on that. I would use something lightweight, not SQL, that would not run in a cloud run. [00:44:43] Speaker D: I'm more terrified of why you're talking about writing SQL Lambda's functions app. [00:44:50] Speaker B: No, he's given up running it in containers, so now he's going to run it on the serverless framework. [00:44:55] Speaker C: I can see for SQLite, you just run a small little SQLite database and you do an update, you pass the command into the cloud run function. You can update cloud if you want to. I don't think you should. I'm just saying you could. [00:45:04] Speaker D: I think we need a bad justin from the keyboard shortly. [00:45:07] Speaker B: No, there's an alert every time he logs in I get a thing. [00:45:11] Speaker C: That's why they don't give me access to systems anymore. [00:45:14] Speaker D: It actually just auto changes his password so he has to reset it. [00:45:18] Speaker C: I don't even remember my password, it changes so often I forget I need to get to access something. [00:45:25] Speaker B: I mean, I look forward to this because cloud functions have been really limited in terms of specifically for my use cases for networking. And so I'm really curious about the direct VPC egress as well as the traffic splitting options here, because those might make my life a lot easier. [00:45:42] Speaker D: So cool. [00:45:43] Speaker C: Yeah, I started to wonder why you would just use cloud run, unless you're getting some automation with cloud run functions that I'm not familiar enough with. But the fact that you get all the cloud run benefits with cloud functions, and if I get some advantage with using functions, I guess it's a win. [00:46:00] Speaker B: I've never really been able to figure out why use cloud functions versus cloud run. One has functions in it, so I prefer that one, you know, kind of thing. But they, they each have their quirks in terms of like, you know, scalability and basically triggering workloads is where they're very different. So I think this looks like, I think that a lot of people probably feel very similar as I do, which is part of this merge trying to reduce that abstraction. It's one service with different capabilities of the one service will go into different workload pools. Makes sense. Very nice. [00:46:42] Speaker C: Well, let's talk about compliance. [00:46:44] Speaker B: Do we have to? [00:46:44] Speaker C: It isn't a one time job. Google is releasing several updates to assured workloads, which helps your organization meet your compliance requirements. The new compliance update feature allows you to evaluate if your current assured workload folder configuration differs from the latest secure compliant one and can enable you to upgrade previously created assured workload folders to the latest version. There also expanded region controls of the shared workloads now in over 30 regions and 20 countries, and regional controls now support over 50 of the most popular Google Cloud services, which represents 45% more than the year before. And they now have over 100 new Fedramp high authorized services, including vertex, AI, cloud build, and cloud run. Which means AI is coming to the government. [00:47:26] Speaker B: It does. Sure does. [00:47:28] Speaker D: Wait, so their compliance service only supports say over 50, so somewhere between 50 and 50 50 and 60% of the services. [00:47:37] Speaker C: So only 50 of them are supported with regional controls and assured workloads. So basically they, it means that they will guarantee that only a person of that country will access those systems. [00:47:50] Speaker D: Got it. Yeah, I've translated this to my head now. [00:47:53] Speaker C: Yeah, they use this assured workloads label for a lot of things, and it can get you in trouble unless you know exactly which one you're talking about. Because assured workloads applies to Fedramp, it applies to regional sovereignty and to compliance configurations of other areas as well. [00:48:09] Speaker B: I mean, this is definitely a great addition, especially if you're running Fedramp workloads, because that's hard. And I love the sort of checkboxness of this. You're creating projects in a certain structure and you're assigning them as an assured workload and defining the regions and the services. And so not only can you confine them to the appropriate service, but you can also confine them to the appropriate compliance level. So like if you're a Fedramp high, you get these services, or if you're fedramp medium or whatever the rest of the mark, you can gate to these specific options within the GCP APIs and server project structure. So it's pretty handy feature. I haven't paid for it yet though. [00:48:56] Speaker C: So yeah, I think it's a good feature too. But yeah, it definitely, like all Google things, has sharp edges. So be careful. Well, for those of you who need to get events into your system and you decided pub sub is too limited because you're potentially multi cloud and you need Kafka or other things, or you have architects who just don't understand the difference between kinesis and Kafka like I had at a prior job. You can now you know if you want to manage Kafka in your data center or in your cloud providers, you had to learn how to run distributed event processing and things like Zookeeper and the storage systems that can push your Ops team to the brink. There are tons of ways to secure network and auto scale your clusters and those are all things your team has become experts at. Or Google is pleased now offer you a shortcut with the new Google Cloud managed service for Apache Kafka. The service takes care of the high stakes, sometimes tedious work of running infrastructure for Kafka. This is an alternative to cloud pub sub. You can have Kafka clusters in ten different VPC networks, but there was no mention about region support, which is really what I need out of this service versus in region support. But if they make this multi region over time, im sort of in on this one. [00:50:07] Speaker B: Yeah I mean this. [00:50:09] Speaker D: Anyone managing Kafka is great because you dont want to be managing this if you can avoid managing the setup of it. This is one of those services that youre like here take my money, please make sure that I dont have to deal with this in life. You dont want it go away. [00:50:22] Speaker B: Yep I will happily pay confluent cloud and now I can. And then Amazon, I forget their service mks. Yeah and now this, like I will happily pay someone else to manage that complexity. Um, it's just not like even if you can do it like why? [00:50:42] Speaker C: Because you like to play with zookeeper when it breaks in horrible ways. [00:50:45] Speaker D: That sounds like bad life choices. [00:50:47] Speaker C: What it is. That's a bad life choice. [00:50:48] Speaker B: He's only saying that to troll me because I was like ooh, zookeeper. [00:50:54] Speaker D: And then you ran it and then you tried to write a production and then you did you still say uzookeeper? [00:50:59] Speaker B: I did, but only because I used it very sparingly for maintaining cluster state for a configuration management system. But yeah, most use cases I've seen with Zookeeper, if you have really load the individual, it's gonna fall down and, and hurt you and yeah, I mean. [00:51:15] Speaker C: Isn'T kubernetes still heavily relying on Zookeeper? Did they back away from that finally? Or Etc? Is it etcd? [00:51:21] Speaker B: I think it's etcd. [00:51:22] Speaker D: Etcd. [00:51:23] Speaker C: Yeah, they're the same, same knife, different manufacturer. [00:51:28] Speaker B: I don't know, I don't hear the same hate for etCD as I do for. [00:51:31] Speaker C: That's because everyone ran to Zookeeper or console. Well, like Azure, Google is now getting a new provider. The 60 is now generally available for terraform Google provider. The combined Hashgroup Google provider team has listened closely to your feedback and only given you three notable features which were not that notable. First up, opt out default label Goog terraform provisioned is now opt out. It was opt in in 516. And so if you were looking for default labels which are dumb, you can turn this on and get Goog terraform provisioned. Then they go into a long part of the article trying to explain to you how this is awesome because now you can see in your building data that it was Goog terraform provisioned. And I'm like which terraform file was it provisioned in? Could you give me that at least? Which file it was like anything other than just Google terraform provision which does not help me in any way, shape or form. Again, the second item is deletion protection fields added to multiple resources including Google Domain, Google Cloud Run v two job, Google Cloud run v two service, Google Folder and Google project which that should have had delete protection a while ago. And the last one is allowed allowed reducing the suffix length and name prefix. The max length for the user defined name prefix has increased from 37 characters to 54 characters, just like the Azure version. There is an upgrade guide available and I'm sure more will be coming out as there are lots of tricks to upgrade your Google terraform provider. [00:52:58] Speaker D: The deletion protection still is, unlike guys that should have been there, I'm surprised. [00:53:04] Speaker C: For Google Folder, Google Project and Google domain, those ones should have been there a while ago. [00:53:09] Speaker B: You've had the ability to do that, just nothing, be a terraform natively. But yeah, yeah, this makes sense because then you can just define it as having delete protection when you're, when you're running it versus the terrible method before which I forget what they even call it, GCP. [00:53:28] Speaker C: But it's, it's delete protection because it's Google. It's gonna be something plainly, oh no. [00:53:34] Speaker B: This is like a. No, it's, it's, it's so obscure. I remember the brain isn't working, but I'll find it and I won't. [00:53:45] Speaker C: It'll come to you at 03:00 a.m. you wake up, you're like, that's what it was. [00:53:48] Speaker B: And then you fall back to a dead sleep. [00:53:53] Speaker C: All right, well we're going to move right along to Azure where first up we're getting the ability for you to elevate your AI deployments more efficiently with new deployment and cost management solutions for Azure OpenAI service, including selfdevelop service provisioned, which yes, that's the way the headline reads. So what's new? Self service provision and model independent quota requests, allowing you to request provisioned throughput units or ptus more flexibility and efficiently. This new feature empowers you to manage your Azure OpenAI service quota deployments independently without relying on support from your account team. And by decoupling quota requests from your specific models, you can now allocate resources based on your immediate needs and adjust as requirements evolve. Next is the visibility, visibility to service capacity and availability. So now you know in real time about service capacity in different regions, ensuring that you plan and manage your deployment effectively. And all I can ask for both Azure and GCP is please make that available for all instance types across all regions so we know. And then the final part of this is provisioned hourly pricing and reservations. There's no hourly, no commit purchasing and monthly and yearly Azure reservations for provisioned deployments of Azure open AI service. [00:54:59] Speaker D: These are, while they sound crazy, extremely useful because as soon as like was it 40 came out, we had to go like floyd them because otherwise we were worried. We were going to get locked out of the region. So even though we weren't using them yet, our account team was like, make sure you deploy them as soon as you see the announcement that may or may not be coming out in a very in it in the next couple of days and do the units that you're going to need for production even though you didn't know what we need yet. I, so we just chose a high number and you know, so having more levers on it should hopefully make it so that people aren't doing what we were told to do, which is take everything and you'll leave some for everyone else. [00:55:41] Speaker B: So wow, that's crazy because you're paying for all that provision. [00:55:45] Speaker D: No you're not. [00:55:46] Speaker B: That's what you're not. [00:55:49] Speaker D: Depends on. I have to check what we did versus provision. There's, there's two different levers that you need that you can pull. So we don't do provision. We did the other lever which means that we at least have the ability to use it more so than if you just got locked out. They ran out of capacity in the region to start off. [00:56:09] Speaker C: Is that what the hourly no commit purchasing is? [00:56:12] Speaker D: Yeah, I think so. This guy to a detail that was like, yeah, I'm not worried about it. They're not charging me and we did it so we're good and we can move on in life. I'll deal with this in six months from now and there I have enough capacity that I can delete it, do it right and move on. [00:56:31] Speaker C: It's so interesting because you're reserving or signaling that you're going to use capacity that they can't sell to someone else. They're not charging you for it. It's sort of weird. [00:56:39] Speaker D: Yeah. I assume they over provision it to x percent and they've done enough mathematical models to say, have they though? [00:56:48] Speaker C: I don't know. [00:56:51] Speaker D: They might just drop IP addresses from their virtual machines to storage layer again. It's fine, no problem, no issue. [00:57:01] Speaker C: Well Azure is thrilled to announce you can attach or detach vms to and from a virtual machine scale set. Vmss with no downtime is generally available. I will point out that they actually use the word thrilled which is why I put it in the show notes. They said they're thrilled. The functionality is available for scale sets with flexible orchestration mode with a fault domain count of one. And the benefits? Let Azure do the work for you. Easy to scale, no downtime, isolated troubleshooting and easily move your vms. I'm like, I get the idea of taking something out of an auto scale group to troubleshoot it, that makes sense to me. Then to put it back is sort of weird. Why wouldn't you just kill it after you troubleshoot it? [00:57:38] Speaker D: And this is only for flexible. So if you're not using flexible, which has other issues, all right with it, and you already have to be in a fault count, so you actually have more than capacity than you need. So there's very specific ways you can leverage this. [00:57:54] Speaker C: You're saying that's not most people? [00:57:57] Speaker D: I mean, probably not. [00:58:01] Speaker C: So what's flexible orchestration mode versus non flexible orchestration mode? [00:58:06] Speaker D: It's called uniform orchestration mode. Of course it is. Yeah. There was a whole document I had read on one book, so I was trying to figure out which one. We use uniform at my day job for most of stuff. And it's like you can't see the machines, but you can like, you know, so if you just go like the virtual machine console. So in AWS, right, if you're in the EC two console, you will see all the machines, even though they're in the auto scaling group here, you don't see them there, they're all hidden, but you can still actually RDP to them or whatever, ssh, whatever way you need usually to go in a slightly different way. And then there's like different capacity limits and theres different pros and cons to each one. Its one of those things you research at the beginning, you decide which way youre going to go, and then you look back in six months now and deal with it and have to redo all that research because youve successfully forgotten why you went that way. [00:59:00] Speaker C: So im realizing as youre talking about this, and my brain is wandering, that I recommended you for this job and I put you into the situation azure. And I just wanted to apologize for what ive done to you because nothing you said made any sense on that at all. And I know it made perfect sense to you. And like when you're no longer working at this job sometime in the future, hopefully not anytime soon. Well, maybe soon. I don't know. You can forget all this very quickly, liquor, or that I will gracefully provide to you to help burn all these brain cells. Seriously, man, that hurt. [00:59:31] Speaker B: Yeah, I got about halfway through and then I was like, not pain tolerance exceeding. [00:59:36] Speaker D: I just assumed that you wanted to be a do this job. So we had more of an azure. [00:59:39] Speaker C: I mean, we did, that was the ulterior motive to getting you this job was like, well, he'll get more azure exposure. That'd be great for the show. And we, there was definitely that to it. But also like, you know, you're helping out a friend of mine as well who's a cto there, but I feel, I feel bad. So I'm sorry about that. Romantic. [00:59:55] Speaker D: So here's a quick google the quick medium article that pulled up uniform orchestration. [01:00:02] Speaker C: Medium. Medium. A source of truth for sure. [01:00:04] Speaker D: Yeah, it's on the Internet. It must be true. Uniform orchestration, comma standard mode deploys and manages virtual machines that are identical in configuration. What a scale set, auto scaling group big should be flexible orchestration is more diverse. Virtual machine configuration setup within the same scale set. This is perfect for a more complex application that requires different virtual machine sizes, which I still wonder if you have different virtual machine sizes, why they're in the same scale set. Unless if you're doing some logic that says, like, hey, if I have 18 gigs, gigs of ram to 4000 connections, if I have 24 gigs, use this many number of connections, which I've done weird stuff like that. But there's 27 other issues with all this. [01:00:52] Speaker B: Yeah, I guess I can kind of see like if I read between the lines and I squint, you know, like the uniform scale set works really well for, you know, standard web serving where it's all the same and you're just trying to round robin requests and then I. Maybe. Flexible orchestration mode works really well if you're running your own Kubernetes cluster where you need to define different node sets with different things. Yeah, I'm gonna pass. [01:01:18] Speaker A: That's kind of nice because if you've made reservations and you're not using them anywhere else, to be able to put them into a flexible scale set is and reuse them and not use them. No, no. [01:01:28] Speaker C: You're very quiet. [01:01:29] Speaker A: Okay. [01:01:31] Speaker C: Welcome, welcome. [01:01:33] Speaker B: Look who snuck in. [01:01:34] Speaker C: I know, snuck in, right? The azure section. Well done. [01:01:37] Speaker A: I thought you're doing the azure stuff first. [01:01:41] Speaker C: We always save it for last because it's the most exciting part of the show. [01:01:45] Speaker A: All the other listeners have dropped out by now. I was just gonna. I'm just saying, if you've got reservations that you're not using your old reservations that you bought for however many years, it's really nice to be able to reuse them somewhere in an auto scaling group, if that's the model, instead of. [01:01:58] Speaker C: Arbitrage against overpaying for our eyes. Perfect. [01:02:01] Speaker A: Yeah. [01:02:02] Speaker C: Okay. [01:02:03] Speaker A: Or if there's, or if there's a spot market, you know, maybe you can get a particular type of instance. But not a different type at any moment in time. It's kind of. I like the flexibility, but. Yeah. [01:02:14] Speaker D: But then back to this whole article, back to the actual article that brought us down. The difference between all these. You have to be in a fault tolerance domain count to one, which means you already have to have double the capacity of what you need. So if you are in a multi zonal setup, you've told it to launch, you already have your over provisioning you now, I believe, and this is where I'm not 100% sure. I think default domain count means that you have to have a full other set, which that means you're double of all that, which then I just get confused. [01:02:43] Speaker B: See, every time he gets about halfway through, I'm like, just make the pain stop, Matt. Just. [01:02:47] Speaker C: Yeah, yeah, lots of booze. Lots of booze. Okay, moving on to the other stories. Gonna make you drink. Cyber attacks are. Yeah, well, it gets better. Cyber attacks are becoming more frequent, and so Microsoft is now forcing you to MFA for all Azure sign ons as part of their $20 billion investment in security. Starting in October, MFA will be required to sign into Azure portal. Microsoft enter admin center and the intune admin center, which. Good, good. I'm all good there. Then we get to the bad part. In early 2025, gradual enforcement for MFA for signing will occur for the Azure ClI. Okay, Azure Powershell, that's the CLI, but whatever. Azure mobile app. Okay, fine. And infrastructure as code tools. And that's where I lost it. I was like, no, that's gonna be terrible. So now all your infrastructure code with tools will also require MFA at some point in early 2025, which will cause you to drink. [01:03:37] Speaker D: Have you crossed this bridge yet of how we're gonna have to deal with that one? [01:03:40] Speaker C: You don't have to wait till. You can wait. You have months to go six months away, so you can figure it out. [01:03:44] Speaker A: Then presumably there's other options than Orfe on the phone or Microsoft authenticator for a second. Factor in this for automation, like a key of some kind or passkey or whatever. [01:03:58] Speaker B: Yeah, I don't know. [01:03:59] Speaker C: The day job, they turned on this MFA thing, and it's driving me crazy. [01:04:03] Speaker B: It is pretty bad, but, yeah. Now hopefully it's like an OAuth model, and you just sort of like continue the length of the. The. [01:04:14] Speaker C: I assume that there's like, you can use keys, like API keys or something. [01:04:18] Speaker B: That would use API keys typically already. Right. To authenticate. And so this would have to be another factor. And so maybe you'll have to re authenticate every so often. [01:04:29] Speaker C: Or, and it's a lot great for a terraform enterprise that's supposed to be running and doing drift detection and all the other things that these things can do. [01:04:37] Speaker B: Yeah, no, this is going to be a nightmare, I'm sure. [01:04:39] Speaker D: Or you just run your worker nodes inside and use the whatever they call service principle to, which is like an IAM role to handle the authentication for you, which definitely works great with Atlantis. [01:04:52] Speaker C: Yeah, that's an option. Well, good. We'll look forward to complaining about this in February or so. [01:04:57] Speaker D: Yeah. Then I might yell at you for showing me this job. [01:05:03] Speaker B: Going to demand the boost. [01:05:06] Speaker C: Well, Azure has a bunch of updates to the Fi model, streamlined rag, and customized generative AI models. Look at these kind of quickly because we're over AI for the day. Improvements to the family of models include the new mixture of experts model plus 20 additional languages. The AI 21, Jamba 1.5 large and Jamba 1.5 on Azure AI models as a service is available the integrated vectorization in Azure AI search to create a streamlined retrieval augmented generation pipeline with integrated data prep and embedding the customized generative extraction model in Azure AI data document intelligence. So you can now extract custom fields for unstructured documents with high accuracy and the general availability of text to speech avatar, a capability of Azure AI speech service which brings natural sounding voices and photosynthetic avatars to life across diverse languages and voices, enhancing custom engage customer engagement and overall experience. I hope they're blue. Ga versus code extension for Azure virtual machine learning and the general availability of conversational PIi detection service in Azure AI languages. [01:06:09] Speaker A: Man, I really heard the big clippy back. [01:06:13] Speaker C: Is it gonna be blue but a photo realistic clippy? [01:06:16] Speaker B: Yep, that's what I want. [01:06:18] Speaker C: With a tail like avatar, the cloud. [01:06:21] Speaker D: Pod turns blue should have been our title. [01:06:25] Speaker C: That would be a deep cut. I don't know if people got that far. [01:06:27] Speaker B: Goes blue, you know, even deeper cut, referencing to foul language and comedy. [01:06:34] Speaker C: All right, great. That's it for this week. Jonathan, anything you want to add on any topic we talked about? Sure. [01:06:42] Speaker A: Could scroll through. No, I don't think so. [01:06:45] Speaker C: Fantastic. All right, well, next week we will talk to you here in the cloud. Have a great weekend. [01:06:51] Speaker D: See you later. Hey everyone. [01:06:52] Speaker B: Bye everybody. [01:06:56] Speaker C: And that is the week in cloud. Check out our website, the home of the cloud pod, where you can join our newsletter slack team. Send feedback or ask [email protected]. or tweet us with the hashtag hash poundtheclab pod.

Other Episodes

Episode 243

January 17, 2024 00:30:17
Episode Cover

243: WHOIS The Cloud Pod? We’ll Never Know

Welcome to episode 243 of the Cloud Pod podcast - where the forecast is always cloudy! It’s a bit of a slow new week,...

Listen

Episode

June 05, 2019 57m22s
Episode Cover

Episode 25: Optimize your Journey with The Cloud Pod Center of Excellence

This week we talk about Cloud Center of Excellence, New Encryption options, open source update on Firecracker and more.  Elise Carmichael (twitter: @uncfleece) from...

Listen

Episode 109

March 25, 2021 00:45:13
Episode Cover

109: The Cloud Pod Hopes all Fault Injections are Simulated

On The Cloud Pod this week, the team debate the merits of daylight savings and how they could use it to break things in...

Listen