[00:00:00] Speaker A: Foreign.
[00:00:08] Speaker B: Where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure.
[00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan and Matthew.
[00:00:18] Speaker D: Episode 327 recorded for October 21, 2025. AWS finally admits Kubernetes is hard, but makes robots do it instead. Good evening, Matt and Ryan. How you guys doing?
[00:00:30] Speaker C: Doing good, dude.
[00:00:31] Speaker A: How are you?
[00:00:33] Speaker D: Yes, it's been an interesting week in the cloud as Amazon was on fire yesterday. Yeah, the entire Internet was down. At least it felt like the whole Internet was down, except for me because I was on gcp, so.
[00:00:44] Speaker A: And I was an Asher for once.
[00:00:46] Speaker D: Small victories. Yeah.
[00:00:50] Speaker C: Does it make the rest of the year worth it? Probably not, but at least we get.
[00:00:54] Speaker A: This one for 18 hours, 16 hours or so. It was good times.
[00:00:59] Speaker D: Yeah, we definitely will talk about that today, but let's hit some follow up first and then we'll get into the fun of Amazon's Hug Ops scenario yesterday.
First of all, last week Ryan and I talked about the Glacier deprecation, or actually a bunch of deprecations that Amazon had announced.
And then right after we recorded the show the next morning, I got an email clarifying something about Glacier. So we were a little bit surprised Glacier was in there and we had talked a little bit about potentially this is just the Glacier service standalone. And that's the email basically confirmed standalone. Amazon Glacier Service will stop accepting new customers as of December 15, but the S3 Glacier storage class, which includes instant retrieval, flexible retrieval, deep archive, are completely unaffected and continue normally and are basically part of the S3 service as a whole at this point. Existing Glacier customers can use it forever. Apparently there is no force migration required, at least not announced yet. I mean, forever is only until the next press release.
And basically, you know, Amazon's essentially consolidating around S3 as a unified storage platform and standalone service, will be here getting no bug fixes. So yeah, enjoy. You will actually potentially get some cost savings if you move to the S3 glacier service. It is slightly cheaper than the Glacier standalone service. I was looking at some of the pricing charts and so then that made me email Synology, who I use to back up. I use Glacier to back up my Synology at my house. And I'm like, hey, are you going to do something about this? And they're like, we'll get back to you.
[00:02:19] Speaker C: Oh, wow.
[00:02:21] Speaker D: So I'm like, well that's good, that's a good sign. But yeah. So there you go, Ryan, clarification and kind of what we assume, but I just thought we should officially say that's what they said.
[00:02:29] Speaker C: Yeah, that's cool.
[00:02:33] Speaker D: All right. General news. F5 disclosed that nation state hackers maintained persistent access to their internal systems over the summer of 2024, stealing portions of the big IP source code and vulnerability details before containment in August.
The breach compromised product development and engineering systems, but did not affect customer CRM data, financial systems, or F5 software supply chain. According to independent security audits, F5 has released security patches for Big IP, F5Os and Big IP Next products and is providing threat hunting guides to help customers monitor for suspicious activity.
The first publicly disclosed breach of F5 control systems. Notable given that F5 handles traffic for 80% of the Fortune Global 500 companies through their load balancing and security security services.
This incident will highlight your supply chain security concerns, attackers targeted source code and vulnerability information rather than customer data. Potentially looking for ways to break into your F5 product. So a little concerning on this one, mostly as F5 is everywhere and as well as they own NGINX, a couple other things. And if they're in, if they're in F5 company who has a lot of money for security in their source code, make sure you're also protecting your source code and your company's as well.
[00:03:40] Speaker C: Yeah, I mean, the thing that scares me the most is the persistent access over, you know, a summer, which says to me like months.
That's kind of spooky to me. And like, why wasn't that detected? And that's a long time to mess around inside the development environment.
And you know, how are they verifying that nothing was sort of breached or inserted into the CSE pipe pipelines?
[00:04:06] Speaker A: I mean, it also is. This is from 2024, this happened. So they're announcing it over a year later, which also feels kind of scary.
[00:04:17] Speaker C: I mean, if. Especially if they detected it right now.
[00:04:20] Speaker D: And so in the article it does talk about that they were asked by US entities not to disclose until now because of the risk to their business. So I. They were asked by Department of Justice to do that.
[00:04:32] Speaker C: Okay, well, that's understandable.
[00:04:34] Speaker A: Oh, I read that wrong when I read the article. Got it.
Yeah, I read that Department Justin was asking them to disclose it now. I must have inverted it in my head.
[00:04:44] Speaker C: Yeah, that makes sense.
[00:04:45] Speaker D: They now are allowed to announce it. They were not allowed to announce it last year when they confirmed it. So definitely very concerning.
[00:04:52] Speaker A: For sure.
[00:04:56] Speaker D: All right. AI is how ML makes money. This week, Claude Code finally gets a web version. Yay. But it's the new sandboxing feature that really matters, says Ars Technica. The Web and Mobile interfaces for cloud code the CLI based AI coding system with a web version supporting direct GitHub repository access and the ability to process general instructions like add real time inventory tracking to the dashboard. The web interface introduces multi session support allowing developers to run and switch between multiple coding sessions simultaneously through a left side panel possibility to provide min task corrections without canceling and restarting and a new sandbox runtime has been implemented to improve security and reduce friction. Moving away from the previous approach where cloud code required permissions for most changes and steps during execution. The mobile version is currently limited to iOS and is in an earlier development stage compared to the web interface, indicating a phase rollout approach positions Cloud code is a more accessible alternative to traditional CI only AI coding tools and I'm looking forward to playing with this, although I've been busy this week dealing with other things so hopefully by next week I'll get to play with this.
[00:05:53] Speaker C: Yeah, I haven't had a chance to play with the web version, but I am interested in it just because it is I found the terminal interface like limiting, but I also feel like the a lot of the value is in that local sort of execution and not in the sandbox because that's a lot of the tasks I do are internal and require you know, access to either like company resources or private networks or that kind of thing where you're not going to get that from publicly hosted sandbox environment.
[00:06:21] Speaker A: Yeah, I was gonna say I find it useful also like everything I do I always say okay, build it in a Docker container. So I know that will run elsewhere too.
So once you move it up to somewhere else, I don't know and I haven't played with it that much.
I did open it up coincidentally by accident because I didn't realize it was new where they like I just don't think you can do that whole contest containerized workflow process.
[00:06:47] Speaker C: I mean it seems like it's it's still interfacing with your code repository and and therefore it could be part of your CI C pipelines that build containers.
[00:06:56] Speaker A: Just not. Yeah but if I'm testing something locally, like if I'm building a web app or something like that, which is what I've been on a roll recently.
[00:07:03] Speaker D: A lot of people are moving away from local development anyway so they they need on server or they need other things. So maybe this solves those issues. But there's also been times where I've been, you know, because I use cloud code quite a bit and sometimes, you know, someone will call me and be like, hey, this is broken. And I'm like. Or like, you guys tell me Bolt's not doing something right, and I'm like, oh crap, I'm not anywhere near my laptop where I know exactly what I could fix with Claude. And you guys would just write the code yourselves. But, you know, has a whole different conversation.
But, you know, I could just hop into the web interface potentially and do a quick Bolt fix if I needed to. For our show Note bot, for example.
[00:07:36] Speaker C: Yeah, no, that is, I mean, that's a pretty powerful, you know, sort of option for that, especially, you know, like being able to like bug fix basically in public. You think about on call and that kind of certain things like that where you're less. Are you in, you know, working directly in this dev environment with all the special access and all the special tools and you're just doing natural language sort of prompts to. To fix things.
[00:08:00] Speaker A: I mean, it also makes it more accessible to, you know, let's say junior person or potentially someone in your support. Org. Hey, there's typos. We've typo'd, you know, the word developer in 27 places and we have three tickets for it. You know, can a junior person or someone that doesn't really have full access do and handle it in this way versus having to handle local development of it?
[00:08:27] Speaker D: Containerization Assist is automating the tedious process of creating Docker files and Kubernetes manifests eliminating manual errors that plague developers during the containerization process. Built on AKS's draft proven foundation, this open source tool goes beyond basic AI coding assistance by providing a complete containerization platform rather than just code suggestions. The tool is addressing a critical pain point where developers waste hours writing boilerplate container configurations and debugging deployment issues caused by manual mistakes.
I mean, can I take a quick side note here? That's because you guys all chose Kubernetes. If you've chosen a platform as a service, you wouldn't have had to do any of that. That was a platform as a service supposed to do for us with all of you.
[00:09:04] Speaker C: No, no, no.
[00:09:04] Speaker D: Kubernetes is the future. We don't need any lambda or anything else like. No, no, this is on your own. This is on you. This is on all of you.
As an open source, MCP server integrates seamlessly with existing development workflows while leveraging Microsoft's containerization expertise from Azure, Kubernetes service extratease is a stretch. The launch signals Microsoft's commitment to simplifying kubernetes adoption by removing the steep learning curve associated with container orchestration and manifest creation. Or you could just use a pass.
[00:09:29] Speaker C: Yeah. Yeah. Enjoy your 17 nested layers of YAML documents.
[00:09:36] Speaker A: The piece I did like about this is it integrated in as an optional feature kind of the trivia and the security thing. So it's not just setting it up but they integrated in you know, the full or you know, not the full but you know the next steps of security code scanning. So it's not Microsoft saying hey it's standard. You don't get security at premium do like which this isn't a service, it's a you know, git repository. But you know they are building security in. So hopefully as things grow this becomes more of a trend for Microsoft that you don't pay for security. But you know what a. Maybe it's just a pipe dream of mine.
[00:10:13] Speaker C: I mean trivia is open source so it's like they're setting it up for you but it was already free.
[00:10:19] Speaker A: Yeah, no, no, I know but like I guess it's more showing that they're thinking about a day one.
[00:10:24] Speaker C: Yeah, no, and I do think it's cool. I do think having having that sort of built into the pipeline by default is neat. You know, seeing that more and more you see it directly on Docker Hub for example and it's just there's so many vulnerabilities and it.
I didn't realize this coming from sort of an OS and server background like vulnerability management to me is easy. So it's just rebuild the thing. Figuring out which layer vulnerability is in like kind of came second nature. But I think if you don't come from that background it sort of feels a little bit like black magic and you know, and having zero idea on how to remediate whatever vulnerability it is.
[00:10:58] Speaker D: So it's.
[00:10:59] Speaker C: I do think that having this in there is. Is better and hopefully educates.
[00:11:05] Speaker A: I just tell Claude once a week to update all my versions and fix any bugs it has.
[00:11:11] Speaker D: Fix all bugs.
[00:11:13] Speaker C: Cron job perfect.
[00:11:15] Speaker D: See I can go wrong. Yeah, I do like that. You know looking at this repo they actually have, the other clouds are actually for citizens. There's actually, you know, here's how to authenticate against EKS and GKE and it's all actually in here where a lot of companies will say oh this is multi cloud. And then you actually go look at it like yeah, battery's not included for multi cloud.
[00:11:36] Speaker C: Do yourself use your one cloud. Yeah, yeah it happens to Be ours.
[00:11:40] Speaker D: You know, it's nice that they're trying something, so that's nice.
All right. Moving on to Clouds Tools, Harness had a blog post that just mostly annoyed me, which is why we're talking about it.
Basically, they said the title of this was Infrastructure as Code is great, but have you heard of Infrastructure as Code Management?
Wow.
Infrastructure as Code Management, or IACM for short, supposedly extends traditional infrastructure as code by adding lifecycle management capabilities, including state management, policy enforcement, and drift detection to handle the complexity of infrastructure at scale. Key features apparently include centralized state file manager with version control module and provider industries for usable components and automated policy enforcement to assure compliance without slowing down teams. The platform integrates directly with your CI CD workflow with visual PR insights showing cost estimates and infrastructure changes. And Infrastructure as Code Manager addresses critical pain points like config drift, secret exposure, and state files and resource conflicts with multiple teams working on the same infrastructure simultaneously. It also supports Open Tofu and Terraform with features like variable sets, workplace templates, and default pipelines. So let me boil this down for you.
We created our own Terraform Enterprise or Terraform Cloud, but we can't use that name because it's copyrighted. So we're going to try to create a new thing and pretend we invented this and then try to sell it to you as our new Terraform or Open Tofu replacement for your management tier. Thanks. Thanks for that, Harness. I appreciate it.
[00:13:08] Speaker C: Just, it was funny because I read this article and I got just as frustrated and I was really worried that this was something that was put in because it was like, supposedly like this really cool new thing. And I'm like, wait, no, this is just all how you're supposed to. This is what infrastructure's code has been forever. You just didn't manage it correctly now. So you're trying to invent a new thing to sell? Like, no.
[00:13:28] Speaker D: Thank you.
[00:13:29] Speaker A: Welcome to the sales and marketing team.
[00:13:31] Speaker D: I mean, I. I'm sort of offended that you. You would have assumed that I would have thought this was a new thing. Like, I don't know. I don't necessarily.
[00:13:38] Speaker C: I wasn't calling out names. You know, it's mostly blaming the AI.
[00:13:41] Speaker A: Bot now, so I thought we were blaming Jonathan.
[00:13:44] Speaker D: I did submit this article, so Jonathan's not here. You can blame him.
Yeah, no, I. I mostly submitted this because I was super annoyed about it. Yeah, no, it's.
[00:13:53] Speaker C: It is. It's just super frustrating.
[00:13:55] Speaker D: These are all things that have existed.
[00:13:56] Speaker C: And there's been tools and. And ways to do this fore.
Dedicated platform. There's definitely ways to integrate a lot all of these things into your CIC pipelines. But yeah, definitely things that you need to address. You know, you do need drift detection, you do need to have, have state management in a way where multiple teams can interact with it, you know, separately and you know, having a place for reusable components and standards. But none of like some of this is like decades old. Thanks, Harness.
[00:14:26] Speaker D: Yeah, well, Matt said that we had to create a new section today called HugOps Corner to Hug our friendly ops fellows at Amazon who had a really bad day on Monday here this week, October 20th.
So basically for those of you who were not trying to use the Internet on Monday, aws used East 1 experienced a major outage, which is what it's known for, starting with midnight Pacific on Monday, caused by DNS resolutions through DynamoDB that prevented proper address lookup for database services impacting thousands of applications.
Facebook, Snapchat, Coinbase, ChatGPT and Amazon's own services.
I mean it's always DNS number one. And then, you know, this isn't the official, we have not seen an official rca. So this is speculative, but I think they basically said it was DNS in their status updates, but they haven't given us the root cause yet. So everything we're talking about is speculative today.
But you know, first of all, the amount of press that this outage has gotten is crazy to me.
[00:15:23] Speaker A: Insane.
[00:15:24] Speaker D: It's like everyone forgot about when US East 1 used to fall over like every other month, which was like, you know, I mean, I realize it's had a really long streak of not having those problems for a while, which is appreciated. But yeah, it's like this isn't a new thing. US East 1 was called US Tire Fire 1 for a reason for a long time by a lot of us. And we always joked that, you know, friends don't let friends start an Amazon project in US east one for this exact reason. So, you know, they, it's a sort of interesting. Now there's a couple of things that I would point out and you know, we, we link to a bunch of articles, we will go through all of them. But you know, there are some things I noticed with this particular one. Number one, it took a long time for them to seem to get this thing on a positive direction. You know, I was, I happened to be awake because I couldn't sleep and so I saw people starting to say, hey, Amazon's having a problem. And I kind of ignored it, went to bed and nothing. My page didn't go off. So it was a win win.
But, you know, it took a long time for them to resolve it basically until, you know, it was like hours before they had the DNS restored. Then they had, you know, all kinds of thundering herd problems and all kinds of the traditional issues you see with Amazon when they're trying to recover, you know, thousands of people's sites. And you know, the one thing that kind of struck me was, well, is it because they're out of practice like US East One hasn't, hasn't crapped out in a while and so they lost some muscle memory or is this really a byproduct of the fact that, you know, Amazon's talent retention has been terrible as of late? You know, between layoffs they've been doing or forcing RTO and you know, all the amount of people who have left Amazon in the last few years. Like, is this a reflection of the reality that, you know, when you let smart people go, sometimes things don't end up as well as you'd like them to be. And is this kind of going to be the new norm where we're just not quite seeing the level, same level of caliber from AWS as we were used to in the past? And is this a wake up call in some ways to Amazon that maybe what you're doing isn't the right thing?
[00:17:19] Speaker C: Yeah, I mean, I'm not one for heroes and so like, I, I hope that it wasn't, you know, the earlier resolutions weren't just because people were kind of throwing themselves on, on the tracks getting these things resolved. But the, if it's a DNS resolution issue that's causing sort of a global outage, you know, like that's not exactly straightforward. It's not just, you know, a bug, you know, a function returning the wrong value or, or that you're, you're looking at global propagation. You're looking at all whole, you know, clients in different places resolving different things and some, you know, at the base, base parts of the Internet for functionality. And so it does take, you know, a pretty experienced engineer to sort of have that in their heads conceptually in order to troubleshoot.
So, you know, like, I wonder if that's really the cause. It's, you know, where they're, where they're, you know, not able to recover as much. But I also feel like cloud computing has come a long way and the, you know, the, the impact was very widely felt because a lot of people, a lot more people are using AWS as, as their hosting provider than I think have been in the past.
A little bit of everything, I think.
[00:18:30] Speaker A: Yeah, I mean I definitely think it's a little bit of everything. I think it was interesting where they kind of made it sound like in the initial cloud status that they were kind of thought they were good at one point and then it kind of went south again on them because they were like, we're seeing recovery, we're seeing recovery, we're hold on, we're not. So it almost feels like there was multiple internal issues that kind of happened and they fixed one and you know, and thunder heard and other issues kind of caused other services to fail over, other managed platforms internally to fail over. And I don't think alas, if you're under an NDA, so we probably won't talk about it, we'll get that level of detail publicly.
But they kind of felt like multiple issues there that kind of caused the global problem.
I think the larger scale issue is that they still run a lot of these core services out of US East 1 and so many of them are based there. Like IAM is still based there, Cloudfront is based there.
Can they.
And I know some of them, they can't, you know, but can they start to figure out how to move those to other regions? So if a single region, aka US one goes down 37, things don't else go down that are global services because of that, you know, and that's I think kind of the bigger issue. And you know, they've tried to mitigate it some with was it over the summer or last year? Right. Since the time is skewed in life right now where they recommended and they adjusted the default for DNS to resolve for IM into each or for I was IM or STS into each region and they still left the default one to point there. So like they're trying to slowly do that but you know, they're literally flying 17 jumbo jets, you know, at once trying to keep, you know, change the engines on it without landing.
You know, it's not like they can have a maintenance window of US east ones. They were going to take us down for two hours. The world would revolt, you know. So like I think there's probably multiple issues under the hood and I think that they are tracking in the right direction. But you know, this will probably help expedite some of the issues. Just like when was it 17 or 16 when S3 went down and took down the whole area again, you know. So core services kind of have big impacts.
[00:20:50] Speaker D: Yep. I mean it'd be definitely interesting to See, you know, if this is a beginning of a trend of a lack of stability in the Amazon ecosystem, again, I mean, again, if you followed best cloud practices, you were multi region, you know, you, you designed for failure, you were not really that impacted by this. I mean, there definitely was things, you know, logging to the console might have been problematic for you, but you know, if you're, you had servers running in other regions, you know, traffic was routed there as it should, like, you know, things work normally. And again, there's still a lot of companies who don't invest in Dr. Or in getting outside of US east one in a serious way and then they're beholden to these issues. I also say, you know, welcome to October. We're a month and a few days and change away from reinvent. Things start getting rolled out at this time of year in the background. And so, you know, there's a lot more things getting pushed out that are new or had potentially have issues or bugs that could take down the control plane at Amazon. So, you know, it'd be interesting to see if we see a trend in a, in a negative direction or is this just a fluke and, you know, big things happen sometimes and you deal with it and you recover and they take lessons learned and then everything goes back on like normal. Or is this really the beginning of a trend that won't be great? I don't. My feeling is Amazon has a lot smart people still, even though they've let a lot go or a lot of them are left, they have a lot of systems that help control and keep their systems working. I imagine their RCA will be hopefully enlightening and will tell us what happened and what they're doing to fix it. And we'll feel better about that, but we'll keep you posted as we, we wait to hear 100% from them what the, what's going on.
But just it's so funny how the media just jumped all over this one really did.
[00:22:26] Speaker A: Well, it was just such a widespread outage, you know, like, I mean, every.
[00:22:30] Speaker D: US61 outage has been widespread. It's just that we haven't had one in a while.
[00:22:34] Speaker A: Right.
[00:22:35] Speaker D: And it was a slow news day, I guess, or they needed a distraction from, you know, Trump's AI videos. Like, I don't know what was going on, but like it was, every news media jumped on it because I had people pinging me like, hey, are you impacted by Amazon outage day? I'm like, how'd you do it? Like, oh, it's on the top of the Wall Street Journal. Like, okay, yeah, nope, I wasn't. Thank goodness. Prior lives I would have been. But you know, luckily my Amazon ecosystem, the cloud pod, was not impacted because we run in US west too. So, you know, you're welcome.
[00:23:03] Speaker A: Oh come on. You don't want to try us West One and just pay the extra 10%?
[00:23:06] Speaker D: No, I don't. I don't.
It also doesn't have any of the services that we actually use to run a website, so. Yeah, minor problems. U.S. 1.
[00:23:15] Speaker A: Details.
[00:23:16] Speaker D: Yeah, details. All right, well, exiting Hugops for AWS right into AWS. Let's talk about the new things that they launched that might have broken it. We'll find out.
First up, apparently they did not use this tool to help prevent the outage. Or maybe they did, I don't know.
EC2 capacity manager is a new service that provides a single dashboard to monitor and manage EC2 capacity across all accounts and regions, eliminating the need to collect Data from multiple AO services like Cost and Usage Reports, CloudWatch and EC2 APIs and is available to you at no additional costs in all commercial regions. The service aggregates capacity data with hourly refresh rates for on demand instances, spot instances and capacity reservations displaying utilization metrics by vcpu, instance count or estimated cost based on published on demand rates.
Key features include automated identification of underutilized capacity reservations with specific utilization percentages by instance type and availability zone, plus direct modification capabilities for ODCRs within the same account.
Data exports to S3, extended analytics beyond the 90 day console retention period, enabling long term capacity trend analysis and integration with existing BI tools or custom reporting systems and organizations can enable this for cross account visibility through of course AWS organizations helping identify observation opportunities like redistributing reservations between development accounts showing 30% utilization and production accounts exceeding the 95% utilization for example.
[00:24:35] Speaker C: Yeah, I mean, hooray. You know these, these types of features where they're sort of trying to catch up from the, the separation between accounts and, and how companies have adopted, you know, a multi account strategy for a number of reasons and it made stuff like capacity management really difficult. Um, and it's kind of, you know, like these little islands of data and it, it's been, you know, a challenge to get it all sort of parsed in one place and where you can derive actual metrics out of it. So this is kind of nice to have it built in and just have it sort of be plug and play. You just turn it on. Especially when it's at no cost. I love it.
[00:25:17] Speaker A: No cost. With exports, you know and which is which is pretty nice too.
And day one features for organizations I feel like is like you've said the big one because I know I've written several scripts for myself that go cross accounts or an Azure Cross tenant cross subscription and kind of pull that same data too.
[00:25:37] Speaker C: Yep.
[00:25:39] Speaker D: Next up, EKS Auto Mode now supports EC2 on demand capacity reservations and capacity blocks for ML, allowing customers to target pre purchase capacity for your AI and ML workloads requiring guaranteed access access to specialized instances like the P5s. The addresses the challenge of GPU availability for trained jobs without over provisioning your infrastructure. There's new networking capabilities included separate POD subnets for isolating infrastructure and application traffic, explicit public IP control for enterprise security compliance, and forward proxy support with custom certificate bundles. These features enable integration with existing enterprise network architectures without complex CNI customizations. Thank you. Complete AWS KMS encryption now covers both ephemeral storage and root volumes using customer managed keys, addressing security audit findings that previously flagged unencrypted storage Performance improvements include multi threaded node filtering and intelligent capacity management that can automatically relax instance diversity constraints during capacity usage. EKS Auto mode is available for new clusters or can be enabled on existing clusters running Kubernetes 1.29 plus migration guides available for teams moving from managed node groups. Carpenter or Fargate pricing follows standard eks pricing at $0.10 per cluster per hour plus the EC2 instance cost.
So nice to see those new features coming to EKS Auto mode this week.
[00:26:51] Speaker C: Yeah, I mean this just highlights how terrible it was before, like managing your own, you know, compute layer, custom certificate bundles, like that's it's been a while since, you know, I've had to do plumbing at that layer, especially for, you know, eks, which is supposed to be a managed Kubernetes platform. So this is rough. So I'm glad they fixed this. I'm sure customers were clamoring for this.
Torches and pitchforks, I imagine.
[00:27:17] Speaker A: There's definitely times that, you know, I'm like are people going to know how all these things work under the hood when you have to go debug something that's gone wrong, you know. So I always kind of wonder that as we build these higher, higher level services, do people forget that? But it's again, it cleans up the toil and how to, you know, solve the larger scale problems too.
[00:27:40] Speaker C: I mean if I never had to learn how to create a custom certificate bundle and distribute it to wide fleet of compute nodes, I'd be okay with that. Yeah, really no problems whatsoever.
[00:27:50] Speaker A: I mean, it also goes back to like I was talking to somebody earlier today because I was generating an SSL cert because Azure doesn't have an SSL for app gateways because I have to get at least one to get in the app gateways per podcast and above your quota. I was like talking to them about how I was like, oh yeah, remember when you said upload it to IM and then attach your SSL cert? And now you just have acm. So, you know, I guess you're right. It doesn't really matter if you don't ever have to deal with it.
[00:28:20] Speaker C: There's always going to be new problems. You know, that's the way I look at it. So I don't want to keep fixing the old ones.
[00:28:26] Speaker D: I mean, I'd rather have those new problems. Or you know what you could do, Ryan?
You could use Platform as a service. Yes, yes, I could.
[00:28:35] Speaker C: This is Platform as a Service, which is the rough.
[00:28:37] Speaker D: I mean, sort of. It's still Kubernetes. You're still defining pods.
[00:28:40] Speaker C: Still Kubernetes.
[00:28:41] Speaker D: Yeah, you're still defining pods and services. So yeah. Let's move on to Amazon. EC2 now supports optimizing your CPU for license included instances.
This allows you, your customers, reduce VCPU counts and disable hyperthreading on Windows Server and SQL Server license included instances, enabling up to 50% savings on VCPU based licenses in cost while maintaining full memory and IOPS performance. This feature targets database workloads that need high memory and IOPS but fewer VCPUs. For example, an R7i8x large instance can be used from 32 to 16 VCPUs while keeping its 256 gigs of memory and 40,000 IOPS. The CPU optimization extends EC2's existing optimized CPUs feature to license included instances, addressing a common pain point where customers overpay from Microsoft licensing due to fixed VCPU counts.
Available to you now in all commercial regions and Gov cloud regions, no additional charges.
So this is a little weird to me because I thought this already existed.
[00:29:36] Speaker C: It's because they're, they're announcing it wrong. Like, this is custom shapes. It's just they're not calling it custom shapes.
[00:29:43] Speaker D: Right.
[00:29:44] Speaker C: So you used to have to, you know, basically, oh, I know you might be paying for it. You know, I don't know if you're paying for the entire R7 IX large and all the compute that goes along with it, or do you actually get some sort of price change when you reduce the VCPUs running on it. I don't know, but that's always. It's been a challenge because you could always do it at the Windows OS layer, but I don't know if you could do it at like the sort.
[00:30:07] Speaker A: Of console you can turn off Hyper threading at the Windows OS layer. I thought that was like a BIOS level.
[00:30:12] Speaker C: No, but you can. You in a like the Microsoft SQL Server for instance, you can configure the the CPUs that it runs on so that you're not paying for all of them. But there'd still be accessible by the host os.
[00:30:25] Speaker A: And then just go back to Justin Kameron, just use a managed service. Go use rds. I understand there are many issues with RDS and Microsoft SQL, but just use rds.
I think we have a new theme of the show that probably should have made a title around select your managed service.
[00:30:44] Speaker C: Yeah, I mean it won't be any surprise to our listeners that we're a big fan of managed service and not doing any real work.
[00:30:55] Speaker D: Speaking of managed services, let's talk about patching and managed services too.
[00:31:00] Speaker C: Too soon for me. This is my day job right now.
[00:31:03] Speaker A: I know.
[00:31:04] Speaker D: AWS Systems Manager Patch Manager now includes an available security update state that identifies Windows security patches available but not yet approved by patch baseline rules, helping prevent accidental slow exposure from delayed patch approvals. This feature addresses a specific operational risk where administrators using approval delay with extended time frames could unknowingly leave systems vulnerable with instances marked as non compliant by default when security updates are pending available from across all AWS Systems Manager regions with no additional charges beyond standard pricing Feature integrates directly into existing patch baseline configurations.
So yeah, this makes sense because if you are doing delayed, you still maybe want your security patches sooner and you don't want them to be be out of compliance for security reasons. So that's nice. Quality of life improvement.
[00:31:47] Speaker C: Well, I mean it's, it, it sounds like just a quality of life improvement, but it's something that should be so basic but isn't there, right? Which is like Windows patch management is. Is cobbled together and not really managed well. And so like you could have a patch available but the only way to find out that it was available previous to this was to actually run like go ahead and patch it and then see if it did something.
And so now at least you have signal on on on that you can, you know, apply your patches in a way that's not going to take down your entire service if a patch goes wrong. So this is very nice. I think for people using the systems manager patch management they're going to be very happy with this.
Still really angry at Windows patching in general because that's just the state of life.
[00:32:33] Speaker A: Do like that Windows patching is.
[00:32:36] Speaker C: If.
[00:32:36] Speaker A: Once if you can get it wrangled and you spend the time to build a good process around becomes easier but it's never easy because you know it's perpetually having to deal with new changes between patch Tuesday the one off updates that they do out there. Everything else it's just a pain in the butt and you have to manage.
[00:32:58] Speaker C: So much like you have to manage infrastructure in order to build a good patch management system with Windows. Like you can't like trying to do it just all with public sources and doing all those things like you I. There are patches that I know are applicable to the OS that are just not available to me unless I stand up a Windows Update service and manage that for the end of time.
[00:33:18] Speaker A: I thought they deprecated wsus.
[00:33:20] Speaker C: I thought they did too. I think they keep saying that but yet it's the only thing that works still.
[00:33:28] Speaker A: Yeah it is. Here's an R. Yeah.
[00:33:30] Speaker C: And it doesn't have API support and it never will because they've deprecated it.
[00:33:34] Speaker A: Windows WSUS will no longer be developed starting in Windows Server 2025 encourages businesses to adopt cloud based solutions. Client server updates such as Windows autopatch Intune and Azure Update Manager which don't get me started on Azure Update Manager.
[00:33:50] Speaker C: Yeah, yeah. It's just, it's incredibly frustrating and like the patches, the patches that I'm referring to specifically are like not. They're not random or like it's to the Edge browser.
Like it's not crazy and it's like just could not patch the Edge browsers using public sources had to point at a WS server. Like it makes no sense to me.
[00:34:11] Speaker D: Yeah, I mean you think about every you I would have thought this would got better. It just hasn't and I'm kind of shocked about it still.
[00:34:19] Speaker C: I know like I. I remember asking you like how could it possibly work like this like a decade ago and it still works like this.
[00:34:27] Speaker D: I mean you've heard my rants about ad.
So anyways.
[00:34:34] Speaker B: There are a lot of cloud cost management tools out there but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment to shortest 30 days.
If you don't use all the cloud resources you've committed to they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Achero will click the link in the show Notes to check them out on the AWS Marketplace.
[00:35:14] Speaker D: Well, Amazon's giving us a new.
[00:35:17] Speaker A: Q.
[00:35:18] Speaker D: Coordinator so Amazon is introducing CLI Agent Orchestrator, or CAO for short, an open source framework that enables multiple AI powered CLI tools like Amazon QCLI and or CLAUDE code direct specialized agents under a supervisor agent, addressing limitations of single agent approach for complex enterprise development projects. CAO uses hierarchical orchestration with TMUX session isolation and model context protocol servers to coordinate specialized agents, for example orchestrating architecture, security, performance and test agent simultaneously during mainframe modernization projects. The framework supports three orchestration patterns handoff for synchronous transfers, assign for parallel execution or send message for direct communication plus scheduled runs like cron like automation with all processing occurring locally for security and privacy. It currently supports Amazon Q Developer CLI and Cloud Code with planned expansion to OpenAI, Codex, Gemini, Quin and Aden. No pricing has been mentioned as is open source available at the GitHub the.com AWS Labs CLI agent orchestrator, which I mean this is great except for probably mostly works well with Q and the other ones probably not so much. But I do appreciate that there's something one more tool in the agent orchestration world.
[00:36:28] Speaker C: I mean, I wonder because it's. I mean that if you're interfacing with the cli, they're all kind of the same thing but at that level, like there's a kind of common element there. But I also wonder like this seems like a lot of configuration and a lot of management of the configuration to make work and I'm not sure that it would be worth it just to be able to use different tools as much as I like being able to select different models for different tasks. It seems a little heavy.
[00:36:54] Speaker D: I mean this is how you get those, you know, CRON job security patches or you know, coding fixes. Fix all my bugs.
Use an agent orchestrator like this.
[00:37:02] Speaker C: If I wanted to run it off of my desktop, sure, why not?
But I'm not going to do that.
[00:37:07] Speaker A: Come on, what fun are you?
[00:37:10] Speaker C: I'm not guy who's going to have critical production, you know, processes running on my desk that or a computer that's running in my desk.
[00:37:18] Speaker D: I mean you're writing code on your desk that you push a production. So technically you have production things on your thing.
[00:37:22] Speaker C: Well I have production code, but it's the functionality is pushed to the cloud.
[00:37:29] Speaker D: Amazon ECS is Now Publishing AWS CloudTrail data events for insight into API activities.
Basically this allows you to see ECS Agent API activities enabling detailed monitoring of container instance operations including polling, telemetry sessions and manage instance logging security operations. Teams gain comprehensive audit trails to detect unusual access patterns, troubleshoot agent communication issues and understand how container instance roles are utilized for compliance requirements. The feature uses the new data event resource type AWS ECS container instance and is available for ECS on EC2 and all AWS regions, but the ECS manages the support in selected regions.
Standard CloudTrail data event charges apply and there's a previous visibility gap in ECS operations as teams can now track agent level activities. I mean I get that this wasn't tracked before and that's bad and security wants to know all the things that are happening, but what's the use case? Brian, as our security guy, I mean.
[00:38:26] Speaker C: I know when I was managing ECs like at the hardware layer like running on you know, dedicated compute, this is definitely something that you know like captured in the ECS logs and pushed upstream so that they could make so larger orchestration engines who are more aware of multiple clusters could make decisions based off of it. And so I imagine as things have moved more into like the Fargate model where you don't have that option, the need for this is a little bit more. But it also like, you know, a lot of the reasons why I needed to make those larger orchestration levels is because I was managing things at a compute level and needed to make decisions on scaling and capacity management and things that are sort of part of the managed services when you're using, you know, Fargate, although it can be more expensive. So I mean and this, you know, is definitely something I would use sparingly because the, the ECS API is Agent API is chatty. So this seems like it would be very expensive very fast.
[00:39:25] Speaker A: Yeah, that was mainly my concern when you know, I was reading about this is put system log event.
That's that feels like it's going to be very often and seeing that millions of times. The other thing is it's just going to clog up your cloud trail.
So if you are looking to debug something, how many times is going to appear in there for something you don't like you might not care about.
[00:39:48] Speaker C: Oh, this is specific to that new managed instances. This is the thing that's in between Fargate and running your own computer. So that's why this makes sense.
Yeah. So this is basically the same use case I had. Except for when you were running managed instances, you didn't have the ability to pull from the ECS agent logs on the host itself.
But you do still have to manage capacity. You do still have to manage things like this is the one that has GPUs, and so that's probably what this unlocks for a lot of people.
[00:40:19] Speaker D: Makes sense. I mean, I get it. It just. It's definitely.
[00:40:24] Speaker A: Oh no.
[00:40:25] Speaker C: I've worked very hard to avoid needing this and so yeah, I know there are people that need this and I'm sorry.
[00:40:30] Speaker D: Yeah, maybe you should find a job if you need this.
All right, let's move on to Google Cloud. This week they've got several things for us.
G4 VMs powered by Nvidia RTX 6000 black hole GPUs are now generally available.
They offer up to 9x throughput improvement over the G2 instances and support workloads from AI inference to digital twin simulations with configurations of 1, 2, 4 or 8 GPUs. The G4 VMs feature enhanced PCIe based peer to peer data paths that deliver up to 168 throughput gains and 41% lower latency for multi GPU workloads. Addressing the bottleneck issues common in serving large AI models that exceed single GPU memory limits. Each of you provides 96 gig of GDR7 memory, up to 768 gigabytes total native FB4 precision support and multi instance GPU capabilities that allows partitioning into four isolated instances, enabling efficient serving of models from under 30 billion to over 100 billion parameters. Nvidia, Omniverse and Isaac SIM are now available on Google Cloud Marketplace as turnkey solutions for G4VMs, enabling immediate deployment of industrial digital twin and robotics simulation applications with full integration across gka, gke, Vertex AI, dataproc and Cloud run. These are available immediately with broader regional availability than previous GPU offerings. Though specific price details were not provided in this particular announcement, I can tell you the answer is expensive.
[00:41:54] Speaker C: Definitely expensive.
[00:41:56] Speaker A: All I heard was GPUs and I'm like, yep, that's expensive.
[00:42:01] Speaker C: Yeah, I mean it's. I'm still very far away from understanding these larger workloads because I don't do the type of development where I need to. Definitely. I'm more on the other side where I'm trying to like, how would I finally get AI on a very tiny little computer where very much on the edge so that I don't have to write complex mapping rules.
[00:42:21] Speaker D: Indeed, for those of you who use Dataproc, Google's giving you Dataproc 2.3 which chooses a lightweight FEDERMP high compliant image that contains only essential Spark and Hadoop components, reducing the CVE exposure and meeting strict security requirements for organizations handling sensitive of data.
Optional components like Flink, Hiveweb, HCAT and Ranger are now deployed on demand during cluster creation rather than prepackaged, keeping clusters lean by default while maintaining full functionality when needed. Custom images allow pre installation of required components to reduce cluster provisioning time while maintaining the security benefits of the lightweight base image. Image supports multiple operating systems including Debian 12, Ubuntu 22 and Rocky 9 deployment as simple as specifying version 2.3 when creating clusters via the GCloud CLI, Google employs an automated CVE scanning and patching combined with manual intervention for complex abilities to maintain compliance standards and your security posture. So it's Alpine for data Proc and when you need to use any of the tools you actually need, they won't be installed like vim making you install them and then your container that was lean is now no longer lean.
[00:43:27] Speaker C: So nice but on the contrary like FedRamp has such tight SLAs for vulnerability management that you don't have to, you know carry this risk or request an exception because of, you know, Google not patching Flink as as fast as you would like them to.
Yep, at least perspective control where at the end user where they can say well I'm not going to use that.
[00:43:49] Speaker D: BigQuery Studio is getting a new interface, introducing an expanded Explorer view that allows users to filter resources by projects and type with a dedicated search function that spans across all BigQuery resources within an organization, addressing the common pain point of navigating through large scale data projects. The reference panel provides context or information about tables and schemas directly within the code editor, eliminating the need to switch between tabs or run exploratory queries just to check column names or data types, which is particularly useful for data analysts writing complex SQL queries. Google has streamlined the workspace by moving job history to a dedicated tab accessible from the Explorer pane and removing the bottom panel clutter, also allowing users to control tab behavior with double click functionality to prevent unwanted tab replacements. The update includes co generation capabilities where clicking on table elements and the reference panel automatically inserts query snippets or field names the editor reducing manual typing errors and speeding up query development workflows and driving all of us crazy when we have to troubleshoot and we go, what did you do? And they go, I clicked the button. Yeah, yeah, it didn't work.
[00:44:47] Speaker C: SQL like syntax and I don't know either one the like or the SQL.
I mean some of these things, the reducing clutter and stuff is definitely something that I've run into. Like having a bunch of job executions and not being able to sort of see your query and the job executions because the screen's too small. It sucks. This is definitely something I've run into, although I'm a little nervous about having all the BigQuery resources across an organization available in a single console just because it sounds like permissions nightmare and GCP permissions since it's evaluating things like tables and, and data sets sort of on demand. Evaluating a policy on demand. Like it typically shows you everything and then just gives you an error.
So it's a little, a little rough. I'm a little nervous about this.
[00:45:38] Speaker D: I mean, we'll see.
I suspect that our data analyst friends will love it.
I have tried to use this interface a couple times and I find it overwhelming initially.
And so if I can make it, you know, if this is a step towards making it more approachable for, you know, people who are not as experienced yet, that would be helpful too. So we'll see. But I also worry about it's just AI all the way down, then what are you going to do?
[00:46:02] Speaker C: I mean, this definitely makes a lot of Gemini query building things a lot easier and then also easy to share them across. GCP project.
[00:46:11] Speaker D: Well, finally you can manage all of your prompts in Vertex and AI SDK, enabling developers to create, version and manage prompts programmatically through Python code rather than tracking them in spreadsheets or text files. Feature provides seamless integration between Vertex, AI Studio, visual interface for prompt design and SDK for programmatic management, with prompts stored as centralized resources within Google Cloud projects for team collaborations.
Enterprise security features include CMIC or Customer managed encryption keys and VPC service controls addressing your compliance requirements for organizations handling sensitive data in their AI applications. I mean, if your prompt has a. Has sensitive data in it, I have questions already.
[00:46:49] Speaker A: Who doesn't like PII in your prompt?
[00:46:52] Speaker C: Come on.
[00:46:52] Speaker D: Yeah, it doesn't matter. Like, hey, can you see if this Social Security number exists in any of my data, please?
[00:46:57] Speaker C: Like, yeah, beautiful.
[00:47:00] Speaker D: All right, well, that's nice, I guess.
[00:47:03] Speaker C: I mean, it just shows that everyone's having the same sort of problem. Like, I know that this has recently become my life, which is trying to figure out, you know, in Different AI workloads. Like I'm copying and pasting prompts from different things rather than typing it new every time. And it's like patterns are starting to become a little ingrained and people are having their sort of workflows and.
And if there's not a tool like this, you end up figuring a way out. And that can be a document or, you know, a spreadsheet or whatever. So this is, this is kind of nice to bring sort of structure to that. And then, you know, the fact that it's SDK with Python management makes me real happy because I don't have to click through some UI somewhere to figure out where my prompt is.
[00:47:43] Speaker D: It's not in go for you, so that's nice.
[00:47:46] Speaker C: Yeah, I mean that's. Yeah, I don't mind if it's in go, that's fine.
[00:47:49] Speaker D: But you like it better. It's a Python, so it's easier to.
[00:47:52] Speaker C: Easier to read and understand. Sure.
[00:47:54] Speaker D: Only because you know it.
[00:47:58] Speaker A: Well.
[00:47:59] Speaker D: If you were excited about Claude Code Web earlier and being able to do things remotely, or if you're excited about GitHub Copilot, GitHub Enterprise is now supporting Gemini Code Assist as well. Now bringing your AI powered code reviews to enterprise customers using GitHub Enterprise Cloud and on premises GitHub Enterprise Server. This addresses the bottleneck where 60.2% of organizations take over a day for code changes to each production due to manual review processes. The services provides organizational level controls including centralized custom style guides and org wide configuration settings, allowing platform teams to enforce coding standards automatically across all all repositories. Individual teams can still customize repo level settings while maintaining organizational baselines. This is all built under the Google Cloud terms of service. The enterprise version ensures code prompts and model responses are stateless and not stored, with Google committing not to use customer data for model training without your permission. This addresses enterprise security and compliance requirements for AI assisted development and it's currently in public preview with access through the Google Cloud console. Service includes a higher pull request quota than the individual developer tier and Google is developing additional features including agentic loop capabilities for automated issue resolution and bug fixing.
[00:49:04] Speaker C: I'm sort of fascinated on how much, you know, hooks this is into GitHub because a lot of these features are directly competitive with Copilot.
[00:49:14] Speaker A: But if you're a Google shop and you've already have Gemini as your approved tool.
[00:49:19] Speaker C: Oh, I mean, I get it.
[00:49:20] Speaker A: I mean that's what they're targeting and hoping that it's going to kind of let People that are a Gemini shop completely take over and not. And still let you leverage Git because.
Does Google have an alternative?
[00:49:36] Speaker C: They do have their own source repository service.
[00:49:39] Speaker A: Was it as good as codecommit?
[00:49:40] Speaker C: I don't know. I wouldn't dare use it.
[00:49:42] Speaker A: I'm kidding. Is it as good as Azure DevOps?
[00:49:47] Speaker C: I mean you know they like, like Amazon, they have this sort of similar sort of context of like.
[00:49:52] Speaker D: Well they have cloud build.
[00:49:53] Speaker C: Right. So they don't have a CI pipeline built. They have a separate repository and they integrate together if you want to go and plug it all.
Yeah, I mean it's, it's just sort of the, the ability to sort of do organizational wide things is, is super powerful for these tools and I'm just sort of surprised that you know, GitHub allows that.
It seems like they would have to develop API hooks and externalize that.
[00:50:16] Speaker D: They do these all through the marketplace. Right. So I mean they have the web hooks, they have the APIs already. They exist and so you're just plugging into that ecosystem they've already built for other tools to leverage now do they eventually get hostile towards other AIs other than GitHub Copilot? They could, but I think it's kind of anti what Microsoft has been trying to preach for the last several years. So I mean right now they're good citizens. But that may change in the future. We'll see.
[00:50:40] Speaker C: I forgot about the marketplace interaction. You're right, that, that does make it makes it a little bit more approachable. This wasn't like requiring custom development to enable. Just what I was thinking.
[00:50:49] Speaker A: Yeah, GitHub is a marketplace. I don't remember that.
[00:50:52] Speaker D: Yes.
[00:50:53] Speaker A: I don't used to.
I'll have to look into that more.
[00:50:56] Speaker D: It's the preferred way you do integrations with third party tools like if you want integrate Snyk into GitHub, you typically do it through a marketplace plugin for example versus doing API keys for individual.
[00:51:07] Speaker C: Users, things like that.
[00:51:10] Speaker A: Got it. We're more of a BitBucket shop so I've had to do that.
[00:51:13] Speaker D: Well that's, that's unfortunate for you. I mean I assume problems. I assume BitBucket will eventually get AI. I mean JIRA and Confluence have it now. Why can't BitBucket?
[00:51:22] Speaker A: Yeah, I think, I think they def. They definitely have. I've gotten some notifications for if I.
They are behind the eight ball in my opinion.
[00:51:29] Speaker C: Well, and it's Atlassian. They'll charge you through the nose.
[00:51:31] Speaker D: Yeah, they're going to charge you through the nose for it. Number two, you, you probably have to upgrade your BitBucket, which you probably haven't upgraded since you installed it.
Do they have a SaaS version?
[00:51:41] Speaker A: No, no, it was the SAS version.
[00:51:42] Speaker D: They do have a SAS version.
[00:51:43] Speaker C: Yeah. Okay.
[00:51:44] Speaker A: I mean, I still think BitBucket is their redhead stepchild that they just kind of tag along. There's like, you know, keep lights on team and that's about it. But they have released some small features though.
[00:51:56] Speaker D: They sell it to the people who, you know, refuse to use GitHub for some reason because they hate Microsoft. That's a brilliant strategy.
[00:52:04] Speaker A: It's significantly cheaper also.
[00:52:06] Speaker C: Yeah. Yes.
[00:52:09] Speaker D: All right. Vertex AI context caching is reducing your cost by 90% for repeated content in Gemini models. By storing pre computed tokens, implicit caching happens automatically while explicit caching gives developers controls over what to content to cache. For predictable savings. The feature supports caching up to 2048 tokens, up to Gemini 2.5 Pro's 1 million token context window across all modalities. Text, PDF, image, audio and video.
Global and regional endpoint support. Use cases include document processing for financial analysis, customer support, chatbots with detailed system instructions, code based QA for development teams and enterprise knowledge based queries. Implicit caching is enabled by default with no code changes required and clears within 24 hours. While explicit caching charges standard input token rates for initial caching, then a 90% discount on reuse plus hourly storage fees based on the TTL itself. Integrated with provision throughput ensures production workloads benefit from caching and explicit cache supports customer managed encryption keys for additional security and compliance.
[00:53:05] Speaker C: I mean this is awesome. Like this is if you have a workload where you're going to have very similar queries and or prompts and have it return similar data. Like this is definitely nice than. Rather than having to regenerate that every time.
So this is.
They've been moving more and more towards this and I like to see it sort of more at a platform level now. Whereas you could sort of implement this like in a weird way using directly in the model, like in a, in a, like a notebook or something. This is more.
It sort of just. I think it's a turn it on and it works.
[00:53:42] Speaker A: Is it? I. I still wonder is it.
I mean it sounds like it's in line which is always good. So it kind of will handle like you said. But eventually is there going to be.
If you are leveraging AI for any PII or anything else like that, you have to be careful with some of these tools. Because I'm just thinking it's like Redis or anything else with SQL back in the day.
[00:54:04] Speaker C: Yeah, I mean, it is. It's exactly the same problem that you're going to have with like Redis, you know, where it's like, oh, we've got a bad thing in the cache, we got to clear the whole thing. Right? Or it could be just a prompt that's returning bad information.
You know, it doesn't necessarily have to be pii, it's just less than desired and you have to go clear it all out.
[00:54:21] Speaker A: I'm on PII hell at work, so, you know, that's where my brain is.
[00:54:25] Speaker C: Gotcha then.
[00:54:27] Speaker D: Our final Google announcement for today, Cloud Armor is introducing several new features for you and it was named A strong performer in the Forester wave.
The new features launched include hierarchical security policies, now in general availability, that enable WAF and ddos protection at the organization, folder and project level Allowing centralized security management across large GCP deployments with consistent policy enforcement A new enhanced WAF inspection capabilities and preview Expanding request body inspection from 8 kilobytes to 64 kilobytes for all pre configured rules Improving detection of malicious content hidden in larger payloads while maintaining performance the GA4 network fingerprinting support is now generally available, providing advanced SSL, TLS, client identification beyond J, offering deep behavioral insights for threat hunting and distinguishing legitimate traffic from malicious actors. The organizational scoped address groups are now generally available, enabling IP rangeless management across multiple security policies and products like Cloud Next gen firewall, reducing configuration complexity and duplicative rules and Cloud Armor now protects media CDN with network threat intelligence and ASN blocking capabilities and general availability Defending media assets at the network edge against known malicious IPs and traffic patterns.
[00:55:35] Speaker C: These are some pretty advanced features for, you know, like a cloud platform provided laugh. It's pretty cool. I find myself using this more and more of the day job just because it's easy to implement and then pretty configurable once you have it enabled through, you know, definition of policies and rules.
I haven't run into the, you know, the eight kilobit sort of inspection limit yet, but that's because the rules are kind of expensive and I am turning them on very cautiously. So it's kind of. You gotta be careful there. But yeah, no, I look forward to using this and getting the better information when looking at the ginormous amount of logs that any waften generates.
[00:56:16] Speaker A: I mean a lot of the other features from an organizational perspective sound really nice too. That Feel like something that Ryan should be implementing on his day job if he's not. But like IP addresses and stuff like that sound phenomenal for a security team that is managing a multi project strategy.
[00:56:34] Speaker C: Oh definitely. I mean and you know there's, there's been ways to do this, but not good ways. You know, like it's like having a giant basically shared WAF layer. That's your, that you're, you have a poor platform team, they're responsible for it now they have to offer WAF to the rest of the company. And so this is nice because you can now delegate some of that while still maintaining that, you know, regions or you know, traffic from an embargoed country just never gets in and not having to work with 27 different teams to get that enabled.
[00:57:04] Speaker D: Agreed. Moving on to Azure for this week.
General availability the observed capacity metric in Azure Firewall. Azure Firewall's new observed capacity metric provides real time visibility into capacity unit utilization, helping administrators track actual scaling behavior versus provision capacity for better resource optimization and cost management.
And you might want to combine that with the next announcement which is that they're now allowing firewall pre scaling which allows administrators to reserve capacity units in advance for predictable traffic spikes like holiday shopping seasons or product launches, eliminating the lag time typically associated with auto scaling your firewall resources. So now you can see the observed capacity so you can actually determine what that pre scaling need is actually required. So to combine these two announcements this week are actually pretty good if it was 10 years ago.
[00:57:51] Speaker A: I mean that's just a core problem with Azure though is that there's still the concept of servers under the hood. You tell them how many servers you want of everything and it blows my mind that that's still the world you're in versus saying I want a load balancer, I want a firewall, I want a NAT gateway. I think NAT gateways actually might be managed for you. I'd have to double check but you still say I want X capacity where you know with AWS and a little bit I've done with gcp more of that's obfuscated from you.
I guess with AWS you have to like pre warm the load balance like that. But I think a lot of that's.
[00:58:31] Speaker C: Which is probably still in there. I just haven't touched it and I don't know how long because it's just not needed anymore for. Because I'm not the size fan.
[00:58:39] Speaker A: Yeah I think the last time I think I read they, they don't recommend it and at that point you probably are on, you know you have a tam you, you know they know you're launching a big thing and they would I'm sure reach out. So you're at that massive level either way. But like everything with Azure feels like you are literally telling them the metrics, the size and it's just a, in my head it's anti cloud pattern because I don't want to manage capacity. This is why I'm paying a premium for you to do it for me.
[00:59:10] Speaker C: Yeah, it's, you know, like it, I guess it's, it's part of the management. Right. You're not maintaining the underlying infrastructure in terms of patching and updating and, and but it's also like the having to do capacity management just sort of is that last little bit where you're like I want to do this last little bit. Like just take it away, just bill me.
[00:59:31] Speaker A: It's like on soccer ISO. I don't remember I was doing one of the things and they're like tell me how you capacity plan. I was like I have my quotas and I have scale sets that scale up and down for me and I have alert set up with things fail. Like I leverage automation to do it for me and explain that to an auditor is always a fun time.
[00:59:53] Speaker D: What can go wrong?
[00:59:54] Speaker A: Yeah.
[00:59:55] Speaker D: In public preview this week, environmental sustainability features are now available to you and the Azure API Management plane.
Azure API Management introduces carbon aware capabilities that allow organizations to route API traffic and adjust policy behavior based on the carbon intensity data, helping reduce the environmental impact of API infrastructure operations. The feature enables developers to implement sustainably focused policies such as throttling non critical API calls during high carbon intensity periods or routing traffic to regions with cleaner energy grids. This aligns with Microsoft's broader carbon negative commitment by 2030 and provides enterprise with tools to measure and reduce the carbon footprint. Other digital services at the API layer CAR customers include organizations with ESG commitments and sustainability reporting requirements who need granular control over their cloud infrastructure's environmental impact.
And in general I think this is weird.
Like I'm going to use some type of API thing to determine that if I use this compute in this region that's on coal power right now, it would be less energy efficient than if I use the this other one that's using, you know, maybe hydropower is that kind of the gist of this.
[01:01:02] Speaker A: So APIMs are one stupidly expensive.
If you have to be on the premier tier it's like $2,700 a month and then if you Want ha. You have to have two of them.
So like whatever they're doing under the hood is stupidly expensive. If you ever had to deal with the SharePoint, they definitely use them because I've hit the same error codes as we provide to customers.
The second side, when you do scale them, you can scale them to be multi region apims in the paired region concept.
So in theory what you can do based on this is route in the uk if UK north or was it central versus south, you know, has a cheaper or more environmentally efficient one, you could route to your paired region and then have the traffic coming that way.
The reason why I think that this is interesting is one, I keep stating that green is going to be on the forefront of one of these keynotes at one point and I'm just going to keep putting that out there until I give up on our prediction show.
But on the flip side, I think it's interesting that especially in Europe this is such a big deal that the cloud vendors AKA Azure is targeting building stuff and targeting these ESG things. I'm seeing it more in my day job when customers buy our SaaS product is they'll ask what the green capacity is or green metric, whatever it is for our product and how many CO2 tons of CO2 we're releasing in Tier 1, 2 and 3. So I think it's interesting that Azure is trying to take it on and put the power in the hands of their customers to maybe say this isn't the best thing to do versus Azure just taking care of it for you and letting the customer say it's okay to slow down this. We don't care right now about it. We don't care about our customers response time right now because our green capacity is more important.
[01:03:00] Speaker C: Yeah, I mean it's all about balancing the books with carbon offsets. Right. It's both from a Microsoft perspective and then also an Azure perspective customer perspective. They're trying to have a policy where they can say like you know, this is how we're, we're managing our, our, our carbon sort of footprint and here's the things we're doing to reduce it and just trying to balance that with, with the things that are generating or increasing the carbon footprint. So it's, it seems like a.
[01:03:30] Speaker A: A.
[01:03:31] Speaker C: Lot right now I think because we're sort of early weirdly but I imagine if this becomes one of those things like just like routing for latency if we're, it becomes really standard for all of our tools where we're taking account the carbon impact.
[01:03:50] Speaker A: I mean, I definitely think that it's a very specific company they're targeting for this.
Most people don't care.
[01:03:56] Speaker C: It's definitely a company of scale, right? Yeah, companies of scale at this point.
[01:04:00] Speaker A: Yeah, yeah. And you're also talking about like you're already using the apm, which is a ridiculously expensive service, so you know, you're clearly burning tons of capital on Azure too.
[01:04:14] Speaker D: All right, next up is Azure Storage Discovery is now generally available as a fully managed service that provides enterprise wide visibility into data states across Azure Blob Storage and Data Lake Storage, helping organizations optimize costs, ensure security compliance and improve operational efficiency across multiple subscriptions and regions. The service integrates Microsoft Copilot and Azure to enable natural language queries for storage insights, allowing non technical users to ask questions like show me storage accounts with default Access tiers hot above 1 TB. With these transactions, really a non technical user asked that question and receives actual visualizations without coding skills. Key capabilities include 18 month data retention for trend analysis insights across capacity generate activity, security configurations and errors of deployment taking less than 24 hours. Initial insights from 15 days of historical data pricing includes a free tier with basic capacity and configuration insights retained for 15 days while standard plan adds advanced activity error and security insights with 18 month retention. Specific pricing varies by region.
Target use cases include identifying cost optimization opportunities through access tiered analysis, ensuring security best practices and still and managing data redundancy requirements across global storage estates.
[01:05:27] Speaker C: We talked a little bit about this when it was announced in preview but like I remember like I don't have a lot of Azure experience but just when Focus Data came out with her FinOps report in the structure, just generating a new report, I had to set up a storage infrastructure and then once I set it up, got the report running and updating everyone, I could never find that storage bucket ever again. It was gone. I was, it was. I. It was, it wasn't gone because it was still being accessed but I could never find it in the console. So when I was asked to change it, you know, and I would go through and I would try to change my permission scopes and try to do all the things to try to find it but it was just invisible to me.
And so I think that this is one of those things where hopefully it doesn't have same limitations and you'll be able to to actually see your entire storage footprint versus whatever permissions you have in the specific subscription based off of your billing scope at the time, which is for a non everyday Azure user is too much to keep pass.
[01:06:26] Speaker A: Well that's why you naturally language query it because obviously you're caring about that. As a FinOps person, you can definitely care about what your redundancy level is for each of your storage and the finops person should totally be making that decision.
[01:06:40] Speaker C: Well, I'll tell you, when I was looking for this report I had a lot of natural language and I was shouting it at my computer.
[01:06:46] Speaker A: So I've actually leveraged it. It was free for a couple months. We set it up at my day job. We were kind of poking around it.
It's good if you, you know, if you're a large company with multiple different teams kind of setting it up and figuring out what and you know, your centralized team doesn't know all the different storage accounts, it definitely shed a few interesting insights on us that you know, for my team that we definitely probably should have known to start off but learned a lot along the way. But it's a simple report. You know, there's been multiple press releases about just shows you what's there. It's no different than kind of the network security report they released a few weeks ago. It's an aggregate review for your organization about what you have running if you don't know.
So they're really targeting the larger scale enterprises. If you're a small enterprise and you kind of know these things, it's no big deal.
[01:07:44] Speaker D: No big deal, he says. All right, well there's two new models available for you today in Azure AI Foundry. First up is Sora2, which is populating the entire Internet with AI crap.
Just terrible videos and all AI generated and even has his own Sora apps so you can go to the Sora social media to see all of your friends with their dumb videos made on Sora too. Not a big fan of this one for lots of reasons. I think Grok 4 is also available to you in Azure AI foundry featuring the 128k token context window, native tool use and integrated web search capabilities in the Grok product. Pricing for that one starts at 2 million per million input tokens and $10 per million output tokens for GROQ4 with faster variants available at lower costs in the future.
The Sora pricing I did not find when I wrote this out, but they're available to you in Azure AI Foundry so I'm sure there's a billion model. I just don't know what it is.
[01:08:39] Speaker C: I mean apparently it's not expensive enough considering how much crap there is. So like needs they should raise the.
[01:08:44] Speaker D: Prices, I think, like the Sora stuff is all free right now if you're going through the app, which is part of the problem, because no one, no one has to do anything special.
All right, and our final Azure story.
Azure is releasing PowerShell scripts to help customers migrate from Application Gateway v1 to v2 before the April 2026 retirement deadline. I'm shocked the deadline wasn't, you know, September 20th, 25th, 2025.
Addressing a critical infrastructure transition that's needed. The enhanced cloning script reverse preserves configurations during migration, while the public IP retention script ensures customers can maintain their existing IP addresses, minimizing disruption to your production workload. Migration tooling targets enterprises running legacy application gateway Standard or WAF SKUs who need to upgrade to standard V2 or WAF V2 for continued support to access newer features scripts automate what would otherwise be a complex manual migration process, reducing the risk of configuration errors and downtime during your transition. Customers should begin planning migrations now as 2026 deadline approaches, with these scripts providing a standardized path forward for maintaining application delivery infrastructure. Or, you know what, instead of writing all these scripts, they could just do it for you.
[01:09:48] Speaker A: There's so many issues with this.
[01:09:49] Speaker C: Yeah, this removes what otherwise would be a complex manual process. I'm like, you're giving people PowerShell. It's a complex manual process.
[01:09:58] Speaker A: I mean, I've noticed this a lot when they go from V1 to V2 stuff where they don't. They're either they first just tell you it's now because the deprecation notice for this for V1 to V2 was April 28, 2023. They announced this.
So they gave people five years, you know, to, sorry, three years to move over.
[01:10:22] Speaker C: But migration through attrition at that point. Right, right.
[01:10:25] Speaker A: And now they're like, okay, these are the customers that are left that are using it. But like, honestly, for somebody that leverages the app gateway at my day job, like, V2 is exponentially more powerful. There's definitely less in it in the worlds of the documentation. But I just love hating on app gateways also, which is why it's in here.
At least I'm honest about what I hate. But also, like, the other part of this that I wanted to bring up was around the public IPs. Like, I totally understand IP addresses and people whitelisting or putting in a lot of this IP addresses.
I just still don't believe that at a core principle that that is something that you really need to do.
Plus, if you were on a public IP address, that Azure provided back then and you have to move over here. You probably also have to move from a single zone to a multi zone IP address, which means you need a new IP address to start off.
So like I almost wish they didn't offer this because I want people to move to more modern stuff and they're just leaving themselves more tech debt, you know, to do it. So just take it on and fix the problem though. I say this at my day job.
[01:11:39] Speaker D: I mean this feels right on brand for Microsoft. Like you're saying all these things and I'm just like this is. But this is Microsoft's mo.
[01:11:45] Speaker A: I know.
[01:11:46] Speaker C: And if you have fintech customers it is like pulling teeth.
[01:11:50] Speaker A: Pulling teeth to get them to get.
[01:11:51] Speaker C: Rid of their IP restrictions and get.
[01:11:54] Speaker D: Rid of your URL.
[01:11:55] Speaker A: Wait till you tell your fintech customer that you are having a CDN now and see how well that goes over with that. Because they can't whitelist allow this the IP address anymore which is loads of FUD also.
Or you just tell them to put in the allow this all of Azure front door which then they look at you like you're crazy. I'm like, well this is what you asked for.
[01:12:18] Speaker C: I do remember pointing a customer ages ago at the giant AWS JSON. I'm like, just run a thing that loads this every once in a while and then you'll have just what you need. And then just looking at their face as they were mentally stabbing me through the eyes, they could see it.
[01:12:35] Speaker A: Yep.
About two years ago I had this conversation.
[01:12:40] Speaker D: Just two years ago. It seems that long ago at all.
[01:12:42] Speaker C: Yeah.
[01:12:43] Speaker D: All right, and our final story for this week is Oracle. Oracle's AI Agent Studio is expanding with new marketplace elements and partner integrations for fusion apps, allowing customers to build AI agents using models from Anthropic, Cohere, Meta and others alongside Oracle's own models. I mean, do you even know what Oracle's models are called?
[01:13:01] Speaker A: Bad to ask, I forgot they had models.
[01:13:04] Speaker D: Either do I. The platform enables creation of AI agents that can automate tasks across Oracle fusion cloud applications including erp, HCM and CX with pre built templates at low code development tools for the business users. Oracle is partnering with major consulting firms like Accenture, Deloitte and Infosys to help customers implement AI agents. Those likely mean significant professional services costs for most deployments.
[01:13:28] Speaker C: That's exactly what that sounds like to me. Like oh yeah, they're partnering with these giant firms that will come in with armies of engineers and build you a thing and then hopefully document it before running away.
[01:13:39] Speaker A: You have never had to deal with those people and or deal with the mess that they deal with after.
Never.
[01:13:45] Speaker C: Never had to come in and clean that up. Nope, nope.
[01:13:49] Speaker A: Definitely never clean that up. As a consultant to a company, hey, this other person hired this and they do this.
And then we were brought in to clean up that mess. That was always fun. Yeah.
[01:14:01] Speaker D: It's a never ending cycle.
Never ending. All right, gentlemen, it's another fantastic week here in the Cloud. Past us, so we'll keep an eye on Amazon. Hopefully we don't have any outages to next recording. See if we go to rca and we'll see you all next week here in the Cloud.
[01:14:17] Speaker C: All right, bye, everybody.
[01:14:18] Speaker A: Bye, everyone.
[01:14:22] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services.
While you're at it, head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have.
Thanks for listening and we'll catch you on the next episode.