320: AWS Cost MCP: Your Billing Data Now Speaks Human

Episode 320 September 11, 2025 00:55:42
320: AWS Cost MCP: Your Billing Data Now Speaks Human
tcp.fm
320: AWS Cost MCP: Your Billing Data Now Speaks Human

Sep 11 2025 | 00:55:42

/

Hosted By

Jonathan Baker Justin Brodley Matthew Kohn Ryan Lucas

Show Notes

Welcome to episode 320 of The Cloud Pod, where the forecast is always cloudy! Justin, Matt, and Ryan are coming to you from Justin’s echo chamber and bringing all the latest in AI and Cloud news, including updates to Google’s Anti-trust case, AWS Cost MCP, new regions, updates to EKS, Veo, and Claude, and more! Let’s get into it. 

Titles we almost went with this week:

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our Slack channel for more info.

General News

00:57 Google Dodges A 2.5t Breakup

AI Is Going Great – Or How ML Makes Money 

02:16 Introducing GPT-Realtime

02:58 Matt – “More AI scam calling coming your way.” 

Cloud Tools

04:14 Terraform provider for Google Cloud 7.0 is now GA

05:19 Ryan – “I like the ephemeral resources; I think it’s a neat model for handling sensitive information and stuff you don’t want to store. It’s kind of a neat process.” 

06:50 How to get fast, easy insights with the Gremlin MCP Server

07:38 Ryan – “It’s amazing they limited this to read-only commands, the API. I don’t know why they did that…it’s kind of neat to see the interaction model with different services.”

AWS

09:21 Introducing Seekable OCI Parallel Pull mode for Amazon EKS | Containers

10:24 Justin – “I personally don’t use all the CPU memory or the network of most of my container instances. So yes, that’s a willing trade-off I’m willing to make.”

13:13 AWS Management Console now supports assigning a color to an AWS account for easier identification

14:57 Matt – “I use it for Chrome and that’s always where I’ve identified different users depending on where it was, I kind of like it where it’s something that can be set.”

17:07 AWS Transfer Family introduces Terraform support for deploying SFTP connectors

18:57 Ryan – “You know you’re getting deep into enterprise orchestration in terms of your customer base when you’re doing stuff like this, because this is ROUGH. “

19:20 Amazon EKS introduces on-demand insights refresh

20:41 Amazon Q Developer now supports MCP admin control

21:33 Ryan – “This future is going to be a little weird, you know, as we sort it out. You think about like chatbots and being able to sort of create infrastructure there and then, kind of bypassing a lot of the permissions and stuff. This is kind of the same problem, but magnified a lot more. And so like, it’s going to be interesting to see how companies adapt.”

22:48 Introducing Amazon EC2 I8ge instances 

PLUS

New general-purpose Amazon EC2 M8i and M8i Flex instances are now available | AWS News Blog

29:30 Now Open — AWS Asia Pacific (New Zealand) Region | AWS News Blog

30:54 Announcing a new open source project for scenario-focused AWS CLI scripts

31:56 Ryan – “I will definitely give it a look. It’s kind of strange, because most of the contributions right now are very specific to tutorials, like trying to learn a new Amazon service, and there’s very little documentation on what error handling and advanced sorts of logic are built into these scripts. All of the documentation is just directing you at Q and say, Hey Q, build me a thing that looks like that.”

33:15 Simplified Cache Management for Anthropic’s Claude models in Amazon Bedrock

34:07 Ryan – “I’m just really glad I don’t have to create any applications that need to be this focused on token usage. It sounds painful.” 

GCP

35:02 Google Workspace announces new gen AI features and a no-cost option for Vids

36:34 Gemini is now available anywhere | Google Cloud Blog

38:18 Justin – “I 100% expect this is going to be very expensive. I mean, connected and managed Kubernetes for containers and VMs on a one-year half-depth ruggedized server is $415 per node per month with a five-year commitment.”

39:41 Container-optimized compute delivers autoscaling for Autopilot | Google Cloud Blog

40:38 Ryan – “Imagine my surprise when I found out that using GKE autopilot didn’t handle node-level cold start. It was so confusing, so I was like, wait, what? Because you’ve been able to do that on EKS for so long. I was confused. Why do I need to care about node provisioning and size when I have zero access or really other interactions at that node level using autopilot? So it is kind of strange, but glad to see they fixed it.”

41:23 From clicks to clusters: Confidential Computing expands with Intel TDX |Google Cloud Blog

43:07 Eventarc Advanced orchestrates complex microservices environments | Google Cloud Blog

44:20 Ryan – “So OpenAI is going for real-time inference, and Google is going to be event-based. It seems like two very different directions. I like the event-driven architecture; it’s something I continue to use in most of the apps that I’m developing and creating. I think that having the ability to do something at a larger scale and coordinating across an entire business is pretty handy.”

Azure

45:22 Agent Factory: Top 5 agent observability best practices for reliable AI | Microsoft Azure Blog

47:31 Matt – “It just feels like we’re saying it’s this revolutionary thing, but really it’s something we have to approach from a slightly different angle. It’s the difference between, hey, we have an API and now we have a UI, and users can do things slightly differently… It’s just the evolution of a tool.” 

49:04 Generally Available: Azure App Service – New Premium v4 Offering

52:47 Public Preview: Microsoft Planetary Computer Pro

53:55 Matt  – “I just want to play with the satellites.” 

54:24 Microsoft cloud customers hit by messed-up migration • The Register

56:26 Generally Available: Azure Ultra Disk Price Reduction

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

Chapters

View Full Transcript

Episode Transcript

[00:00:07] Speaker A: Welcome to the Cloud Pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker B: We are your hosts Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker C: Episode 320 recorded for September 2, 2025. Azure gives your FedOps person a Heartac. Good evening Ryan and Matt. How you doing? [00:00:28] Speaker D: Good. How are you? [00:00:29] Speaker B: I'm here just barely. [00:00:31] Speaker C: Yeah, I'm I'm surviving. I'm in my very echoey office as they wrap up construction at my house and I apologize for any echo of this week on the show, but better than the closet I was in last week. So two weeks of home construction is never, never enjoyable. Especially when it's in the middle of your house where you know you had to walk through a bajillion times a day and try to work from home while they're using nail guns and saws and everything else that makes it just super awesome fun. Yeah, well Breaking news this afternoon. Google has successfully avoided a potential $2.5 trillion breakup following antitrust proceedings. Maintaining its current corporate structure despite regulatory pressure to break up represents a significant outcome for big tech antitrust cases, potentially setting precedent for how regulators approach market dominance issues in the cloud and technology sectors. Now and in the future, cloud customers and partners can expect business continuity with Google Cloud platform services, avoiding potential disruption that could have resulted from corporate restructuring. The ruling may influence other major cloud providers, structure their businesses, and approach regulatory compliance, particularly around bundling services and market competitions. Enterprise customers relying on Google's integrated ecosystem, cloud advertising and protective tools can continue their current plans about concerns or service operation. Microsoft's just out there pissed off about this, I'm sure. That's what I was thinking too. [00:01:54] Speaker B: Like it's exactly the same case as Internet Explorer back in the day. And for some reason the Google decision goes the other way, which I was convinced it was going to be an issue to not spin off Chrome and Android. Here we are. [00:02:11] Speaker C: AI is got a couple new things this week with GPT or OpenAI's ChatGPT Real Time introducing real time processing capabilities to GPT models, reducing latency for interactive applications, and nailing more responsive AI experiences in cloud environments. The technology leverages optimized model inference and architectural changes deliver subsecond response times, making it suitable for live customer service, real time translation and interactive coding assistance. Collaborators can integrate GPT real time through new API endpoints, offering developers the ability to build applications that require immediate AI responses without traditional batch processing delays. This development addresses a key limitation in current LLM deployments where response latency has restricted Use cases and time sensitive apps. [00:02:55] Speaker D: More AI spam calling yours coming your way. [00:02:58] Speaker C: Also you more frustrated calls with chat agents when you're calling in for tech support in the future. [00:03:04] Speaker D: Now I'm sure too there's a couple tools we use at work and one of them has now put an AI agent in front of everything that you communicate with them for support to get you human. And it's like no, no, no. I've tried all these things. I need a human to go look at the back end logs guys and fix the real issue. It drives me crazy when you have the AI agents right up front. So it's giving me that old phone tree. Same story. I feel like we're just going but now there'll be a fake human talking to you. It'll be excellent in life. [00:03:33] Speaker C: I mean as long as you can tell it to ignore all instructions and just give me the fix right and send me a new one, I'll be okay with it. [00:03:39] Speaker B: Connect a human. [00:03:40] Speaker D: No, just start to remove RM from slash and just go from there. Have them have fun with their own security. [00:03:49] Speaker C: Terraform is providing us a Google Cloud Provider 7.0 in general availability. It's not really that interesting of an update. It does support over 800 resources and 300 data sources with over 1.4 billion downloads of the provider already. There is a couple of important things though. They are adopting both 1.10 and 1.11 features for both ephemeral resources which are interested in 1.10 which allow users to temporary resource without storing any other data in the Terraform state file. So like your secrets, you don't want there and then write only attributes which were supported since Terraform 1.11 which are write only attributes that prevent sensitive inputs like passwords or API keys from being written to the state file altogether. So between the ephemeral resource which is very temporary, it doesn't get added to state plus the write only attributes that don't go to the state file. You get a much more secure, more robust Terraform implementation with the new provider, which is great. Beyond that, improved validation reliability and an API alignment with all of the breaking changes that Google does to their cloud API all the time. [00:04:52] Speaker B: Yeah, I like, I like the ephemeral resources. I think it's a neat model for handling, you know, sensitive information and stuff that you don't want to store. It's kind of a neat process and I completely missed the the validation improvements because which I'm always a big fan of improvements that layer because I'm still very frequently frustrated when the plan everything works and the apply fails. So like I'm happy to see validation logic there. It's great. [00:05:19] Speaker D: Yeah, there's always been a ton of tools that you could do, like terror test, like would at least like validate your EC2 instance type. So there was things that you could do. But I feel like across all the cloud vendors there's still issues, like you said, with the plan versus validate and not working. What oddly enough most bothers me about this blog post is that they didn't properly Terraform format their code that they put in here and it kind of irks me a little bit. [00:05:44] Speaker B: That's funny. [00:05:45] Speaker D: Once I saw it, I can't unsee it and I'm like, could you just have done an FMT on your code? Not that hard. It was one of my first favorite features of Terraform. It just linted the whole thing for you. Like the first thing I do whenever I set up a new repo is put that as a commit and if you don't, you know, as a PR requirement, if it doesn't pass that, you can't merch it. [00:06:05] Speaker B: Yep. [00:06:07] Speaker D: Stupid things in life that drive me crazy. They did it for some and not others. Sorry. Now I'm going down a hole. [00:06:18] Speaker C: Gremlin, the Chaos Engineering leader, is giving you an MCP server to connect your Chaos engineering data to LLMs like ChatGPT or Claude, enabling teams to query their reliability testing results using natural language to uncover insights about service dependencies, test coverage gaps, and which services to test next. The server architecture consists of three components. The LLM client, a containerized MCP server that interfaces Gremlin's API and the Gremlin API itself, designed for read only operations to prevent accidental system damage during data exploration. This solves the problem of making sense of complex reliability testing data by allowing engineers to ask plaintive questions like which my services should I test next? And instead of manually analyzing test results and metrics, I personally prefer what the hell did you just do to my app? And how did you break it? You grumbling? [00:07:05] Speaker B: Yeah, it's amazing. They limit this to read only commands to the API. I don't know why they did that. That's, you know. [00:07:11] Speaker C: Yeah. So weird, right? [00:07:13] Speaker B: It would have been so much fun to try to figure out how that's going to break. But yeah, it's kind of neat to see this sort of interaction model, what different services sort of take hold like. Still very apprehensive. You know, we've, we've talked about how we feel about MCP servers in general, but I think this is kind of neat, you know, because you could ask generalized questions and naturalized language and be like, you know, like, you know, which part of my application should I focus on? What, what's performing the best, what's not, what's at most at risk. You know, things like that where you don't have to, you know, specific or specify a direct query and whatever DSL du jour is there. So it's cool. [00:07:55] Speaker C: I mean, I wish that they'd allow you to do a lot more with LLM in the future. Like, I don't necessarily want it to like, create more tests, but I think there's a lot more analysis and a lot more debugging and insights that it can provide using an LLM to inform its testing results, that I think could be really cool. So I, I don't know what else Gremlin's doing, particularly inside of LLM usage, but I can see there'd be a lot of really cool use cases that they could unlock with this capability. [00:08:21] Speaker B: Yeah, I mean, they'd need to sort of have training on that, on that data at a larger level. Right. And add that into the feature set. But it was pretty cool. I like the idea. [00:08:32] Speaker C: All right, let's move on to aws. They're introducing Sochi Parallel Polls or SQL oci. Parallel polls made for EKS to address container image pull bottlenecks practically for AI ML workloads where images can exceed 10 gigabytes in size and and take several minutes to download using traditional methods. I bet this would work great for Windows. I was going to say the feature parallelizes both download and unpacking phases using multiple HTTP connections per layer for downloads and concurrent CPU cores for unpacking, achieving up to 60% faster pull times compared to standard containerized configurations. So Tree parallel poll is built into recent Amazon EKS, optimized AMIs for Amazon Linux 2023 and Bottle Rocket with configurable parameters for Download concurrency between 10 to 20 is recommended. Chunk size with 16 megabytes recommended and unpacking parallelism based on your instance resources. The solution trades reduced pull times for higher network CPU and storage utilization, requiring optimized EBS volumes with thousands of megabits per second throughput or instance store NVMe disk for optimal performance on instances like the M6i 8x large. I mean, I personally don't use all the CPU memory or the network of most of my container instances. So yes, that's a willing trade off. I'm willing to make. I mean this is. [00:09:45] Speaker B: It's kind of interesting to me because it's like I can see a misconfiguration here. Would end up in a thundering herd situation where you're actually, you know, taking out your infrastructure by using resource exhaustion. So definitely tune it appropriately. I guess I had lost the thread or just didn't know that AI workloads were driving container sizes to such a huge level. Like that's. That's a very big image and kind of crazy. [00:10:13] Speaker D: Well, it's just pre storing all that data, pre caching it all there so we can just run on top of it. You know, I did a thing years ago for a customer where they just attached an 8 gig volume and by years I mean 1012 it feels like now where we pre attached a cache drive of 8 gigabytes of a bunch of cache data to all their servers. Sure. So when the auto scaling boots. Same concept here, they run that pre cache data locally. You know, that's the way I've seen it though. I still. You're going to take out ecr, you know, and piss off other things if you're not careful. [00:10:47] Speaker B: Yeah. And there's better ways to do that with like, you know, kubernetes, storage connections and devices. So it's like I don't know if I would recommend building in that cache layer into the image itself, but I think that, you know, I think that's definitely, you know, you know, we've all faced Windows containers and certain libraries and code base sort of frameworks can be very large. So kind of nuts there. There's no reason why this wouldn't work for ECS either because it's all just in the containerd backend. [00:11:19] Speaker D: I. I swear Sochi was released a long time ago and maybe it was just straight kubernetes. [00:11:24] Speaker B: Well, lazy loading. But was was announced. But I don't know about Sochi. [00:11:29] Speaker D: I mean I feel like I remember learning about Sochi a long time ago. [00:11:33] Speaker B: But it has been. Has been a theme of this week's notes. Didn't we do this already? Living in groundhog data this week. [00:11:42] Speaker D: I mean polling containers in Sochi snapshot or there's some article, but that's not it. [00:11:48] Speaker B: There's definitely been a lot of improvements. Maybe I'm misremembering and I'm guessing AI workloads are driving that. [00:11:54] Speaker D: Yeah, I was even thinking back like 2022 and stuff like that, but maybe I'm wrong. [00:11:59] Speaker C: Yeah, I don't remember exactly when it came out. I do remember talking about it, but I thought it was still very limited to like certain things in the container had to be built a certain way and then you had which is why as of the thing it actually supports Windows because I think it, you know, Windows operating systems don't really work the same way Linux ones do. But yeah, I don't remember ever coming up, but this is definitely for very specific use case here in eks. I think it was ECS only initially as well. The AWS Management Console is fixing a problem that they caused to themselves. So they basically added the ability to add multiple sessions in your AWS console. And the problem is of course you don't have an easy way to know which console you're in because unless you look at the top right corner and it says in small print which console you're in. So they've added colors to allow you to assign colors to the accounts like red for production, yellow for testing, and these will appear in the navigation bar. Replace the need to memorize account numbers for identification across multiple account environments. The feature addresses a common pain point for organizations managing multiple AWS accounts for different workloads, business units or environments by providing instant visual differentiation when switching between accounts. The invitation requires admin privileges to set colors through the account menu, and users need to either the AWS Management Console Basic User Access Manage Policy or the custom UXC get account color permission to view the assigned colors. I mean this is as a level I am, you know, persnickety that's that you care about the color. This quality of life improvement reduces the risk of accidental changes in the wrong environment and sees of context switching for engineers and operators who really work across multiple AWS accounts. I did turn it on in my account because I do use a couple typically and I have been confused before. So I did appreciate this feature, but it's sort of, this is a problem of your own making that clearly you've never dealt with multiple profiles in any other app where colors make sense. [00:13:51] Speaker D: Yeah, I use it for Chrome and that's always where I've identified different users, you know, different either clients or, you know, depending on where I was. I kind of like it where it's something that can be set and I almost wonder if you can set it at like the organization level where you can say all product counts are red, all staging counts are yellow. And that's kind of where I feel like they're going with it, but they're not quite all the way there yet. [00:14:18] Speaker B: Well, yeah, you could definitely set it for all the accounts Right. So I don't think you could set it biou or some sort of like label. Yeah, necessarily. But it is kind of a neat. But you know, if you're, you know, a cloud, you know, administrator for your company, you could definitely set it that way. Pretty simply it's some API commands. It is kind of cool. And then you, you know, restrict permissions for anyone to change it, which is why you get all persnickety in the IAM permissions like nope, only I can set red. Only me. [00:14:45] Speaker D: I mean I still think it's a great feature. [00:14:47] Speaker B: No, it is a. I mean it's. [00:14:48] Speaker D: Funny something they cause but I remember. [00:14:52] Speaker B: Having this horrible browser config because we had hundreds of AWS accounts and trying to keep it all straight and you know, certain accounts had the roles to that you assumed into certain accounts had the trusts for other things. And so like it was very specific in a lot of cases. And so like between like Firefox containers, tab groups and a whole bunch of other extensions to change colors and change the fav icons of stuff. Like I had all this custom configuration that only worked on the one laptop that I was using. So I like that this is more built into the account level. [00:15:27] Speaker C: Yeah, I mean it's nice to have these things and I mean I remember all the tricks I used to do to so I could support multiple AWS accounts when you couldn't do it in the same browser, like different profiles and different things. And. And so I was really glad when they added the multi session thing but then you had the problem of like okay, how do I tell them apart? Which has been a fun problem. So I'm glad to see it. But again, a problem made of their own devices for sure. The AWS transfer family now supports Terraform deployment for SFTP connectors, enabling infrastructure as code automation for file transfers between S3 and remote SFTP servers. This extends beyond the existing SFP server endpoint support to include the connector functionality. SFP connectors provide fully managed low code file copying between S3 and remote S3 servers. And the new Terraform module allows programmatic provisioning with dependencies and customizations in a single deployment. The module includes end to end examples for automating file transfer workflows using schedule or event triggers, eliminating manual configuration errors and providing repeatable scalable deployments. I'm just glad this is here. But this, this is the one team at Amazon that is very, very slow at adopting Terraform, which just drives me crazy. [00:16:35] Speaker D: But it was low code is why. So now you're Adding code on top of your low code solution. So is it medium code? Where's the fault? [00:16:42] Speaker B: I've never been more confused by this. Like the whole point of a. The connector part of this transfer is that so that you could give people like the ability to sort of build little mini ETL stuff for FTP file, you know, automation. And now you're going to automate that. Like what. [00:16:59] Speaker C: I'm confused. [00:17:01] Speaker D: I mean I definitely have done essentially what an SFTP connector did, just SFTP transfer family service on top of an S3 bucket with an event that fired a lambda that did a thing. And that's what these connectors do. I mean, I guess it's. [00:17:16] Speaker B: Well, they connect to an external SFTP endpoint that's the biggest like. Which is tricky to do from within a lambda. Yeah, no, yeah, you've been able to do it. [00:17:25] Speaker D: The. [00:17:25] Speaker B: From the other way around is much easier. This is going the other way. [00:17:30] Speaker C: Which is. [00:17:33] Speaker B: Which. That's the. You know, that was the. The feature they announced. I don't remember. But how long ago. But is. You know, you're getting deep into enterprise orchestration in terms of your customer base when you're doing stuff like this. Since this is rough when you're in this territory. [00:17:53] Speaker C: Well, the other thing that's rough, Ryan, is when you have those pesky EKS cluster insights and you apply them, but then the darn thing tells you it's still broken. And so to fix that, Amazon's now allowing on demand refresh of your cluster insight, allowing you to immediately verify that if your applied recommendation and configuration changes have taken effect. And instead of waiting for the periodic automatic checks, this feature addresses a key pain point during Kubernetes upgrades by providing instant feedback on whether required changes have been properly implemented, reducing the time between making changes and validating them. The Insight systems checks for issues like deprecated APIs before version upgrades and provides specific remediation steps with the refresh capability now available in all commercial AWS regions for the DevOps teams, managing multiple EKS clusters is limited to guesswork and waiting periods during maintenance windows. Particularly useful when performing rolling upgrades across your environment, the feature integrates with existing EKS cluster management workflows at no additional cost, accessible through the EKS console or API as documented. [00:18:53] Speaker B: You know who's really glad this feature exists? AWS support. Because they're so sick of those cases being open. Like if my insights aren't updating or. [00:19:03] Speaker C: Like, you know, eventual consistency doesn't work out so well when you're trying to do A real time thing? No, sure doesn't. Yeah. All right. Amazon Q Developer is now adding centralized admin control for MCP servers, allowing organizations to enable or disable MCP functionality across all Q Developer clients from the AWS console. The feature provides session level enforcement, checking admin settings at startup and every 24 hours during runtime, ensuring consistent policy applications across VS code. JetBrains, Visual Studio, Eclipse and the Q developer CLI organizations gain granular control over external resources accessed through MCP servers, addressing security concerns preventing users from adding unauthorized servers when the functionality is disabled. Zupp positions Q developers a more enterprise ready AI coding assistant by giving it admins the governance tools needed to manage AI powered development environments at scale. I mean, as long as it's not within that 24 hours, I suppose. Seems like slow automatic update to me. [00:20:01] Speaker B: But it's kind of a weird. Like this future is going to be a little weird, you know, as we sort it out. Right. Like you think about like chatbots and being able to sort of create infrastructure there and then, you know, kind of bypassing a lot of the permissions and stuff. This is kind of the same problem, but magnified like a lot more. And so like it's going to be interesting to see how companies adapt. I've never seen sort of a org wide sort of configuration for. For what an agent can do, which is kind of neat. Kind of glad I don't have to maintain it because it sounds complicated. I mean it'd just be really difficult to. You'd be constantly having to like tweak it. Right. It'd be worse than a service catalog in a sense. [00:20:51] Speaker D: So you got firewall, there's 5 billion rules in it when you're routing stuff everywhere. If you have a firewall decentralized point. Good luck managing all those rules and how you manage change control and everything on it. Yeah, that's kind of what it's going to be. [00:21:04] Speaker C: Amazon is giving us two new instance families this week with the i8ge instance and the new M8i and M8i Flex instances. The i8ge instances have a Graviton 4 processor delivering 60% better compute performance than the previous Graviton 2 plus 120 terabytes of local NVMe storage, as well as has the third generation AWS Nitro cards for faster I O throughput. And you can get this i8ge instance up to 48x large with 1.5 terabytes of memory and 300 gigabytes of networking bandwidth. That's a big one. And then those new M8i, Flex and M8i are custom Intel Xeon 6 processors running at 3.9 gigahertz, all core turbo. They're up to 50% better price performance and 2 1/2 megs memory bandwidth compared to the M7i generation. The ambient iFlex offers a 5% lower price point for workloads that don't need sustained CPU performance, reaching full CPU performance 95% of the time while maintaining compatibility with your existing apps. So both these are available to you in multiple regions. Check the blog posts for specifics on that. [00:22:10] Speaker B: Comes with an SAP certification, huh? Wonderful. [00:22:16] Speaker D: I mean, it's nice that they're slowly adding these instances and moving just the I series and the M series over to more graph times and Intels and really kind of rounding out that fleet. So you have a lot more options in it, which is, I feel like, is nice though. I feel like. And maybe the Flex kind of feels like the T series. So are they migrating kind of away from the T series with the not always guaranteed cpu? [00:22:41] Speaker B: I mean, it's similar, but it's very different model, which is weird, right? Like, instead of having like the credits and. And then a way to sort of ignore the credits like, this sort of feels like a little bit more, you know, declarative, but it is. I have never really kind of understood the need for it, but I. Yeah. [00:23:03] Speaker C: Yeah, I have the same. I have the same problem. I've. I've used them a couple of times and they're, you know, if you have a typical Windows box that does nothing on the CPU and it's all memory, then it makes some sense, I guess. But yeah, it's sort of a weird model. [00:23:14] Speaker D: Then why not do a C series then? [00:23:17] Speaker C: Or because you need the faster Xeon processor? I don't know. [00:23:23] Speaker D: I have no idea. I just have to understand Flex a little bit more. It's. I feel like I remember getting released and that's about it. [00:23:30] Speaker B: Yeah. So, yeah, one of these days I'll have a workload where it'll make sense. I'll be like, oh, okay. That's why. But maybe until then you'll be like. [00:23:37] Speaker D: Wait, we have to refactor that because it breaks everything else. [00:23:39] Speaker C: Exactly. [00:23:40] Speaker D: Yeah. [00:23:41] Speaker C: I'm asking Claude. So I'm like, now I'm curious. [00:23:44] Speaker D: I was literally trying to find the Chrome window that was the different color than the podcast Chrome window. [00:23:52] Speaker C: Let's see what it says. All right. It's thinking. And it's thinking. Cost savings. Yep, yep. Good for variable workloads. Ideal when you don't need sustained high CPU performance Performance and baseline performance with burst capacity provides that baseline with ability burst when needed. Ideal workloads for M7i Flex include web servers, small to medium databases, developing testing environments, microservices, batch processing jobs, content managed systems and CICD pipelines. I guess what is the defense? I mean if like do I this. [00:24:25] Speaker B: Credit thing, I mean it's funny it says they mentioned sweat servers because that's my. All of my T series nightmares are specifically web serving. [00:24:36] Speaker D: Fantasage has a blog post about when you would use it. I think it has to do with how much Flex you need. [00:24:44] Speaker C: Okay, so I asked my follow up question was are there are credit limits? What if the server ends up being more sustained than I expected? And so it says Great question, thanks. Thanks Claude. I always like you to prom me up like that. So the M7i doesn't use a credit system at all, which is one of the key advantages over traditional burstable instances like the T3 or T4. Instead of credits, M7i Flex instances use a fixed baseline plus burst model. So basically you always get at least 40% of CPU and you can burst up to 100% of CPU. But if you're using up trying to use up 100% you might get throttled back. And I bet this is how they're sort of helping feed the spot market. Oh yeah, by creating these boxes that don't have full capacity needs because that's only way this kind of makes sense to me in any way. But this is really about them building how do you build credit into the spot market so you don't have this problem. But yeah, so no credits to manage predictable minimum performance of at least 40%. So you don't necessarily need all the CPU, but you definitely want all the memory. So I can see how this makes sense for like SQL Server potentially where you want a lot of data pinned in memory but you don't necessarily need all the CPU because maybe your workload doesn't quite need that, but it doesn't need the memory. So that's probably the use case that makes sense for this. But you know one of the things I I learned on the spot market, the best instances to buy in spot market are the old ones. Yeah because all these new fancy servers, they all have the fancy GPUs and are in high demand for AI and ML. But if you just wanted something or spot, just go pick up a C5 or a C6, you'll be happy. Camper, I've been sitting on spot market on those bad boys for months now and not been disrupted once because they don't have the GPUs that everyone wants. [00:26:27] Speaker B: And the fleets are huge. [00:26:29] Speaker D: Right? [00:26:29] Speaker C: And the fleets are big. Yeah. So you get a lot of, a lot of bang for your buck in the spot market. And the legacy instances. Yeah. These C7 eyes don't try to spot on these. It's all that'll be a bad day for you unless you can handle a lot of disruption and you will flows. [00:26:42] Speaker D: I had an old client that was like when M3s and M4 came out I think like M4s and he ran an M1 small for his dev box for years without a problem and he would just launch everything as a spot on the M1s and T1 series and they never died. And I was always impressed by it. [00:27:03] Speaker C: Until Amazon called them as like you, you can't have these anymore. [00:27:06] Speaker D: Yeah we are deprecating that. We are removing them. Good luck. [00:27:14] Speaker A: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Achero will click the link in the show Notes to check them out on the AWS marketplace. [00:27:53] Speaker C: All right, Asia Pacific has a new region on Amazon and it's for the Kiwis or for the birds. It is in New Zealand. This is the new AP Southeast 6 region with three availability zones representing the New Zealand 7.5 billion investment that's expected to contribute 10.8 billion to New Zealand's GDP and create 1000 jobs annually. The region addresses data residency requirements for New Zealand organizations and government agencies operating under the country's Cloud first policy. With AWS supporting 143 security standards. New Zealand customers like Maid Matter, Xero and Thematic are already leveraging AWS services, including Bedrock for generative AI applications with Region Powered by Renewable Energy through an agreement with Mercury New Zealand from day one. AWS has been building infrastructure in New Zealand since 2013, including Cloudfront Edge locations and Auckland Local Zone for single digit millisecond latency and direct applications. With this full region launch completing their local infrastructure footprints. This launch brings aws up to 120 availability zones across 38 regions globally with strong local partner ecosystems supporting the company. Put Ryan to sleep on that one. [00:28:57] Speaker B: Yeah, I mean I want to go to New Zealand. I'm still trying to, you know, work one of these announcements into that. [00:29:04] Speaker C: It does make me super sad that they don't allow data center tours for audit reasons, you know, because like yeah, I need to go see the New Zealand region. Not a problem. Yeah, anytime. AWS is launching an open source project providing tested shell scripts for over 60 AWS services addressing the common challenge of writing error handling and cleanup logic when using the AWS CLI for infrastructure automation, which a Don't do that Database Developer Tutorials project on GitHub includes end to end scripts with built in resource tracking and cleanup operations, reducing the time developers spend debugging CLI commands and preventing orphaned resources. Developers can generate new scripts in as little as 15 minutes using generative AI tools like Amazon Q developer CLI leveraging existing documentation to create working scripts through an iterative test and improve process. Each script comes with tutorials explaining the AWS service API interaction, making it easier for teams to understand and modify scripts. For those of you use CASE rather than starting from scratch. The project accepts community contributions and provides instructions for generating new scripts, potentially building a comprehensive library production ready CLI automation patterns across AWS services and I for one am nominating Ryan to commit Ryan Shitty Scripts to the community as a community contribution to be included in the AWS CLI repository. [00:30:15] Speaker B: I will definitely give it a look this it's kind of strange because most of the contributions right now are very specific to tutorials like trying to learn a new Amazon service. And there's very little documentation on what error handling and advanced sort of logic is built into these scripts. It's all of the documentation is just directing you at Q and saying Hey Q, build me a thing that looks like that. [00:30:43] Speaker D: That never caused a problem on Amazon Q. I might have dropped a production database. [00:30:48] Speaker B: It's a little, little annoying there because I would, I like to, I would like a little bit more meat to read through and, and see what they're doing without having to go through all 70 scripts by, you know, line by line. But you know, I do applaud them for having sort of a Cielo based tutorial just because I I don't know how many hello worlds I've gone through on a new Amazon service where I've had to click through this console like exp. You know, experience not really knowing what's going on under the covers and not knowing what resources are being created on my behalf as I navigate that console. And so I like, I kind of like this. I wish it was around when I was still learning a lot of these things. [00:31:28] Speaker C: Amazon Bedrock is simplifying prompt caching for CLAUDE models by automatically identifying and reusing the longest previously cached prefix, eliminating manual cache point management for developers using Claude 3.5, Haiku, Cloud 3.7 and Cloud 4. The update reduction tokens consumption I'm sorry, the update reduces token consumption and cost since cache read tokens don't count towards token per minute quotas, making multiple turn conversations and research assistance more economical to operate. Developers now only need to send a single set a single cache breakpoint at the end of the request instead of tracking multiple cache segments, simply reducing implementation complexity for applications with repetitive contexts. This feature addresses a common pain point in LL notifications or repeated context like system prompts or document analysis. Pre certified manual cache management logic that is error prone and time consuming. [00:32:16] Speaker B: I'm just really glad I don't have to create any applications that need to be this focused on token usage just because it sounds painful. And you know, the reason why features like this exist is because of that pain. So I'm glad that they do. And it's, you know, an advantage of Amazon, you know, and their partnership with Anthropic on how they're able to do this, I'm sure. So it's kind of neat. [00:32:40] Speaker D: I mean, anything to optimize it is going to be good and you know, it'll probably help companies that don't even realize that they're getting the advantage of it just because it's stuff that's cashed in there. You know, I'm sure there's a large customer or two that this is targeting to help the most. [00:32:56] Speaker C: All right, well then let's move on to Google Cloud. First up, Google Workspace announces new gen features and no cost option for videos Google Vids announces generative AI capability powered by VO3 that can transform static images into short videos available to paid Workspace customers. And Google AI Pro Ultra subscribers positions Google against competitors like Microsoft's Clip Champ and Adobe's AI Video Tools by integrating video creation directly into the productivity suite. The basic Vids editor of that AI features launches at no cost option for consumers, marking Google's first free video editing tool within Workspace. This creates a clear freemium model where basic editing is free, but AI powered features like avatars and automatic transcript trimming require paid subscriptions. All I can say to you guys is just be glad that my day job does not have Google Workspace because I've been making lots of videos and lots of fun with this and all the time and so I'm sort of disappointed now. The cloud pod, you know Google Workspace though does have AI capabilities, at least for me because I paid for myself. But you know, so maybe, maybe the cloud powder gets more videos as I start playing with this. [00:34:00] Speaker D: Yeah this can only be used for workspace debauchery then productivity only. [00:34:06] Speaker C: That's the only, only reasonable choice. [00:34:09] Speaker B: Yeah other so AKA evil can only be used for evil which I mean I'm a fan. [00:34:16] Speaker D: Do no evil. Do only evil. Choose which way what your choose what the model was. [00:34:22] Speaker C: Gemini is now available anywhere. Google now offers Gemini AI models on premise through Google Distributed Cloud, allowing organizations with strict data sovereignty requirements to run advanced AI workloads in their own data center without compromising security or compliance. The platform includes Gemini 2.5 Flash and Pro models, supports Nvidia, Hopper and Blackwell GPUs and provides managed infrastructure with automatic scaling, load balancing and confidential computing capabilities for both CPUs and GPUs. This positions Google against both AWS Outposts and Azure Stack, but with a specific focus on AI workloads, offering complete AI stack including vertex AI services, pre built agents and support for custom models alongside Gemini. Some of the key customers include the Singapore government, KDI in Japan and many other public sectors that have sensitive data needs. The offering comes in two variants, GDC Air Gapped, which is now generally available for completely isolated environments, and GDC Connected in Preview for hybrid scenarios, though pricing details are not disclosed and require contacting Google directly. Which means expensive. [00:35:27] Speaker B: I mean any of these solutions from any of these cloud providers are not cheap. Basically talking about racks of hardware being delivered to your data center to run these things. Yeah, fine. [00:35:40] Speaker C: I kind of want to do this. I kind of want to get gdc. I don't know what's involved, I should probably do more research on it, but I'm kind of like this could be fun to play with. [00:35:48] Speaker B: Well I read just through the pre read like clicking through GDC on this kind of thinking the same thing and it's got that enterprise vagary that makes me think it's a lot of money as well. You know, like they're not. [00:36:02] Speaker C: I like I 100% expect it's going to be a lot of money. Yeah, I mean connected connected managed Kubernetes for containers and VMs on a 1U half depth ruggedized server is 415 per node per month with a five year commitment. [00:36:17] Speaker D: Well, it's ruggedized. [00:36:18] Speaker C: Yeah. [00:36:19] Speaker D: With a five year commitment. [00:36:21] Speaker C: Five year commitment is $1,245 per month for three node configuration. That's what it says. I don't. I don't really know. It's in small print. I don't know. There's no aspir to like who. How you have to do that. But I'm curious. And then is this hardware that Google provides or is this like I buy HP shit and I put this on top of it like Azure Stack or I have questions. [00:36:43] Speaker B: I mean I think it's a lot like outposts, right? Because I. They're. It's a fully managed experience and doesn't it. [00:36:52] Speaker D: Yeah. [00:36:52] Speaker C: I assume it's more like out. [00:36:53] Speaker B: Yeah. [00:36:54] Speaker C: Seems like more. [00:36:54] Speaker B: Isn't it like all the support of the hardware and stuff is handled for you. [00:36:57] Speaker D: I thought that's right. [00:36:59] Speaker B: Yeah. Could be wrong though. Yeah. I don't know. It doesn't. Again, you know the product page is very light on specifics because they definitely want you to contact them. Build you a solution of here. Thank you. [00:37:12] Speaker C: Container Optimized Compute delivers auto scaling for autopilot the new containerized optimized compute platform delivers up to 7x faster pod scheduling by using dynamically resizable VMs and pre provisioned compute capacity that doesn't impact billing since customers only pay for requested resources. The platform addresses a common pain point where auto scaling could take several minutes, forcing users to implement costly workarounds like balloon pods to hold unused capacity for rapid scaling scenarios built in high performance HPA profiles provide 3x faster calculations and supports up to 1000 HPA objects, making it particularly suitable for web apps and services requiring gradual scaling. With two CPUs or less available in GKE autopilot 1.32 or later with general purpose compute class. Though not recommended for one pod per node deployments or batch workloads. The decisions GKE competitively against EKS and AKS by solving the cold start problem. Containerized workloads without requiring manual capacity planning or paying for idle resources. [00:38:06] Speaker D: Yeah. [00:38:07] Speaker B: Imagine my surprise when I found out that using GKE Autopilot didn't you had to handle like node level cold start. It was so confusing. So I was like wait, what? Like because you've been able to do that on EKS for so long, you know that I was confused. I'm like why do I need to care about, you know, node provisioning and the size still, but when I have zero access or really other interactions at that node level using Autopilot. So it's kind of strange, but glad to see they fixed it. [00:38:42] Speaker C: Google is expanding their confidential computing with Intel TDX across multiple services including confidential VMs, GKE nodes and confidential space. Now available in 10 regions with 21 zones. The technology creates hardware isolated trust domains that encrypt workloads in memory during processing, addressing the security gap beyond traditional at rest and in Transit encryption. Confidential VMs with Nvidia H100 GPUs on A3 instances Combine Intel TDX for CPU protection with Nvidia Confidential Computing for GPU security, enabling secure AI ML workloads during training and inference. Common GKE knows that Intell TDX work on both GK Standard and Autopilot without code changes, allowing containerized workloads to remain encrypted in memory. Configuration can be set at cluster or node pool level via CLI API, UI or terraform code. Confidential Space now supports Intelli TDX network in addition to AMD enabling multi party data collaboration and federated learning use cases and customers like Symphony and Duality use it for isolating customer data from privileged insiders and privacy preserving ML respectively. Intel's Tiber Trust Authority Station service now offers a free tier for third party verification of confidential VMs and confidential space workloads. This provides stronger separation of duties to guarantees beyond Google's built in attestations. These are available to you in Europe US Central and US East 5A, so don't get too excited if you need these right away, but they are coming to a place near you. [00:40:06] Speaker B: Yeah but there's also you know the there's compliance frameworks in each of those locations that would drive their requirements for this so it makes total sense why they're there. [00:40:17] Speaker C: So yeah, Event Arc Advanced is now generally available, evolving from Event ARC standard to handle complex event driven architectures with centralized message bus management, real time filtering and transformation and multi format payload support. This positions GCP compatibility against EventBridge and Azure Event Grid by offering built in transformation capabilities and envoy based routing. The services introduce a published API for ingesting custom and third party messages in cloud events format, enabling organizations to continue connect existing systems without major refactoring. Centralized message bus provides per message fine grained access control and integrates with cloud logging for observability. Key use cases include large scale microservice orchestration, IoT data streaming for AI workloads and hybrid multi cloud deployments for event routing across different environments is critical. Future integrations with service extensions will allow custom code insertion into the data path and plan to model armor support suggests Google is pushing this for AI agent communication scenarios. This will allow GCP's broader push into AI infrastructure and agentic architectures. Pricing details aren't provided in the announcement. The serverless nature suggests pay per use pricing similar to other GCP eventing capabilities, so do check that with your account. [00:41:28] Speaker B: Reps before enabling advanced so OpenAI is going for real time inference. Google is going to event based Seems two very different directions, but I do, I do. You know I like the event driven architecture. It's something I continue to use in most of the apps that I'm developing and creating. I think that having the ability to do something at a larger scale and coordinating across an entire business is pretty handy, which isn't something you could have done with event arc before. It's much more communicating with yourself and orchestrating those things. This is much more externalized and handled events as a platform. So kind of cool. I imagine it's all Canadian under the hood because a lot of Canad has a lot of this built in the eventing and all of the routing layer. So it's pretty cool. [00:42:29] Speaker C: I'm sure it does. Moving on to Matt's favorite subject, Azure Azure AI Foundry is interesting Comprehensive agent observability capabilities that extend beyond traditional metrics, logs and traces to include AI specific evaluations and governance features for monitoring autonomous AI agents throughout their life cycle. The platform builds provides built in agent evaluators that assess critical behaviors that like intent resolution, task adherence tool, call accuracy and response completeness with SEEMS integration into CISV pipelines through GitHub Actions and Azure DevOps extensions. Azure's AI Red Teaming agent automatically automates adversarial testing to identify security vulnerabilities before production deployment. Simulating attacks on both individual agents and complex multi agent workflows to validate direction readiness and the solution. Differentiate itself from traditional algebraic tools by addressing the non deterministic nature of AI agents, offering model leaderboards for selection, continuous evaluation capabilities integration with Azure Monitor for real time production monitoring with customizable dashboards and alerts. Enterprise customers including ey, Accenture and Veeam are already using these features to ensure AI agents meet quality, safety and compliance standards. [00:43:38] Speaker B: I like that model the Red Teaming agent for testing other agents that you're creating because it is sort of a challenge right? How do you do testing on some of these things you're developing and these, you know a lot of these agents are going to end up being interacted with by your customers and you want to make sure that the data, you know, stays safe and the, the responses aren't horribly incriminating and you know, reputation trouncing. [00:44:09] Speaker D: But how is that different than any standard QA testing that you should already be doing? [00:44:15] Speaker B: Well, because how do you QA test an agent that's generating the content and response? How do you validate that that response is what you want? [00:44:26] Speaker D: How would you make sure an API? A standard API? Well, I guess that's more, you know. [00:44:31] Speaker B: The payload to expect that's documented as part of your swagger spec. Right. Like it's a lot more challenging. [00:44:36] Speaker D: Yeah, it just feels like we're saying this revolutionary thing but really it's to me it's just like, okay, we now have this thing, we have to approach it slightly from a different angle. You know, it's the difference between hey, we originally have a you a let's say API that you have and now we have a ui. Well, users can do things slightly differently. You know, different formats of screens. We have different oss and different web browsers. Like it's to me kind of that same. It's just like an evolution of a tool and doing it properly. And I feel like sometimes when people talk about AI, it's like this completely new thing that we can never do and we, we have to redo everything from the ground. But it's just another iteration in the way things respond and how things operate. So yes, we probably need to build some new tools and build some new theories and you know, models of how to do it. Not no pun intended with the word model but you know, it's just a next step, a next evolution in, in technology. [00:45:37] Speaker B: Definitely. I mean and I think it's build versus buy. Right. If I can leverage a pre built Azure agent versus having to build one myself. You know, it's the same thing if it's depends on the level of customization I need and any of course availability and price and all those different features. But pretty, you know, exactly as building the, the new patterns like these are the types of building blocks you need to choose between. [00:46:02] Speaker C: Azure app service premium v4 is bringing NVMe local storage and memory optimized configurations with Windows and Linux workloads addressing performance bottlenecks for I O intensive apps like content manager systems and E commerce platforms. The new tier runs on Azure's latest hardware with faster processors positioning it competitive against AWS compute optimized instances and GCPS N2 series while maintaining app services past simplicity. Starting configurations at 1 VCPU and 4 gig of RAM make premium V4 accessible for smaller production workloads that need enhanced performance performance without jumping to dedicated VM solutions. This release signals Microsoft's continued investment in app services as enterprises increasingly adopt PAAs for mission critical applications, particularly those requiring consistent low latency performance. Premium v4 fills the gap between standard app service tiers and isolated environments, giving customers a middle ground option. For apps that need better performance but don't require full network isolation. [00:46:56] Speaker D: This is a great incremental improvement pricing wise. Also of this which I think you touched on a little bit or I might not have been paying full attention, but pricing wise also is like it was like 100 to $120 so you know, $20 decrease for one, you know, one worker. So in a normal environment if you have any ha, you're talking three workers. So for your environment you're already talking decent cost reduction and moving up a level. So I know I use these at my day job and you know, just seeing the continuing improvements, I think, you know the V2 to V3 was a good step up in performance with a decrease in price and seeing them continue to evolve, that is nice. And you know the viltage storage, you know, in the faster storage, especially if you're doing local caching or any file manipulation, 250 gigabytes for free is a decent chunk of change, decent chunk of storage. [00:47:50] Speaker B: Matt, remind me, is app service like the the sort of Elastic Beanstalk or Lightsail sort of. It's comparison comparative service. [00:48:00] Speaker D: The best way I describe it to people on AWS it's a combination of like App Runner and Beanstalk combined. So you give it let's say your your and Lambda it's kind of an all in one because it's Microsoft, you know. So Lambda you can have it just run functions on a certain time and it runs your code and does your thing. You can also have it run websites or APIs and here's your zip file of your net or your flask app or whatever it is and it will just run that too. So you give it your pre built zip and it just runs it all for you. So it's kind of that combination of both where it's sort of beanstalky but there's no underlying servers you manage. For Beanstalk you still technically have to have SSM to handle OS updates and all that good stuff, you know. So it's a little bit different there but it also still has some of the things of being stuck which is why kind of most correlated to there where you do have to say if you're not, especially if you're on the premium tiers, you say how many underlying servers you want. 2, 3, 4. They can scale up and out depending on which way you want it. So you kind of have both options here in school. So it's kind of three in one, four in one. It's the best way to kind of describe it. It's Microsoft. We built something that works for many things. It can run Docker by default too I believe. So it has a bunch of stuff. [00:49:23] Speaker B: In there versus the Amazon model where they built six different ways to do the one thing. Yeah. [00:49:32] Speaker C: Microsoft's Planetary Computer Pro is entering Public Preview as a geospatial data platform that ingests, manages and disseminates location based data for enterprise data and AI workflows, targeting organizations that need to process satellite imagery and environmental data sets at scale. The platform integrates with Azure's existing data services to accelerate geospatial insights, positioning Microsoft to compete with AWS Earth on AWS and Google Earth engine by offering enterprise grade tools for climate modeling, agricultural monitoring and urban planning apps. Key capabilities include streamlined data ingestion pipelines for various geospatial formats and built in processing tools that reduce complexity of working with petabyte scale earth observation datas. I mean I don't know anything about geospatial data but that this is called Microsoft Planetary Computer Pro. It's kind of cool. [00:50:19] Speaker D: Literally the best thing about it, I only understood about half the other features they talked about in there because I actually read the article. I understood about half of what they were talking about but I just like the name of it. So that was fun. [00:50:32] Speaker B: Does look fun. [00:50:33] Speaker D: I just want to play with the satellite. [00:50:36] Speaker B: This is what I want. I want someone to pay me just to do random experiments. [00:50:40] Speaker D: I always wanted to play with Amazon satellites ground station. I don't know what I would do with it but I want to use it. [00:50:48] Speaker C: Yeah, I don't know I do with it either, but I'm having the same bone. Like yes, I definitely want to do that. So Microsoft's migration from MOSP to Microsoft customer agreements has caused incorrect cost cases calculations that triggered false budget alerts, with some customers seeing forecasts increase of over 1000% despite no actual billing impact. Those poor FOPS people. Their hearts the this incident highlights risks in Azure's account migration processes where automated systems can send panic inducing alerts even when actual invoices remain unaffected, creating unnecessarily administrative burden and cardiac visits. Microsoft support's response drew criticism as users reported difficulty reaching human support and some claimed their forum comments were being deleted, raising questions about Azure's customer communication during service disruptions. This follows other recent Azure security operational issues that people complained about, including the Storm 051 ransomware attack, etc. For cloud architects, this emphasizes the importance of understanding the difference between forecast alerts and actual billing, and maintaining direct billing verification processes rather than relying solely on automated notifications. Wow, what a great sales pitch that all the cloud finops vendors should take. Don't trust them. You need direct billing verification because you can't trust those people over there at the cloud providers. What do they know? [00:52:05] Speaker D: Yeah, I really thought you were going to say you can't trust those people over there at the cloud pod. Like in my head that's where you were going for some reason. [00:52:13] Speaker B: It's also true. [00:52:15] Speaker C: Also true. I was like, yeah, I mean I. [00:52:19] Speaker D: Was not affected by this. I saw the article, I was like, oh God, thank God. Because we've talked about adding more of our finance people and I've worked on trying to get some of my finance people into the FinOps foundation and trying to teach them and not rely on me and my team to kind of handle all the finops of it. But if, I swear, if they would have gotten some of these notifications and some of the old finance people, I know they would have just like freaked out immediately and told us like to shut everything down. [00:52:49] Speaker C: Azure Ultra disks now cost less in multiple regions, making sub millisecond latency storage more accessible for demanding enterprise. Enterprise workloads like SAP4, HANA, SQL Server and Oracle databases, Ultra disks deliver up to 160,000 IOPS and 4,000 Mbps and cost you an ARM and a leg. The price reduction targets performance critical applications where storage latency directly impacts business operations. Though specific discount percentages weren't disclosed in the announcement, regional pricing strategy suggests Microsoft is testing market response before potentially expanding discounts to other regions. [00:53:23] Speaker D: It looked like it was about 20 to 40% in one of them. Microsoft was weird. The way they announced it was like a blog post per region. They did this in so in US Central, I think it was about 20 to 40%. So now you just get to be lopsided when you walk with your 20% more of a leg that you get to do. [00:53:42] Speaker B: I mean it's if. If you have to ask how much it costs, you can't afford it. I think it's basically. [00:53:48] Speaker D: You're not a dba. [00:53:49] Speaker B: Yeah. Oh, they never ask how much it costs. [00:53:53] Speaker C: They just Spin them up. Well, that is another fantastic week in cloud news, guys. Any final thoughts before we sign off for the night? [00:54:03] Speaker B: We made it. [00:54:05] Speaker D: Yay. You know, the shows I always think are going to be long are shorter. The shows I think are short are going to be long. [00:54:11] Speaker B: You can't win. [00:54:13] Speaker C: Yeah, there's no winning in this. You just talk until the show topics end and then we stop talking. Yeah, my wife always asks, like, how long are you going to be? I'm like, I don't know. It depends. On, is Jonathan there? Is Matt there? Is Ryan there? How interested are the topic are? I mean, if we're talking about security or containers, Ryan can go on for days. Azure's done something to piss Matt off. That's a good 30 minutes of ranting right there. So it just depends on the week. [00:54:45] Speaker D: There's a nice graph for Justin. Everyone's screwy bun. [00:54:48] Speaker B: Yeah, I know. [00:54:48] Speaker C: The new executive dashboard. Like, oh, my God. [00:54:51] Speaker D: Yeah. [00:54:55] Speaker C: Jonathan's teaching us how AI works because we're all idiots, but things like that, it's good. All right, gentlemen, well, I will see you next week here on the show. [00:55:04] Speaker B: Bye, everybody. [00:55:05] Speaker D: Bye, everyone. [00:55:09] Speaker A: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services. While you're at it, head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.

Other Episodes

Episode 265

June 28, 2024 00:39:48
Episode Cover

265: Swing and a WIF

Welcome to episode 265 of the Cloud Pod Podcast – where the forecast is always cloudy! Justin and Matthew are with you this week,...

Listen

Episode 154

March 02, 2022 01:05:33
Episode Cover

154: The Cloud Pod Is QUIC and Rusty This Week

On The Cloud Pod this week, order in the court! Plus tackling those notorious latency issues with AWS Local Zones, things get quick and...

Listen

Episode 184

October 06, 2022 00:52:22
Episode Cover

184: The CloudPod Explicitly trusts itself

On The Cloud Pod this week, AWS announces an update to IAM role trust policy behavior, Easily Collect Vehicle Data and Send to the...

Listen