319: AWS Cost MCP: Your Billing Data Now Speaks Human

[00:00:00] Speaker A: Foreign. [00:00:06] Speaker B: Welcome to the cloud pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker A: Episode 319 recorded for August 26, 2025 AWS cost MTP. Your billing data now speaks human. Good evening, Matt and Ryan. How you doing? [00:00:29] Speaker C: Doing good, doing good. [00:00:31] Speaker A: I mean, you look fresh and fresh and awake. We might have woken him up. [00:00:36] Speaker D: It's fine. [00:00:36] Speaker A: We won't give him a nap. Yeah, you're refreshed, ready to go. I didn't, I missed all the prep, so. It's gonna be fun. [00:00:44] Speaker D: Yeah, but how much prep do we actually do here? [00:00:46] Speaker C: It'll probably be better this way. [00:00:48] Speaker A: Yeah, I mean, it could go either way. Yeah, we, we typically, you know, try to watch the news and stuff. Well, let's jump right into it then and see how Ryan does here. Musk's XAI is suing Apple and OpenAI, alleging a scheme that has harmed them. Basically, they're saying anti competitive practices in the AI chatbot distribution, claiming Apple deprioritizes competing AI apps like Grok in the App Store while favoring chat GPT through direct integration into iOS devices. The lawsuit highlights tensions and AI platform distribution models where cloud based AI services depend on mobile app stores for user access, potentially creating gatekeeping concerns for competing generation generative AI providers. Apple's partnership with OpenAI to integrate ChatGPT into iPhone, iPad and Mac products ruins a shift towards native AI integration rather than app based access, which could impact how cloud AI services reach end users. The dispute underscores growing competition in the generative AI market where multiple players, including Xai's Grok, OpenAI, Chat, GPT, Deepseek and Perplexity are vying for your eyeballs. Yeah, so in general, I don't know that I. [00:01:54] Speaker C: Agree. [00:01:54] Speaker A: You know, like, did they ask xa, did they ask Rock? Like, hey, is Apple purposely keeping your numbers down? And of course it probably hallucinated. Yes, it is, of course. And so that's not reasonable to them. But you know, again, it's one of these weird, like, there's always a potential conflict of interest when you have a partnership like that. But also the App Store, there's a ton of companies that track downloads and track usage of these things. I don't know that they have hard evidence here other than this is just a way to, you know, keep Apple distracted. Why they keep making Grok better. [00:02:27] Speaker D: I mean, I feel like this is no different than Chrome integrates. Would Apple have Google integrated the native search Engine. Like I feel like it's kind of the same thing. They chose a partner, they integrate with it. That's the default. There's other options out there. You could go find them if you really want Grok. You sure. You could search the App Store for Grok and we'll pull it up. Like, feel like this is more of a nuisance lawsuit. It's, it's possible I'm wrong. [00:02:55] Speaker C: I trying to see, remember, like, I kind of feel like in the early 2000s, Google sued Apple for something like this, but instead of AI, it was about mobile access and apps versus search. And like I think it's just new technology and figuring that out. And I don't, you know, I don't think there's much merit to this website or this lawsuit either. Right. [00:03:18] Speaker A: It's. [00:03:19] Speaker C: Apple's allowed to partner with OpenAI to build it into their product. Like that's, that's a thing. And then if Grok wants to compete with that, fantastic. But I don't think they're gonna get anywhere with this. [00:03:36] Speaker A: Yeah, I think it was, I think it was Microsoft who sued Apple. [00:03:39] Speaker C: The Microsoft. [00:03:40] Speaker A: Yeah, that makes for Bing because I think, because I think Google and Apple had the partnership, which, that one's a little different. That's actually why Google got into some antitrust problems now where they were paying for placement and they could outbid everybody because of their monopoly. And that's a different problem. [00:03:56] Speaker C: Yahoo stayed out of it because we're like, ah, apps are, no one's going to do apps. That's not a thing. We'll just do mobile websites for everything. That wasn't, wasn't a good plan. [00:04:07] Speaker A: That didn't work out. [00:04:08] Speaker D: No, they're still doing it. It's fine. Yeah, they, they found their path and they're going with it no matter what. [00:04:19] Speaker A: One other CEO is fighting each other. AWS CEO says AI replacing junior staff is the dumbest idea ever. Amazon CEO Matt Garman argues that using AI to replace junior developers is counterproductive since they're the least expensive employees and most engaged with AI tools. Warning that eliminating entry level positions creates a pipeline problem for future senior talent, Garmin criticizes the common metric of measuring AI value by percentage of code written, noting that more lines of code don't equal better code and that over 80% of AWS developers already use AI tools for various tasks, including unit tests, documentation and code writing. The CEO emphasizes that future tech workers need to learn critical thinking and problem solving skills rather than narrow technical skills skills as rapid technological change, meaning Specific skills may not sustain a 30 year career. The directive aligns with AWS's push for their Kiro AI coding assistant, while acknowledging that AI should automate rather than replace human developers, particularly as organizations need experienced developers to properly evaluate and implement AI generated code. Garmin's comments come amid industry concerns about AI's impact on employment and follow recent issues of the APSQ developer tool that had security vulnerabilities, highlighting the ongoing need for human oversight in all AI development. So yeah, in general don't get more expensive than the AI and you'll be safe. [00:05:31] Speaker C: I mean this, I do really think that the industry is using AI wrong and I think the layoffs are a sign of that. And I think that it is really easy to say like, oh well, our mid to senior developer staff can now do all of these junior tasks, so let's replace it. But I agree, I don't think that's a sustainable model and I think it'll course correct over time to be something that, like what Garmin's saying here, because I do, I think that that's, it's going to be a tool that we all use and whether you're junior or senior, it'll be part of your workflow and it will change the way you work. I know it's changed the way I. [00:06:09] Speaker A: Work, but yeah, yeah, I think we're in the silent recession. If I were to name it something and I'm going to trademark that right now. [00:06:18] Speaker C: Nice. [00:06:19] Speaker A: Uh, but I, I think where, you know, companies are adjusting post pandemic, they're right sizing their organizations. A lot of them got overly bloated in management and different areas. And so AI has become an easy excuse right now versus blaming tariffs and other economic down, you know, headwinds that they're facing and macroclimate issues because those are politically sensitive with you know, the White House and what's going on over there. And so if we can blame something like AI, we can basically reduce our staff with this answer, reduce our salaries and be better aligned to this silent recession we're all living through right now where yeah, things are okay, but it's all on a teetering brink of despair. It feels like at any moment. [00:07:06] Speaker C: Yeah, I think. And we'll see like over the course of, you know, the next set of decades, companies who wield AI, you know, more strategically and not just sort of use it as a crutch or an excuse, I think that you'll see them be very productive and only a few years. [00:07:25] Speaker A: Yeah, I think so too. [00:07:27] Speaker D: The Piece of this that kind of stuck with me a little bit more was, you know, around. You can't just learn a language. And that's what people have done for so many years, learn a language or 2. And AI is kind of saying you don't need to learn the language, you need to learn how to think about it. And that's one of the things that, you know, my school taught us was like we learned, I might call it a made up programming language. It was called Dr. Scheme when I learned it, which was a Lisp fork. And it was really just teaching people the concepts of programming and something you couldn't just Google and whatnot. It was that critical thinking that really it taught you. And I was lucky enough, I went to a high school that had software development. So for me it was a different way to think about it and really force that it's not a language. These are general concepts you need to do. And that's kind of what I feel like AI is going to really force everyone into is down that path of this is the general. What you need, you need recursion, you need loops, you need these general things. Go put these things into place over here. And AI can do the syntax of, you know, is it four, is it for each, is there a parentheses? Is it white space sensitive? What's the syntax of language? It can handle all that for you as long as you kind of more specifically guide it to what you want. [00:08:42] Speaker C: Teaching a made up coding language is probably the best idea I've heard in a long time. [00:08:48] Speaker D: Like, I mean it's a real language. If you Google it, it would, you know, let me see. [00:08:53] Speaker C: I get it. I mean, no, but it's like, I. [00:08:55] Speaker D: Get it's, it does a whole Wikipedia page about it. [00:08:58] Speaker C: That's cool. [00:08:59] Speaker A: I mean I learned, I learned Scratch, you know, which is kind of a similar idea. You know, basically that's I started was like here's this cartoon little character guide. You make him do things and use coding concepts to figure it out. And like you, you outgrow it very quickly where you're like, okay, now I'm frustrated but you know, it's, you know, to beat the, get the basic concepts to be able to put it into something that makes sense or, or use a language that isn't, you know, isn't as easy maybe or doesn't have all the documentation that you can go copy paste from Google like a Lisp variant. I can only imagine it's probably good and it's definitely something to think about. [00:09:34] Speaker D: Yeah, it now Looks like it's, it's in the Racket family of programming languages. [00:09:39] Speaker A: Interesting. So, yeah, I do vaguely remember a little bit of racket around the same time I was doing scratch stuff. So I, I don't remember it exactly, but I do sort of recall some lispy type stuff that was like, oh, this sucks. [00:09:54] Speaker D: I mean, the biggest thing for me about those was there's no variables. You have to like, learn to pass stuff across and really flow, you know, through there as well as like loops and recursion are key concepts. Though trying to watch a TA teach recursion when I had learned it in the past was hilarious. And I ended up like, I remember like sitting in my freshman year dorm, like teaching like five people the concepts of recursion because people couldn't figure out the way the TA was teaching. It was one of those, like vivid memories that you will never forget your life for no apparent reason. [00:10:28] Speaker A: I mean, at least, at least you, you know, like when I was in my first or second job was when object orientated became big and I had to learn how to rewrite programs and object oriented. And I was like, I hate everyone. This was a terrible choice. Why did I do that? It took me a while. Then one day you wake up and you're like, oh, I understand object oriented programming and now I can do this. But like, I think everyone kind of had that moment. I, I was sort of jealous of the new college grad school. Learned nothing but object orientated. So I was like, because the old way was fine, it worked fine, but it's so much better to think in an object orientated way. But it took me longer to get there. There are some basic concepts of it I remember, but nothing, you know, nothing that came what came later. So, all right, moving on to aws. We've got tokens and you got to count those tokens. And so Amazon is giving you the ability to now count your tokens with account token API for Claude models, allowing developers to calculate token usage before inference calls, which helps predict costs and avoid hitting rate limits. Unexpectedly, the API addresses a common pain point where developers would submit prompts that exceeded context windows or trigger throttling, only discovering the issue after the fact and potentially incurring unnecessary costs or the nasty rejection letter. The feature enables more efficient prompt engineering by letting teams test different prompt variations and measure their token consumption without actually running inference. Clearly useful for optimizing system prompts and templates, currently limited to cloud models only. This suggests Amazon is prioritizing Anthropic's Integration while potentially planning similar support for other bedroom models like Titan or third party options. For cost conscious organizations, this pre flight check capability allows better budget forecasting and helps implement guardrails before expensive model calls. Especially important as enterprises scale their AI workloads. Now I appreciate the idea of allowing better budget forecasting, but budget forecasting does not move at the scale of AI. So there's no way that you're getting an accurate forecast unless you have very specific prompts that you're going to reuse. A lot of times, yeah, I was. [00:12:25] Speaker D: Gonna say I feel like this is much more useful for you know, hitting the 429 errors of rate limit exceeded than anything. You know, if you're gonna send in data, you know, and making sure you're not gonna use too many tokens. Is it 429 for, for two tokens? [00:12:41] Speaker C: Yeah, I think that's right. I think that's right. [00:12:42] Speaker D: Yeah, you know, but you know that in advance and getting the error than having to handle it. You can do that preflight check and figure out then send it back to your user vs error based programming we're really talking about programming philosophies today. [00:12:58] Speaker C: Yeah, I mean I think everyone's trying to figure out how to visualize the resource usage of AI because we haven't figured it out yet. Like you know, I still don't understand how this works. I'm getting better at understanding some of these concepts, but it's still vague and, and you know, like if someone asked me to calculate yeah a model or future usage like I, I wouldn't be able to do it. And so like I think that this, I think we'll, we'll probably flail around trying to figure this out for a while before we get it. [00:13:29] Speaker A: AWS is releasing an open source model Context protocol or MCP server for billing and cost management that enables AI assistants like Claude Desktop Vs Code Copilot and Q Developer CLI to analyze AWS spending patterns and identify cost optimization opportunities. The MCP server includes a dedicated SQL based calculation engine that handles large volumes of cost data and performs reproducible calculations for period over period changes and unit cost metrics. Going beyond simple API access, the integration allows customers to use their preferred AI system for FinOps tasks including historical spend analysis, cost anomaly detection, workload cost estimation and AWS service pricing queries without switching to the AWS console. The server connects securely using standard AWS credentials with minimal configuration required and is available now in AWS Labs GitHub repository as an open source project. By supporting The MCP standard AWS enables customers to maintain their existing AI toolchain workflows while gaining access to comprehensive building and cost management capabilities. And all I want to know is, can I tell? And I asked the MCP to tell me what the hell EC2 other is. [00:14:32] Speaker D: A long time ago I was digging through a bill and it was like $5, and for the hell of it, I opened up a ticket with and went back and forth for like a week trying to figure out what was in EC2 other and they could never tell me. [00:14:48] Speaker C: No one knows, man. [00:14:49] Speaker A: Secret. It's a secret. Yeah, it's how they tax you. You can't figure it out, but we're. [00:14:55] Speaker C: Going to tax you if. If the AI using MCP for the AWS building doesn't just throw its hands up in the air and say, I don't know, man, that I'm thinking everything else it says is a hallucination. Because I've been asked that question and had to look at the bill so many times that that's the only response. [00:15:12] Speaker A: Like, point in time view of it. Like, you're easy to like, okay, I see what you calculated into it, but then like, you want to trend it or do anything else with it. Like, yeah, forget it. Like, all of a sudden the AWS cost explorer gets dumb. Like, oh, you want to analyze why it's higher this month than last month? Like, don't talk to me like, that's, that's foolish talk. Because the problem is you're getting into like different metric scales. Like, oh, you're talking about gigabytes transferred for the network side versus, you know, minutes consumed. And it just, it falls apart so quickly once you get into EC2 other. And it has so many different variables and how it builds, that gets complicated. Just proving that DB2 is still the best database Amazon is now providing you read only replicas, up to three of them in fact, per database instance, enabling customers to offload read only workloads from primary database and improve application performance through asynchronous replication. Read replicas can be deployed within the same region or cross region, providing both performance scaling for read heavy applications and Dr. Capabilities through replica promotion to handle read write operations. The Future requires IBM DB2 licenses for all VCPUs on a replica instance, which customers can obtain through AWS Marketplace on demand licensing or bring their own licenses, which you're definitely going to want to do because I'm pretty sure the on demand pricing is going to be high. This edition brings RDS for DB2 to future parity with other RDS engines like MySQL and PostgreSQL that have long supported read replicas, making it more viable for enterprise workloads. Key use cases include analytical workloads that need consistent read performance, geographic deterrent of read traffic, and maintaining standby instances for disaster recovery without the complexity of managing replication manually. So yeah, if you're into the DV2 side because you run a mainframe, you now have an option in AWS. Best database ever. [00:16:54] Speaker D: What year was DB2 released? In fun fact. [00:16:59] Speaker A: I don't know. [00:17:00] Speaker D: Without googling it, Justin, As I said, I don't know. Yeah, 1983. [00:17:07] Speaker A: What? [00:17:08] Speaker C: Wow, that's crazy. [00:17:11] Speaker A: Well, if you want to be more shocked, IMS, which is the pre, you know, IBM's pre was launched in 1966. So if you love your flat file databases like Parquet and Iceberg and all those things, you can go, you can go thank your grandfather IMs for that. Well, if you DB2 is a little too long in the tooth for you and you prefer to have your postgres delayed, Amazon has you with that this week. With Amazon RDS for postgres now supporting delayed read replicas, allowing you to configure a time lag between source and replica databases to protect against accidental databases. Deletions or modification feature enables faster disaster recovery by letting you pause replication before the problematic change propagate, then resume up to a specific log position and promote the replica as primary significantly faster than traditional point in time restores that can take hours. For large databases available in all AWS regions where RDS postgres operates at no additional cost beyond standard RDS pricing, making it an accessible safety net for production databases. Just as a common enterprise need for protection against human error while maintaining the performance maintenance benefits of read replicas for scaling read workloads. I believe that the chances of me being able to realize that I screwed up that badly within 15 minutes before this replicated is probably pretty slim. [00:18:26] Speaker D: It does let you do up to a day. But I really just feel like this is like only really useful maybe for like an upgrade or something like that that you're doing like, okay, we're going to do the upgrade. If it fails, we can leverage this as rollback. [00:18:43] Speaker A: You could do that? [00:18:43] Speaker D: Sure. Like that. [00:18:45] Speaker A: I think ER is also an opportunity for you again, like if you, you get a ransomware attack at 3 o' clock and you have this delayed 12 hours, right. You can basically fast forward to 259 and have all your data before the ransomware attack happens. So there's that use case you've got. You know, I deleted your table, which I've known to do in production before it's happened. Don't lie. It's happening. Both of you too. So there's those opportunities again. Normally the delay I've seen in other productions like 15 to 30 minutes. Unless I realize I was on the production database and not the dev database when I dropped that table. I don't always realize that within 15 to 30 minutes until someone starts saying hey, nothing's working in prod. And I'm like, ooh, I should double check this database real quick. And that's when you learn that you're on the wrong database. [00:19:35] Speaker D: I just. [00:19:35] Speaker C: I'm confused. But like this is a read replica that you could also query, right? Like that in theory you could shard traffic to and you could compare. [00:19:45] Speaker A: I suppose then in that scenario you could compare what's in the current database to what's in the database 20 minutes ago to see if your batch process worked. [00:19:53] Speaker C: Yeah, I mean it just seems like the likelihood of me serving the wrong data would be high. [00:20:01] Speaker D: Well, I don't. [00:20:01] Speaker A: I assume that in your. You wouldn't put this into. Well, I guess I don't. [00:20:06] Speaker C: What's the difference between this and a backup then? You know, like we have snapshots and you know, point in time restores and I get the performance that they're talking about. But it is. That is. It's just kind of interesting to call it a read replica because I'm sure it is, but it's just also breaks my head. [00:20:23] Speaker D: I mean it's a read rep. Kai, wouldn't you only have for this purpose like. Or you're architecting something. [00:20:30] Speaker A: It's a delayed replica. I don't know. [00:20:33] Speaker C: It is weird, right? [00:20:35] Speaker A: I think about that use case but you're right. That's okay. Well, I assume this will come to MySQL and to the other RDS database families and eventually DV2 probably as well will also get delayed re replicas apparently because now they built the pattern everyone else can adopt it is ransomware attacking. [00:20:54] Speaker D: Ever like encrypting tables. You've only ever seen it like encrypting full. [00:21:00] Speaker A: If they, if. If they get into your system to the point where they're encrypting files, there's probably a high likelihood they also got access to other passwords or things that got them into databases. And if they can get to it, they will encrypt it. [00:21:12] Speaker C: Yeah, yeah. [00:21:13] Speaker D: I've never seen anybody encrypt at the table or database level I've only ever seen at the server. [00:21:17] Speaker A: Or like most, most people keep those credentials separate. [00:21:21] Speaker D: Yeah. [00:21:23] Speaker A: But in the place where they did not. Yes, they're host. So. [00:21:26] Speaker D: Yeah. But I feel like those people are also the ones that probably aren't running RDs. [00:21:31] Speaker A: They're the, they're the people who are your doctors and dentists who run a server underneath one of the chairs in the. In the room. And they're the ones who get ransomware and totally owned. Right. [00:21:40] Speaker D: And that's my point. [00:21:41] Speaker A: Like, yeah, yeah, they're definitely not using an RDS lay and read replica to help them. Right. And that's a lot of the ransomware compliance is just checkbox anyways. Like, oh, we have a way to do this and in the event we got ransomware, we have a, we have, you know, protected backups and we have all these things. And yeah. You know, the reality is that at a larger enterprise company, so much security has failed to get ransomware at that level that, you know, you have other problems. But government gets hit all the time. So. Yeah, I don't know. [00:22:10] Speaker C: I mean it's, it's a really good thing that you can throw nuclear hammers at for, for being able to restore stuff. It's expensive. You pay that for that premium. But it's also if you're protecting yourself against ransomware, like, you're right, there's several other layers that have failed. But also having just a completely off site copy of this or you know, read once kind of, you know, for write once is a. An easy blanket you can throw over everything to make yourself feel better. [00:22:41] Speaker A: Well, a month ago, Amazon Prime Day happened and I didn't buy anything, at least that I recall. But I did, you know, was anxiously waiting to get this blog post which is the annual how did AWS scale to new heights for Prime Day 2025 with key metrics and milestones. My favorite article of the year, typically, other than PI Day, which is my other favorite article of the year, I. [00:23:06] Speaker D: Thought it was playing the sound when we have to do quarterly. [00:23:11] Speaker A: Well, that's just. That's just me trolling you. Okay. Yeah, that's really not about. It's not my favorite Amazon announcement like that. Just, you know, just, you know, see if I can catch you off guard with the sound. That's also great joy for me or any other kind of brand. Sound effects, which I do occasionally drop in a bunch for the show. Never done in a while. But my favorite still is, is the. The court that I did with Peter though. That was probably my favorite one. [00:23:34] Speaker D: I remember that one. [00:23:36] Speaker A: Yeah, it was cloud court because he had contended something. I brought evidence. I remember that episode now. Yeah, that was good. All right, so basically this article has a couple things that are interesting. First of all, this is the first one. I think I've been run by Jeff Barr because he went and retired from the blog. But they said this year marks a significant transformation in the Prime Day experience through advancements in generative AI offerings from Amazon and AWS customers using Alexa plus the Amazon Next Generation Personal Assistant now available in early access to millions of customers, along with AI powered shopping assistant Rufus and AI shopping guides. And those features are all built on 15 years of cloud innovation and machine learning expertise from AWS combined with deep retail and consumer experience from Amazon, helping customers quickly discover deals and get product information, complementing the fast pre delivery that prime members enjoy year round. So basically they go, they have record breaking sales. It is all part of their earnings, yada yada yada, no one cares about that. But Prime Day, where we care is all the numbers. And so during the weeks leading up to the big shopping event like Prime Day, Amazon fulfillment centers and delivery stations will work to get ready and ensure operations are efficiently and safely. And for example, the Amazon Automated Storage and Retrieval System ASRs operates a global fleet of industrial mobile robots that move goods around Amazon fulfillment centers. AWS Outposts, which are a fully managed service extends AWS experience on premise, are running in those facilities as well as and at one of the largest Amazon facilitators sent more than their outpost sent more than 524 million commands to over 7,000 robots, reaching peak volumes of 8 million commands per hour, 160% increase compared to Prime Day 2024. That's a lot of robots. Wow. Getting into easy two things during Prime Day Amazon Graviton, a family of processors designed to deliver the best price performance okay, Amazon, get off the powered more than 40% of the Amazon EC2 compute used by Amazon.com Amazon deployed over 87,000 AWS Inferentia and AWS Trainium chips which are custom silicon for deep learning and generative AI training to power Amazon Rufus for Prime Day. SageMaker AI, a fully managed service where AI provided low cost machine learning, processed more than 626 billion inference requests during Prime Day. ECS and Fargate for containers launched an average of 18.4 million tasks per day on AWS Fargate representing a 77% increase from previous year's Prime Day average charge. [00:26:01] Speaker D: Back to the Graviton. Why they only say use 40%. Like if they're pushing Graviton so much, shouldn't they use 80, 90% more? [00:26:11] Speaker A: Like, I mean, not all workloads are, you know, not all workloads will work on. Work on Graviton. [00:26:15] Speaker D: I know, but you would think by now since for they've been pushing away, we're in Graviton 3 or fours. Like, I mean, I mean, maybe like. [00:26:22] Speaker A: I imagine there's also the challenge of getting enough GPUs. They don't have enough GPUs and all the, all the boxes. I mean, if they didn't have the GPU and the 8 AI need, they probably would have been much higher. I think last year they were higher on Graviton, but I imagine availability of GPUs is the bigger driver. The fault injection Service ran over 6,800 false injection experiments over eight times more than they did in 2024, to test resilience and ensure Amazon.com remained highly available during Prime Day. That's just risk that you're incurring for no good reason other than to say you did this. Mm. AWS Lambda, the serverless compute service that lets you run code without managing architecture, handled over 1.7 trillion invocations. The API gateway processed over 1 trillion internal service requests, a 30% increase in requests on average per day compared to Prime Day 2024. Cloud Front delivered 3 trillion HTTP requests during the global week of Prime Day and 43% increase in requests compared to last year. And EBS storage peaked at 20.3 trillion IO operations, moving up to an exabyte of data daily. That is a ton of data. [00:27:29] Speaker C: Wow. So cool. So cool. Yeah, I mean, I don't know if I agree with you about the fault injection. Like, I get, I get why you're saying like, increasing it 8x times. [00:27:41] Speaker A: I mean, the thing is, it's part of your culture and it's part of how you design your systems. Chaos engineering during big events is exactly what you want to do because you're actually testing at scale and you're doing everything. Just if you're. What I don't want people, our listeners, to take away from, this is like, hey, I should install Fizz and use it on Black Friday. [00:27:59] Speaker C: Yeah, okay. [00:28:00] Speaker A: If you haven't had a culture of that chaos testing and the resiliency and the redundancy built into your engineering culture for more than a year, do not do that. That's all. [00:28:10] Speaker D: I read this. If they ran them in advance, they didn't run during the event. [00:28:15] Speaker C: That's also what I read, is that they're up leading up until the. [00:28:18] Speaker D: Yeah, because we read yeah in 2024. Well, that doesn't mean they ran any. They didn't run any during the event itself. That's the least way I read it. [00:28:32] Speaker A: I don't know, it's. Yeah, it's. It's not worded in a way I could tell either way it's. But you're probably right. Maybe not before or during, but before. [00:28:39] Speaker D: I mean, they're ballsy. I doubt they're that ballsy. Why. Why did Amazon.com go down? Well, we decided to run some chaos. Engineering in the middle of. [00:28:48] Speaker A: Decided to break DNS. [00:28:49] Speaker C: Who's so fired? Yeah, someone's so fired. That's not a good point. [00:28:53] Speaker D: I feel like they didn't do it during that day. But maybe I'm wrong. [00:28:56] Speaker C: I mean, maybe it's just built into the platform and, you know, because I know that they're, you know, all the white papers I've read about how they do deployments and stuff like that. It could just be built. That's how they deploy software. That's, you know, maybe that's how they deploy scaling. Could be. [00:29:10] Speaker A: Well, I got more big numbers for you guys if you want to keep going. [00:29:12] Speaker D: Oh yeah, I like to interrupt you. It's more fun. [00:29:15] Speaker A: No, it's fine. It was a good spot. We talked about a lot of cool, big things. Let's get to databases. Ryan's favorite topic. [00:29:22] Speaker C: Yes. [00:29:23] Speaker A: You know, they processed 500 billion transactions and stored 4,071 terabytes of data and transferred 999 terabytes of data to Postgres, MySQL and DC SQL. DynamoDB, their serverless, fully managed, distributed NoSQL database maintained high availability while delivering single digit millisecond responses and peaking at 151 million requests per second. That's a lot of requests. Elasticache peaked at serving over 1.5 quadrillion daily requests and over 1.4 trillion requests in a minute. And Kinesis Data Stream, their data stream service processed 807 million records per second during Prime Day. And SQS had a new peak traffic record of 166 million messages per second. [00:30:07] Speaker C: Holy crap. [00:30:11] Speaker A: So that's databases and eventing, so that was exciting. And then we do have a little bit of security stuff too for you, Ryan. [00:30:17] Speaker C: Yeah. [00:30:20] Speaker D: I mean, I'm actually curious to know what caches they use. I assume they use all of them, but it'd be interesting to see how much they've taken on Valkyrie. [00:30:28] Speaker A: Valkyrie versus. [00:30:29] Speaker D: Yeah, versus, you know, still using any of the other ones. Redis. And the other ones, I assume, like Redis. Anything Redis, they just swapped over. [00:30:36] Speaker A: But yeah, I'm pretty sure they're off of Redis completely. I imagine they might be using some memcache D in front of some databases. [00:30:42] Speaker D: Yeah. [00:30:43] Speaker A: Just because. [00:30:44] Speaker D: More native in some places. [00:30:46] Speaker A: Yeah. But I imagine primarily for the web tier, they're very heavy. VAL keyshop. [00:30:50] Speaker D: Yeah, that would be my assumption too. [00:30:53] Speaker A: Yeah. I mean, they built memcache. The only reason why you would have built memcached when they did was because you were using it internally because, you know, it was already kind of on the way. I mean, you can use Redis for that purpose as well, but it's not as good as memcache for that. [00:31:04] Speaker D: Yeah, I was gonna say those services. Elasticache has been around forever, I feel like. Because, yeah, I remember like the original DevOps exam, they were talking about those. And you know, which ones you can scale and restore, which ones you can't. One of those fun, dumb questions that they. Which is why they redid the exams. [00:31:27] Speaker A: On security side, GuardDuty, their intelligent threat detection service, monitored an average of 8.9 trillion log events per hour, which was a 48.9% increase from last year's Prime Day. And CloudTrail, their activity and API usage system, processed over 2.5 trillion events during primary 25 compared to 976 billion events in 2024. That's a lot of data for GuardDuty. [00:31:52] Speaker D: Is it that they've added new features or is it. [00:31:56] Speaker A: I mean, they have definitely added a lot of new things to GuardDuty this year. So I'm sure that a contributing factor. [00:32:01] Speaker C: Load, is also going to drive logging transactions and different things. And so, yeah, pretty cool. I wish they said results of that, like, because it's like, oh, we spent $6 billion on guard duty and it didn't find anything. [00:32:23] Speaker A: How many false positives of guard duty? Fine. During the event. That's what I like to know. [00:32:26] Speaker C: Like that would be, you know, because it's like, I'm not suggesting that you not have it because I, you know, like, you have to have insight and there's. This is the only way to really, you know, get at some of this data in AWS is. Is via GuardDuty. But yeah, I always, I, I always want, you know, as a security engineer, you always have to question the value and so you can't just increase costs exponentially for these things. You have to apply it. [00:32:52] Speaker D: You've not met a lot of security departments then? No, no, no. [00:32:55] Speaker A: I think you're a security person. That sounds like. [00:32:57] Speaker C: This is. So this is the thing. [00:32:59] Speaker A: I'm. [00:32:59] Speaker C: I'm. I'm a pretend security person coming from the infrastructure. We all know this. [00:33:04] Speaker A: Yeah, you're. You're a newly converted person to security. When did, when do you get to the point where you don't care about those things anymore? Does that come later in your transition or sudden? [00:33:13] Speaker C: How far to the dark side I fall? I think. I think there's a chance maybe, maybe I'm the chosen one and I will rewrite how security teams. [00:33:20] Speaker D: It's when we kick him off the podcast. [00:33:24] Speaker A: How many CSOs have you met that are, you know, open to new ideas or change? [00:33:29] Speaker D: Maybe I'll be the first. [00:33:31] Speaker C: No, because then I'd have to be cso. [00:33:37] Speaker A: All right, let's move on to gcp. But congratulations Amazon, on Prime Day Rest. [00:33:42] Speaker D: In that article they talked about AWS Countdown. Do you guys. [00:33:45] Speaker A: Oh yeah, that's their. In their event thing. I had a different name before ime. IEM Infrastructure Event Management. Which they always were always pitching us like, hey, you're going to do a launch of a product, you should do an im. And I'm like, we're, we don't do that much data. Like even when we launch a new product, like it's not like everybody gets access to that thing and everyone uses a day one. It takes like months for them to do things that never made sense for us. But yeah, if you have a big Black Friday type event or you're launching some massive new B2C thing, the Countdown service is a way to get kind of like the Bat phone to a bunch of people at Amazon who will help make sure your launch goes smooth. They'll evaluate your things in advance, they'll look at your well architected framework and make sure that's all good and that you know, if something does have a big or wrong during your massive launch or sport event or whatever, they will be there to help you fix it. Which is it's a good service. I think you pay a little bit for it as well now for it used to be free. [00:34:36] Speaker D: It does say Countdown engagements each year at no cost. So it's maybe like once a year. [00:34:41] Speaker A: You get, I mean, I think if you're an enterprise support, you get a couple a year in your enterprise support. I think if you don't have enterprise support, you pay for them. [00:34:48] Speaker D: They say, I mean it seems like an Interesting thing. But also at that point, I mean they say we only recommend you do at least two to three weeks ahead of the critical event. If you have a, an architectural issue at that point, it's probably too late to fix it at two to three weeks beforehand. So feels like something you would. [00:35:07] Speaker C: No, no, no. You're forgetting all solutions are just throw more capacity at it. [00:35:12] Speaker D: Sorry, I forgot the pull the lever off, add more servers and scale up, scale up and out to see what. [00:35:18] Speaker A: Happens on your event when it all, when all eyes are on you. You don't give a crap about the cost. [00:35:24] Speaker D: No. [00:35:24] Speaker A: You want to smooth. [00:35:25] Speaker C: No. And two weeks before, it's the same thing. Same thing you don't want. If you know about it in advance, you want to try to avoid that. [00:35:33] Speaker D: Train wreck, well then you can at least tell your CFO that he's going to hate you in, in four to six weeks and then you get your bill. [00:35:41] Speaker C: Set the expectations really high so it comes in a little lower and maybe you're not fired, you're just in trouble. [00:35:45] Speaker A: Yeah, the event's only going to cost us, you know, three and a half million and then I'm in a million dollars. I'll be very happy with you. All right, Google. So Google has diversified its AI developer tooling into six distinct offerings which we talk about. And so they decided they should write you a blog post to try to figure out which one to use, which I find hilarious. Jules for GitHub automation, Gemini CLI for flexible code interactions, Gemini Code Assist for IDE integration, Firebase Studio for browser based development and Google AI Studio for prompt experimentation and the Gemini app for prototyping. Tools are categorized by interaction model, either delegated, gentic, supervised and collaborative, each targeting different develop workflows and skill levels. Juul stands out as a GitHub agent that can autonomously handle tasks like documentation, test coverage and code modernization through pull requests. With a free tier and paid pro and ultra options, Firebase Studio enables non professional developers to build production grade applications in a Google managed browser environment with built in templates and geni power code generation during its free preview period. And those apps will not scale, so be prepared for that. Most tools offer generous free tiers with Gemini model access, paid options providing higher rate limits and enterprise features through vertex AI. But if you are confused about all of these different tools, and even here at the Cloud pod, I've been confused a few times like hey, they already have that. And this is a nice little way to kind of see how Google is thinking about the segmentation of these products and tools and worth the cheat sheet for you. [00:37:07] Speaker C: Well, and the Gemini app, like a lot of the documentation that is accompanying the app, is very likely to lead you astray in terms of whether this is something that can handle a production deployment. Referencing that API endpoint, because it says things like take an idea from your brain into production in minutes and it's not built for that. And so when you start using a scale, Google actually gets mad at you. They're like, wait, what are you doing? So, and I've had to talk people out of using it because it isn't really the same thing as the managed service like vertex AI where they've built, you know, capacity and scaling into it and charging more. But yeah, so I think it's good that they're, they're calling this out. Like, I think it is confusing for some people. I think people don't understand what codes is versus joules is, for instance. Like, I, I think it is difficult for people to sort of understand there. [00:38:08] Speaker A: Yeah, well, and I, I think getting into more, more and more agentic use cases like Jules is cool for those type of use cases because you can give it a very. But you're giving a very specific set of tasks that you wanted to do inside of a pull request versus where, you know, doing some of the word of the IDE like Gemini CLI or ID integration tools. Like that's where we're interactive, like we're doing it together. And so it's good to have that context. And I think it is helpful. But it's also like, you know, can't you just make your tool have different modes and we don't have to have 12 different things installed but, you know, it's probably too early for that era. [00:38:45] Speaker C: Well, and they are, you know, they are. A lot of these things do have different modes, like, you know, just the moving from like the ask model for generative AI into the agent model and then now you can have sub agents and. But I think a lot of these distinctions are access and where they run. And you know, I think that that's where the separation really is. So like Jules for GitHub for instance, you know, versus, you know, Gemini CLI, they're going to have access to very different things. Right. One is directly interacting with the GitHub API and managing the API actions within your GitHub ecosystem. The other one's on your laptop, you know, so it's. Do you want to call them all the same thing? I don't know. More stickers? I like more stickers. [00:39:29] Speaker A: It's like stickers. [00:39:34] Speaker B: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment to shortest 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Our chair will click the link in the Show Notes to check them out on the AWS Marketplace. [00:40:12] Speaker A: Google launched his Gemini 2.5 flash image on Vertex AI in preview, adding native image generation and editing capabilities with state of the art performance for both functions. Feature built in Synth ID watermarking for responsible use Model uses three capabilities, Multi image Fusion to combine multiple reference images into one unified visual character and style consistency across generations without fine tuning and conversational editing using natural language instruction. So when it misspells your words that you asked it to put into the image, you can now tell it what to fix and it'll actually do it. Early adopters include Adobe integrating into Firefly and Express, WPP testing it for retail and CPG applications, and Figma adding it to their AI image tools, indicating broad enterprise interests across creative workflows. The conversational editing feature allows iterative refinement through simple text prompts, maintaining a direct consistency while enabling significant adjustments, a capability that Leonardo AI's CEO described as enabling entirely new creative workflows. Available now in Preview on Vertex AI with documentation for developers. This positions Google to compete directly with other cloud providers and the generation services while leveraging their existing Vertex AI AI imager. Really there's only one other one which is, you know, chat GPT and then there's all the specialized, you know, models that just do images. But I don't know how competitive is other cloud providers because Titan doesn't do images. Claude doesn't really do images either. It'll describe beautifully an image that you can go give to, but yeah, to actually have a generated image, it's something Claude does not do. [00:41:37] Speaker C: Yeah, now you too can make your own Michael Jackson video where you're morphing every type of person into the other type of person. [00:41:47] Speaker A: I don't know if you guys remember I complained about how expensive VEO was. Yeah. And so now you can make three videos a day with VEO in Gemini Pro. Oh. So I like, you know, they're not super long videos, but you can now do three videos. So if you want to play with that, you can now do it in Gemini Pro. [00:42:09] Speaker C: That's, that's pretty rad actually because that's I, I've stayed away from it. Cause I'm like oh no, I will get myself into trouble. [00:42:18] Speaker D: Yeah. Accidentally run something and go back to the AWS conversation of you get the bill at the end. [00:42:24] Speaker C: Well and then once I get started, right And I start nitpicking at it like I can't stop. Like I want it to be. It's so close. Which is one more thing. And then, and then I have to like, you know, sell a kidney. [00:42:34] Speaker D: You know, I need one rope. You're fine. [00:42:36] Speaker A: You're too much of a perfectionist for that kind of three limit. Yeah, it's like, yeah. But again if you use other things to help create the instructions for the prompt and then give it the prompt. [00:42:47] Speaker C: You have a much better time work out. [00:42:50] Speaker D: So AI all the way down. [00:42:54] Speaker A: All right. Google Cloud Assistant is now your new master investigator. It's a new AI powered root cause analysis tool that automatically analyzes logs, configurations, metrics and error patterns across GCP environments to diagnose infrastructure and application issues, reducing troubleshooting time from hours to minutes according to earlier users. The service provides multiple access points including API integration for Slack and Incident Manager tools, directly triggering from Log Explorer or monitoring alerts, and seamlessly hands off to Google Cloud support with full investigation context preserved. Unlike traditional monitoring tools. This leverages Google's internal SRE runbooks and support knowledge bases combined with Gemini AI to generate ranked observations, probable root causes and specific remediation steps rather than just servicing raw data. Key to Trader is a comprehensive signal analysis across cloud logs, asset inventory, app hub and log themes in parallel. Automatically building resource topology and correlating changes signify issues that would be difficult to spot manually in a distributed system. Currently in preview with no pricing analysis to Google against their rivals like DevOps Guru and Azure monitor similar AI driven troubleshooting capabilities. [00:43:57] Speaker C: I mean this is fantastic. I, I love these, these use cases that are coming out like this is the first thing I thought of when Google announced sort of the, the integrated console access for, for AI. I was like this is, that's what I want. I want a little pop up that says one of your projects is on fire over here and being able to quickly get to that because that's, you know, it's the, it's the dream, right? When you're, you're setting up your, your monitoring and alerting dashboards is you want, you want to very quickly be able to pinpoint whatever where the issue is and what's going on. So something it's. I don't know if we ever really got to that point with metrics and logs and dashboards. So this is, you can, you know I think that hopefully this, this kind of finishes that off. [00:44:46] Speaker D: Yeah I mean it was always the human had to look at your dashboard to say okay here's the problem and we see this red line spiking over here. That's bad versus you know, in theory as these things evolve over time hopefully they'll be able to say hey, your latency on your database went from you know, 3 milliseconds to 50 milliseconds and CPUs at 9 at 98%. Maybe go look over here, there's a problem somewhere here. Now automatic fixing it and root cause you know I feel like then you're getting, it's like hey your application's causing issue. You know like getting to that part but like fixing the problem quickly to like get your infrastructure and get your product back up. That's the, that's the dream beliefs for me to get that piece down. As long as you can define rules. [00:45:29] Speaker C: Next part like I, I'm okay if AI is going to roll back a version if because it's detected higher rates or something like that. But yeah, if it, if it just continues to throw resources at things, I don't want it. [00:45:42] Speaker D: Well that's what I think somebody was saying that the Microsoft one did at the beginning. Hey, your database is CPU is high. Let's just scale up the automatic SRE one. I was like ooh, that's not good. Come in and you're at like a ADB Core CPU and you're normally at 2. That's when your CFO really hates you. [00:46:00] Speaker A: Too and all but all your customers are loving you like it's never been faster. The performance All I need is a memory. It's amazing. Google's introducing automated SQL translation from Databrick Spark SQL to BigQuery using Gemini AI Addressing the growing need for cross platform data migration as businesses diversify their cloud ecosystem. Solution combines gemini with Vertex AI's rag engine to handle complex syntax differences, function mappings and geospatial operations. Like each three functions artificial leverages Google cloud storage for source files, a curated function mapping guide and a few shot examples to ground Gemini's responses resulting in more accurate translations. System includes a validation layer using BigQuery's dry run mode to Catch syntax errors before execution. The key technical challenges include handling difference in window functions like first value syntax variations, data type mappings and databricks specific functions that need bigquery equivalents. The RAG enhanced approach significantly improves translation accuracy compared to using Gemini alone. I mean I find it interesting that they call out that their product is not as good as databricks by saying, you know, we'll help you build all the things that you need for equivalence. Like that's helpful. Thanks, thanks Google. Appreciate that. [00:47:11] Speaker D: Well, that's the you build it or you buy it method I feel like. So here they're like we'll build it and we can give you AI to build it for you versus it shows. [00:47:20] Speaker A: You that database isn't that differentiated. If they say that their, their AI can basically, you know, fill the blanks in for what you're buying for. Yeah, I mean it's probably for people moving, thinking about moving off of databricks bricks partially too. [00:47:32] Speaker C: Yeah, I mean I think it's the same sort of argument we had when AI was redoing, you know, stored procedures or, or data bricks, adding stored procedures, you know, like that kind of thing. So it's just like I guess meeting customers where they are is a good thing and you know, I think solving the data stickiness is a good thing because I would like these things to be more competitive and I know I have definitely looked at like I would love to get this data somewhere else because of cost, because of something performance or operability and there's just like. But getting that data somewhere else is just a huge undertaking. [00:48:10] Speaker A: Google is releasing a technical paper detailing their methodology for measuring AI inference environmental impacts, revealing that median Gemini apps text prompt uses only 0.24 watt hours of energy or 0.03 grams of CO2e emissions and 0.26 milliliters of water, substantially lower than many public estimates of and equivalent to watching TV for less than nine seconds. The comprehensive measurement approach accounts for full system dynamic power idle machines, CPU and RAM usage, data center overhead or PUE and water consumption factors often overlooked at industry calculations that only consider active GPU TPU consumption, making this one of the most complete assessments of AI's operational footprints. Google achieved a 33x reduction in energy consumption and a 44x reduction in carbon footprint for Gemini text prompts over 12 months through full stack optimizations including a mixture of express architectures, quantization technique techniques, speculative decoding and their custom Iowa TPUs for 30x more energy efficient than their first generation TPUs methodology provides a framework for consistent industry wide measurement of AI resource consumption. Addressing growing concerns about AI environmental impact as inference workload scale Particularly important as enterprises increasingly deploy generative AI applications. Google's data centers operate at an average of PUE of 1.09 and the company is pursuing 24. 7 carbon free energy while targeting 120% freshwater replenishment. Demonstrating how infrastructure efficiency directly impacts AI workload sustainability. I mean, I appreciate that they gave us a methodology and I assume that Amazon and Azure will follow suit with their own methodologies that will contradict each other and then they'll all argue as a committee and actually come out with a standard in about four to seven. [00:49:47] Speaker D: Years when there's a whole organization that comes out to do it for them. [00:49:51] Speaker A: Exactly. But I do appreciate that they are trying something here which is good and it isn't, you know, this is reading into more of the detail of the white paper. You know there you can definitely tell they put a lot of thought into how to think about this issue. You know a lot of the things that I was like wondering about like oh, what are all the idle capacity like they covered that, which is great, you know, so it's not just you know, consume capacity, it's also what we have sitting around waiting for consumption. And so it's, it's well thought out. At least when I read through the technical, which I didn't read the whole thing but I got through about half of it before I, I was like, I can't read this anymore. [00:50:26] Speaker C: That's about it. [00:50:27] Speaker A: Yeah, like it's, it's, these are hard reach. [00:50:29] Speaker C: But I mean that is really cool because I do think, I like, I like this sort of analytical way of approaching the impact first because I feel like we've been in this like doom and gloom and dire, you know, sort of view of AI is just going to ruin everything and burn the earliest, you know, the earth for our children. And you know, even I was in San Francisco in a museum. They had an AI exhibit. One of the ex AI exhibits was, you know, water usage for every query. And it was this massive amount of water and I'm just like, but it's reusing it. Just like the AI is in the exhibit. It's actually not wasting it. [00:51:06] Speaker A: Well, last week we talked about the UK is asking you to delete your email to save, save water and energy. And it was sort of like, yeah, I don't, I don't think you understand how this works. [00:51:17] Speaker C: Right? Yeah. And so this, this is the opposite of that. Right. Which is like leading your way up to this conclusion that doesn't actually based on facts. These are, these are the facts. And so whether you can argue whether they, you know how they got there, but at least they're telling you. I do want, I really do look forward to some environmentalist group or someone who's going to have a very skeptical outlook to go through and look at this techno, you know, the review process and because I would like to hear that feedback too just so you could hear sort of both sides. But I do think that this is well thought out and, and you know, they're showing the math. [00:51:55] Speaker A: Yeah, I, I definitely would like to see someone come back and, and point at it. You know, hopefully that does. I'll keep an eye out for someone coming out and saying this is bunk or here's why I don't agree and see what they have to say. But definitely intriguing for sure. [00:52:08] Speaker D: It's a starting point more than anything I feel like. [00:52:10] Speaker A: Yeah, it's beginning of a conversation. That's how I would describe it. I think our understanding of these things will change over time as we learn more. All right. Google Cloud Compliance Manager is entering Preview as an integrated security command center feature that unifies security and compliance management across infrastructure workloads and data. It addresses the growing challenge of managing multiple regulatory frameworks by providing a single platform for configuration and monitoring and auditing compliance requirements. The platform introduces two core constructs frameworks which are a collection of technical controls, map regulations like CIS, SOC2, ISO 27001 et cetera and Cloud Controls platform agnostic building blocks for preventative detective and audit modes. Organizations can use pre built frameworks of create custom ones with AI powered control authoring to accelerate deployment. This positions Google Cloud competitively against Amazon Security Hub and Azure Policy Compliance Manager by offering bidirectional translation between regulatory controls and and technical configurations. The key demonstrator is the automated evidence generation for audits validated through the Google's FedRamp 20xx partnership, while which consistently reduce manual compliance work for regular industries like healthcare, finance and the government. The platform supports deployment in organizations folder and project levels for granting their control. Available now in Preview through the Google Cloud Console under Security Compliance Navigation. While price details aren't provided yet, interested organization can contact their Google Cloud account team or email Compliance Manager Preview at Google for access and feedback opportunities. [00:53:38] Speaker C: I mean this is the automated evidence generation or gathering is is spectacular on on these tools and it's it's really what's needed and even from a like a security engineer standpoint being able to sort of view those frameworks and see sort of the compliance metrics and, and how you're performing across those things and what's actually impactful to those is super important too. And I've been using this tool for a while, the last few months. And so like, they're really helpful for setting up a new heavily regulated environment or coming through and preparing for an external audit. It's just so much easier when you have the evidence gathered already and labeled as evidence of that. And it's already kind of creating your audit narrative for you from an outside party. So it's, it's going to be viewed a little less skeptically than, you know, when I'm in the interview going, no, it's fine, everything's fine. Don't look over there. [00:54:39] Speaker D: I mean, it's taking on the tools like Drata Vanta, there's a bunch of other ones out there. It's taking on all those tools because they're those third parties. You can tell I've dug into this a little bit recently. They're taking on, they're building their own controls that map to it and from there they're making API calls, querying it, and it's just native in the platform. So you hope that in this case Google knows how to make their environment be CIS, SOC, ISO, FedRAMP compliant better than any of these third parties that then you hand over the data those tools generate. So it's a good starting point. If you're in one cloud particularly, it's good point for all that. And if they grow and kind of take on the full sock and ISO, you know, life cycle, that would be nice too. But at least for your cloud provider, it's. Here is all the stuff. It is automatically automated, it is automatically generated. I did nothing to do with it. This runs daily. This isn't me telling you at a point in time. This is literally proof right here. It does it every day kind of conversation. So it makes the auditor, it's the middle finger to the auditor saying, here you go, get out of my room and I want to talk to you more. This is the way it was done and here's the proof for it. [00:55:56] Speaker C: And I don't know if compliance manager supports this directly, but a lot of Security center enterprise is not limited to just Google Cloud. Right? You can configure access of the tool into your Azure and into your AWS workloads and it will measure the security and monitor the security of all those environments. And so I'm hoping that this actually is something that would be something that you could run across all those clouds collectively so that you could sort of provide a comprehensive view of your security layout. [00:56:30] Speaker D: I mean, the ones that I've looked at when we were kind of playing this game at work was it isn't just your cloud provider, it's also your identity management. So if you're using Okta or any of those, you know, as your primary identity management, but they also start to integrate into all your source code, into all your, you know, third party tools, your team starting. Not teams zoom, any of those. So you can really kind of get that full life cycle of all the tools in your environment if you're using ones that they have integrations with. Also, it's a great way to kind of save money because you find all these random small things out there that, oh yeah, we didn't disable it in this one tool. Not sso. We found that, you know, Matthew Cohen still has access to this tool even though he left last week. So it ends up being a little bit of a cost savings game at the first year at least. [00:57:21] Speaker A: Oh yeah. [00:57:21] Speaker C: There's no better cost savings exercise than a security audit that's going to go through. And like, what is this thing that has an IP or did for five minutes two weeks ago? What is that? Is that still needed? You know, like, because it's way better to just remove that than it is to. To secure whatever vulnerability or whatever got triggered. [00:57:43] Speaker A: Just boop. [00:57:43] Speaker C: No, it's gone. [00:57:45] Speaker A: I don't, I don't know what you're talking about. Was running, ran for that one minute and then went away. [00:57:53] Speaker D: Ryan likes the screen test method. Just turn stuff off and see who yells. [00:57:57] Speaker C: I absolutely do. I'm a big fan of that. [00:58:00] Speaker A: I just did that with service accounts recently. So far. Knock on wood. [00:58:05] Speaker C: No screening. [00:58:06] Speaker D: At one point, Justin's gonna be like, hold on guys, you guys gotta finish off. I got a phone call going. [00:58:13] Speaker A: I mean, it was one of these like, you know, Windows Service accounts have interactive login or not interactive login. But no one who doesn't pay attention to these things knows what that means. They just check the box and then that becomes a problem later. And like, you know, using logic, you can determine if that's going to be a problem or not, which was what I did. And I was like, based on what our app is, I don't think that's going to be a problem. And guess what? It wasn't. So. But that people didn't know and they just checked the box because that was the default or they didn't know and they said, well, I don't want to be. I don't want to fight this incident, so I'll just check the box not understanding what it actually means. It happens all the time. [00:58:49] Speaker D: Yeah, it's amazing when people don't understand servers count to interactive logins and from to make Ryan happy how much more secure the world becomes when you don't have interactive logins. At least you dramatically fix our stuff. [00:59:04] Speaker A: All right, moving on to our next story here. GKE Auto IPAM dynamically allocates and deallocates IP address ranges for nodes and pods as clusters scale, eliminating the need for large upfront IP reservations and manual intervention during scaling operations. This addresses a critical pain point in Kubernetes networking or poor IP management leads to IP space exhausted errors that halt cluster scaling and deployments particularly problematic giving IPv for v4 address scarcity VTR works with both new and existing clusters running GKE 1.33 Unlike traditional static IP allocation approaches used by other cloud providers, GK Auto IPAM proactively manages addresses on demand, reducing administrative overhead while optimizing ipv for utilization. Key beneficiaries include organizations running resource intensive workloads requiring rapid scaling as a feature ensures sufficient IP capacity is dynamically available without manual planning or mentioned I was sort of confused why all the cloud providers were doing this and then I connected the dots that oh, it became part of Kubernetes 1.33 and that's why they're all doing IP management as they added to the core platforms, they all the managed services. Can I get it? Because this is literally the third cloud provider to announce something about IP address usage. [01:00:12] Speaker C: I think it was Google last week that announced that you could add IP space to existing clusters. And so like that's just last week. And that's probably related to this as well, I'm guessing. [01:00:24] Speaker A: Yep. [01:00:25] Speaker C: I'm a little confused on what automatic IPAM does, right. Because it's like usually it's used as a, as a tool to sort of catalog and, and look at capacity. Sure. But what's it going to do? Provision a new subnet? Like take an unallocated subnet from someone else. [01:00:43] Speaker A: I think the use of the word IPAM is a stretch. [01:00:48] Speaker C: Okay. [01:00:49] Speaker A: I think it's a market. Like they were like how do you, you know, what's a, what's the thing that does IP man? Like oh, an iPam. They're like, oh, it's perfect. Yeah, let's use that in the press release. I don't Think it has any actual IPAM capabilities? [01:00:59] Speaker C: Okay. I still don't know what it does then. [01:01:02] Speaker A: It's just. It's a spreadsheet to know where your IPs are using. It's basically what I would assume. And the fact there's no screenshots of this thing and the press release is annoying to me. [01:01:12] Speaker C: Yeah. [01:01:13] Speaker A: But I don't think it's anything super fancy like IPAM iPad. [01:01:17] Speaker C: I'm sure it's just a dashboard that says that, you know, percentage full. [01:01:20] Speaker A: It's going to be like these are your IPS that are used for these pods and this pod is using more IPS than that pod and that's helpful information. I'm sure. It's not bad. It's just. Yeah, you know, it's nothing, nothing to get excited about like Blue Cat or any of the other type of solutions. Yeah, scale with confident. I'm just saying which anything automated allocation and configuration for a valuable time for your team. I mean it has some basic automation to it, so it's probably has. It's like a partial automation but very contained to your gke. Because to be honest, IPAM shouldn't be in gke, it should be outside of it in the VPC service and then plugged into gke. [01:01:55] Speaker C: It should be a source of truth they can use. Sure. [01:01:59] Speaker A: Well, Google saw that MongoDB compatibility thing that Amazon did and said we want that too. And so they're giving you Firestore with MongoDB compatibility now available in general availability. It supports MongoDB compatible APIs, align developers to use existing MongoDB code drivers and tools with Firestore serverless infrastructure that offers up to five nines of availability and a multi region application with strong consistency. Service includes over 200 MongoDB query language capabilities, unique indexes and new aggregation stages like lookup for joining data across collections addressing enterprise needs for complex queries and data relationships. Enterprise features include point in time recovery for a seven day rollback capability. Is that delayed? Can I get that delayed? Database cloning for staging environments, managed expert import to cloud storage and change data capture triggers for replicating data to services like BigQuery. Available through both Firebase and Google Cloud consoles as part of FHIRS for Enterprise Edition with pay as you go pricing and a free tier targeting industries like financial services, healthcare and retail. Seeking MongoDB compatibility without operational overhead. Google against document DB and Cosmos DB which both also offer MongoDB APIs. [01:03:09] Speaker D: But. [01:03:09] Speaker A: Only Amazon and Google offer with document with MongoDB compatibility, which is important. [01:03:16] Speaker C: Yeah. [01:03:17] Speaker A: Azure gets it too, but they don't, they don't advertise it that way. So clearly they don't do it because. [01:03:22] Speaker C: They don't understand how fun it is. [01:03:23] Speaker D: To say they're still working on non relational databases. Be nice. [01:03:29] Speaker A: I mean Cosmos seems pretty good, at least where I've seen of it. [01:03:31] Speaker D: No, no, it is, it's, it's, it's pretty good. I, I mean I haven't stress tested it, but in the use cases that I build as like a platform engineer tool, it's, it's great. And the auto scaling works as you need to, you know. I've never dealt with it at massive scale though. [01:03:49] Speaker A: Well, GKE is celebrating its 10th birthday. It's now about to become a teenager. Super annoying. But it's moving to a single paid tier in September 2025 that includes multicluster management Features like fleets, teams, config management and policy control are all at no additional cost with optional ala carte features available as needed. Autopilot mode, which provides fully managed Kubernetes nodes without requiring deep expertise, as will soon be available for all clusters, including existing GKE standard clusters on a per workload basis with ability to toggle them on and off. Thank goodness. [01:04:20] Speaker C: That's cool. [01:04:20] Speaker A: GKE now supports larger clusters to handle AI workloads at scale. Customers like Anthropic, Maloco and Signify using the platform for training and serving AI models on TPUs and running global services and the new Container optimized compute platform. And Autopilot delivers improved efficiency and performance, allowing workloads to serve more traffic with the same capacity or maintain existing traffic with fewer resources. After 10 years since launch and 11 years since Kubernetes was open source from Google's board system, GK continues to incorporate learnings from running Google's own services like Vertex AI into the managed platform. I mean it doesn't run the rest of Google's world, but sure, okay, it runs Vertex AI. [01:04:54] Speaker C: Thanks. [01:04:55] Speaker A: So cool. Happy birthday. [01:04:57] Speaker C: Is this mean. Are they finally doing away with the GKE Enterprise and Anthro's confusion to now it's just GKE and then I buy other stuff like is that what I think? [01:05:07] Speaker A: They're b. Basically taking the parts of GKE Enterprise that everyone was mad were paywalled and they basically have moved them over to this and then they'll offer add on things for the other parts that didn't make sense for the package but they didn't, they didn't specify which ones you know would be add on or how much those would cost, which is kind of the knowing part of this article. But you definitely, you know, it seems to be a situation where they're recognizing the market confusion but why they waited for the 10 year anniversary is kind of silly. [01:05:34] Speaker C: But here we are. I mean there's a lot that I don't see miss that I don't see included in here. So it's sort of like I'm, I'm glad but I'm also confused because it's like some of this stuff is available at the non enterprise version of gke so I don't know, they didn't do. [01:05:51] Speaker A: A press release on it before Orion, so that's the trick. [01:05:54] Speaker C: Of course. Yeah. [01:05:56] Speaker A: And there was some things like the policy controller. I do vaguely remember being available for all gke and so yeah, it will be interesting to see what they kind of move between the two tiers. But I think it's the right call. If I was running GKE without Autopilot point I would be. That'd be silly. So. [01:06:18] Speaker C: Oh, I don't know. I mean a lot of Autopilot is, is, you know, you have no visibility into the modes, you only have the workloads and I think a lot of people are not running on Autopilot. Hey, it's an increased cost and, and you know, people like to control their, their hardware and their node pools and, and, and their network access. You do like to control your network. [01:06:41] Speaker A: Access and I do like my network access. [01:06:43] Speaker C: You can't do the same. You can't do the same thing in GCP like you can in AWS and just deploy a virtual NIC for them all to use. So that's the big, that's the big limitation that I see. So but I imagine that that's probably what they're doing which is there won't have the, the containers themselves won't have access to your network but because you have this new ephemeral network you can just attach to a cluster, you know, probably being an isolated one. [01:07:09] Speaker A: Oh, nice. They haven't updated the edition comparison page between GKE editions. But there are, you know, some of the things. I'm just glancing here at the list of enterprise features, lifecycle management including upgrades and backups. That's Autopilot stuff. Fully supported Kubernetes distributions, cloud cluster auto scaling. That's in both. That's in both. Okay, so am I confusing Autopilot for. [01:07:32] Speaker C: I thought Autopilot was more of their. I forget what it's called in aws. [01:07:37] Speaker A: I mean Autopilot is just a node type that you set for. They basically manage the node for you and then it plugs into either enterprise or standard. I believe that does have a higher cost. So in the Enterprise edition, some of the things that aren't being moved over looks like service mesh. [01:07:51] Speaker C: Yeah, that was the first thing I was looking for. [01:07:55] Speaker A: Multi cluster ingress, Binary authorization, Advanced Vulnerability insights, the Connect Gateway, Best practice object metrics and long term support for older versions of Kubernetes. Clusters isn't in there. [01:08:08] Speaker C: I didn't think you could do multi cluster ingress. That's awesome. I'd pay for that. [01:08:13] Speaker A: Yeah, that's part of enterprise. All right, well I'll keep an eye on that because I hope they clarify that a little bit more than what they've done in this blog post. It's not very good. Moving on to Microsoft Azure this week. We did not get an Azure Weekly this week to lean on, so we're dealing with Matt who's on vacation trying to find us Azure stories. [01:08:36] Speaker C: She did a really good job by the way. I was impressed. [01:08:40] Speaker A: There was no shade in that comment that way, but I was impressed. He coming off a vacation, he basically popped out four articles that I didn't find. [01:08:51] Speaker D: It's where I have to find these things every day to see what's going on my world. Same way you feel like Google. We're always Google heavy here because with the world you guys live in every day, but it's three versus one. So you know. [01:09:03] Speaker A: Yeah, I mean we're still pretty Amazon heavy too. [01:09:05] Speaker C: That's true because I mean because we all share the Amazon background, right? [01:09:09] Speaker D: Because we all like Amazon better than. [01:09:13] Speaker A: Has the best web presence of all three of them. Like how they like Azure. All of their news is fractured across like a thousand different micro sites that you have to track. And then Google doesn't really do like mini announcements in a way that makes sense for us. They'd have a feed but like we updated these four features. You're like what does that bullet mean? You're like, I don't know. There's no link. To figure out what that is, you have to ask somebody and then yeah, they do, but they do. I think Google does better blog posts most of the time, but then Azure's kind of all over the place anyways. First up, Microsoft has evolved from being the closed source punks of the Steve Ballmer era and Bill Gates era to now contributing over 20,000 lines of Linux code in 2009 to becoming the largest public cloud contributor to CNCF over the last three years with 66% of Azure customer cores now running Linux workloads. Azure Kubernetes service powers some of the world's largest deployments including Microsoft 365 cosmic platform running millions of cores and OpenAI's ChatGPT serving 700 million weekly users with just 12 engineers managing their trans. That's because they have ChatGPT to do it. Microsoft has open source multiple enterprise grade tools including Dapper for distributed applications, Cato for AI workload automation on Kubernetes and Phi for Mini, a 3.8 billion parameter AI model optimized for edge computing. Company's open source strategy focuses on upstream first contributions, then downstream product integration and contrasting with AWS GP's tendency to fork projects or build proprietary alternatives. Azure's managed services like AKS and postgres SQL abstract operational complexity while maintaining open source flexibility, nailing rapid scaling without large operation teams as demonstrated by ChatGPT handling over 1 billion queries daily. [01:10:55] Speaker D: I'm confused by that fifth, fourth thing because you know they, they fully backed Redis where they changed the licensing and the only cloud that did but we are focused on open source first but you've backed the person that fully moved over to, to a different license I guess it's technically an open source license but not for commercial use. [01:11:19] Speaker C: So oh, they're just ignoring that one dark corner of their, their history I think, you know, and Microsoft is too easy to make fun of because I really like this because this push into Linux and more of this world but it's also sort of like I read this as we had 20,000 lines of code to bed Linux to our will to make it work at least remotely. [01:11:42] Speaker A: Microsoft products and services I think this is the reason why we don't hate on Microsoft as much as we used to 10 years ago or even 15 years ago now is that they did start embracing open source. It wasn't Windows only and NET only and they acknowledged that they're not the best at everything which when you eat a little humble pie suddenly now it's like okay, well if you're willing to admit that and willing to have a conversation, I'm so not willing to buy Azure, but just me personally. I mean Matt's already committed but you know, I appreciate that there is that thing and if I had to be forced to support Azure at some massive scale I probably would complain a lot about it but also not be that upset until all the issues that happen to Matt every week happen to me and then I get really mad. [01:12:30] Speaker D: So to be fair, you have a Lot of issues of GCP that I don't have. True. [01:12:35] Speaker C: And when, when we were on aws, we complained about that too. So this isn't. Yeah, we're going to complain about wherever we are. [01:12:41] Speaker A: Know everything. [01:12:41] Speaker C: Yeah. [01:12:42] Speaker D: I mean, no matter what, all of us planners and we're all, you know, civics and everything else, so, you know, you gotta take a little bit for what we get. [01:12:48] Speaker C: Yeah. [01:12:49] Speaker A: I mean, we were just having a conversation in one of the Slack channels, I think for the last weekend, AWS Talking about how U.S. east 1 tire fire is kind of no longer a tire fire. And then it kind of crashed the next day with Cloudflare, which was hilarious. Like. Well, we spoke it into the world that it was bad, but it wasn't major audit. Like the thing is it's, you know, it's gotten so compartmentalized now because a. It's so big. But number two, they've done a lot of segmentation there. But you know, I remember, I remember complaining about US east one all the. [01:13:14] Speaker D: Time on daily basis. It was dying. [01:13:18] Speaker A: So yeah, every cloud has its own issues and problems and you know, again, I think, you know, there are workloads that make more sense on Azure. I can admit that, you know, if you're a heavy Windows shop with a lot of SQL Server, Azure is, you're not the best bet if you're, you know, really into, you know, open source and being able to build quickly and break things. I think Amazon is beneficial for you. If you really like kubernetes and big data things, I think Google's your place. I think they all have their niche. [01:13:46] Speaker D: So I mean, that's why I always kind of say use the right tool at the right time in the right place and you know, depending on what your business is, we'll kind of take that on for you. You know, we'll kind of dictate where you go with it, with everything. [01:14:01] Speaker A: Well, speaking of contribution to open source, Microsoft is giving The Microsoft document DB an open source MongoDB compatible database built on PostgreSQL to the Linux foundation to ensure vendor neutral governance and broader community collaboration. I don't know how Amazon missed this, by the way, in like the lawsuit territory of like, hey, you're stepping on our name. Like, I'm shocked that this, that this has happened. The project provides a NoSQL document database experience while leveraging PostgreSQL's reliability and ecosystem. The move positions DocumentDB as a potential industry standard for NoSQL databases, similar to ANSI SQL for relational databases. With companies like Yugabyte and Singlestore already joining the technical steering committee. AWS's documentdb, though, has remained proprietary since they launched it, even though it does basically the same thing. DocumentDB offers developers MongoDB Wire Protocol compatibility without vendor lock in using standard PostgreSQL extensions under MIT license. Rather than requiring a forked database engine, this approach allows existing PostgreSQL deployments to add document database capabilities without migrating to a separate system. The project targets organizations wanting MongoDB style document databases, but preferring PostgreSQL's operational model, backup tools and existing infrastructure investments. Unlike Azure Cosmos DB multimodal approach, DocumentDB focuses specifically on document workloads with PostgreSQL's proven scalability, the Linux Foundation Governance DocumentDB provides an open alternative to proprietary document databases from cloud vendors, potentially reducing costs for self managed deployments while maintaining compatibility with MongoDB applications applications and tools. Now the question is, can I take these documentdb extensions and go put them onto Cloud SQL from Google's I'll get the same thing in Google world without having to use Firestore? That's the real question. [01:15:41] Speaker D: I don't really have much to say with this. It's good that they were sharing it. It goes back to their prior blog posts that we talked about. It's a nice to have pressured emerge two articles into one conversation. [01:15:54] Speaker A: Probably could have all right, did you guys lose you guys? Well if that didn't excite you, man do I got more excitement for you with Azure Bastion now supporting connectivity to private AKS clusters via tunneling, which if you were listening to any of our prior episodes, you know that you could just done this with tailscale a long time ago, but now it's built into AKs. So your Azure Bastion now enables secure tunneling from local machines to private AKS cluster API servers, eliminating the need for VPN connections or exposing clusters to public Internet while maintaining standard kubectl workflows. This feature addresses a common security challenge where organizations want private AKS clusters but struggle with developer access, competing with AWS Systems Manager, Sessions Manager and GCP Identity Aware proxy for kubernetes access tunneling capabilities work with existing kubernetes tooling and supports both private and public clusters of API server authorized IP ranges, reducing operational complexity for teams managing multiple cluster types. This is ideally meant for enterprises with strict security requirements and regulated industries that need private clusters but want to avoid managing complex VPN infrastructure infrastructure or jump boxes for developer access. I mean if you're regulated industry that Needs private clusters. I'm pretty sure cost is not your problem. [01:17:02] Speaker D: Yeah, I don't. [01:17:03] Speaker C: Well it might be if you use this. The pricing model seems outrageous. [01:17:07] Speaker D: No Azure batch is actually normally pretty good. Like we really a day job. It's really not bad and it lets you go to multiple places. [01:17:16] Speaker A: It's really, it's like zero trust network solution, right? [01:17:19] Speaker D: Yeah you pretty much can log in from the console and or from your command line and pass think AWS SSM where you could like as your SSM directed to a box it's a service so it can jump you right to Windows boxes, to Linux boxes, to Bastion. [01:17:35] Speaker C: But if it's 10 cents an hour or almost 10 cents an hour before data transfer costs, like how is how. [01:17:42] Speaker D: Much data transfer Unless if you're uploading downloading files over your Bastion Host, it's what, $720, $772 a month to get this successful in the right spot. [01:17:54] Speaker C: I mean we've, we've long slammed you know, NAT gateways and that for their hourly charges and the whole thing and so like I'm just a little confused on how maybe I, I don't understand how this is deployed. So like maybe while they're listing it out as almost 10 cents an hour, it would be something that wouldn't really be deployed for the full hour. [01:18:13] Speaker D: No, it's deployed for the full hour. It's there, it's running 24, 7. You can scale it up and down. You only need one to connect to multiple things. So unless if you have multiple completely isolated networks, it's not like you have one, I have one. You don't need 50 of them running there. I think there's a limit of like 10 concurrent connections per Bastion host or something like that. Like there is a limit per bastion when you would need a scaler, but most time you don't need to scale it. [01:18:41] Speaker A: Well and unlike a NAT gateway you're not paying for data going. I mean you're paying only for the data that goes to this bastion host, which is going to be significantly less than a NAT gateway processes. Because on a NAT gateway you're paying basically four and a half cents per hour for the gateway itself. Then you're paying four and a half cents for every gigabyte process through the gateway. Plus you're paying a data transfer charge on top of EC2 data transfer to the gateway. So like the NAT gateway costs get pretty large frequency because all that traffic is constantly flowing. Typically in a NAT gateway scenario where this is only when you're accessing one of those hosts. So I, I imagine that yes, the 10 cents per hour is a little steep. You're not paying at the same level of data throughput as a NAT gateway does. But I don't agree. I think it is a little expensive for what it is, but it depends. [01:19:28] Speaker D: The premium tier actually is pretty cool where it actually will record everything that happens. So you have that for compliance purposes? [01:19:36] Speaker C: Yeah. Privileged access management. Yeah, yeah. [01:19:38] Speaker D: So like that, that of it's pretty nice. I mean we use the standard tier. We've looked at the premium tier a few times and I think we'll eventually get to it. But it's a pretty nice service. And now that it lets you directly connect to the private endpoint, it's one less networking headache you have to deal with. So if you think about it, what's the cost of running $75 per month versus a VPN and the overhead of. [01:20:01] Speaker C: Running that or maybe dedicated bastion node is also going to be expensive. [01:20:05] Speaker A: Yeah, I mean you're running a compute instance, that point that you had to patch, contain, et cetera. Yeah, I mean do the ROI calculation before jumping on this plan. But it might work out just fine. [01:20:15] Speaker D: I, from what I've seen, it's worth it, worth the price, you know, and it saves a lot of a headache. [01:20:23] Speaker A: And Matt's super cheap. So if Matt says it's worth the price, that's true. [01:20:26] Speaker C: That's true. [01:20:28] Speaker A: Yeah. [01:20:28] Speaker C: I mean I do like the security model like I like and we're seeing more and more of this be sort of productized and exposed directly, which is great because it's the days of routing everything through a single tunnel in your VPN and having to deal with that capacity and limitations is over. And so like this is pretty great. You know, this is a lot like the GCP Identity Aware proxy, which I'm, I am a heavy user of and I kind of, I'm getting, I'm using it more and more and so like I like this. [01:20:59] Speaker D: I mean from a security perspective, you can immediately say pending you haven't done anything else stupid. There's no public IPs you use your Azure credentials to get in. So you are if you think they now forcibly have you with mfa. So essentially you can say to your customers, any access to a box requires two factor authentication in which makes a lot of customers immediately kind of go away because you don't, you're not saying, oh, you have to have, you know, our Palo Alto or our Zscaler IP address or anything like that. It's from there you can put extra controls on it, allow certain IPs and. [01:21:34] Speaker C: Whatnot if you really want. [01:21:36] Speaker D: So there are extra controls that you know, I've started, we've implemented on my day job but from a day zero you just can say there's nothing there and it works really well. Like I have heard very little complaints from my development team, from my ops team, from everyone about the tool and for $75 a month it beats a VPN any day of the week for sure. [01:21:59] Speaker A: Azure Application Gateway now provisions new instances during rolling upgrades before taking old ones offline through max search support, eliminating the capacity drops that previously occurred during version transitions. How long has this existed Matt that this is a problem. [01:22:14] Speaker D: It's one of those things. I think we talked about this a few weeks ago. I have a very love hate relationship with the app gateway for such a critical service as a load balancer and it's your really only native load balancer service. I mean there's nginx that you can get as like a. I don't know what they call it but like you can buy like nginx through them which will run your load balancer like as a native service. It's missing so many core features and like this is one of them which is like when they would upgrade your load balancer you can monitor stuff with your resource health we you would get alert saying hey the resource is down because the node is offline because they're. [01:22:52] Speaker A: All they're running is scale sets. [01:22:53] Speaker D: Underneath the hood would be my assumption and versus adding a node and then subtracting a node they just would delete a node and replace it. You know and AWS has had this on EC2 instances for I don't want to tell a number of years because I remember the press release so I'm trying to gauge it well before COVID. [01:23:12] Speaker C: Let'S start with that. [01:23:14] Speaker D: So it's just amazing that this wasn't there and native and why this is something you even have to think about. It's a fully it's supposed to be a managed service. In fact I have to tell it or I can tell it. You know and in my case with my day job uses a lot of websockets so I have to tell it this because otherwise it doesn't quite handle websockets versus you know other things as well with our auto scaling ability there tell it the number of nodes, tell it to do these things. It just still feels like a very clunky managed service and you still have to bring your own certificate. [01:23:51] Speaker A: Sweet. Who doesn't love that rant over. [01:23:56] Speaker D: But it's a great feature that they're bringing finally and you less things to deal with so you get less drops and whatnot because versus all of a sudden being down you will have extra capacity capacity. The question is will they charge you for that as they do it? Which I'm assuming the answer is yes of course. [01:24:11] Speaker A: Well, our final Azure story is Azure Migrate now enables direct migration to zone redundant storage disks, which automatically replicate data synchronously across three availability zones in a region for enhanced durability and availability compared to locally redundant storage. This feature addresses a key gap for organizations requiring high availability during cloud migrations, as they can now attain zone redundancy from the start rather than converting disk disks post migration, reducing operational overhead and potential downtime. ZRS disks provide 12 nines of durability over a given year and protect against data center level failures, making this particularly valuable for mission critical workloads that need continuous availability during zone outages. The feature targets enterprises with strict compliance requirements and those running stable applications where data loss or extended downtime during zone failures would have significant business impact. Though ZRS disk typically cost 50% more than standard locally redundant storage, so I think I understand this. So you took an EBS volume on AWS and you basically said we're going to make that EBS volume now multi az. [01:25:11] Speaker D: Yes. [01:25:12] Speaker A: And that's what this feature is. And to do that you used to have to migrate the data through RoboCopy or Xcopy or some other terrible method to from your old single zone disk to this zone redundant thing. But now you just click in the console and say now your zone redundant and just does it. Is that my interpretation? Is that correct? [01:25:30] Speaker D: It's a good way to spend some some money very quickly. [01:25:33] Speaker A: Yeah. How much does this thing cost over a normal EBS volume? [01:25:38] Speaker D: The volume. I went through a whole exercise recently as we were like doing some finops work and playing with our bill and I learned that the storage or the way pricing works for the bills for Azure, for ebs, for managed disk is like its own level of complexity. Because you have like, you can't just say I want to. You know, you could say I want a 32 gigabyte disk, but if you want a 50 gigabyte disk, you can't do that. You have like certain numbers that you can do, which is its own level of fun. But it looks like it's. That's for snapshots. I'm trying to find it. [01:26:16] Speaker A: I mean, so like does that have trade offs? Like I know in the Amazon world, EBS Multi Attach is great as long as you want to handle your own, you know, coordination between the nodes to not overwrite themselves or cause all kinds of blocking issues. On the, on the block storage, GCP has a much better regional persistent disk option, which is much better than Amazon. So is this, is this more like AWS's or more like GCPS? [01:26:42] Speaker D: It's more like AWS as far as I understand with the. But you no longer have to be. This is more for like backup. So you're, if you're, if you're running a file server in one region in one zone and that zone goes down, your data is still in the other zone. So you can spin up a server and attach it versus having to snapshot in and restore from snapshot and all that. I think that's more the use case of this article that this is attacking than, you know, the multi attachment feature. So it's more for resilience than hey, attach it from two places at once that you can already do. And I think it's still tied to a zone. You can't do that cross zone. [01:27:25] Speaker C: I don't think you. EBS Multi Attach allows you to do cross zone. It doesn't. I think you, you can, yeah, you, you can connect stuff within a zone. [01:27:32] Speaker A: But so like, I mean EBS Multi Attach is a joke. I don't. You had to basically put on top of it like some Mantec or Veritas at the time. I mean there used to be like a bunch of custom schedulers for disk operation iOS that used to be for clustering back on Linux back in the day. And that was like, that's what Amazon's basically provided with EBS Multi Attach. So like it's not a great feature. So the fact that this is more like that versus GCPS is kind of a bummer. [01:28:00] Speaker D: Well this isn't even for that. They already have that multi attached feature. This is just for data resiliency. So if a zone goes down, you don't have to restore from backup, it's already there. So if you did an auto healing node or you know what I always call it, you know, scale set of 101 or, or an ASG of 101, that disk is already in that other zone for you. So you don't have to wait for it to boot back up. You know, the server boots back up, configures itself, attach the volume and you're done because it's already in both places. [01:28:33] Speaker A: I don't know. This one. This one feels weird. [01:28:36] Speaker C: I mean, I totally get this having, you know, several, several workloads where I had to like the Dr. Strategy was restore from backup, copy disk, make disk, launch compute node, attach compute node. [01:28:49] Speaker D: Pretty sure I wrote that old script for you like four times. [01:28:52] Speaker A: Yeah, exactly, right. Like, at least now you can do, you know, you can restore from snapshot and get disk performance that's expected versus back. You know, when you guys wrote that process, that was when you would restore from a snapshot and then performance would be dog. Yeah. Why? Basically replicated across a bunch of nodes to get through scalability. [01:29:11] Speaker D: But I think that's only if you have like fast snapshot enabled and stuff like that. Like there's. You have to pay for. [01:29:16] Speaker A: Yeah, yeah. I mean, yeah, again, this was what, five years ago? Seven years ago. So, I mean, like, things have come a long way even since then. [01:29:23] Speaker D: But yeah, there was an article I was reading Corey Quinn's last week in aws. It was like, things that you like, things that you used to have to do on AWS that you no longer have to do. And like, DD is one of them. Like, it was in there and I was reading it, I was just like, okay, I've officially been on AWS too long because I remember all these things. [01:29:41] Speaker C: Yep, I have to go read that. That sounds fascinating. Corey's second shout out this episode. [01:29:49] Speaker A: All right, and then our final cloud story for this week is Digital Ocean's MCP server is now available. The server enables developers to manage cloud resources using natural language commands through AI tools like Claude and Cursor. The server runs locally and currently supports nine services including app platform databases, kubernetes and Droplets. The implementation allows developers to use plain English commands like Deploy a Ruby on Rails app from my GitHub repo or create a new Postgres SQL database instead of writing scripts or navigating multiple dashboards. Security is managed through service scoping, where developers can restrict AI assistant access to only specific services using flags. And this prevents context bloat and limits access to only necessary resources while maintaining audit trails and error handling. The service is currently free and in public preview, with hundreds of developers already using it daily for provisioning infrastructure, monitoring usage, and automating cloud tasks. And it works with Cloud Cursor, VS Code, Windsurf, and other MCP compatible clients. [01:30:38] Speaker C: Of course it's free developers, but add more memory, add more memory, add more memory. And now they can do it using natural language instead of having to figure out where the config file is. This is. I mean, I do like this, you know. You know, I like to make fun of it just because it's. It is sort of dangerous. Right? Like for a long time before we had infrastructures cloud, we had Click Ops and it was very difficult to make a repeatable environment using Click Ops. If you're generating natural language through natural language environments, you're going to have a really hard time getting that the same two times. But it is sort of for experimentation and for prototyping and for just quick and dirty things. That is pretty cool. [01:31:23] Speaker D: Terrified of developers going, I told my MCP to generate my postgres database and go make that work in prod. [01:31:31] Speaker C: I asked it to write me a thing that would never fail, ever. [01:31:37] Speaker D: Then I got the bill and my CFO told me I was fired. CFOs really should not listen to our podcast. We make fun of. [01:31:45] Speaker C: Oh, they don't. Don't worry, they don't. [01:31:49] Speaker A: It wasn't just the day job either, so. Yeah, exactly. So we did take homework last week, both Ryan and I, that we have completely failed because we both tried to watch the video and it was so boring that we got about 10 minutes into it and we're like, yeah, we shouldn't have made that promise. So the guide to platform engineering is good. Blog post. The video, it's a recording of a conference talk and it's exactly what it sounds like. And it's. It's very similar to what's in the article and didn't get into a lot of details because I, I did fast forward it and then I used the power of YouTube to ask AI Gemini a bunch of questions of the video to help me understand it so I didn't have to watch the whole thing. And I will look for better content for you guys for a future cloud journey episode about this topic. How's that? [01:32:36] Speaker C: I wasn't even smart enough to get the AI summarization. I just turned it off. [01:32:41] Speaker D: I got about five minutes situated. [01:32:43] Speaker A: Well, the thing about Gemini summarization of YouTube videos is it's awesome. Like, and I only learned about it because I was like, you know, when we were doing all of our prediction shows and we try to figure out who had the, you know, how many times they say AI on stage. It's like, you know, I would take it and I go put it in this whole like otter AI and I get the transcribed. Then I do, you know, a grab on it to how many times this thing was set and it was like a really painful process and Then I saw the Gemini little star icon that they now put on everything. And you're like, oh, it has Gemini. And I click. And I was like, tell me how many times said AI and it like, pooped up an answer in like 30 seconds. I'm like, oh, my God, this is so now use it all the time on YouTube videos that I don't like. You know, there's a lot of, you know, boring content. I'm like, hey, tell me where in the video this happens, because that's what I'm actually here for. And it'll tell you, like, oh, it's a timestamp, blah, blah. And so I can skip through all the noise and, you know, tap here to, like, this video and subscribe. BS nonsense. That is not every YouTube influencer video you've ever seen and get to what I cared about. So, yeah, it's good. [01:33:41] Speaker C: And you can see it being built in natively to. To go off on a rant but. Or tangent. Like, I, you know, I've long hated the searching for how to do something and it's a video. And now in the search results, it tells you, like, where in the video that happens. Like, it's fantastic. [01:33:57] Speaker A: Or like, you know, you. Yeah, you can literally say, hey, this video doesn't have steps. Can you break it down to steps for me? And it'll actually create the steps, so that's good too. Like, so there was a cooking recipe I was watching and I. [01:34:07] Speaker C: Great idea. [01:34:08] Speaker A: I was like, oh, could you turn this into a recipe that I could use to make this at home and in the proper order of preparation. And it took what they did and turn into something I could actually follow, which was great. [01:34:19] Speaker C: Oh, that's genius. I'm totally going to do that because I'm constantly doing that with, like, electronics or car stuff. [01:34:25] Speaker D: I was doing plumbing earlier today and I was like, watching YouTube videos and I was like, just tell me where this bolt or how I detach the. [01:34:32] Speaker A: Yeah, well, another thing that I use, I'm using AI a lot for is like, I'll see a symbol, like on your dash. Or you'll see something somewhere, and you're like, what is this? You'll take a photo of it, like, with Claude and say, hey, what is this? And it'll literally tell me exactly what it is and how I would use it and why, why it's there, where it is. It's just. It's kind of nice. It's like, it's, you know, it changes a lot. And then we were. We were having one of our neighbors got some birds that were very loud and noisy and I did not quite recognize the name, what the sound was. So I literally recorded the sound and put it into one of the models and literally it told me exactly who the bird was, etc. Etc. Which was just fascinating. [01:35:12] Speaker C: So what foods it are dangerous for it? [01:35:15] Speaker A: Yeah. What things not to feed it, what regulations it's violating in the city ordinance that I can complain about things like that. Minor quibbles, but that problem got fixed. Whatever method you want to go home with. Exactly. Well, it's been another fantastic week here at the Cloud. We will see you next week. [01:35:36] Speaker C: Bye everybody. [01:35:37] Speaker D: Bye everyone. [01:35:41] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services. While you're at it, head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.

Show Notes

Titles we almost went with this week:

AI Is Going Great – or How ML Makes Money

AWS

GCP

Azure

Other Clouds

Cloud Journey

Closing

Chapters

Episode Transcript

Other Episodes

Episode 140

140: The Cloud Pod Buys all its Synapse in Advance

Episode

Episode 24: Happy 5th Birthday to Kubernetes from The Cloud Pod

Episode

Episode 74: The Cloud Pod Gets Their Groove Back