[00:00:00] Speaker A: Foreign.
[00:00:08] Speaker B: Forecast is always cloudy. We talk weekly about all things aws, gcp, and Azure.
[00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan, and Matthew.
[00:00:18] Speaker A: Episode 312 recorded for July 8, 2025. Azure Firewall finally learns to spell. FQDN Edition.
Good evening, Ryan, Matt, how you doing?
[00:00:29] Speaker D: Learning to spell.
[00:00:31] Speaker A: Good, good. Thank you for hanging and doing the show notes last week and the show, and you guys did a great job. I listened to the episode at 2x speed because I can't listen to you guys at normal speed, at least on an audio recording format. And so, yeah, no, I listened to the whole thing. You guys did a great job. So nice. Thank you for filling in on me for Pinch. You're gonna do it again in a couple weeks while I'm in India once again, but always appreciate you guys stepping in and doing my job.
[00:00:55] Speaker D: And we ran the asylum, hopefully burn to the ground and didn't lose too many listeners. We'll find out soon.
[00:01:01] Speaker A: Find out soon. I'll be posting on that.
All right, well, we don't have a ton of news this week, Shocker. Which is good because it's been some marathon episodes as of late.
I was even impressed you guys made it as long as you did on the last episode because, wow, this is a long one for the two of them, which is pretty good.
[00:01:18] Speaker C: We got about halfway through before we were looking at each other with the look. We knew we should have called a lot more.
[00:01:26] Speaker D: They were good stories. We had some fun. Just making fun of them was the problem.
[00:01:30] Speaker A: Yeah, they were all good stories. You guys are good commentary too, so nice job.
[00:01:33] Speaker D: The question is, was the SaaS level too high or not?
[00:01:36] Speaker A: No, I thought the sass level was perfect. I'd like to be sassy.
[00:01:39] Speaker D: All right, cool.
[00:01:41] Speaker A: I'm always about that. But I did enjoy that you stumbled over my Azure bug in the show note bot. I have it basically trying to tell me, hey, what's. What makes this feature different than AWS and gcp? And it was like maybe three weeks ago, we were recording and I said that line. And then Ryan goes, that's not true. And I was like, oh, wait, it's hallucinating there. And then you were like, wow, Azure's really going after AWS and gcp. I'm like, oh, yeah, I should probably fix the bot. So it's not doing that. I just been removing that line because I know which bullet point it is every time. And I'm like, I just remove it from the show notes when I do the show notes.
But, yeah, you guys stumbled Right across that bomb. So sorry about that one. That one's on me. I'll take that one.
[00:02:23] Speaker C: That free fight.
[00:02:25] Speaker A: All right, well, let's get into AI is how ML makes money. And Ryan, you went and did a thing we've had on the show notes for a few weeks for you to collect your thoughts and feelings and I'm going to turn it over to you to tell us all about it.
[00:02:38] Speaker C: All right, well so I was invited as a small group of security professionals to the Google offices in San Francisco, Cisco and went to a little like sort of hands on, sort of product pitch of you know, how to secure AI workloads.
And I was really fascinated with how they did that. More like, I mean it's a big hot topic. Everyone's talking about AI, you've got to have it in everything and you know, how to secure it, you know, is just like every other new technology increase. It's a little, we're coming from behind and trying to figure out how to actually get it done. And you know, like some of the you know, prompts and stuff you can, you know, you can sort of try to get around and you can, you know, the sort of normal ways of exploiting AI have like, sort of I guess been handled like in a sense of like trying to get more information through prompts, trying to reset sort of the, the context of the bot. So it's, it's no longer in, in the helpful agent mode or just trying to get different models to you know, give you nefarious recipes or bad advice or you know, things that can be sort of like reputational damaging for like ch. Public chatbots, that kind of thing. So like those, you know, those are sort of the, you know, the ones that you know, I think are more known, more pattern based. And so it's like I was really interested to figure out how they were going to sort of do a hands on lab for those things because it's, it's a little difficult and if you think about model armor and how it works like it's a, it's a service that you run in line with your own service. Like it doesn't, you can't do anything. It's like the DLP service like you, it's an API where you send your results and then it sends it back, you know, kind of thing. So it's like how do you hands on lap that.
And I was really impressed with the, the Google team on how they did it. They, they basically built probably via AI a whole bunch of little, you know, dummy applications that allowed you to select models and select different levels of protection and then be able to, you know, prompt at your own, own desire to see how these, you know, protections come into place.
And so it definitely, you know, as part of that app, showed you a lot of the backend logic that's being executed at the application which you wouldn't normally have but just to see how it works, see how sausage is made a little bit. And I was, I was impressed because there's, you know, there's how, how we're thinking about AI is still evolving and how we're protecting of it. It's going to be changing rapidly and, and having real world examples really helped really flesh out how their AI services are, you know, how they're integrated into a, a security ecosystem.
It was pretty impressive and it's something that's near and dear.
I've been working and trying to roll out Google agent spaces and different AI workloads and trying to get involved and make sure that we just getting visibility into all the different ones and it was really helpful to think about it in those contexts.
So that was the thing.
[00:05:48] Speaker A: You reminded me talking about labs and Google is that I'm still bitter years later about Google buying Quick Labs.
I mean it's nice I get to use Quick Labs still because you know, they're still the preferred lab solution for Google Cloud. But I've tried to use AWS builder labs and they are not as good. It's Quick Labs and so you just reminded me that the acquisition happened. Made me sad all over again. So thanks.
[00:06:13] Speaker C: Yeah, you're welcome. But I mean the good news is.
[00:06:15] Speaker A: That that was 2016 at still the.
[00:06:16] Speaker C: Dirt just said well and if you're on gcp it's great, right like, and so it was one of those, it was a pleasant surprise since I still own the sort of cloud skills boost which is the how they rebranded Qwik Labs. It's still very much Qwik Labs. Like the invite email still comes from quicklabs.com like and it's, it's, it's just really well done. All of their labs, the content is well done. The virtual environments are really where they shine and, and yeah, I like it a whole lot and I think there's a fair amount that's free to the community and then you can also, even if you don't have an enterprise license, I think you can purchase tokens and then be able to sort of run some of the more expensive courses, more in depth courses in a piece by piece fashion. It's really neat. I like it.
[00:07:10] Speaker A: In your experience with Google's tools in this space, do they feel as MVVP as say, AWS's new services? Do they feel more fully baked in these?
Typically, because you're using MVP products in most cases for new services, they felt you found the rough edges or where all of a sudden Lambda Spackle came into play very quickly.
In your experience, this SEC thing being somewhat newer, does it feel as. As MVPish or does it feel more fully baked?
[00:07:41] Speaker C: It feels more fully baked. I hesitate to say fully baked just in general, but I mean it's the integration between different services like, you know, GCS storage and that is just a little bit nicer. Like the. In, in gcp. Like, you know, comparing sort of the Vertex model garden to Bedrock. It's. It is similar features, but I find the gcp, it's a little bit more intuitive depending on what you're trying to do. Right. Like, and so one of the things I wanted to do playing around with Bedrock was like, I'm just going to fire up a quick chatbot and point at different models and see what happens. And so much other stuff had to be built in order to make that happen.
[00:08:22] Speaker A: Right.
[00:08:22] Speaker C: Versus, you know, vertex AI. You click a tab and you sort of where you're presented with a configuration screen with, with knobs and, and a check and dialogue, which is great, you know, like. And so yeah, I find that GCP just a little bit more polished and having that, you know, it's a little bit easier to get started from zero on, on a lot of their services, which is nice because I mean, I imagine just like everyone, we're all just trying to learn as fast as we can as things change.
[00:08:53] Speaker A: Well, Oracle got a big contract, big, big contract from an AI vendor.
You guys know who it is, but maybe the biggest one that has a pretty tight relationship with Microsoft and Redmond.
[00:09:08] Speaker C: And an exclusive hosting announcement, I believe.
[00:09:12] Speaker A: Yeah, I do believe we talked about that being changed recently when they announced Stargate, but apparently they've now made Stargate real by signing their 30 billion annual cloud computing agreement with Oracle for four and a half gigawatts of capacity, making it one of the largest AI cloud deals to date and nearly tripling Oracle's current 10.3 billion annual data center infrastructure revenue, which that's one big customer you don't want to lose because that's going to have an impact on your stock if you ever lose them.
This deal represents a major expansion of the Stargate Data center initiative, a 500 billion joint venture between OpenAI, SoftBank, Oracle and Abu Dhabi's MGX Fund, aimed at building AI infrastructure across multiple US states, including Texas, Michigan and Ohio.
Oracle plans to purchase 400,000 Nvidia GB200 chips for approximately 40 billion to power the Abilene, Texas facility, positioning itself to compete directly with AWS and Microsoft in the AI cloud infrastructure market. The 4 1/2 gigawatt capacity represents about 25% of current US operational data center capacity, highlighting the substantial infrastructure requirements for training and running advanced AI models at scale. This partnership signals a shift in cloud landscape where traditional database companies like Oracle are becoming critical infrastructure providers for AI workloads, potentially disrupting the current cloud provider Hierarchy. I mean, 400,000 GB200 chips, how long do you think it's going to take Nvidia to deliver those?
Let me know.
[00:10:31] Speaker D: Enough that they will probably no longer be the prominent chip and there'll be a new version.
[00:10:37] Speaker C: Yeah, I mean, it's hard to imagine like numbers at scale and thinking about how hard it's been able to get chips and get access to chips and I guess maybe this is it. Maybe it just takes so much more of them than I would expect.
Craziness.
[00:10:54] Speaker D: Yeah, I mean, I mean the 25% is crazy that literally they're saying that they will be using 25% of US data center current capacity, let alone then the environmental effects and everything else. And then the next question is, is Oracle kind of trying to move into a niche play of saying, hey, we have all these chips, we are the AI vendor. You know, if you want to do your training miles or anything, do it here because we have the capacity, we have the power, we have all that and kind of trying to become, I don't want to say a niche player, but a more specialty focused player.
[00:11:38] Speaker A: So my very quick Internet research says that in October of 2024 that they felt in Q4 they could deliver 150,000 to 200 unit, 200,000 units of the GB200AI server.
So that means that it takes half a year for them to deliver 400,000 of them at that current, at that rate, which that's, you know, Oracle is just one vendor buying these things. So you have to imagine it takes multiple years to get all of the GB200s that this order requires. But I mean, it sounds like they're.
[00:12:12] Speaker C: They'Re going to have to embed unless the data centers are already built. It sounds like you're talking building permits and construction.
[00:12:19] Speaker A: Well, I think they, I think they're in the process of Being specked out and built, et cetera, as part of that Sargate initiative was the announcement they were doing it.
[00:12:25] Speaker C: Sure.
[00:12:25] Speaker A: But that was what, construction permits? Yeah, I mean, like, I mean they're probably, probably just getting through environmental impact studies and that kind of stuff. But yeah, 21% current US operational data center capacity. That's all US data center capacity, which is crazy.
I don't know.
The green sustainability impact of AI is I think, really unmeasured at this moment because also there's byproducts of creating 400,000 chips.
I don't know what all the environmental impact of creating them is, but I'm sure the, the carbon footprint of AI has got to be massive.
[00:12:59] Speaker C: Yeah. And you know, until we find ways to, to, you know, power more completely, the infrastructure of all these things with sustainable, it's going to be a nightmare.
[00:13:11] Speaker D: You know, like I have an old friend that is an architect of data centers, like the physical buildings and everything. So I weirdly know probably more than a lot of people do about building a data center just from talking to her over the years.
And I'm like sitting here going, okay, I know aws, I know Google, I know Microsoft, all have dedicated teams of specialty people for plumbing, for electrical, for, you know, project managers all focus on this like building the their data centers and building all their infrastructure. I'm like, okay, Oracle somehow needs to build enough data centers and do all that. Like that's not just like we were talking about the environmental permits and everything, but that's also they're going to have to get people costs and everything set up in order to operate that in that manner too and build out the internal team. I'm sure they have some. When we joke that their data centers are just trucks driving around in circles, but to do something at this scale, they're going to have to upgrade from trucks and maybe they move to semis, but they still need to be able to build all that out and figuring out the electrical, the plumbing, all the internal, and operating at that massive of a scale is a different level than they're at right now, at least I.
[00:14:26] Speaker A: Feel like I don't disagree with you at all.
Well, let's keep an eye on that one. Because when they start recognizing that revenue, Oracle's become suddenly maybe a solid number three against Google. Yeah, that's hypercloud space.
[00:14:41] Speaker C: I'm just shocked because when they announced it, like I thought, I for sure thought these data centers would be hosted by Azure naturally. So like, nope.
[00:14:51] Speaker A: I mean, they said even then that that was when they changed the exclusivity contract, when they announced Stargate that you know, they changed the terms and now Azure had a first right to decide or not to build out infrastructure capacity for them. And I think that was why they went with Oracle for this particular one. But yeah, I mean this is also azure giving up $30 billion in revenue.
[00:15:12] Speaker C: That's the part I'm confused by. Yeah, like wow. But I guess, I mean I don't know the realities of the project I'm sure came into play well and it.
[00:15:23] Speaker A: Makes you wonder how much is Anthropic paying per month and all these other what's Gemini cost Google per month to run and their revenue comes more direct but Anthropic has a direct cost and a direct revenue perspective. So yeah, a lot of economical questions about how AI works itself long term.
[00:15:43] Speaker C: We're in that honeymoon period where it doesn't quite have to make enough money. Right.
[00:15:47] Speaker A: Where we're in the hyper growth phase where it doesn't have to make money, but when it does have to make money, it's gotta get ugly. Interesting. Yeah.
Google is announcing new AI tools for mental health research and treatment which I may need when the AI market crashes.
Google's building AI tools specifically for mental health research and treatment through the article appears via survey page rather than containing actual content.
Going on to the article though Claver sure would essentially for these tools to handle sensitive health data processing for HIPAA compliance and scale support healthcare providers and researchers and the mental health AI tools often integrate with existing cloud based electronic health record systems and require robust security measures for patient data protection.
The development signals Google's continued expansion to healthcare AI applications. And really at the end of the day what I want to say is I don't know how I feel talking to Gemini about my problems like it used it against me in the, in the Skynet wars.
[00:16:37] Speaker D: Wait till that information then gets passed back in all your private conversations because at least the therapist you're normally talking one on one and yeah tell them.
[00:16:46] Speaker A: It'S not being used to train the model.
[00:16:48] Speaker D: Train the model to be used against you or pass back to your, you know, to your healthc care provider who that raises your premiums due to it.
[00:16:57] Speaker C: And I just don't know that we're ready. Right. Like there was an article recently in the Washington Post that I just dug up so I didn't sound like I was spouting nonsense in our pre read where you know, like it's. The advice is still very subject to hallucinations and you Know, like it's not, it's not an easy subject to, to give to AI, which is kind of nuts. Like, you know, in this example they were talking about, you know, a therapist bot recommending to an addict that they, you know, take a little bit of methamphetamine to stay alert at work. You know, and it's just like I don't think that's a good idea.
[00:17:34] Speaker A: That's a bad choice. Yeah, I think Stanford actually did a study on this as well and showed that nearly 50% of individuals who, who could benefit from therapeutic services aren't able to reach them. And so AI could be a huge boon to those people. But they also said that the LLM based systems are being used as companions, competence and therapists and some people see real benefits but they find significant risks. And I think it's important to lay out the more safety critical aspects of therapy and talked about some of these fundamental differences and that these things are basically hallucinating dangerous bots that tell you to do bad things like take meth. So be careful.
[00:18:07] Speaker C: I mean the good news in this case is it was a researcher in a study and so the person being recommended meth wasn't real.
So that's at least good. But yeah, it's kind of like it was very clear that it's. We're not ready.
[00:18:23] Speaker A: Moving on to AWS Amazon Nova Canvas has been updated to allow virtual try on capabilities allowing you to combine two images like placing clothing on your body or furniture into a room using AI powered image generation with three masking modes Garment, prompt or custom image masks. Eight new pre trained style options simplify consistent image generation across different artistic styles including 3D animated family film, photorealism, graphic novel and mid century retro. Limiting complex prompt engineering and the future targets e commerce retailers Amazon might be that you think who can integrate virtual try on to help customers visualize products before purchase potentially reducing returns and improving conversion rates. Available immediately in U.S. east Asia, Pacific, Tokyo and Europe Ireland regions with standard Amazon Bedrock pricing crying images under 4.1 million pixels or 2048 by 2048 max. The integration requires minimal code changes using existing Bedrock runtime to invoke the API with the new task type parameters making accessible for developers already using Nova Canvas without model migration.
[00:19:25] Speaker D: Amazon's here have a field day with this. Oh, they're having been doing it for a long time in the app so it doesn't surprise me. They're just figuring out how to monetize this I think.
[00:19:34] Speaker C: Didn't they make a big push when like when the, the the Echo devices were relatively new or when they adopted like the screen and camera version. So like you could do some sort of try on ability or something. I remember some sort of outfit raider or something like that that was going on. So this is, I do maybe remember that. Yeah.
[00:19:53] Speaker D: Yeah.
[00:19:54] Speaker C: So this is like a, you know, a continuation of, of that I think. And then you know, it's interesting to see how AI factors into that. I'm sure that some people take advantage and virtual try on is nice but can it really, you know, adjust to my size?
[00:20:12] Speaker D: I feel like the virtual try on is probably not targeted to the three of us. Three tech people, tech guys is probably not going to be their target audience.
[00:20:20] Speaker C: As much I know how that conference T shirt is going to fit. I do.
[00:20:27] Speaker D: Have the same one for the last seven years.
[00:20:33] Speaker A: Oracle Database at AWS enables direct migration of Oracle exadata and RAC workloads to AWS with minimal changes providing a third option beyond self managed EC2 or RDS for Oracle this addresses a significant gap for enterprises locked into Oracle's high end database features and the service runs Oracle infrastructure within AWS data centers, integrating with native AWS services like VPCS, IAM, Cloud Watch and S3 for backups limitating Oracle's management plane. Customers get unified building through AWS Marketplace accounts towards AWS commitments and the zero ETL integrations with Amazon Redshift eliminate cross network data transfer costs for analytics workloads, while S3 backup supports provide 11 nines of durability server supports both traditional Exadata VM clusters and fully managed autonomous database options currently available to you in US east and US west regions with expansion plan to 20 additional AWS regions globally. Pricing is set by Oracle through the AWS Marketplace private offers so prepare to spend all your monies and does require coordination between AWS and Oracle sales teams for activation, which is not very cloudy.
VM cluster creation will take up to six hours and requires navigating between AWS and OCI consoles for full database management. The service maintains compliance with major standards including SOC, HIPAA and PCI DSS. So if you are excited to get Oracle@DataNews be aware that it is not auto scale.
[00:21:49] Speaker C: I mean there's a ton of advantages when you think about the integration like the zero ATL with Redshift is a pretty prominent example.
If you're in the Amazon ecosystem and you're utilizing those services, this is going to be great, right? But somehow you're limited to the Oracle database products.
It's such a Hard place to be between those two things. And so I, I like this for the customers that this will fit, but it does seem a little clunky.
[00:22:20] Speaker A: Yeah, I mean it's again it's. I want to really move my workload to Amazon and I use Oracle and I don't want to have it in OCI because of you know, latency reasons or whatever else, but I want the benefits of it and I was audited by Oracle and I owe them the money through a contract legal process and this is an option for me. So like there's lots of advantages for.
[00:22:39] Speaker C: Customers and you know, the offering in the private marketplace will count against, you know, any kind of spend commit you have. You have that agreement with aws.
[00:22:45] Speaker A: Yes it does.
[00:22:46] Speaker D: Yeah. But the six hour setup time and flipping between consoles just hurts me. Like hearing that is like forget the fact that like it's Oracle, that hurts my soul a little bit and running that.
But like flipping between consoles and like different UIs and then six plus hours to set up and having to deal with two sales teams at the same time. Like none of this sounds like fun.
[00:23:12] Speaker C: No, it sounds like a nightmare to stand up.
[00:23:16] Speaker D: I think it sounds like the first time I tried to start with Direct Connect it took many days and a lot of fun filled efforts.
[00:23:23] Speaker C: Especially in the early days when you had to like provision the circuit in.
[00:23:26] Speaker A: The data center and then also.
[00:23:28] Speaker C: Yeah, it's very much like that. I think in fact that might be what they're doing.
[00:23:35] Speaker D: Provision you would Direct Connect into the.
[00:23:39] Speaker C: Management plane is an OCI data center. So they configure Direct Connect. Yeah, I don't know.
[00:23:45] Speaker D: I wonder. I mean it could. They could be like doing a MPLS and splitting off like a one megabit segment for you. Like we do a lot of weird things.
[00:23:56] Speaker A: All right, let's move to gcp.
GCP has a new luster this week. It's nice and shiny because they now have the general availability of Google Cloud managed lustre with four performance tiers ranging from 125 Mbps to 1000 Mbps per terabyte scaling up to 8 petabytes of storage capacity powered by DDN's Exascaler technology for high performance parallel file system needs and AI ML workloads.
[00:24:19] Speaker C: Of course, you've got to have that.
[00:24:20] Speaker A: The service addresses critical AI infrastructure bottlenecks by providing POSIX compliant storage a submittal second read, latency nailing, efficient GPU TPU utilization for model training, checkpointing and high throughput Inference tasks that require rapid access to petabyte scale datasets. Pricing starts at 14 cents per terabyte hour for the 125Mbps tier, up to 70 cents per terabyte hour for the thousand megabit tier.
Competitively against AWS F6 Lustre for offering native integration with GKE and TPUs across multiple Google Cloud regions. The partnership with DDM brings enterprise grade Lustre expertise to Google Cloud's managed services portfolio, filling a gap for customers who need proven HPC storage solutions without the operational overhead of self managing luster clusters. Key use cases extend beyond AI traditional HPC workloads like genomic sequencing and climate modeling, with Nvidia endorsing it as part of their AI platform on Google Cloud for organizations requiring high performance storage at scale.
[00:25:13] Speaker D: I still am always impressed by how cheap storage is on these services. 14 cents per terabyte. Granted it's not a lot of bandwidth, but for, for, you know, average things, still really cheap.
[00:25:28] Speaker A: I mean, I wish I knew more about Lustre.
I was thinking about it like I knew ZFS is probably the last like true file system I really learned a lot about and I dabbled a little bit in stuff and Ryan's told me his horror stories and stuff.
But then like Luster, I like never got into it because it was never in my workload. But like what's so special about Lustre?
[00:25:48] Speaker C: Yeah, I'm in the same boat, you know. Like Seth was all about, you know, the butterfs and a lot of the rail stuff that was going on. So it's, I've never had experience either, but I imagine it's, it's a similar thing. Like you know, you have to be a true nerd to get into a file system and, and you know we are but it's, yeah, it's, we should look into it. I mean it's, it's, it's gotta be something with the larger workloads and because I, that's the only time I hear.
[00:26:15] Speaker D: It mentioned, I've only ever dealt with it with HPC workloads and it has to do with large files and massive concurrency. It's designed to handle better. But I also feel like, see I.
[00:26:28] Speaker A: Work on web services and it's all just small files.
[00:26:31] Speaker D: Right?
[00:26:33] Speaker A: Yeah, which is a different problem.
[00:26:35] Speaker D: So that's why I think we never really dealt with it because it's these massive files for genomic synchronously as you mentioned, another massive modeling where we all are like, yeah, that's not cloud native. Let's deal with 400 billion small files per month and see if I can piss off Blob Storage. What could possibly go wrong?
[00:26:54] Speaker C: I mean, I, I, you know, it's interesting because you say that now I'm like, well I guess like HDFS and like the Hadoop from my way back days, like really is just sort of kind of an object store that you're running on your own computer with a little bit more low level in terms of the storage access, but kind of the same thing. But it is sort of like distributing storage workloads across many multiple nodes and then having that all come together. So you've got you're dealing with sharding and replication and so I imagine, I wonder if Lustre is sort of that orchestration layer that handles sort of all that fault tolerance and the access, maybe even permissions.
[00:27:35] Speaker A: Maybe someone can write into us who supports Lustre and they can tell us why they think it's awesome. I'll look around on Twitter and Blue sky and Mastodon. All those places too.
All right, Vertex AI Memory bank is now in public preview Vertex AI Memory bank enables agents to maintain persistent memory across conversations, storing user preferences and context beyond single sessions, addressing the common limitation where agents treat every interaction as a new and ask repetitive questions. How think it?
The service uses Gemini models to automatically extract, consolidate and update memories from conversation history, handling contradictions intelligently while providing similar research for relevant context retrieval. Based on Google Research's ACL 2025 accepted method for topic based agent memory, Memory bank integrates with Agent Development Kit and Agent Engine sessions with support for third party frameworks like Langraph and Crewai and developers can start with a Gmail account and API key through Express Mode registration before upgrading to full gc.
Oh, that's nice because the full GCP project has burned me a couple times like oh, that's a lot of heavy lifting as Visions Google competitive against aws, Bedrock's Conversation Memory and Azure similar offering, though Google's invitation emphasizes automatic memory extraction intelligent consolidation rather than simple conversation storage Key use cases include personalized retail assistants, customer service agents that remember past issues, and any application requiring multi session context with the service available in public preview at standard Vertex AI pricing tiers.
I love this idea that it's automatic because that's one of my I will realize halfway through a conversation I've answered the same question like four times or I see the AI going and answering the same question that I already asked earlier and I'm like oh, I'm paying for tokens for that and I Shouldn't do that.
So I like that. I thought that it's automatic because it would be nice if you could just tell Claude, like, hey, look at our chat history for the last day and what have we done multiple times? And what can we turn into memory and have it kind of generate that stuff so I don't have to remember to do it. So Maybe, maybe other LLMs will come up the same thing. But cool. Nice job.
[00:29:28] Speaker C: Well, I think so. I think you can do that with Claude. It's just that you would have to do that every time, you know, almost every time starting a new session. And I think what I like about this is it seems like it's much more automatic than what some of the workarounds I've seen in like the chat applications and even some of the ides where they're, you know, like in you can write copilot instructions as part of your VS code project. It's a sort of a workaround for this where it's giving it sort of data that it will. It'll read on every prompt and ingest and then, you know, being able to sort of. It'll take all that data and then just re. Ingest it every time. So you are spending tokens on it. So I don't. I wonder how that works with the ADK option here. It'll be fun to see.
[00:30:15] Speaker A: Yeah, very, very cool.
Well, for those of you who like to burn lots of money, Google is expanding its Z3 storage optimized VM family with nine new instances offering 3 to 18 terabytes of local SSD capacity plus a bare metal option with 72 terabytes targeting I O intensive workloads like databases and analytics. The new Titanium SSDs deliver up to 36 gigabits per second throughput and 9 million IOPS with 35% lower latency than the previous generation. The local SSD Z3 introduces two VM type standard LSSD 200 gigabyte per VCPU for OLAP and single databases, and high SSD for 400 gigabytes per SSD per VCPU for distributed databases and streaming applications. The bare metal instance provides direct CPU access for specialized workloads requiring custom hypervisors or specific licensing needs. Enhanced maintenance features include advanced notice for planned maintenance, live migration support for VMS with 18 terabytes or less and in place upgrades that preserve data for larger instances. This addresses a common pain point for staple workloads requiring local storage. Z3 integrates with Google's Hyper Disk for network attached storage supporting up to 350,000 IOPS for VM and 500,000 IOPS for the bare metal instance. AlloyDB will leverage Z3 as its foundation, using local SSDs as cache to hold datasets 25 times larger than memory with near memory performance early adoption for significant performance gains. OP Labs saw 30% reduction in P99 latencies for blockchain nodes and tenderly achieved a 40% read latency improvement. And Shopify selected Z3 as their platform for performance sensitive storage systems.
[00:31:45] Speaker C: I think this and the Lustre announcement are tied together like this is the hardware they developed.
[00:31:52] Speaker D: Yeah, this is how we made this work.
[00:31:54] Speaker C: They've put in so much development in Google Hyperdesk and making that a service, you know, for network estache, but everything that's over a network is going to have a higher latency than a local ssd. And so it's, it's kind of funny, you know, to see these ginormous boxes to work around that. Like that's a lot of data to have just in live cache.
[00:32:16] Speaker D: I mean, I've also used it for a lot of real time processing too. And when you need to take something, dump it local and do processing.
So like that's the other use case where I've seen it versus just like keeping it all cached in that way.
I mean, again, this is also where in theory you could do some. You know, you'll see people leverage these not just for that, but if you do like have a MongoDB and NoSQL cluster, you want to trust it set up across multiple nodes. You can leverage this because it ends up being free storage versus doing the Hyper disk or anything else or using the managed service. So there definitely are use cases for it.
But you got to be careful with some of the ephemeral memory stuff too.
[00:33:00] Speaker C: Yeah, it always bites someone in the ephemeral disk. Yeah.
[00:33:04] Speaker A: Let's move on to our best friend, Azure. Matt, I got some stories for you this week.
[00:33:10] Speaker D: The first one's just here to annoy Ryan and I think maybe.
[00:33:16] Speaker A: Azure is now offering you two postgres SQL deployment options on aks Azure Container Storage with local NVME for performance critical workloads achieving up to 26,000 tps with sample sub millisecond latency and premium SSD v2 for cost optimized deployments with flexible IOPS throughput scaling for up to 80,000 apps for volume the Cloud native PG Kubernetes operator integration provides automated failover built in replication and native Azure Blob storage backup capabilities. Addressing the complexity of running stateful workloads on Kubernetes that has historically pushed enterprises towards managed database services. Benchmark's results show Local MVP delivers 14,812 tps at 4.3 milliseconds of latency on standard L16 SV3 VMs, while the Premium SD achieved 8,600 tps at 7.4 milliseconds on standard D16DS V5s with NVMe option costing approximately $1,382 a month versus 348 month for the Premium SSD V2. This position's AKs competitive against AWS, EKS and GCPG for database workloads, particularly as Postgres now supports 36% of all Kubernetes database deployments according to the 2025 Kubernetes in the Wild report, up 6 points in 2022. See 36% wow. Databases are on Kubernetes.
So you guys think I'm crazy, but this article says otherwise.
[00:34:31] Speaker C: I know you're crazy. This isn't my only evidence. But yeah, so I bristle at all the numbers because like the they're trying, they're comparing it to, you know, managed services and it's a cost, you know, and it's like well yeah, but you're also not counting the cost of the three people that it's going to take minimum to support your Kubernetes cluster, then.
[00:34:52] Speaker D: Your DBAs that you need on top of it because now you have to manage your SQL deployments on top of that. It's just, and all the optimizations you have to do to make that work.
[00:34:59] Speaker C: So I mean there's just, there's a lot of, you know, advantages that you're given up to in order to run it locally and have direct access to that layer. And you can't have any database specific logic. You don't necessarily, you know, because it's all tied to the Kubernetes ecosystem. And so whatever scaling and networking you get with, with Kubernetes and pods and that setup is, is what you're going to get. So it's, I don't know, I mean, maybe I've never built a database orchestration mechanism at scale where maybe, maybe RDS is just, you know, a whole bunch of containers under the covers, you know, but I doubt it.
[00:35:42] Speaker A: So I just went and downloaded this Kubernetes in the Wild report because I had not heard of this. It's from Dynatrace, so that was something I would have been looking for.
But there's some interesting statistics in here. Just looking.
Apparently in cloud, the most common node size is 16 gigabytes, followed by 32 gigabytes and then 64.
Those are the three most common node memory sizes.
The average number of nodes is 6 and the average number of pods is right over. Right above 210ish. It looks like on this chart it's not. There's no scale. It's easy to read.
So that's kind of interesting.
But basically they're showing that organizations continue to expand the use of Kubernetes around managing applications, but it doesn't really show that. Application workloads went from 37% in 2022 to 2024. It's down to 27%, which is pods of application workloads. But pods of auxiliary workloads have gone from 63% to 73%. That's not a huge growth. No, that's interesting.
And then they do have the auxiliary workloads in Kubernetes are seeing our databases, Open source observability, messaging, security compliance, continuous delivery, Big data service measures and IDPs for databases. Number one, Redis 71%. Yeah, this doesn't shock me.
[00:36:57] Speaker C: Not at all.
[00:36:57] Speaker A: Postgres was number two at 36. MongoDB was 28%, which I cannot imagine running Mongo inside of Kubernetes.
I thought SQL Server in Kubernetes was a bad idea. Well, a good idea and a bad idea, but Mango would be.
You mess up those shards in replication, that's a bad day. Yeah.
MySQL 18% doesn't surprise me. Oracle though. 9% on Oracle. That's interesting because of the licensing implications alone. That'd be kind of blows my mind. MariaDB falls after that with 7%, which is really the same thing as my sequel in my opinion. Cassandra 7% and then memcache 7%. So SQL doesn't make it on the list.
Not surprised.
Open source observability 66% was Prometheus and then 49% was Kubestate metrics and then 36% was Thluenpit 31% Thanos and then it drops down to below 20% for Jaeger, FluentD and Kiali.
On the security and compliance side, Gatekeeper, Tigera, Twistlock, Kiverno, CrowdStrike, Aqua, Sysdig.
Yeah, but they dropped from 27% for Gatekeeper down to 3% for 60. Pretty fast. Wow.
And then messaging Kafka, of course is 39%, rabbit MQ is 27 and apparently 6% of you still run active MQ for some reason they continues delivery. Argo CD is 25% Flux was 18 and then a couple others there as well. Big data elasticsearch of course 35% solar 9% airflow 9% service mesh Istio was 25% when Kong was 12. That's surprise. That's as high as it is. And then Internal development platforms backstage 6% Java is the most common language go second most common node after that.net is the fourth most common. That's surprising to me. Python. Python's a little bit higher.
Interesting. But yeah interesting insights on this report. I'll have to dig into a little bit more, but crazy.
[00:38:57] Speaker C: You'll be happy to know that you know that the Gatekeeper security tool is open policy agent.
[00:39:04] Speaker A: Oh perfect.
[00:39:05] Speaker D: That's true.
[00:39:05] Speaker A: So it's.
[00:39:07] Speaker C: I don't consider that much of a security tool because it's more of a validation of your. Your Kubernetes deployment against policy instead of actually, you know, like. Which is good. It's part of it. But it's not really a security tool in the same. In the same way that, you know Twist Lock and Aqua and Systegar.
[00:39:22] Speaker D: So What's. How is GitLab? Is it just like it's under the CD? Is that just because they run worker nodes there that will roll stuff out or.
I don't think of GitLab as a CD tool. Maybe. Oh I mean I think of it as a pipeline deployment. I guess it can do pipelines.
[00:39:42] Speaker C: It's the. It is the pipelines. Like that's the.
The pipelines and then the also the ability to sort of reference like a catalog, much like GitHub Actions. You can reference public actions or other actions and other repos and pipelines. It's got the same sort of setup.
[00:39:59] Speaker D: That's what I'm curious. That's not here. Is GitHub action like worker nodes or anything like that running? Because that's been out for a while running your worker nodes on your internal Kubernetes clusters.
[00:40:09] Speaker A: Maybe Dynatrace people customers don't like it.
Maybe they're GitLab users or Jenkins users. Again, like Dynatrace is the company, you know, so it's biased on the fact that it's going out to Dynatrace customers, which is not.
I'm surprised that that's the company that's doing a Kubernetes in the Wild survey, to be honest.
Yeah, it's like Oracle doing a Kubernetes survey. Like why like it's a weird.
[00:40:32] Speaker C: Yeah, like it doesn't not make sense, but it doesn't make any sense.
[00:40:37] Speaker A: No, I mean like I Docker would make more sense to do that survey or Google maybe. I don't know anyways, but in the DORA report I trust that more.
[00:40:51] Speaker C: There.
[00:40:51] Speaker B: Are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days.
If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Achero will click the link in the Show Notes to check them out on the AWS Marketplace.
[00:41:30] Speaker A: All right, let's go back to Azure. Sorry, Sidetrail Announcing Announcing the general availability of the Microsoft purview SDK and APIs. For those of you who don't know who Purview is, I can tell you it's just Microsoft's DLP solution. They're generally available, enabling developers to embed enterprise grade data security and compliance controls directly into custom gen AI applications and agents. Addressing critical concerns around data leakage, unauthorized access and regulatory compliance, the SDK provides three key security capabilities preventing data oversharing by inheriting labels from the source data, protecting against data leaks with built in safeguards and governing AI runtime data through auditing, data lifecycle management, E discovery and communication. Compliance positions Microsoft competitively against their peers with Purview handling the complex compliance and governance requirements enterprises demand Target customers include ISVs and enterprise building custom AI applications that need to meet strict data governance requirements, particularly in regulated industries where data security and compliance are non negotiable for adoption. The SDK works across any platform and AI model, not just Azure. Making a flexible solution for multi cloud environments while leveraging Microsoft's existing and Purview data governance infrastructure that many enterprises already use.
[00:42:37] Speaker C: I mean, you know every cloud, every cloud is going to have it. You know this is, I think this is competitive to Model Armor which I'm using in gcp, which is, you know, it's an API you put in your runtime and be like don't let it say anything about making a bomb.
You give it rules.
It is funny they're building that into their like DLP solution which it sort of makes sense.
Like it's just I hadn't thought of it that way.
[00:43:01] Speaker D: They're definitely pushing Purview and a lot of the features of it recently or maybe it's just people I've been talking to, but it's something that's being coming up more and more.
So I think that they're just doing a push to make it be a larger service for to be used not just, you know, in the corporate IT space, but in the software dev. When you're talking things like, hey, how are we going to handle this? You know, our AI bot to make sure it's not leaking information or your MCP server leaking information or anything along those lines, you can build in these controls that will help along the way. So they are kind of pushing it.
[00:43:38] Speaker C: Well, yeah. And this will allow, you know, things that aren't AI related, which I think still exist, I've been told anyway.
[00:43:45] Speaker D: Yeah, yeah.
[00:43:46] Speaker C: You know, like if you're thinking about, you know, any kind of like, you know, data transaction, you want to run it through some sort of solution to look for data by, you know, look for data and policy to make sure it's okay. So like I'm really happy to see an SDK and APIs for any Microsoft product. So like, you know, like, so I look forward to, you know, since you are sort of forced to use this if you're an Office365, you know, shop. So I'm hoping that this applies to those Purview deployments as well.
[00:44:19] Speaker A: It's interesting.
Yeah, I was just thinking like, does Amazon really have a DLP solution And Macy's the only one that comes to mind which is still primarily focused at S3.
And it's interesting that both Microsoft and Google have come up with DLP solutions and I wonder if it's a connection to that Office component. The fact that they both do email for enterprises and email is a huge area where people will exfil data and that's why they felt the need to build DLP in both cases. I don't know if that's. It just seems logical that would be the reason why that Amazon hasn't invested there other than Amazon hates competing with their security vendors partners for some reason still. But that makes at least a logical sense to me. Like it's not as big of an issue for Amazon because we know Amazon work mail and everything was a failure and no one used it.
[00:45:05] Speaker C: I think you're honest.
[00:45:05] Speaker D: Did they actually end up like that?
[00:45:07] Speaker A: I thought they did.
[00:45:10] Speaker C: I mean even if they didn't, it's gone off to OBSCURITY but yeah, DLP is such a huge pain for it. Sides of the house. And so it makes sense that that's the product teams for even cloud hosting. It's going to bleed in there.
So I like the theory. No way to prove it, but that's pretty great.
[00:45:30] Speaker A: WorkMail is still around, but work docs is dead.
[00:45:32] Speaker C: Yes.
[00:45:34] Speaker A: So close. But I can't imagine workmail can't be that far behind.
[00:45:38] Speaker D: I mean, they released MFA support last year in October.
[00:45:41] Speaker A: Yeah. Oh, nice.
Good. Good for them.
[00:45:45] Speaker C: I mean, it's probably fine as long as it doesn't cost any money.
[00:45:48] Speaker A: Do they announce encryption for it yet? Have they encrypted the data at REST and workmail yet?
[00:45:51] Speaker D: No, no, no, that's. That's a step too far.
[00:45:55] Speaker C: Wait, wait, what? They didn't encrypt the data at REST.
[00:45:59] Speaker A: Say we're just joking.
[00:46:00] Speaker C: Okay.
[00:46:00] Speaker A: They think that that's not real mfa. Yeah, yeah.
Security encrypt everything. We're just like. Do you think they encrypt the data?
[00:46:07] Speaker C: Yeah, I would lose my mind.
[00:46:09] Speaker A: Yeah, I mean, I can't confirm or deny that workmail supports encryption.
[00:46:14] Speaker C: I like to have positive, you know, intent of turning into an optimist of my old age.
[00:46:19] Speaker D: I would assume it doesn't support byok. How about that?
[00:46:22] Speaker C: Okay. BYOK would be a bridge too far.
[00:46:27] Speaker A: Yeah. It looks like it automatically encrypts all data at REST using KMS but their managed key. Yeah, it does say users have the option to use their own KMS keys. Customer managed keys through kms. Yeah.
[00:46:39] Speaker D: So kms, cmks.
[00:46:40] Speaker C: Yeah, that's all right.
[00:46:42] Speaker A: Yeah, yeah, good. Encrypt everything.
Well, it's 2025 and I'm pleased to announce Active Directory new features which are just old features repackaged into Entre so we can announce them twice. When they were first announced in Windows 2000, we were like, oh, that's amazing. And now I just find it funny. 25 years later, now we're announcing them again. So this week, General availability of two way forest trusts for Microsoft Entre Domain services.
The two way forest trust between Entre Domain and On Premise Active Directory enables bidirectional authentication and resource access, addressing a key limitation where only one way trusts were previously supported. This feature allows organizations to maintain their existing on premise ad infrastructure while extending authentication capabilities to cloud resources, reducing the need for complex identity federation or migration projects. The general availability release positions Azure more competitive against AWS managed Microsoft AD, which has supported 2hr since launch. Come on. Azure primary use case includes Hybrid cloud deployments for applications and Azure need to authenticate users from on premise domains and vice versa. And particularly beneficial for enterprises with regulatory requirements to maintain on premise identity systems. Organizations should evaluate the additional network connectivity requirements and potential latency impacts while implementing forest trusts.
Thank goodness this is finally here, by the way, because this is actually a pain point that I'm familiar with from the day job.
[00:47:59] Speaker C: Oh my God.
[00:48:00] Speaker A: Because the ability to not connect your entre ID to your local authorization domain is a big problem and so not having this ability actually causes a lot of weird edge cases and extra hoops that now Ryan won't have to solve. Yes.
[00:48:16] Speaker C: I don't know if I'm going to fall in that grenade, but.
So someone hopefully will fix it.
Yeah, no, I was, I'm sort of shocked that this is, you know, in that state. You know, I'm surprised that Amazon supported it from the launch. But it also, when I think through it, of course with Amazon's like sort of distributed, you know, nature.
[00:48:38] Speaker A: Well, Amazon literally just took AD and packaged it as a service. Like they did. Yeah, they did. No modifications to ad, which is why Azure or AWS AD is actually problematic because it's doesn't really understand massive scale of like a true large scale ad forest. It's good for small workloads, but it definitely falls down pretty quickly in small ones.
[00:48:58] Speaker C: Yeah. And I was, you know, like, but you're going to have an AWS that I imagine it's very natural to have a lot more domains and separating those managed ad infrastructures to having boatloads of them. So maybe that's. Yeah, at scale it's going to fall down, but maybe that's the idea.
[00:49:15] Speaker A: Yeah, I mean I, I'm just. You just maybe had this horrible idea of like. Oh yeah, if you're a multiple AWS account, maybe you have multiple 8 AWS ads and then. Oh yeah, you need trust scroll. Oh yeah, yeah, I hope. No, I think, I think isn't Azure AD or. Sorry, AWS AD is one of those servers. I think you actually can share through that.
[00:49:34] Speaker C: I believe it is part of the.
[00:49:35] Speaker A: Resource manager VPC thing.
[00:49:36] Speaker C: Yeah, the resource manager. Yeah.
[00:49:38] Speaker A: For that exact reason. Because doing trust across every account you have would be that.
I don't think I would do that.
[00:49:45] Speaker D: But you really are running into this at your day job with the forest levels.
[00:49:49] Speaker A: You guys aren't between ENTRE and on Prem.
[00:49:52] Speaker D: Oh, these are completely different setups. There's not. Okay.
[00:49:55] Speaker A: Yep, yep.
So we use Office365 for email and for using virtual VDI and all these things. And literally even today I saw a note, I was like, well, the problem with this is that it doesn't attach to the AD properly, so we can't do authentication into our data center or into our GCP region. It's like, yeah, that's a problem. But this now solves that problem. So now I can go to that meeting, I can be the jerk and be like, actually, Azure, fix this for you.
[00:50:18] Speaker D: Except for implementing it.
[00:50:21] Speaker C: No, no, no, think about it.
The problem is all the workarounds we've put in place because this wasn't an option will prevent us from unwind all of that.
[00:50:29] Speaker A: Yeah, it's not an easy fix no matter what you do.
[00:50:32] Speaker C: Yeah. But I mean it is good.
Hopefully it'll solve it for the next guy. I think it's too late for us.
[00:50:41] Speaker A: I mean, I don't know. Some areas, yes. Other areas we're not too far down the path, but other areas, I was.
[00:50:48] Speaker C: Specifically with ad, like, that's, I think that, you know, it's going to be. And I think a lot of companies who are running AD like, it's hard to design something like Active Directory that will scale and be future proofed for your business 15 years later after rolling out. Like, it's just hard.
[00:51:05] Speaker A: Well, I mean, now we're getting into my rant territory where I start complaining about how bad AD is for the cloud world again. And then, you know, just, that's just a rabbit hole. We don't need to go down tonight.
[00:51:14] Speaker C: It was not built for it, it's not designed for it.
[00:51:16] Speaker A: You have to have. It needs to die and be replaced by the new modern equivalent of it. But Microsoft is refusing. They keep thinking Entre is going to solve all their problems.
[00:51:25] Speaker D: It's not, should I start my rant or just should we move on?
[00:51:31] Speaker A: Were you here for my last Azure AD is not built for cloud rant that might have predated your joining us permanently. That might have been back in the Peter era. But I went on like a 25 minute rant about it one day.
[00:51:42] Speaker D: I'm sure I listened to it at one point. But then in Microsoft, if you're using their DNS with ad, which you kind of have to do to get things to work, then you have DNS cleanup, which works really great with auto scaling groups or scale sets, which then you don't definitely don't ever have legacy dangling DNS hangers if you're scaling at a given rate and you definitely don't have any issues enabling automatic cleanup and then what ratio and time you do cleanup and I can keep going if you would like, but we should really go talk about FQDNs.
[00:52:12] Speaker A: Yeah, you had a couple of the ones that I hit. The one in particular that is egregious is the auto scaling domain join problem where you end up with a ton of orphaned AD objects. That's the one that. That sends me over the edge.
[00:52:25] Speaker D: Yeah, definitely. Definitely have that day job.
[00:52:27] Speaker A: Yeah, I. They have fixed that a little bit.
[00:52:30] Speaker C: It's bad. It's not fixed.
[00:52:32] Speaker A: There's a. Yeah, it's not fixed. But like there's like a. There's a certain attachment type you can do which is not quite the same as a full trust domain join, but it's like a partial domain join but still these artifacts. So it's better but not fixed.
[00:52:44] Speaker D: No, it definitely has many problems still.
[00:52:46] Speaker A: Yeah.
All right, well let's talk about the firewall that had another big problem and that was that it didn't support DNS, but now it does. The Azure Firewall now supports fully qualified domain name filtering in DNET rules, allowing administrators to route inbound traffic to backend resources using domain names instead of static IP addresses, which simplifies management when backend IPs change frequently. This feature addresses a common pain point where organizations had to manually update firewall rules when backend server IPs change. Particularly useful for scenarios with dynamic infrastructure or when they use services with rotating IP addresses. The invitation brings Azure Firewall closer to future parity with both Amazon and Google. Target use cases include load balancing to backend pools of changing IPs, routing to containerized applications, and managing multi region deployments where the IP addresses may vary across the Availability zone. Organizations should note that fully qualified domain resolution adds a slight processing overhead and DNS look of time to DNA operations, though Microsoft hasn't published specific latency metrics for this GA feature yet, which this is probably one of the reasons why I constantly argue with customers about IP addresses versus DNS names and why IP range. When they say one ip, we give them a massive IP range because I say I can't guarantee an IP and they don't like that.
Hopefully this fixes it for at least all my Azure customers.
[00:54:00] Speaker D: Well fix it internally.
[00:54:01] Speaker C: Yeah, well fix it. Yeah, we'll fix that. That's all security tooling and people wanting to manage their Internet traffic tightly.
This.
[00:54:12] Speaker D: Yeah, I mean this goes back to one of the first AWS setups I did at a healthcare company where they said no, we need a firewall and they put, I don't remember insert firewall. But then they needed multiple servers on the back end and their team demanded auto scaling. Really it was just auto healing like 2 of 2 or 3 of 3. But when they rebuilt, the traffic just stopped working. Every time they'd have to put a request into the.
Yeah, and it was like. I was like can we put a load balancer here? Yeah, but what the load balancer changes so then we want the load balancer out because they don't change nearly as often. But it was still, still failed.
[00:54:51] Speaker C: I mean the fact that routing traffic by IP address on the back end wasn't possible until now is crazy to me.
[00:55:00] Speaker D: I mean that wasn't with us. This was aws and circa 10 years ago or so because somebody had to do Apollo Alto fortnet insert firewall there to meet their checkbox compliance and monitor their traffic going in. But it just caused more problems than it was worth. Yeah, but it's just a basic feature in my head of oh, all firewalls can do DNS based routing.
Clearly not.
[00:55:27] Speaker C: I mean it goes. Maybe I'll use my earlier point that Justin squashed or it's like, of course this has been an Amazon forever because they're distributed nature. You know, everything is designed to be isolated and you know, like features like this will force, you know, a centralized managed network, you know, architecture. So it doesn't change because it's the only way you could really operate it in a safe way.
So this is, this is good. I do like this for the back end. I do think it'd be great to publish services and have things be a lot more dynamic and references by name, whether it be roles, whether it be backend services like everything stop statically addressing things.
[00:56:10] Speaker A: I mean nice, wouldn't it?
Azure's Updated AZ NFS 3.0 to introduce fuse based performance enhancements to BLOB. NFS deliver up to 5 times faster single file reads and 3 times faster writes compared to native Linux F NFS clients. This addresses performance bottlenecks for hpc, AI, ML and backup workloads that require high throughput access to BLOB storage via the NFS protocol. The update increases TCP connection support from 16 to 256 connections, enabling workloads to saturate VM network bandwidth with just four parallel operations fully. This brings Azure's NFS BLOB access performance closer to that of EFS and GCP file store capabilities for demanding enterprise workloads. Key technical improvements include support for files up to 5 TB previously limited to to 3 TB, removal of the 16 group user limitation, and enhanced metadata operations with 3 megabit directory queries. These changes particularly benefit EDA and CAD workloads that process large simulation files and extensive file metadata. While blobfuse offers Azure Entra ID authentication and public endpoint access, Blob NFS still requires virtual network connectivity and lacks native Azure ad integration. Organizations must weigh protocol requirements against security needs when choosing between two mounting options. The preview requires registration and targets customers using Linux based HPC clusters, AI training pipelines and legacy applications requiring POSIX compliance and installation involves the az nfs mount helper package available on GitHub with no additional Azure cost beyond standard Blob storage pricing.
[00:57:31] Speaker C: They fix the fuse to object store issues where this is either just going to deadlock or corrupt your objects.
[00:57:39] Speaker A: I mean, I don't know about an Azure I don't know the Blob architecture as much as I mean for all we know it's just a netapp with a Blob API endpoint on top of it. So it might not be as bad in Azure as it is in S3, but don't recommend this in S3. Although they do have more native few support in S3 these days too, but only really for read operations.
They don't support the full POSIX stack.
[00:58:04] Speaker D: Which I think is a better life choice.
Help you migrate there but you know, do your puts in a more native way at least.
[00:58:11] Speaker A: Yeah, so I don't recommend doing this this way, but if this is your need and use case then great. That little note about security needs and choosing between mounting options, that's a fun one.
I'd like to see Ryan explain that difference to a developer.
No, you can't use that one because it's not as authenticated as it needs to be.
[00:58:31] Speaker C: Yeah, no thank you.
[00:58:36] Speaker A: Let's move on to Azure AI Foundry, introducing Deep Research as an API SDK service that automates web scale research using Open AI's O3 deep research model, enabling developers to build agents that can analyze and synthesize information from across the web with full source citations and audit trails. The service integrates with Azure's enterprise ecosystem through logic apps, Azure functions and other Foundry agent service connectors, allowing Research to be embedded as a reusable component in multi step workflows rather than just a standalone chat interface. Pricing starts at $10 per 1 million input tokens and $40 per 1 million output tokens for the O3 deep research model. You put a little data in and get A lot of data out and pay for it.
Architecture provides transparency through documentary using paths and source notations addressing enterprise governance requirements for regulated industries where AI decision making needs to be fully auditable.
[00:59:24] Speaker D: You forgot additional charges for Bing Search.
[00:59:27] Speaker A: On the top of your tokens. Yeah, because everyone's using Bing Search for their grounding needs.
[00:59:32] Speaker C: Of course.
[00:59:33] Speaker D: Yeah.
[00:59:34] Speaker C: It is truly evil to do a four times cost increase for the. The output that you're not putting that.
[00:59:41] Speaker A: You don't have any control of. Yeah, it's beautiful.
I guess.
[00:59:46] Speaker D: How else would you charge it?
Like I get like we're so based on tokens.
[00:59:51] Speaker A: So like. Yeah, I mean like there's no way not to do it that way. But it's just, it's. It's funny to me because I'm like I could put a little bit in and I get a lot out and I had no control over what came out. Right. Or if it's even accurate, which it's probably.
[01:00:03] Speaker D: Yeah, well you got Bing as source.
[01:00:05] Speaker A: Of truth here, so it's definitely not accurate.
[01:00:08] Speaker C: I'm trying to think of other pricing models that where it is they've split sort of the token on input and output. I'm drawing a blank. I'm sure it exists because usually it's just per token and that includes both input and output. Right?
[01:00:24] Speaker A: Yeah. I haven't seen in any of the research reasoning models they always have two different prices.
[01:00:29] Speaker C: They do.
[01:00:30] Speaker A: So that's what I've seen to date.
[01:00:32] Speaker C: Maybe that's when that was introduced.
[01:00:35] Speaker A: So I'd like to remind you of an earlier Azure story that we talked about before we get to the next one, which was announcing the availability of Microsoft Purview, SDK and APIs because you don't want to have data that leaks out of your system, which is awkward because we have this final article for Azure this week.
Researchers have discovered a critical vulnerability in Azure's managed certificate provider MCP that allows attackers to extract key vault secrets by exploiting certificate validation flaws in the authentication process. The vulnerability stemmed from the MVP's improper handling of certificate chains enabling malicious actors to forge certificates that appear legitimate to Azure's authentication system and gain unauthorized access to sensitive key vault data. Microsoft has since patched the vulnerability, allegedly, but the incident highlights ongoing security challenges in cloud certificate managed systems and the need for robust certificate validation mechanisms across all cloud providers.
Organizations using Azure Key Vault should audit their access logs and rotate any potential exposed secrets, as the vulnerability could have been exploited without leaving obvious traces in standard monitoring systems. The discovery following a follows a Pattern of certificate related vulnerabilities across major cloud platforms, emphasizing that even mature cloud services require continuous security scrutiny and that customers should implement defense in depth strategies rather than relying solely on platform security.
Wow, nice job, Azure.
[01:01:53] Speaker C: A major certificate like every year now, is that what we can expect for. This is not so.
[01:02:01] Speaker D: I mean, I just am still surprised that they built an MCP service for this. Like, for certificates like mcps are still new for such a critical.
[01:02:14] Speaker C: Oh, I don't think this is model context protocol. I really hope not. I think it's the managed certificate provider.
[01:02:20] Speaker D: Oh, you're right. Okay, sorry.
[01:02:23] Speaker C: Yeah, you're right. If they end up MCP for certificates, it's like, yeah, we're gonna have to dive off cliffs.
[01:02:29] Speaker A: Yeah, sorry, it's a little late.
[01:02:30] Speaker D: Confused the acronyms in my head.
It's.
It's just amazing. And all I still think back is to that AWS's CISO or Amazon CISO. And they were like, we hold our providers to our security level and that's where we're holding off moving to M365 or O365. And I'm just like, yeah, and these are the ones. I mean this is a big, this was a big risk, you know, that it was out there.
I mean, in theory you also should have had some control set up on your key vault and you know, so you also had to do a little bit of things that were not best practice on the back end. But it's still not good.
[01:03:12] Speaker C: I mean, if you're.
[01:03:12] Speaker A: I mean, I had to say that I. The more I've learned about mcps, the more I played with them, the more that I have created them and seeing what it gets created. I.
MCPs scare me like in product, in like areas where data is sensitive and I need to be concerned about it.
I don't know that I would trust an AI generated MCP not to have this problem. Like, it's like, I think you a. I don't think we know enough about how to make mcps good. We don't have enough best practices. We don't have enough good patterns.
The, you know, the AI, how they create them is, you know, really dependent on the ability of the AI APIs that it's talking to.
And so like it. It's a bit of a concern, especially when you get into like an agent to agent. And like if you have a bad MCP out there, all of a sudden now you've allowed an agent to agent hacker interface to your data in a bad way. And so it is like, I, I think they're awesome, I think they're really cool, but they do kind of freak me out a little bit right now.
[01:04:13] Speaker C: So this really is the MCP server. It really is the MCP in front of their key vault service. And I was convinced there's no way that they would allow that kind of thing. So it's clearly a flaw in the managed certificate provider, which is what I thought MCP stood up for.
[01:04:32] Speaker D: Well, now you guys have me confused because they do have Azure mcp, the managed certificate provider, which is their, which is like their acm. Yeah, give me ssl. So it works on app services and front door. It does not work on app gateway because why would you want a managed certificate on your load balancer? We'll bypass that rant for another day because it's late already.
But yeah, this is, I think both of them. It's the Azure official MCP itself, I think, and then how it integrates. So then leveraging the certificates that you have in there that could be through Azure Certificate Manager or other things that you put in there.
[01:05:21] Speaker C: Yeah, I'm reading through with much more detail because I didn't read this because it is sort of a complex like which, you know, which key is the, is the MCP server leaking? And I thought it was just a certificate service, but it's not. This is crazy to me.
[01:05:38] Speaker A: Yeah.
This is why I go back to my comment.
I worry about MCPS just a little bit right now.
If you're an AppSec person and your company is thinking about building an MCP, my recommendation is that it gets Red Team to death before you ship it.
[01:05:58] Speaker C: Yeah. Or just, you know, like any other automation, just in terms of like privileges and, and data it can access just incredibly scoped down to only what it needs to do.
I mean, although this is an example of, you know, where it's, you know.
[01:06:15] Speaker D: It'S, it's built to do that, it's inspect.
[01:06:17] Speaker C: Yeah, I mean it's, it's managing the API call between two different subscriptions and then leaking the, the secret data from there. Like that's holy. That's bad.
[01:06:30] Speaker D: Yeah, I'm gonna echo Justin. There is MCPS do kind of terrify me because I foresee a lot of people being like, hey, I followed this guide and we officially have insert company names here mcp and it's really just a general setup and it's a full pass through to their SQL backend which has, you know, their passwords unencrypted, not even salted, just stored in plain text. And I foresee just doing these things and exposing a lot more and more issues.
[01:07:00] Speaker A: And worse. It's there's actually like you tell like you go to Klein or root code or Claude and you say, hey, I want to use an MCP tool to connect to Google Docs, for example. This is an example of I tried and you know, I was, I was watching what it was doing and it went to go. It went and found this, this MCP controller and it downloaded it and added it to the code. And then I was like, wait, where did that come from?
There's just some guy on GitHub who created a Google Docs MCP. I don't know this guy from anybody. He could be it's Node js but worse for AI.
[01:07:37] Speaker D: Npm js.
[01:07:42] Speaker A: Whatever. Npm, NPM and mcp, they're both three letter acronyms. They both should be careful with.
Just, just word of advice.
[01:07:50] Speaker D: Are we gonna have like MCP hell? Like we have Ruby GEBH hell.
[01:07:56] Speaker A: And I mean Ruby GEM is nowhere near as bad as npm, so I'm offended by you even said it.
[01:08:04] Speaker D: I've had a lot of arguing with Ruby Gem block files in my life.
[01:08:10] Speaker A: That nothing is worse than cpan. So I'm going to stand on that and we're going to move on to a Cloud Journey.
[01:08:16] Speaker D: Cpam. No, that's just me.
[01:08:19] Speaker A: I know Pearl, that's always bad. All right, let's move on to Cloud Journey. We have two this week.
One because I just want to talk about. This is exciting. It's cool. But the first one before we get to the cool one. They're both cool, I guess.
Database DevOps Fix Git before it breaks your production environment Database deployments often fail due to poor git branching strategies and particularly the common practice of maintaining separate branches for each environment dev, QA and prod, which leads to merge conflicts, configuration drift and manual patching becoming routine problems. Trunk based development with context driven deployments offers a more scalable solution by storing all database change logs in a single branch and using liquid based context or metadata to control where changes are applied, eliminating duplication and conflicts. Database changes require different handling than stateless applications because they involve persistent state sequential dependencies and irreversible operations, making proper version control and GitOps practices essential for safe deployments. Harness database DevOps currently supports liquid based for change management and enables reference exchange logs for any supported database from clcic Pipeline plans to add flyaway support in the future.
This article was from Harness. If you didn't get US automation capabilities include drift detection, automated rollbacks and compliance checks which are critical for production grade database development.
DevOps ensuring consistency and traceability while reducing manual overhead and risk.
I was curious about just in general database git. I'm a big fan of database git, but Visual Studio makes me sad about it because SQL changes are done terribly in Visual Studio. Then integrating something like Liquidbase or Flyway becomes a bit of a pain in the butt.
But if you're using Postgres or some others it's not too bad. But in general there are definitely gotchas if you're not using trunk based development in your database flow for all the reasons that they just mentioned which will cause you a bad day if you screw up.
[01:10:06] Speaker C: I mean I feel the same way about application in general. Like I, I do not believe in long wind feature blanches. Like it's gonna, you're always gonna have. If you're trying to support changes to three different things, you're gonna have three different ways to do it. So why just have one? Why not just have one that you tag to separate it out? And so like and I get it with databases it's, it's very complex, you can't roll back as easily and stuff like that. But you know, it'd be, it's way better to have a version control sort of delta of changes that you can apply and also write, you know, sequence scripting against to do rollbacks and to do migrations of your existing data.
[01:10:48] Speaker A: But.
[01:10:51] Speaker D: Yeah, these types of changes I've definitely seen done before where you do GitOps for your database and it can solve a lot of problems because I always felt like the biggest problems when you do application deployments are how do you update your database? You update before, you update after.
How do you really kind of manage that? You know, do you have have the ability to roll back? Do you make sure you update application code to always forward compatible and always do your changes after or vice versa. So if you get a lot of this in there and you do this right now, it's good. Adding this to a current running development platform like development system is not going to be easy. I've seen a couple companies add something like this and developers both love and hated fixed production issues but made dev a lot more difficult. And then if you do like you really do have to stick with, with trunk based development because as soon as you start to do like epic based branching and merging that in and how you handled all those merging and then if two people change the same columns like it definitely Requires more cross team communication if you're dealing with like a monorepo also.
[01:12:04] Speaker C: Well, monorepo would have some advantages as long as you have everything going through it, which is never the case. Right. So it's like.
Because then it would be, you know, the pipeline would break and then it would be easier.
[01:12:15] Speaker A: I mean, you want to get into a monorepo argument. Let's not.
[01:12:18] Speaker C: Right, right in the. Yeah, the cloud journey. No, you know, like. No, I mean it's just a, you know, like it's, it's difficult to do these types of changes. I, if you had the CISV pipeline, you know, it's good. But I mean for database deployment, I very rarely even see that. I still see a lot of administrator driven deployments for databases and some of that is conservative, trying to avoid changes. But a lot of it is just because the changes aren't really thought about in a way that is a change set or change log.
They're just like, oh, run this script. Do you have a rollback? Oh, we'll just undo that transaction or we'll just delete all the entries in that table. Like that's. It doesn't really work. And that's how you get foreign key conflicts and all kinds of things.
[01:13:09] Speaker D: Or you just restore all your production database deploying type backups. What could possibly go wrong with that strategy? Never seen that strategy as a method before.
[01:13:19] Speaker C: It's fine as long as you've shut everything down so you don't have any data loss. Not a big deal.
[01:13:25] Speaker D: Who needs data? It's fine.
[01:13:28] Speaker A: The world doesn't need data. That's just silly.
Well, the second cloud journey is about tdd, which early on my development career I cursed and said this is stupid. And I've learned my lesson over my career that TDD is actually pretty nice, although my, my ability to write good tests has always been somewhat limited.
And so this article kind of interested me because it was from eighthlight and they made a compelling case for Test Drive development, or tdd, as being the missing piece for making AI coding assistance actually useful in real world development.
The core insight is that we've been treating LLMs like they're human developers who understand context and intent. Really, they need structured, explicit instructions and TDD provides exactly that framework by forcing us to break down problems into small testable pieces.
The timing of this is particularly relevant for cloud developers because we're seeing tools like GitHub, Copilot, Codewhisperer and Google's Duet AI become a deeply integrated to cloud development workflows. This article's a little dated. If you tell Duet is dead, but Gemini Code Assistant. But without a proper protocol for communicating with these tools, developers are getting frustrated when the AI generates code that looks good but doesn't actually work or meet the requirements. What's clever about using TDD as a communication coil is that it solves multiple problems at once. You're not just getting better AI generated code, you're also ensuring your code has proper test coverage, which is critical for cloud applications where reliability and scalability matter. The article shows how writing task descriptions first gives the clear AI boundaries and expectations, similar to how you define industry requirements before deploying to the cloud. And the reason why we're talking about this is because I actually did this. So one of the problems with this Bolt and its ability to create show notes was that the Google Docs API is terrible.
We talked about this. When I talked about Graham Bolt for the first time.
[01:15:04] Speaker C: It killed two of the cloud pod hosts.
[01:15:07] Speaker A: Yeah. So one of the problems was I constantly was regressing the input model for Google Docs. So basically I would make a change and I'd tweak something and then all of a sudden the text get inserted in the wrong heading style or it would get imported in bold or it would pull in markdown like it's not supposed to and like all these things that just really annoying.
And so I read this article because I stumbled across it and read it or something and I was like, and so I solved my problem by creating a cloud pod show note test file that is a perfect version of the show notes.
And then I basically told the AI that that is the test. If you can do your imports into the test document and it looks like this, in this format, you've succeeded. And so by doing that I have solved all of my regression problems. I don't have them anymore. I have some other tweaky issues which aren't really fixable that way. But it's been like, it went from like super frustrating because I didn't want to touch Bolt because I would break it every time I touched it. Two, I can change Bolt relatively simply and not worry about it because it will pass the test and it won't build and it won't run if it breaks the configuration. And then the AI knows what's broken and it'll fix it every time without me having to go through constant loops of correction to the AI to solve this problem for me. So I. And this is because of this article. So that's why it's here. On the show, this really solved a big problem that was causing a lot of problems. Now there was, there was some fun things like when it first saw the Google Doc, it thought that all of the show notes should always match the text. And I was like, no, no, ignore the text. I don't care about the text. I only care about the formatting of the text. The format should be matching, not the content. And then it was like, oh, okay, I get it then like that. So there was some tweaks and some iteration to it so it didn't write the tests in the way that like, oh, you know the article from 8th Light is the first bullet. No, yeah, it's on the first bullet to be a bullet, I want to be indented and I want it to be a normal text. If you match those three things, you succeeded pass. If you didn't match those three things, you failed. Try again. And so it was. But it was, it solved like probably my biggest bolt development headache that I was dealing with. So that's why I want to talk about this one.
[01:17:20] Speaker C: So as you're vibe coding that like where do those instructions live? Like are you having the conversation with the AI so that it's generating the tests that looks for the indent in the bullet and thing. Okay, yeah.
[01:17:31] Speaker A: So basically it writes the test that generated another test document and then it does a comparison to that test document that it created to this to the master text document for the formatting. And so basically it's written the test in a way that it compares the two docs to basically determine how this works, which is why it is working as well as it is, I think because it wrote tests that are correct, but it also uses this document to verify that its tests actually were correct as well.
[01:17:58] Speaker C: It's funny because, yeah, such a. I'm such an API backend guy. I've always thought about these things. It's like function outputs and text or JSON objects. And I like your. You know, I never would think of a test document for test driven deployments. Pretty smart. I like it.
[01:18:13] Speaker D: I mean you could also say do something like have it write the tests that then have it run the pipeline when it's done. Here, go run this and go trigger this pipeline with your branch. That's where I kind of saw this running is you always have that mythical, hey, we want code coverage from our testing and especially with a pre existing application on Greenfield application. That's so hard. At least with Greenfield you can start with that and say here's my test, now write the Code that matches the test.
So over time, what you might end up seeing, which is an interesting idea that I kind of thought of based on this article, or maybe the article said, and I therefore thought of it, I don't remember anymore because we've talked, we've said, we talked about talking about this a few weeks ago, was you're really going to have more QA people if you can get your proper tests, that if you have a QA person that's able to write all the tests that needs to pass, you then just tell the AI to go write something that passed the test.
And you know, you can then start to build out and do more things along those lines. So you're not going to be as much as, you know, working on the actual engineering, you know, that's going to be more focused on how do you automate and write those automated tests for the AI to test to write their code against?
[01:19:34] Speaker A: Yeah, well, in the example they gave in the document, they said, you know, you basically write descriptive test cases covering your requirement, implement one seed test to establish the pattern you want to use, and then let AI generate the remaining tests and then the implementation code. And that approach works particularly well for cloud microservices where you need consistent patterns across multiple services and APIs.
If you're using AI coding assistance, it could be a game changer in terms of productivity and code quality. Instead of developers spending hours debugging AI generated code that miss critical edge cases, they're using AI to handle the repetitive implementation work while maintaining the high standard. So it has that advantage to you, but you can even to write the test. That's one of the big advantages of AI, in my opinion, is documentation tests for sure, but you can still write the right pattern and the way the format you want things to be and then make it comply with that standard as well. The remember functions, you can make it remember those things. You don't have to tell it multiple times that you need to do this thing.
But yeah, no, I agree with you. I think ability to write really good tests is going to be important. But I also think the ability to debug some of this complex code is going to be also very important to developers in the next few years. Because I can see it two, three years, all these startups that were vibe coding and getting all these solutions out to market and they get some critical mass and then they hit scalability challenges and they're like, I can't get the AI to fix it, it just doesn't work. And you're gonna need people who know.
[01:20:52] Speaker C: How to can't vibe code a hot fix.
[01:20:56] Speaker A: Or if you don't understand how to do massive scale, you're not gonna be able to tell the prompt how to design a system for massive scale. Like, you just, you can't do it. So I do suspect that we're going to see in the next 18 months like this kind of, you know, pendulum swing back the other way where like, oh, yeah, we let all these developers go and it's like, okay, well now we need those developers back to fix some of the garbage code that these things Vibe coded into existence. And like, even in my Vibe coding experience, I spent a lot of time going back into refactoring. Like, you know, I look at what it generated, I'm not happy with it, but it works. So I'm like, I'm going to ship that because it works. But now I, and I have a document of like, I need to go back and fix these. And then I, I also use the different. I use Claude or I use Gemini or I use OpenAI and I say analyze my code base and make recommendations for simplification, refactoring, repeatability, you know, classes. And so it. I'm constantly, at least once every couple weeks going through that cycle. So I'm simplifying what I'm doing. And especially if I start seeing the AI start to bog down in like, repetitively trying to make things work that aren't working, I'm like, no, no, it's time to stop. We need to simplify. We need to make this better, easier, standardized. And when you do that, it actually makes the AI more efficient too. So you're not wasting so much money on tokens in repetitive loops that you can get into. So it's only because I've been doing it long enough now and I know the right patterns and I can see it and I can catch it in the middle of doing the wrong thing that I can kind of fix those things. But again, I wasn't a strong coder at this point in my career. I mean, like, my heavy coding days were, are long behind me. But those principles still apply right now?
[01:22:29] Speaker C: Absolutely do.
[01:22:30] Speaker A: I mean, I, and so I don't.
[01:22:31] Speaker C: Agree with, but you know, because I still think test driven stifles innovation. But like, because it's, it's just hard to think about things from a test frame point. I, I agree with everything else about that. It's just the being able to write the outputs of each individual function beforehand is very difficult, folks.
[01:22:50] Speaker A: Well, I, so that again, because I was not a big TDD developer.
It was coming the big hotness when I started getting into management. So I didn't feel through all the how to build it out properly. But I having watched now watch Claude write tests, I'm like oh, like my concept of what a test driven development was too narrow in many cases. Like they don't have to be as so specific. Which in my mind they're always like these very, very specific requirements that I was thinking about that had to be written into a test and then I would write, I'd write code that then would match the test. But like even looking at what this, right, I was like, okay, it's close, but it's not always exactly dictating implementation.
It's really like what's the business output or what's the functional output you want to do. How it gets designed to do that is not really important as long as the output is correct and that the output matches the edge case, etc. So I have learned more about writing good tests, watching AI do it. I think I have too, which is nice. I'm also glad I didn't get heavily into TDD because I think it would have driven me up a wall back then because it is still mentally taking a little bit for me to get my head around it. But I've always respected the process. It's just always been something I never did myself. So I always made Ryan do it, but never did it myself.
[01:24:10] Speaker C: I just said I did it and then I just did normal development without any tests. But I said I had tests because they all just said echo, you know, zero.
True.
[01:24:21] Speaker A: Yeah. It is interesting to see.
That's perfect.
[01:24:24] Speaker D: I can see him actually doing that.
[01:24:26] Speaker A: Is definitely done that I think I pull. I think I've reviewed a pull request and approved one that had that in there before. It sounds very familiar to me. I do think also I.
Another thing I have seen a couple times now is where you try to up you put a test coverage requirement on your code and watching the AI struggle to increase by like 2% and a really large code base is sort of hilarious because like Bolt is all Python code. So there's only so much of Python code that you can actually write tests for because a lot of it's just boiler scaffolding that you can't write tests for. So it's like this file has 400 lines, it has zero tests. I'm going to go write tests for it. And it does all this work and it generates all this test and it's like and we increase the test percentage of the entire product by 3%. We still have 8% to go to get to 70%. I'm like, it sucks to be you.
[01:25:19] Speaker D: Thank God you're an AI bot.
[01:25:21] Speaker C: Well, and it's funny because you can catch it if you force that. I had early days of development. I was trying to experiment, so I was trying to get to, you know, I think I said 90% code coverage and trying to force it through lots of iterations. And it started writing tests that was, you know, test for test sake. Like it was, it wasn't valuable inspection at all. It was just this line, should do this. So this, you know, should do this. And it was like. And you know, the amount of mocking that it had to do in order to accomplish some of those things, it was, you know, like, didn't actually add anything to the, the value of testing code. So it's because it was just struggling to figure out what to test for. And so like, it's, you know, one of my favorite interactions was, you know, AI evaluating a test that it had just written that I was challenging it on. It was like, oh, this is a perfect example of, you know, how you shouldn't do anything. I'm like, you just wrote that it.
[01:26:23] Speaker D: You mean, you know, arbitrary business rules that don't make sense don't actually help you achieve the business objective.
[01:26:30] Speaker C: Come on. No, yeah, yeah. Code coverage isn't. Is a goal. I think it should always be a goal because it's good to have tests and we all need to know what, what the expectation is.
[01:26:41] Speaker D: I write perfect code first time every time. Yeah, like every developer out there. Yeah, 100%.
[01:26:48] Speaker A: Yeah. The other thing I've been doing a lot more of is pre commit hooks.
That's a lot of fun too.
You can get the AI really pissed off about pre commit hooks real quickly. It's like, hey, I want to disable it. So like, no, you can't. You have to fix it because I can't disable the pre commit hooks. You can't even.
[01:27:04] Speaker C: Oh, that's, that's smart actually because, yeah, I actually moved away from an experiment from Gemini because it automatically made a commit. I'm like, no, you didn't.
[01:27:16] Speaker A: Well then it's great because like, you know, it's writing Python code and it very often puts extra lines and doesn't, you know, follow good Python specs. So I use, you know, black and a bunch of other Python linters and I make it run through all three of those and most time it fails the first time and has to go figure out, oh yeah, I didn't follow the rules. Let me use the automated fix to fix it and then. Yeah, but it's nice because now it's making sure that everything's consistent and clean and yeah, so pre commit hooks are also a good one and the particularly security ones are important because if you're not careful and this one, if you're doing anything with Docker containers, you got to be careful because it'll shortcut to just inserting the secret into the container. Build that as a pre commit hook for looking for that. That's a good one that I highly recommend because I caught it and then I had to go into Docker and delete a bunch of old containers. I'm like, you bastard, you put too many secrets into that that you should not have.
And I had to rotate them. Not that I think they were exposed or anything, but it's just good practice and sleep at night with that. So here we go.
[01:28:17] Speaker C: Yeah.
[01:28:17] Speaker A: Well, gentlemen, I think we've covered this topic to death.
Yes. TDD for life. Is everyone not. Yeah. And then test equals zero equals yes.
[01:28:27] Speaker D: So good return true. Don't turn true.
[01:28:31] Speaker A: All right, well have a good night, both of you. We'll see you next week here on the Cloud podcast.
[01:28:36] Speaker C: Bye, everybody.
[01:28:37] Speaker D: Bye, everyone.
[01:28:41] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services.
While you're at it, head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.
[01:29:05] Speaker C: Sa.