[00:00:00] Speaker A: Foreign.
[00:00:06] Speaker B: Welcome to the Cloud Pod, where the forecast is always cloudy. We talk weekly about all things aws, gcp, and Azure.
[00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan, and Matthew.
[00:00:18] Speaker A: Episode 301, recorded for April 22, 2025. The Cloud Pod party rocks in the house tonight.
[00:00:26] Speaker C: Good.
[00:00:26] Speaker A: Good evening, Matt. How you doing?
[00:00:28] Speaker C: Good. How are you, Justin?
[00:00:30] Speaker A: You know, it's Tuesday, it's not Friday, so. 50 50. That's probably as close as I get to it.
[00:00:37] Speaker C: You have five days done. Yeah.
[00:00:40] Speaker A: Yeah. Well, you know, unfortunately, we cannot track down Ryan tonight. I don't know where he's at other than he's probably napping, as he usually likes to do.
[00:00:47] Speaker C: He's in a cave somewhere and he walk outside to see the smoke signals. Tonight.
[00:00:51] Speaker A: Yep. And then Jonathan's on out with us, and then you're. You're leaving us soon for maybe a week or two. And as you have some exciting life changes happening, which you can share if you want to, but I won't force you to, but it's exciting times. Congratulations.
[00:01:03] Speaker C: Thank you. After.
[00:01:05] Speaker A: Yes, after. Good.
Well, last week we recorded our 300th episode, and it was our Google Next recap show. And so we just focused on Google Next, which means we're talking about a lot of AWS and Azure news this week. Just a little preview, but, you know, I was trying to do a bunch of things with AI to talk about our show, and so there's things that I had forgotten. Like, number one, I did not save any of our show notes from episode one through 50. The first show notes that we have from the archive back in 2018 is 51. And I do vaguely remember realizing, like, hey, it's been a year. Maybe we want this for historical reasons. And so that's why when we started saving all of our show notes, and then we have, you know, six files of show notes from 2019 all the way to 2025 of everything we've ever talked about on the show. And so I thought I could just use AI to do, like, analysis of it. And, like, how many stories have we covered on AWS across 300 episodes? And how many stories of Azure, Yada, yada, yada.
The problem is that's a pretty big context window required to do all five documents. You had to parse it down. I was trying to do just one document, just 2019, which is just 50 episodes, which is, I think, like 80 pages, if I remember right. It's not massive, but it's so big. And the AIs, they scan through the first, like, maybe two or three pages. And then it just makes a bunch of assumptions about the rest of the doc.
So it's like, okay, so I wasn't able to quite deliver on what I wanted to deliver for you guys, and I apologize for that. I try going into like, real Vertex, real Sagemaker. I did spend $45 in SageMaker on VO2 because I was there playing and I made two cute little videos and then I got a notice saying, hey, you went over your budget. And I was like, how much did that cost me? It was $45 for two sets of four 6 second videos in 4K. So, yeah, be careful with your AI usage. Get expensive real quickly.
But anyways, you haven't been with us for all 300 episodes, but you've been with us for quite a while. And so I thought we'd just maybe reminisce for just a moment here. About 300 episodes. I did promise it to the listeners and I hate to disappoint them, but I was thinking a lot about the 300 episodes and a. I am kind of shocked how long we've been doing it because 300 doesn't feel that long. I mean, it has. I mean, technically we started in 2018, so it's now 2025. I guess that's almost seven years of podcasting when you do the math. I mean, end of 2018. So I'm going to count at six. Really?
And yeah, I mean, 50 episodes divided by, you know. Yeah, it works out.
[00:03:45] Speaker C: Makes sense.
[00:03:46] Speaker A: Yeah, the math work checks out. Right? So it's been fun. My reflection of it is like, I've had a lot of fun doing it. Is it making a lot of money for us? No, it doesn't. It breaks even at best. We've had a great time. Peter was with us for a few hundred episodes and then you joined us, which was a great upgrade over Peter. No offense, Peter, but you're kind of bringing some more of the Azure stuff. Brian's kind of gone from our container expert to our container experts plus security. Jonathan's become our AI guy. I'm still kind of, you know, executive at hire here. Just trying to keep the show notes on, keep the show on the rails, if you will. But I really enjoyed the time and it's been a lot of fun and, you know, like, would I love it to be tremendously successful and, you know, we could all retire off the podcast. Yeah, that'd be awesome because it's just fun hanging out with the four of you all the time.
But, you know, I think we're having fun and I don't see us ending anytime soon unless our wives make us do it, but, you know, and. Or family things come into play. But. Yeah, I don't know. What's your feeling? You've been with us last time and Ryan's not here to pontificate, but yeah, he wasn't here for the whole. He wasn't here for the whole thing either. It's really just was Jonathan and myself and Peter initially. And then I think Ryan we brought in like around like episode 70ish, and then he's been here for a long while and I think. I don't remember what episode you could joined us, but it's been years ago. Yeah.
[00:05:03] Speaker C: Because my LinkedIn notified me, um, which I think I did some episodes earlier on here and there. Yeah, you had a few for Peter.
[00:05:10] Speaker A: When you were filling in. Yep.
[00:05:11] Speaker C: Yeah. And I think I did a few before we. Before we all kind of said, yeah, because that was gonna stay on because time zone math, you know, it's amazing. I jokingly tell people it's really difficult to coordinate four people's lives with a couple different jobs, you know, and families and two time zones and everything else. So, like, that's why I'm always like, yeah, there's four of us on the podcast, so we can normally at least get three. I understand the irony of me saying that now with only two of us here.
[00:05:42] Speaker A: Right.
[00:05:43] Speaker C: Just one point that out. You know, the goal is always to have at least three of us, and we can rarely get that. You know, to get all four of us is like a novelty. I feel like just because it's so difficult to do. But, you know, even before I, you know, know, joined, I. I listened in. I used to listen, you know, in during the runs. And thinking back on the, you know, podcast, it's changed even so much from like. I mean, I remember the first couple episodes you guys were, you know, joking about like, what beer you were drinking or whatever you know, that day and moved on to like lightning ground stuff and, you know, screwing with Peter and Ryan coming in kind of as containers. And it's really interesting to think even since I've been here, like, how this. The show has evolved of like, it was like containers and then blockchain stuff got thrown in there, and then container slash kubernetes came back and now it's like AI everything. And when it first started, it was like, AI. I'm like, okay, I can do this. And now you can't do AI anymore.
It's AI everything.
[00:06:44] Speaker A: If it was still blockchain. I don't know how I feel, because the blockchain stuff was just so tiring. And it was like, everyone. I happen to solve all these problems with blockchain. It's like, yeah, sure you are. Okay. Yeah, yeah. I mean, AI definitely is having its moment, and I kind of look forward to whatever's the next thing.
But it is interesting how it has changed because, you know, back in 2018, you know, that wasn't the reinvent year they announced ECS, because I think that was in 2017 or 2016, a couple years before. Yeah, but I think that was the year they announced EKs, perhaps.
[00:07:16] Speaker C: Or very 2015. They announced ECs.
[00:07:20] Speaker A: Okay.
[00:07:20] Speaker C: And then when did E. 2018.
[00:07:24] Speaker A: Yeah, so. So that was the reinvent. They announced EKs. Okay. And then, you know, they had a preview 2017, and then GA. GA. Okay, makes sense. Yeah. So we were just a little bit after that.
But yeah, it's. It's, you know, we. The drinking thing was fun. When you're thinking about, you know, it's really this four of us BSing about technology, and of course you're going to do it over a drink because that's how we did it at Re invent. And that's appropriate when you think about re invent and what you're doing. But then you kind of think of like, there's still drinking happening. We just don't talk about it, which is fine because there's not people out there who are struggling and who have challenges. Alcohol. And hearing four people enjoying tasty beverages might be triggering for them. And so we kind of drop that out and we did the lighting round. Laddering was fun, but then, you know, you can only write so many jokes about.
Yeah, and I. I won a lot of lighting rounds, but I think, you know, there are some things that the production shows stuck with us pretty much the entire time.
You know, the AI section is new. We did retire lighting round. We did kind of cut down on all the number of stories we're showing, although I think today is not a.
[00:08:32] Speaker C: Good side of that.
[00:08:33] Speaker A: Today is not the day to say that because we have a lot to go through, but it has been an evolution. I think it's feedback from listeners like you who are listening to us right now. Babylon, about our 300 episodes.
We take your feedback. We take it seriously. If you're like, hey, less this, more that, then cool, we'll take on our advisement. We don't always do it, but a lot of times we take your feedback seriously. And we do things. And I was having a lunch today with a friend of mine who has a podcast as well, and he was raving about how exciting going to video is. And I was like, yeah, but like, what does it really take to go to video? And he's like, oh, I have a full time producer and all that. I'm like, I don't know, I don't want to do that. That's a lot of work.
[00:09:12] Speaker C: Feels like a lot more work for.
[00:09:14] Speaker A: Yes. And so, you know, but like, you know, we are definitely getting a little bit of a following on YouTube. Even though it's audio only on YouTube, we do republish there. We do get people listening to us there, which blows my mind every time.
[00:09:24] Speaker C: And so I didn't even know we published YouTube.
[00:09:26] Speaker A: Yeah, see, good job. Yeah, thanks. It's a nice thing about using an aggregator like Cast OS that just distributes things everywhere. So I don't have to worry about that as much. But it's good. But yeah, if you're listening to us and you've been sticking with us for either a new listener or a long term listener, we'd love a review on Apple, itunes for podcasts or on Spotify or wherever you listen to your podcast or on YouTube, you can comment. Although I won't read the comments on YouTube because I learned a long time ago you just don't read comments on YouTube. It's bad news even if you're not the person on video. We tried some things too, that didn't work. I don't know that our reinvent live stream worked as well as we hoped it would. You know, we had people come and join us, which was fun, but you know, it was only like 15, 20 people and you know, the realization that most people are probably at reinvent doing reinvent things and not going to listen to us on the Internet babble on top of Andy Jassy talking. So yeah, there's definitely lots of things for us to do. And we talked about as, as Jonathan I get together regularly and we talk about things we might want to do with a show going forward. And you know, like, you know, do we do memberships where you guys can get an unedited version of us screwing this up or talking about the show notes before we do the show, or talking about all the different titles that we don't make the grade that we put in the show notes because there's a lot of fun dialogue that happens there.
We try to save as much for the show as possible, but sometimes it slips out. So we could offer a bootleg cut through that kind of thing. So we had those kind of conversations and so we'll see where we end up. Time we get to 500 episodes. But congratulations for 300 episodes.
[00:10:57] Speaker C: Matt, congratulations for you. You've probably done, I would guesstimate probably about 295.
[00:11:05] Speaker A: Probably like 293, I think probably be accurate because typically take a couple of vacations a year. Then I make you guys all try to be me and that's how you, that's why you guys don't want to fire me because I do all the hard work.
[00:11:15] Speaker C: You really do. I feel like, you know, I feel like I just kind of show up and you know, just schedule wise I feel like I show up and I'm like, okay, we're doing this live, you know, where like I know Jonathan and you put in a lot of the time beforehand and I feel like Ryan and I sometimes just show up. So I'm pulling Ryan down with me but he's not here to defend himself because he's out taking a nap today.
[00:11:40] Speaker A: Yeah, the show notes are a labor of love, but AI is starting to make things better. So Ryan promised a long, long time ago that he was going to create an RSS feed that would automatically summarize these things. And I am actually now 80% done with the code vibe coding it out. So I've, you know, we're coming along. The real reality is that the APIs for Google Docs are probably the worst part of the whole process.
[00:12:05] Speaker C: Well, I started and I got probably about 70% of the way down and all I had to do was connect it into, have the RSS feeder link in and then publish it to Google Docs. And I was like, okay, let me get the Google Docs and you know, feeding it in. I was going to do ridiculous thing of lambda on AWS, hitting Azure open GPT APIs and Google Docs just to be a true multi cloud solution because I hate my life. And I got to the Google Doc parts, I was like, this is no fun anymore. I don't really want to understand their APIs. This doesn't work. So maybe the vibe coding will actually help because it will just work or you'll hate yourself more.
[00:12:49] Speaker A: I mean, I can tell you that I tried to use MCP for the first time as I walked through, you know, basically trying to get show notes kind of parsed through the thing. I was like, well maybe I can use MCP with Claude to basically do this. And so, you know, you're supposed to use Claude to create your mcp. And like, I probably could have figured it out, but like, I lost interest about halfway through.
[00:13:12] Speaker C: I know where I got. There's a GitHub repo with my code in if you ever want to look at. Terrible Python code.
[00:13:17] Speaker A: Yeah, exactly. So, like, I created a Google Docs provider. Then there was some, like, special steps. I had to go get API keys and that required me to log into the Google Cloud console. And then I was sort of like, eh, I don't know if I need this. And so that's kind of how that died for me on this one. But then I was just like, well, I'll just export it as PDFs. But, you know, the reality is these files are big too. Like, because again, like, they're hundreds of pages long, the Show Notes files. So they're like two to three megabytes per second. But if a listener is out there who is better at AI than usual wants to take our Show Notes, I will gladly give these files to you and you can see if you can parse them into something meaningful. Statistics and things that would be more interesting.
[00:13:52] Speaker C: But.
[00:13:52] Speaker A: All right, let's get to the news because I mentioned there's a bunch of it today and we don't be here all night.
All right, general news. Google is hold is being held up to their legal monopoly in ad tech US Stretch fines. Department of justice has won the ruling against Google, paving the way for US Antitrust prosecutors to seek a breakup of its ad products. Google was found liable for willingly acquiring and maintaining monopoly power in markets for publisher ad servers and the market for ad exchanges, which sits between buyers and the sellers. Hearings will now be scheduled to turn what Google must do to restore competition in those markets. And U.S. attorney General Pamela Bondi called the ruling a landmark victory in the ongoing fight to stop Google from monopolizing the digital public square. Google says they will appeal the ruling, of course, pointing out they won half of the case and just lost the latter half. And DOJ says that Google should have to sell off at least its Google Ad Manager, which includes the company's publisher, Ad Server and Ad Exchange. This ruling is in addition to recent ruling that uses monopoly practices with a Chrome browser and would need to sell that off as well. So there's multiple cases against Google. All kind of coming to a head now with Google not coming out the victor in all of these.
[00:14:58] Speaker C: No, it's fascinating.
It feels like a replay of when I was growing up and Microsoft IE and Windows and all those things. It's the same story. You know, we kind of let everything slide for years and now it's all coming back. And you know, it's good to see that they are trying to at least look at these companies and say if they have monopolies or not. And hopefully it does. You know, my wife's in marketing and you know she talks about pretty much it's Google, there's a little bit of Amazon depending on if you're a product shop or not, like so what you're selling. But overall you really have to integrate in with Google if you really want anything. And still getting above the fold aka like 3 or 4 on your SEO is so critical and they control everything.
[00:15:46] Speaker A: Well, Google didn't take this sitting down. They actually posted a blog post they're technically responding to the Chrome browser ruling, uh, but basically said that they said the Google's DOJ or sorry, Google says that DOJ or Department of Justice 2020 search distribution lawsuit is backwards looking case at a time of intense competition and unprecedented innovation. With new services like Chat, GPT, Deepseek thriving in the market, DOJ sweeping remedy proposals are unnecessary and harmful. Google says a trial they will show how DOJ unprecedented proposals go miles beyond the course decision and would hurt American consumers economy and technological leadership. They have several points including DOJ's proposals would make it harder for you to get services you prefer. People use Google because they want to, not because they have to. That's not what the DOJ says.
DOJ's proposal would force browsers and phones to default to search services like Microsoft's Bing, making it harder for you to access Google. And I definitely would not be using Bing as my search choice.
[00:16:42] Speaker C: Yeah, I'm good. Yeah, I probably would go to DuckDuckGo before ping.
[00:16:45] Speaker A: Yeah. DOJ's proposal to prevent us from competing for the right to distribute search would raise prices and slow innovation. Device makers and web browsers like Mozilla's Firefox rely on the revenue they receive from search distribution. Removing that revenue would raise the cost of mobile phones and handicap the web browsers that use every day. I mean Apple phones keep going up regardless of this. So I don't think I'm getting any benefit from the money that they're paying Apple to make Safari or Google the default search engine. I'm pretty sure Apple's just pocketing that money and charging me more.
[00:17:13] Speaker C: Obviously. How else are they going to, you know, show the profit margins of the shower year?
[00:17:17] Speaker A: Exactly. The Firefox browser would definitely be a loss though. That isn't the primary way they make money. And so that one I. But again it's, it's sort of, you know, is it really for convenience?
[00:17:29] Speaker C: So, but also why could they still run the same programs? They could still make it be the default in Firefox. Like nothing stopping them from doing it, I guess. Unless if they sell off the ad space from it. So they're not gonna provide that, that revenue stream for them anymore. Which at that point feels kind of petty.
[00:17:49] Speaker A: Yep.
Third one. DOJ's proposal would force Google to share your most sensitive and private search queries with companies that may never, you may have never heard of jeopardizing your privacy and security. Your private information will be exposed that your permission to companies that lack Google's world class security protections where it could be exploited by bad actors. I mean Google, you're kind of a bad actor on your own. I mean like you're literally using my data to basically give advertisers information to basically sell me better ads. So I, I don't know if I agree with you or going to give, it's going to be any worse than you already having that data especially with.
[00:18:22] Speaker C: The ad exchange and everything else they integrate. You know, I remember a story on the news that was like like a US Senator and they were able to identify based on like five different purchases of like under like $1,000. Like what? Let's just go with unethical things. The senator was in.
[00:18:42] Speaker A: Yep.
[00:18:42] Speaker C: It's just, just by pulling like generic data and they were able to kind of triangulate this one person from that. So I'm not really sure that do no, do no harm is really still, always there.
[00:18:53] Speaker A: Oh, it's start. It's been gone for a long time. I think it's the reality.
[00:18:57] Speaker C: Yeah, but they like to say it's still there.
[00:19:00] Speaker A: They like to not correct you when you say that.
[00:19:03] Speaker C: Yeah.
[00:19:04] Speaker A: And then DOJ's proposal would also hamstring how they develop AI and have a government appointed a committee regulate the design and development of their products that would hold back American innovation at critical juncture. We're in fierce, competitive, competitive global race of China for the next generation of technology leadership. And Google is at the forefront of American companies making scientific and technological breakthroughs. That one I maybe agree with.
[00:19:24] Speaker C: How does Chrome spin off though affect development of AI?
[00:19:28] Speaker A: Well because now they don't have all that search history to then use it to train models of how people browse the web to answer questions, I guess. I don't know.
[00:19:35] Speaker C: Trying to piece together the puzzle Here.
[00:19:37] Speaker A: Yeah, I'm not sure. And then finally the last one. DH's proposal split off Chrome and Android, which we built at great cost over many years and make available for free, would break those platforms, hurt businesses built on them and undermine security. Google keeps more people safe online than any other company in the world. Breaking off Chrome and Android from our technical, security and operational infrastructure would not just introduce cyber security and even national security risks, but also increase the cost of your devices.
[00:20:00] Speaker C: My phone's already expensive and I use a Pixel, so I'm going back to no on the first part. Look, they do get enough data, you know, from being in the security world and hell, they just bought Wiz and they already owned Mandiant and other stuff. Like they are a.
They do have a lot of data in order to identify security risks and, and see patterns before a smaller company would. But I just don't see really how like Android spinning off, you know, I'm sure they send back a ton of data from my phone on everything I do from the keyboard of everything you type to even with like you know, the private mode and everything else all the way through like every webpage you search. So you know, it's. I don't know that I buy this 100%. You know, I think they're stretching, but I kind of can see where from a security honeypot, you know, aspect it would add a level of risk. But that implies that everyone in Google is talking to everyone else and sharing the security risks along the way that they find and making proper notifications internally, which at a company of Google size size, is that really something that I would 100% expect?
[00:21:16] Speaker A: Yeah.
So that's what Google says about what they're proposing so far. They have now given their suggested remedies to make competition better. So first of all, browser companies like Apple Mozilla should continue to have the freedom to do deals with whatever search engine they think is best for their users. The court accepted that browser companies occasionally assess Google search quality relative to its rivals and find Google's to be superior. And for companies like Mozilla, these contracts generate vital revenue. So do nothing good. Check, check. Second, their proposal allows browsers to continue to offer Google search their users and earn revenue from that partnership. But it also provides them with additional flexibility. Like it would allow for multiple default agreements across different platforms, I. E. A different default search engine for iPhones and iPads and browsing modes, plus the ability to change their defaults which provider at least every 12 months. The course decision specifically refer to a 12 month agreement as presumed reasonable under antitrust law. So basically Apple could have a deal for iPhones to use Google search engine, but my iPad could have a deal with DuckDuckGo, blah blah blah. That's what they're saying. Again, not really anything changing here.
[00:22:19] Speaker C: Yeah, I was gonna say it feels unlikely too that specifically Apple, which if you really think about the two major players are Apple and Android, Apple is gonna do something different because their whole thing is a unified experience across all their devices and they all integrate, integrate seamlessly. So would you really want the same search query from your phone to be different than your tablet, than different than Safari? That doesn't feel logical or something that any companies ever that specifically Apple is going to go do.
[00:22:49] Speaker A: Exactly.
For Android contracts. Their proposal means device makers have additional flexibility in preloading multiple search engines and preloading any Google app independently of preloading Search or Chrome. Again, this will give their partners additional flexibility and their rivals like Microsoft more chances to bid for placement through a vendor. So basically there are some restrictions that they put on Samsung and others who license the Android operating system that they have to include the Google apps. Basically this would be giving the flexibility to not do that or to provide alternatives.
[00:23:17] Speaker C: In addition to feels like they purposely are going outside their way to do something that they probably shouldn't do. And this is their hey, we're going to throw one in to make it feel like we're doing something, but we're really not doing anything.
[00:23:31] Speaker A: Yep. And then finally their last point is oversight and compliance. Their proposal includes a robust mechanism to ensure we comply with the court's order without giving the government extensive power over the design of your online experience. So trust us, it worked for Boeing. It worked for us too.
[00:23:45] Speaker C: Boeing's doing a great job. No planes fell out of the sky and they rebuy the company they spun off.
[00:23:51] Speaker A: No doors blew out. You know, based on self certification with the faa. There's no problem, no issues.
[00:23:56] Speaker C: And like, you know, they spin off a company and then repurchase that company back for like 10x the price they spun off for and you know, sold off all their engineering wings to 12 other companies. It doesn't feel like anything that.
[00:24:10] Speaker A: Yeah, so that's, that's their proposal. Their proposal doesn't seem very strong. It's kind of my take.
I don't know what you thought, but definitely not a great look for Google. So we'll see how this all ends up for them. I'm not as confident as they are that that's going to work out.
All right. ChatGPT is releasing OpenAI03 and the O4 Mini, the latest in their O series models trained to think longer before responding to you. OpenAI says these are the smartest models they have released to date, representing a step change in chat GPT's capabilities for everyone from curious users to advanced researchers. And for the first time, the reasoning models can agentically use and combine Every tool within ChatGPT. This includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs and even generating images. OpenAI03 is the most powerful reasoning model that pushes the frontier across coding, math, science, visual perception and more, and sets a new standard on benchmarks including Code Enforces, SWE Benchmark and MMMU. External experts say O3 makes 20% fewer major errors in OpenAI01 on difficult real world tasks, especially excelling in areas like programming, business consulting and creative ideation. The OpenAI 04 Mini is a smaller model optimized for fast cost efficient reasoning and achieves remarkable performance for its size and cost, particularly in math, coding and visual tasks in expert evaluations. The O4 mini outperformance processor O3 mini on non stem tasks as well as domains like Data Science. Thanks to efficiency, O4 mini supports significantly higher usage limits than O3, making a strong high volume, high throughput option. For questions that benefit from reasoning like.
[00:25:45] Speaker C: You think of is we should have made the title of the podcast the Cloud podcast. Oh yeah.
[00:25:53] Speaker A: So I mean the thing about this, everyone thought they'd have GPT 4.5 out by now and they don't. And then they're now doing major updates to both the O3 and the O4 Mini.
You know, it sort of feels like they're just making optimizations to the current models because they don't have anything better to provide and they need an update.
[00:26:19] Speaker C: Out there in the world.
[00:26:20] Speaker A: Yeah, they need something to respond to. Gemini 2.5 and Claude Sonnet 3.7 and Mistral, whatever version they're on, you know, like all these different things. And so this feels like maybe OpenAI is losing their way just a little bit. I don't know what you think. That's. That's my take.
[00:26:37] Speaker C: Yeah, I mean, I think it's weird that it felt like they were moving, they were trying to be like, here's three now we built you four, go use four. You know, the same way that, you know, AWS does. Here's the G.
I don't know, M4 moved to the M5. And that's when I realized I've Been on the cloud too long because I'm Pretty sure there's M7 and M8 out, but we're going to bypass that point, you know, so they feel, you know, they would make the next model be cheaper, better and try to get people to move. And now they're like, well, there's also this thing that we're going to update on this old one. And it feels like they're going backwards, like you said, like they're not actually. They're doing an incremental update, not a massive update, which is where a lot of the other vendors are going. So it feels like they had a really good, and hopefully I'm wrong, like jumpstart in the whole LLM AI craze and they stopped innovating and now they're having issues getting it out or they thought they were going to be able to do something and they can't. But like, they're kind of dead in the water with press announcements and producing new things and they're like, what can we get out there to show that we're still one of the leaders in this area?
[00:27:44] Speaker A: Yeah, I mean, it definitely feels like things are not where they want them to be. And you had to wonder like, is this partially because they've had a brain drain of talent? You know, OpenAI engineers are getting a lot of money in the market and they're taking advantage of that. You know, things with Microsoft don't seem quite as great as they once were.
You know, there's a lot more pressure and competition on them and so there's more pressure to deliver. And so are they.
Or they're like, look, we're focusing on AGI as the next evolution and so ability to do these type of step function iterations doesn't make sense for us. I don't know, it's sort of weird. It does feel. I wouldn't count them out. They're still the first mover advantage. But it definitely seems like there's something going on here a little bit in their roadmap where maybe they've ran into some stumbles they're not talking about or just there's other things happening that we don't understand yet.
[00:28:37] Speaker C: Yeah, I mean, the other thing is, you know, we'll talk. You know, we talked about that during the Azure section, of course, but like Azure has pretty much got all in on OpenAI and the GPT models and I'm curious. I know there was a deal. I don't remember the details of it, but like at one point Microsoft's going to have to take a step back and be, hey, we're only allowing these models in here. Maybe it's time we allow cloud models and other ones in our OpenAI service so that we're able to expand. Because if they're not getting it, people are going to start looking at other clouds or other solutions in order to provide these services that they need.
[00:29:13] Speaker A: Well, this also goes back to, you know, they, they made a big announcement without Microsoft about building brand new data centers for AI training.
[00:29:20] Speaker C: Yep.
[00:29:21] Speaker A: It's possible that, you know, Azure doesn't have the capacity that they need either to make the next, you know, level model. Maybe that's a problem. Or, or the amount of data that we have available, they've, they've sucked all the data they can into the model to make it as good as they can. And really, until you change the underlying transformer model to something different, like, there's really not a lot of huge leaps. You know, we'll kind of know probably in the next year. Because Gemini and Anthropic, you would assume, would hit similar walls if this is the case.
[00:29:53] Speaker C: Yeah, I mean, everybody read something at one point where at the rate, at this, the rate difference between 3, 3, 5, 4, 5.0 was going to use every piece of data that we had on Earth plus consume more power than we generate in a year. There wasn't any other improvement. So I think they're having to look back at the general setup of everything and kind of reevaluate. So maybe they're just dealing with, for lack of a better term, all the tech debt they've left themselves.
[00:30:26] Speaker A: Yeah, could be.
Well, in addition to the new models we just Talked about, the O3 and 04 Mini, they're also releasing three new models only in the API, including GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano. These models outperform GPT 4.0 and GPT 4.0 Mini across the board, which makes you wonder why you just released 4.0 and 4.0 mini with major gains in coding and instruction following also have larger context Windows, up to 1 million tokens of context, and are able to better use that context with improved long context comprehension. They also feature a more recent knowledge cutoff of June 2024, which. It's April 2025, is it not, Matt?
[00:31:04] Speaker C: Yep, 100%. I know that for many reasons.
[00:31:06] Speaker A: Yeah. So isn't that almost a year? That's like 10 months ago.
[00:31:12] Speaker C: And nothing has changed in the last 10 months.
[00:31:15] Speaker A: Nothing's happened. You Know, we don't have a new president, we don't have a new Congress. You know, all those things that you might ask a system for, that's a little weird. GPT 4.1 apparently excels at coding. GPT 401 scores 54.6 on software engineering benchmark Verified, improving by 21.4% over GPT 4o and 26% over GPT 4.5, making it a leading model for coding instructions following on scales. Multi challenge benchmark, a measure of instruction following ability. GPT 4.1 scores 38.3%, a 10.5% increase over GPT 4O and Long Context on Video MME, a benchmark for multimodal long context understanding. GPT 4.1 sets a new state of the art result, scoring 72% on the long no subtiles category, a 6.7% improvement over GPT 4.0. Note that the GPT 4.1 will only be available by the API and they will also begin deprecating GPT 4.5 preview in the API as GPT 4.1 offers improved or similar performance in many key capabilities at much lower cost and latencies. GPT 4.5 was not a good product and they've now basically said, no, no, it should have been GPT 4.1. And now they fixed the branding because they got called on the carpet, basically.
[00:32:24] Speaker C: So one impressive. You were able to read all those stats that quickly? And I think I used to listen to the podcast on like 1.5x with the way you just read all that, with all that data, I don't think I could have ever listened that quickly. So bravo.
[00:32:37] Speaker A: Yeah, I might have been a little fast. Sorry, sorry listeners, but there's a lot to get through on that. I was like, okay, we gotta get through these metrics on these benchmarks that don't mean a lot. So, yeah, I mean also like looking.
[00:32:47] Speaker C: At the benchmarks, I still question, you know, cool, you're 20% over what, what is the 40 mile, what's like the accuracy? What's the value of that? I can say I'm 20%, you know, taller than yesterday. But like, what are we comparing it to? You know, and was it accurate? Was it, you know, the average person from their vibe coding get enough done or were they, you know, struggling still in this 20% can be a massive impact. Like what is that 20% actually providing?
[00:33:17] Speaker A: Yeah, I, again, I, I don't know. And like, do you care at the levels of performance that, that matters? Right. You know, like, right you know, it was 3.4 seconds and now 23% impre improvement is, you know, 2.8 seconds. Like, do I care?
[00:33:31] Speaker C: That's the other side of it too.
[00:33:33] Speaker A: Yeah. So these are always the thing with these metrics and percentages. You're like, well, I don't really know one way or another what that means, but it's interesting.
You know, again, it's interesting just the branding and like I, they, we didn't like the 4.0branding, the 3.0branding. You know, we thought it was silly when they did it, you know, the GPT 4.5, 4.1, 4.0 branding. Like I feel like they've, they've backed themselves into a corner and now it's just causing brand confusion 100%. So I think Anthropic needs to take a hard look at that at some point and figure out maybe you go the Google way. Like, you know, you know, there's multiple flavors of Gemini 2. Five, there's the Mini, there's the deep thinking, but you're enabling the features that are really the different traders, not enabling different models. Even though under the hood it is completely a different Gemini model. You just don't see that. And it helps with the branding confusion, I think. And you see the same thing with Anthropic too. All right, let's move on to cloud tools. So we have complained at least a couple times on the show about how far behind we felt GitHub Copilot was and the fact that they did not have an agent coding experience to do vibe coding. But they do. Finally, they have now announced Agent mode in vs. Code is rolling out to all users now, complete with MCP support that unlocks access to any context or capability you want. They're also releasing the open Source and local GitHub MCP server, giving you the ability to add GitHub functionality to any LLM tool that supports MCP and keeping their commitment to offer multimodal choice. They're making anthropic cloud 3.5, 3.7, 3.7, sonnet thinking and Google Gemini 2.0, Flash and OpenAI03 mini, generally available via premium requests included in all paid Copilot tiers. These premium requests are in addition to unlimited requests for agent mode, context driven chat and code completions that all paid plans have when using the base model and the new Pro plus tier. Individuals get the most out of the latest models with Copilot. In addition, they have announced the GA of Copilot Code review Agent, plus the general availability of Next Edit suggestions so you can tab tab, tab your way to coding glory.
[00:35:29] Speaker C: I mean, I'm totally going to end up bad coding a bunch of stuff and I've already played with it and it works decently well. Did we talk about like the Pro vs Pro plus and all that stuff?
[00:35:41] Speaker A: We didn't really.
[00:35:43] Speaker C: I mean I did a little bit of research at one point. I didn't see the massive benefits besides access to other models when I looked, but I don't remember all the details.
[00:35:52] Speaker A: And there's like not a major. I mean Copilot Pro is $10 per user per month or 100 per year. And then Copilot Pro plus is 39 USD per month or 390 per year.
But then if you're getting to like the Copilot for business or Copilot for Enterprise, I think they include some of the features of Copilot Pro plus. So this is another one of those. Like if you're an individual coder, you should probably just buy Pro plus because you get 1500 premium requests versus 300 per month.
But in the enterprise then business you get lesser than the 1500. So it's a little weird. Definitely something to look at. But you also do pay by the drip for additional premium requests at 4 cents a request. Basically.
[00:36:35] Speaker C: Yeah. I still don't really understand how you would track the number of requests and everything else. Like is a tablet request is two tabs, two requests. So like definitely I would be a little bit careful before you do it. I mean I use the Pro in my day job and whatnot and it works fine for me. I don't have it needed the premium access models or I haven't found a full reason for that yet.
[00:36:59] Speaker A: Yeah, me neither. But I think I'm on an enterprise plan at work, so that's what I'm playing with. I don't have a personal version because I've been using Claude. So I'm kind of curious to play with it at work. And if I like it at work I might, I might buy a license on the Copilot side for my personal usage. Because right now I'm getting the Copilot free, which only gets me 50 premium requests. And now it does have agent mode, but you'll run out of agent mode real quickly. Is my experience with that level of. At least with using cloud, I, you know, doing simple tasks and things, you know, use a lot of requests.
[00:37:30] Speaker C: Yeah, I still use it for a lot of general scripting. Just like, hey, I need to pull this data for this report and this is where I slowly realize I'm trying to follow in Justin's footsteps of becoming an executive, where I jokingly say people don't let me do anything fun during my day, my day job anymore, but I'll just go and be like, hey, I need this data. And that's where I end up using more of these types of things of hey, go write me this thing that queries the API of Azure or some other tool that I can jumpstart the 100 line script or 200 line script. I'm not using it in a full through software development setup.
[00:38:11] Speaker B: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Our Sharer gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Achero will click the link in the Show Notes to check them out on the AWS Marketplace.
[00:38:50] Speaker A: Availability of VPC Route Server to simply simplify your dynamic routing between virtual appliances in your VPC route server lets you advertise routing information through BGP from virtual appliances and dynamically level update the VPC route tables associated with the subnets and Internet Gateway. Before this you have to create a custom script or use virtual routers with overlay networks to dynamically update VPC route tables. This removes the operational overhead of creating and maintaining overlay networks or custom scripts and offers a managed solution for dynamically updating routes in route tables. With VZroute server, you can deploy endpoints inside your VPC and peer with them. Your virtual appliance to advertise routes using.
[00:39:27] Speaker C: Bgp definitely done a lot of hacky stuff over the years that this replaces. If I remember correctly, when this was I feel like this was two weeks ago, this was ga. At this point it was yeah, it is not a cheap service and it's by endpoint. So if you're going to look at this, yeah, it will simplify a lot of your life, but just be careful of the parsing.
[00:39:53] Speaker A: Yeah the party keeps rocking over in Party Rock who now has Image Playground that leverages Amazon Nova Canvas foundational models to transform ideas into customizable images you can access directly through Images section, featuring intuitive interface and comprehensive customization options. And when they say intuitive interface and they say Party Rock in the same sentence. It means for a child. Yeah, I do keep hoping they're going to kill Party Rock. Now they have nova Amazon.com, i'm surprised. I mean, I assume this is already in development by the time they came out with Nova, Amazon.com, but again, I'm still rooting for the retirement of Party Rock at some point in the future.
[00:40:30] Speaker C: Yeah, but then we don't have fun show titles.
[00:40:33] Speaker A: That's true. Very, very true.
Good news. If you use Amazon Peering and you've always been confused about the inter az VPC peering usage in your bill, especially within the same AWS region, they're now introducing a new usage type in your bill. The bad news? It's effective immediately, meaning it will break or show up in your phone ops tooling in a new and exciting way this month as your VPC peering now appears with no notice. Thanks Amazon for that one. I appreciate it.
[00:40:59] Speaker C: Yeah, but there's always this black hole of networking on your AWS bill, or really any cloud build.
[00:41:06] Speaker A: The EC2 other is like the bane of my existence. I'm like, what is in this bucket?
[00:41:12] Speaker C: So I know we're, we're already, we're, we're trying to try to pick up the pace but like long tangents real fast. There was definitely a point at one point that I was working with a client that was like a very small, you know, AWS and they were digging into their build, trying to understand. It went to AWS, it went to EC2 other and I must have spent like an hour like going back and forth with support that because I was just curious. I was like, what is in here? How do I break this out? How do we look at it? And I'm sure It was like $40 and it wasn't actually worth the time, but even support at that point. This was seven years ago, eight years ago. And now I just feel like I've been on the cloud too long, had no idea. And they're like, it's just all the other stuff related to everything and it's just this black hole. So anything they can do to give visibility around networking specifically because they've never reduced the price of it, is definitely going to be useful. I'm sure your finops person will beat everyone shortly when they get this and everyone on this listening to this will have to be dealing with new questions they get. Hey, what's this charge and why have you never seen it before? Where is it?
[00:42:23] Speaker A: Why is this service costing so much money to talk to that service. Yeah, right. Can't wait.
[00:42:27] Speaker C: Good luck.
[00:42:28] Speaker A: Yeah, sorry for finops folks. It's going to be a bummer.
All right, well, Announcing AWS Security Reference Architecture Code examples for Generative AI this is for all those lovely security practitioners out there doing good work like Ryan. AWS is giving you new security reference architecture code examples for securing your generative AI workloads. These examples include two capabilities focused on secure model inference and RAG implementations. The new code examples are available in the AWS SRA examples repository and include ready to deploy CloudFormation templates to assist application developers in getting started with network segmentation, identity management, encryption, prompt injection, detection and logging and monitoring. I always love a good example.
[00:43:07] Speaker C: I always love some good cloud information. Nothing can go wrong with that.
[00:43:11] Speaker A: Yeah, I definitely was curious, you know, how they were deploying some of the things like guardrails and some other things which have some good examples. So if you are confused about some of the things, this was a pretty good repo to poke around at, which.
[00:43:25] Speaker C: Is yeah, I'd be interested if this ends up falling into AWS config of rules you can kind of just turn on in the future. But I don't think they're ready to ga it. But it wouldn't surprise me.
[00:43:38] Speaker A: Yeah, I would think this is a lead into getting it added to the well architecture framework, getting it added into config, all those type of areas of focus.
[00:43:46] Speaker C: This is where if I'm smart I would take notes somewhere and say this is a reinvent announcement. But I know myself, I've output. I'll start notes. I think I started one last time. We had a really good idea and I don't know where that notice and it's only been one week.
[00:43:59] Speaker A: I mean like would you really. Would you really think this was something they would main stage? I don't know.
[00:44:03] Speaker C: Yeah.
[00:44:04] Speaker A: Oh, we added a framework, I mean back, you know, back in 2017. I guess they did well to framework on stage. That would have been a thing nowadays, you know, it's AI related but like it would be a brief mention on stage at best.
[00:44:17] Speaker C: Yeah, it wouldn't be a full segment but I could definitely see it being maybe not mainstage. Maybe. Well Monday's what hardware Tuesday is mainstage. Maybe on the AI keynote because they still have that.
[00:44:31] Speaker A: Yeah, that's true. A keynote might do it. The ML one. Yeah.
[00:44:34] Speaker C: Not that would help me with any sort of predictions.
[00:44:38] Speaker A: You never know. It might well if you wrote it down but you won't write it down and I'll remember it.
[00:44:42] Speaker C: No, write down. I'm just going to not know where I put it. So that's not useful.
[00:44:46] Speaker A: Maybe like the title of your Google Doc should be Prediction Show. Don't forget this Matt.
[00:44:51] Speaker C: Like I need Google to like automatically notify me. Maybe I'll write a lambda that will notify me like around the time of event.
[00:44:57] Speaker A: Yeah, but a reminder in the Slack that you have a Google Doc for this before.
[00:45:01] Speaker C: There we go. That's what I'll do. I'll put it like November 1st and make slack notify me.
[00:45:05] Speaker A: Exactly. That's what you should do. So helping you out. All right. Introducing Amazon novasonic. I love this combination name. This is such a Amazon naming if I ever see one. This is a new gen AI model for building voice applications and agents. The new model, Amazon novasonic is a new foundational model that unifies speech understanding and speech generation into a single model to enable more human like voice conversations and AI applications. I don't know what they might want for human like voice conversations. Perhaps Alexa.
Available in Bedrock via new bidirectional streaming API, the model simplifies the development of voice applications such as customer service, call automation and AI agents across a broad range of industries including travel, education, healthcare, entertainment and more. Nova Sonic solves challenges with traditional approaches around building voice enabled applications which require complex orchestration of multiple models. The unified model architecture of Nova delivers speech understanding and generation without requiring a separate model for each of the steps. I have a quote here from Rohit Prasad, SCP of Amazon Artificial General Intelligence. From the invention of the world's best personal AI system with Alexa to developing AWS services like Connect, Lex and Poly that are used across a wide range of industries. Amazon has long believed that voice powered applications can make all of our customers lives better and easier. With Amazon Nova Sonic, we are releasing a new foundation model in Amazon Bedrock that makes it simpler for developers to build voice powered applications that can complete tasks for customers with higher accuracy while being more natural and engaging.
[00:46:29] Speaker C: I mean it's straight up saying Connect Lex, AWS. Sorry, Alexa, Alexa. You know it's 100% where they're going with all this, which is, you know, how do we make these things not sound like robots? And it's getting creepily good to the point when we're just gonna tell the podcast to do it for us and we'll never know. You guys will never know if Justin or sorry Jonathan ever actually makes it back because he could be a bot at that point.
[00:46:55] Speaker A: Yeah, you never know which this reminds me of a story I didn't cover this week, but apparently saying please and thank you to chatgpt is costing them tens of millions of dollars.
[00:47:04] Speaker C: Yep.
[00:47:05] Speaker A: And it, you know, basically, apparently it's, it's costing the firm substantial sums and electricity expenses as it has to process both the please and the thank you. I'm like, why wouldn't you just strip that out if it's a necessary word? Yeah, I was like, I get it, but just strip it out.
I don't know if it's.
[00:47:24] Speaker C: And normally I've done it and it's. For me, it's like its own prompt.
[00:47:29] Speaker A: Right.
[00:47:29] Speaker C: So it almost like has to respond back to it, which is the way. Because it always has to have less work. It's AI.
[00:47:37] Speaker A: Yeah, that's a weird, weird wrinkle.
If you're excited about Novasonic, I also got Nova Real 1.1 for you. Real, of course, is their ability to create computer generated shot video shots. The Nova Real 1.1 provides quality and latency improvements in 6 second single shot video generation. Comparing Nova Real 1 to Nova Real 1.0 update generates multiple shot videos up to 2 minutes long, maintaining a consistent style across each shot. And you can provide a single prompt for up to two minute video composed of six second shots or, or design each shot individually with custom prompts. So I guess they saw the wizard of Oz and they were like, ooh, we should probably update NovaReal to do more cool things.
[00:48:15] Speaker C: That's where I was thinking. And they probably only had to get six seconds to show what they showed you guys in the keynote in the presentation.
[00:48:24] Speaker A: It was longer than six seconds. But yeah, I had to wonder if the sphere approached Amazon for doing that work and if they decided they couldn't do it or if Amazon decided we're too busy because I imagine they, you know, they talked about in the interview that there was only, you know, only like eight companies they reached out to they thought could even possibly do it. And then they shortlisted that down to like three or four companies and it was very clear that Google was the only one who was interested in partnering to do it As a scientific, you know, theory project.
[00:48:57] Speaker C: I think Google has the most, probably like R and D, like cool things like that that they do because they have enough other, you know, capital. It wouldn't surprise me also if the timing of it worked out well where they wanted to, you know, do it with Google Next, like from a marketing, you know, storyline perspective where if AWS did it, they don't have a big conference unless they did that like Senate, you know, all their other conferences aren't really relevant.
[00:49:25] Speaker A: Well, I mean they're shipping this in August re events not till later November. But I mean like again, I assume it's also like Amazon might have been able to do it, but then there was a question of like could they do it for free or for a reduced price or whatever Google did to do it. I mean I don't know if there's any, I don't have any financial terms what they do with that deal. But I imagine, you know, there was something. Google wanted the marketing piece more than Amazon did because Amazon is all about, yeah, yeah, we'll do stuff with you, but you're going to pay for it.
[00:49:52] Speaker C: Right? Where Google is like, let's get the story out there and you know, it's a good marketing piece that they'll have forever.
[00:49:59] Speaker A: Exactly.
Last year Amazon released Bedrock guardrails to standardize protections across generative AI applications, bridging the gap between native model protections and enterprise requirements and streamlining governance processes. Today they're announcing several new capabilities to guardrails including multimodal toxicity detection with industry leading image and text protection, enhanced privacy protection for PI detection and user inputs, mandatory guardrail enforcement with IAM and optimizing performance while maintaining protection with selective guardrail policy applications. Applications.
[00:50:28] Speaker C: I mean I like that they're making it mandatory and this kind of goes back to some of the other stuff which is, you know, security is compliance. Are really starting to look at AI and saying, you know, between having to have, you know, I deal with our security and compliance a lot as I help oversee them in my day job. Like you're starting to have AI policy specific to everything, AI committees, AI everything that having this forcibly set checks that box from a lot of these stories and I think that's why we're going to see more and more of these out there between all the clouds.
[00:51:03] Speaker A: Yep, Amazon is announcing a price cut. I remember we used to announce these all the time. They're fun. Now they rarely come, but they get one occasionally. And this one is for specifically S3 Express 1 zone, which is a high performance single AZ storage class built to deliver consistent single digit millisecond data access for your most frequently accessed data and latency sensitive applications. Amazon is announcing a 31% reduction in storage costs, a 55% reduction in put requests and get requests by 85%. In addition, S3 Express 1 Zone has reduced the per gigabyte charges for data uploads and retrievals by 60% and these charges now apply to all bytes transferred rather than just the portion after the first 512 kilobytes overall. A little bit of a price increase on that last one with that 512 kilobytes not being part of free tier anymore. But I think net net, you're still going to save money on this one. So it's a net positive.
[00:51:52] Speaker C: Yeah, I think, you know, probably more people are leveraging this with, you know. Yeah. With AI and you know, generating their own rag miles and stuff like that. So they're probably getting enough. They got enough people into it that they figured out where the kinks were. Now are able to drop the price down.
[00:52:05] Speaker A: Exactly. And I think, I think, you know, single, single express, one zone type use cases. You're definitely going to be talking about training models or doing, you know, things like this. Data is put in a position where it can be quickly accessed by a model to train it.
[00:52:19] Speaker C: So makes sense not care if it's lost. I mean I love to and I still use S3 1A S3IA1 zone too for just cheaper storage too. So we kind of just built on that but gave you performance out of it.
[00:52:32] Speaker A: Yeah, I mean like I would even use like one zone for like Glacier if they made it available to me because like I, you know, I have a backup of my NAS that goes to S3 and I right now it goes to Glacier Direct, which is fine. But like I don't need all the redundancy of Glacier because again, it's just a backup of my backup. So 311 policy. But yeah, this is a good feature for those who have this advantage and nice price cut to help you to get more adoption. Always a win.
[00:52:58] Speaker C: I'm more upset that you said 311 and a 32 1.
[00:53:02] Speaker A: Whatever it is. I don't know is it 321 or 31 1?
[00:53:05] Speaker C: You've been at this admin for too long, Justin.
[00:53:07] Speaker A: Yeah, I know.
AWS STS now automatically serves all requests to the global endpoint in the same AWS region as your deployed workload. Enhanced resiliency and performance. Previously they were all served in the US east region. We warned you about this in a previous episode. We talked about it at length. I could get a link back to that show notes for that one, but it's now live. So if your stuff broke, you now know why.
[00:53:30] Speaker C: Still don't really expect this to break. Unless if.
[00:53:32] Speaker A: Yeah, I would not expect this to break either.
[00:53:34] Speaker C: I don't really know how this would break your product, you know, but I.
[00:53:37] Speaker A: Mean if you're using. I can see if you're using an outbound NAT gateway with filtering where you are only you're using IP addresses which you shouldn't do anyways because you're in AWS and it's all changing.
[00:53:47] Speaker C: So you're already wrong.
[00:53:48] Speaker A: You're already wrong. And then maybe you hard coded to the us. No, no, it's got to be just IP if you're using IP addresses, which was wrong. So yeah, that's how you screwed yourself.
[00:53:59] Speaker C: And you probably broke yourself multiple times. So if you were running an old school firewall that didn't understand DNS or you didn't want to be able to blame that it's DNS fault in the future, that's how you broke it.
[00:54:10] Speaker A: You'd be shocked how many companies run firewalls that don't understand DNS or they have security policies that don't allow them to use DNS. Even if the firewall, which is like.
[00:54:17] Speaker C: Why I have a lot of scar tissue at my day job and I've had a lot of arguments with security departments who tell me that they can't do it and that it's insecure and that I just. My brain dies a little bit.
[00:54:29] Speaker A: I mean if you have DNSSEC enabled on your domain and then, you know. Yes, if I go out of business and someone takes my DNS zone and then publishes bad stuff, you're screwed anyways.
You already should have closed those things and the company went out of business. So I don't know. But yeah, I think with DNSSEC it eliminates the man in the middle concern you might have and then I think that solves the one security complaint I can think of. But if you're running dnssec, I don't know why this is a problem to not have DNS.
[00:54:59] Speaker C: Yeah, I think DNS SEC just got added to one of the clouds recently. I know AWS had. I feel like AW or Azure might have just added or like expanded on it.
[00:55:07] Speaker A: I'll have to look it up maybe. Yeah, I mean there's some interesting things if you want to have multiple DNS providers and run dnssec. There's a lot of sharp edges there. Yeah, that's one of the reasons why we're reducing the number of DNS providers we have so we can actually enable dnssec because we think that's more important than having the redundancy of Amazon and NS1. And honestly Amazon's reliability on Route 53 has been pretty darn good. Knock on wood.
[00:55:33] Speaker C: I mean all of AWS is getting you down if.
[00:55:36] Speaker A: Yeah, I mean like it's such a huge failure domain at this point that it's got a lot of redundancies in it that I'm not too terribly concerned about it. Now if it was, you know, my own bind servers, you bet you. I want multiple.
[00:55:46] Speaker C: Come on, what could go wrong? It's just bind on a. On an ECS container. What could go wrong?
[00:55:51] Speaker A: What could go wrong? Yeah, no problem.
[00:55:53] Speaker C: Not a spot instance. It's reliable.
[00:55:55] Speaker A: Yeah, that's perfect place for DS Spot instances. That's for you for your ad controllers too on spot for 100%. Yeah, yeah.
[00:56:02] Speaker C: I run my Windows servers on spot instance.
[00:56:05] Speaker A: Yeah, perfect.
[00:56:07] Speaker C: I got told the ad was too expensive, so I decided to go to spot. What could go wrong?
[00:56:11] Speaker A: Yeah, what could go wrong?
All right. Gcp we don't have a lot of stories for them because they had Google Next and so these are just a couple things that came up. Spring cleaning with FinOps Hub they have some new FinOps capabilities. This week FinOps Hub 2.0 was released at Next comes with a new utilization insights to zero in on optimization opportunities. The FinOps 2.0 capabilities focus exclusively on bringing utilization insights on your resources to the forefront so you can see what potential waste may exist and take action immediately. And they say waste can come from many forms. From a VM that is barely getting used because it's over provisioned to a GKE cluster that's actually running hot because it's under provisioned to manage resources like cloud run instances that may not be optimally configured due to the bad configuration. So it gives you insights into all of those. Gemini Cloud Assist Supercharges finops Hub to summarize optimization insights and send opportunities to engineering teams. This tool helps create personalized cost reports and synthesize insights which has resulted in more than 100,000 finops hours saved by Google customers. And the final item is eliminate waste. Introducing a new IAM role. Permission for your tech solution owners to see and directly take action on these optimizations via the new project building cost manager role which is for all your managers who are now have a budget they need to maintain but you don't want them to have access to the rest of the infrastructure because they're not trustworthy.
This is something Ryan would do to me, I'm pretty sure with you a little bit. Yeah. Just to mess with me. Totally.
[00:57:34] Speaker C: Yeah.
No, I mean I think it's always good to add these things. You know, even the most clean environment, there's always something else. There's a hard drive from a server that you deleted that you forgot to clean up. There's all these small things that just sit there and run cost add costs and honestly it's the way the cloud providers make money is these things sitting there. So it's nice to them that they are slowly adding all these little pieces into the world and hopefully keeps your environment clean and saves you some money along the way.
[00:58:09] Speaker A: If you're excited about Gemini 2.5 flash, it's not available to you in preview via the Gemini API via both Google AI Studio and Vertex AI.
[00:58:19] Speaker C: Yay.
[00:58:20] Speaker A: Yay.
[00:58:21] Speaker C: Sorry.
[00:58:22] Speaker A: No worries then. Our last Google story Memory store for Valkey is not generally available. Someone missed the deadline for next apparently a significant step forward for open source in memory data management. Google is offering a four nines Availability SLA along with features such as Private Service Connect, Multi VPC Access, Cross region replication and persistence, and many more. In addition to that PSC connect I just mentioned browse giving you zero downtime scaling, integrated Google built vector similarity searching and managed backups as well as that cross region replication.
[00:58:50] Speaker C: Every time I read one of these stories I just feel like it's getting shoved in my face that Azure backed Redis and it will probably never get valky and I'm just mad at everyone.
[00:59:05] Speaker A: That's a true statement if I've ever seen it.
[00:59:08] Speaker C: I just can't do anything about it. I also thought this was G8 because I think AWS has been Jade for a while, so I was surprised to see her that it wasn't Jade yet.
[00:59:17] Speaker A: Yep.
All right, we're on the home stretch. We're down to Azure and Oracle so I know we're almost there, almost there. We'll finish up with Oracle, which is a great story, so lots of humor in the last one. So finish on a high note.
[00:59:31] Speaker C: That's why we keep Oracle around.
[00:59:33] Speaker A: Yeah, that's why we keep it there at the last part of the show.
So Azure is releasing several new capabilities into AI Foundry this week. First of all is the general availability of the Agent Framework, an extension of Azure AI Foundries Open Source Kit Semantic Kernel specifically designed to simplify the orchestration of multi agent systems. The features include one the Agent Framework, which makes it easier for agents to coordinate and dramatically reduce the code developers need to write. Organizations like KPMG are using Semantic Kernel to orchestrate workflows among specialized agents, dramatically reducing the development complex complexity. And at the core of Azure AI Foundry is a feedback system and advanced OBSERVABILITY platform that gives developers visibility into agent behavior and outcomes. They're also giving you a new AI Red teaming agent in Preview. It's an agent that systematically probes AI models to uncover safety risks. Integrating Azure AI Foundries Robust evaluation systems with Microsoft Security Pirate framework. The agent generates comprehensive reports tracking improvements over time. Creating an AI safety testing ecosystem evolves alongside your system. And finally, they're releasing Azure AI Foundry extensions for VS code in Preview as well. Developers can now build, test and deploy agent based applications entirely within their ide. No context switching is required.
[01:00:40] Speaker C: These all are great things. I like the concept of the AI regimen. It's kind of cool. I don't know how to use it yet or how I want to use it, but being able to actually kind of pen test the AI models and see where it can go wrong from a safety perspective and how you get around the guardrails is definitely like a big thing. So I think it's definitely going to add a lot to the ecosystem. I'm waiting to see other providers add these types of features.
Agreed.
[01:01:15] Speaker A: All right, next up, Azure Storage drops some new capabilities at Kubecon Europe. New updates include update to BlobFuse2. The 2.4.1 version allows you to access BLOB storage via the Container Storage interface driver. Providing a seamless way to store and retrieve data at scale speeds up your model training and inference, simplifies your data pre processing, ensures your data integrity at scale and parallel access of massive datasets.
[01:01:39] Speaker C: I love how a storage driver update is all about AI.
[01:01:45] Speaker A: Yeah. And it's the only thing they talked about from Kubecon. I'm like, that's. That's the best you could come up with? Like where's AKS in this story?
[01:01:53] Speaker C: Like nothing else on aks, you have this new driver and it can help with AI.
[01:01:58] Speaker A: Yeah.
[01:01:59] Speaker C: Bravo. Whoever wrote this on getting AI into the story for the driver. That's all I have to say.
[01:02:04] Speaker A: Yeah. Very well done.
Llama 4 models are now available in Azure AI foundry and Azure Databricks. These models include the Llama 4 Scout 17B 16E, Llama 4 Scout 17B 16 E instruct and the Lava 4 Maverick 17B 128E instruct. FP8. What all I don't know that meant Llama 4.
[01:02:25] Speaker C: The Llama for her. Justin, come on.
[01:02:28] Speaker A: See, Llama 4 had several architectural innovations including early fusion multimodal transformer as well as cutting edge mixture of experts architectures all in the llama 4.
[01:02:38] Speaker C: Jonathan, translate this to English please.
[01:02:41] Speaker A: Basically it's an open source model that you can now use. I know there. So. Because they're not liking OpenAI as much.
[01:02:47] Speaker C: But what is an early fuse multimodal transformer?
[01:02:50] Speaker A: I don't.
[01:02:51] Speaker C: That's where I want Jonathan.
[01:02:52] Speaker A: Yeah, that's true.
It is a herd of llama's cult. Is it really just a herd?
[01:02:58] Speaker C: Feels like there has to be a better name.
[01:02:59] Speaker A: Yeah, like most of these things have like you know, a pack of these called a kids or you know, whatever. Let's see.
[01:03:05] Speaker C: A herd of Llama is simply called a herd according to Search Labs AIO on Google.
[01:03:11] Speaker A: Well, that's kind of disappointing.
[01:03:13] Speaker C: Well, largely for the reason Lama is also known as a crea herd. C R I A is a technical name most people simply refer to as a herd.
[01:03:25] Speaker A: I mean, apparently all kids right now have that llama haircut, so maybe they're just a pack of teenagers.
All right, moving on. Azure has announced the GPT 4.1, 4.1 Mini and 4.1 Nano models are available to you via Microsoft Azure OpenAI service and GitHub Copilot as well as fine tuning via supervised fine tuning will be coming within the next week or two. They also announced the O3 and 04 mini because they also got the brand confusion. So they're also available to you in the Azure AI foundry as well as GitHub as well.
[01:03:56] Speaker C: Yeah, but someone that has to deal with multiple regions and everything else of this. I question when they say it's available, I'm like, okay, so it's in EC2 and one other place. Cool.
[01:04:07] Speaker A: Probably Europe for GDPR.
[01:04:09] Speaker C: Yeah. Actually the 3.0 mini model is available in every region.
That's impressive. I did not see that coming.
[01:04:18] Speaker A: That's good.
[01:04:19] Speaker C: So yeah, I'm. I'm. But the 3oh mini. Yeah, the O3 mini is available in pretty much every region. But 3o uses a lot less power than the 4.0, so it's pretty much East 2 and Sweden actually is their other large region. I've noticed for a lot of their stuff it's always available and that's their GDPR region.
[01:04:40] Speaker A: Interesting.
[01:04:41] Speaker C: Yeah.
[01:04:43] Speaker A: Azure announced the general availability of Copilot and Azure database. They're excited to share. The Copilot in Azure includes a KIVA for Azure SQL database that helps to streamline database operations, troubleshooting issues and optimizing your performance. All designed to enhance productivity and simplify your complex tasks. Some of the AI capabilities include intelligent troubleshooting to provide guidance for common SQL errors like the dreaded 4600613 error identify and resolve issues related to scaling and replication and assist with logging problems. Help you with performance issues, including helping you determine if the database is reaching a storage capacity limit or hitting its IO limits and analyzes database connections timeouts and provides recommendations to optimize your connection settings. Configuration guidance on selecting appropriate tiers for your database. Always more expensive tier is recommended. Clearing directions for creating and using correct connection strings to ensure seamless connectivity as well as troubleshooting issues with transparent data encryption, providing insights on replication issues and offering solutions to secure and sync your data across geo secondaries. So I'm glad to see AI coming to Azure SQL. I'm more excited about that bringing AI to Azure or to SQL Management Studio, which will be much more interesting to me long term it's the SQL Management.
[01:05:52] Speaker C: Studio that's the big piece. I've tried the Copilot integration in Azure when I had an error that was called like literally internal server error. When I was trying to do something I was like how do you fix it? It's like contact Microsoft support. I was like I don't want to do that. Please don't recommend that.
[01:06:09] Speaker A: Don't make me do that.
[01:06:10] Speaker C: Don't make me do that.
[01:06:11] Speaker A: Not like this, not like this.
[01:06:12] Speaker C: Yeah, you don't go very far unless if you open a semi this isn't worth it. So please, please don't make me do this. So yeah, but they keep adding to it. Hopefully it will help and you know, hopefully help streamline a lot of their, you know, tier one support requests that they get and their outsourced support is able to help with medium to real issues or they just can bypass all the fun filled outsource support for sure.
[01:06:41] Speaker A: All right. Announcing the public preview of the new Hybrid Connection Manager, a powerful tool designed to enhance connectivity and streamline the management of hybrid connections. Key features in the preview include cross platform compatibility to support both Windows and Linux clients, allowing for seamless management of hybrid connections across platforms, enhance UI and improve visibility. And for those who are not familiar, which was me because I did not know what the hell this thing was, I had to go Google it. It's basically a zero trust networking solution, but it's basically Hybrid Connection Manager is a relay service that enables Azure app services to securely communicate with resources and other networks, particularly on premise systems without requiring complex networking configurations like VPN or ExpressRoute, which is very ZTN. So I don't know why they don't just call this hybrid ZTN manager because then people know what they're talking about, but that's what Avature's model and style.
[01:07:26] Speaker C: To be fair, it's all the cloud models in style. I feel like if we obscure the name enough, maybe we don't have to figure it out.
[01:07:35] Speaker A: Exactly.
[01:07:36] Speaker C: And no one's going to use it.
[01:07:38] Speaker A: Microsoft has a cool one bit AI model story coming from Ars Technica. Future computing may not need supercomputers thanks to models like the Bitnet B1582B4T model that's going to need a rebrand. Most traditional AI models rely on full precision weights which are 32 bit or 16 bit floating point numbers. These weights require significant memory and computational resources, and each weight represents millions of possible values for precise calculations. Typically, this math and floating point operations require high powered GPUs consuming a lot of electricity. In fact, it's estimated right now that AI GPUs are consuming 4% of the global electricity supply today. This new Bitnet approach from Microsoft that powers the Bitnet B1 582 B14 model uses extreme compression by reducing weights. Just three possible values, negative 1, 0 or plus 1. This 1.58 bit approach dramatically reduces memory needs and the model requires only simple addition operations instead of complex multiplications. Using a custom framework called Bitnet CPP, these are designed to run efficiently on CPUs rather than GPUs, requests only 400 megabytes of memory versus the 1.4 gigabytes of comparable models and this might be a great use case for edge computing, mobile devices, embedded systems, et cetera in the future. I don't know what any of that meant, but it sounds cool.
[01:08:52] Speaker C: It sounds cool. I like the just negative one, zero and positive one kind of schema there because it really will make it be able to work on low power devices, phones.
They talk about embedded systems. Anything along those lines hopefully can start to get that feedback to users better than having to have a network signal. So I'm also thinking like offline systems and start to do very simple AI. I would assume this is fairly far away from now, but it could have really cool potentials and dramatically change the market once they get there. I still am curious of like 1bit v1. I'm curious what the 58 to be 4t is what that stands for. Like has to stand for something.
[01:09:42] Speaker A: I'm sure it does. What? I can't tell you.
All right, and then you, you dropped a story in here and wrote the photos up. So I'm going to turn over to you Matt. I'm going to take a breather, take five minutes.
[01:09:53] Speaker C: I can't speak nearly as fast as you.
So they released a conversion to hyperscale which is, you don't know is Azure's comparison to Aurora. And essentially when you do the migrations in the past you had a lot less visibility into what was happening. You pretty much would fire and sit there and clear your thumbs and hope it was happening. So in this most recent update they now give you the ability to have shorter cut over times and because they've done a bunch in the back end to actually handle that migration, higher log generation rate which means they can now actually produce the source to destination conversion at about 3x or 2x now and 3x in the future you can actually control the cutover. So in the past you would press the button and just do it then where now you can actually stage it and in great true observability you can see how far it is along the way where before again you would say hey convert my 3 terabyte database and wonder how many hours it was going to take because you had no way to do any of this.
So it's a really good improvement especially if you're in the Microsoft SQL ecosystem and having that ability to control this is going to be a big next step from the hyperscale perspective and hopefully make the migration to that service be much easier than it was.
[01:11:12] Speaker A: Nice.
[01:11:13] Speaker C: Yeah it's a good quality of life improvement and I thought that if anybody's actually using it it would be or looking to migrate it will help you hopefully with the storage you your jacks to hey look we can actually do this in schedule to say hey we're going to trigger it on Friday and prep it on you know and do it on Saturday versus Hey we're going to write a Friday night and it will be done sometime between Friday night and Sunday. We have no idea.
[01:11:35] Speaker A: Yeah, that yeah predictability is important.
[01:11:38] Speaker C: Yeah especially with SQL database.
[01:11:40] Speaker A: I can see why this, I can see why you're excited about this. I wasn't really sure now you explained it. I'm like oh yeah this makes sense.
[01:11:46] Speaker C: I've already done all the migration so it doesn't really affect me. I would be, you know we're just going to go do this and scheduled it during off hours. But there's a few that took a lot longer than we anticipated even with what information we got from Microsoft.
[01:12:00] Speaker A: Yeah, well in one of these articles they dropped a little tidbit that Microsoft build is May 19th through the 22nd in Seattle which is just around the corner less than like four weeks away.
So I hope people who were actually planning to go to build knew that already and that we were just behind the times here. Clap pod.
[01:12:18] Speaker C: Yeah, it's been announced for a little bit. I, I, I dropped at the bottom because I was debating whether I should make you guys do predictions the same way you guys made me do predictions for gcp.
[01:12:27] Speaker A: I mean, we could try, like, I.
[01:12:28] Speaker C: Don'T know, I don't think we're gonna.
[01:12:29] Speaker A: Do well, but I don't, I mean, I could, I definitely give it a shot, see if I can pull a win since I lost GCP to you.
[01:12:36] Speaker C: Let's, let's give us, let's see how the next few weeks go and who shows up.
[01:12:40] Speaker A: Yeah, let's see how we go. All right.
All right, Oracle. Before next, there was rumors that Oracle had been hacked and they were in denial, saying, no, no, no, we were not hacked. Well, guess what? They were hacked.
Their public cloud was compromised. They've come around to tell customers that they had a successful intrusion of their public cloud as well as theft of their data after previously denying it had been compromised. Claims emerged in late March by a hacker with the handle rose87168 boasted of a cracking into two of Big Red's login servers. Multiple information security experts analyzed the data from the hacker and concluded that Oracle's classic cloud product was indeed compromised as it wasn't patched against CVE 2021 35587, a vulnerability in Oracle's Access Manager, a product of its Oracle Fusion middleware. Oracle is apparently now facing a few losses over this and has run afoul of European gdpr. Regulations require organizations to report theft of customer data to affected folks within 72 hours of discovery, which is just not a great look Oracle. And now you're going to try to fall on the fact that, oh, it's our classic cloud and it's not our new unbreakable cloud thing, but this vulnerability based on the name CVE 2021 is almost 4 years old and you didn't patch this middleware. I get that it's your classic cloud product, but either deprecate the classic cloud or fix your vulnerabilities, folks.
[01:13:59] Speaker C: And it's in an Oracle product. It's not like they were leveraging.
[01:14:04] Speaker A: They can't blame Microsoft. Well, Microsoft gave us shitty code that had a vulnerability. No, it's their code. It's the code they wrote. They know the patch they wrote for the cve. Like, it just, it's Embarrassing.
[01:14:15] Speaker C: Like, this is bad gdpr, you know, in the EU is going to come down higher because, you know, they announced was rumored. They pretty much denied it. I thought we talked about it on one of the shows, but I could be wrong.
[01:14:29] Speaker A: Yes.
[01:14:29] Speaker C: And then they can't. Then the guy published all the information was like, look, I'm not lying to you, here's all the information. And they were like, no, no, this isn't real. Now they're like, oh wait, it's kind of real. Like, this is. Whoever is in charge of their PR security team is going to need to be taken out back and have a conversation with.
[01:14:49] Speaker A: Well, speaking of PR taking actual hacking, the Register took them to task and I love the Register.
So they did a line by line breakdown of the memo that they sent to customers on April 7th. And so we're going to do a little summarizing here because you should definitely read the whole article.
[01:15:08] Speaker C: You should read the article. It's great.
[01:15:09] Speaker A: I don't want to steal all of it, but it starts out with like April 7, 2025. And then the Register says the intruder put data stolen from Oracle for sale on CyberCrime forum on March 20th. Take Big Red 18 days to contact customers with a matter. Speaks volumes about the seriousness or lack thereof with which the database giant is treating this incident. Which. Yes. And then the next line, dear Oracle customers, which they say, well, this does sound better than deer sheep waiting to get shorn, which is a great line. And then there's a bunch of other ones here.
A hacker did access and publish usernames from two obsolete servers that were not part of oci. And so Register says we admitted we are compromised. And also that we leave obsolete unpatched servers like sitting ducks on the Internet. For indeed, the servers were broken into by a hole in Oracle's own middleware on its own tin that it forgot to patch. Yep, yep. So, yeah, these are very good jobs, right? A Big Red in particular, about how they terribly handled this thing. And the very final one was, if you have any question with this notice, please contact Oracle Support or your Oracle account manager. And Register says, good luck with that. Despite around a dozen requests for details and clarifications, Oracle refused to apologize or explain and only denied everything. Those customers we talked to say Big Red has not been responsive or given them any real reassurance. But Larry knows best. Go buy another license or something. Which. Ouch.
[01:16:26] Speaker C: Yeah, I mean, again, leaving your own software unpatched. Like, we know it happens. Like, how many times have you seen a server that you're like, oh, whoops, this thing wasn't patched. It wasn't caught by any of our vulnerability management tool was spun up. This POC that went to production or anything else along those lines that just kind of happens. But it's your own stuff in your own cloud. It's 2021 vulnerability. Which means how many other vulnerabilities are on these servers?
You think they're doing OS patching of whatever OS is underneath the hood? Feels unlikely at that point. Feels like this was like a forgotten server that somebody left up, which, you know, go back to the.
[01:17:08] Speaker A: I mean, how forgotten can it be if it's the login server? I mean, come on.
You can try to underplay it as much as you want to, but the reality is that they screwed up in a tack of their way. They denied it all the way around. I mean, even if they come out.
[01:17:22] Speaker C: And said they denied it is what killed her.
[01:17:24] Speaker A: Yeah, if they come out and said, you know, we have reports that, you know, potentially one of our servers and we're investigating, we don't believe it is us at this time, but we will let you know as soon as we can conclude our investigation and then come back and say, oh, we investigated and we found out it is our data and we found the issue and we fixed it. We've been a much better story than what they did.
[01:17:43] Speaker C: Yeah, trust me, I'm not trying to cover their assets in any way, shape or form. You know? You know, this is really like, you guys need to own up. And hiding anything is never the right approach when it comes to security. Because all I can think of is how. How many different ways can this bite you?
[01:18:02] Speaker A: Exactly.
[01:18:03] Speaker C: I didn't realize that it literally was login.us2.oraclecloud.com yeah, sorry I missed that piece.
[01:18:10] Speaker A: It's a server that no one knows about, right?
[01:18:13] Speaker C: Yeah.
[01:18:13] Speaker A: Yeah. You're not one that everyone who ever uses their system logs into every time at all. Yeah, no big deal.
[01:18:20] Speaker C: Yeah.
[01:18:20] Speaker A: All right, well, Matt, we made it through. This is a long one and we appreciate we're now back on track. We're now caught up on news for all the clouds. So next week we'll be back to a normal show, although we won't be seeing you next week, so we'll see you after you get back. And hopefully Ryan will join me next week, or else it'll just be me talking to the cloud, which I feel.
[01:18:40] Speaker C: Like it's not much of a difference of you and I talking at times. So it's fine.
[01:18:43] Speaker A: I mean, you always have. We always add quantity to each other's topics. It's fine. All right, gentlemen, have a great night. Have a great week in the Cloud, and we'll see you next time.
[01:18:53] Speaker C: See ya.
[01:18:57] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services.
While you're at it, head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening, and we'll catch you on the next episode.