[00:00:00] Speaker A: Foreign.
[00:00:06] Speaker B: Welcome to the cloud pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure.
[00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan and Matthew.
[00:00:18] Speaker A: Episode 337 recorded for January 6, 2026. Adavis discovers prices can go both ways. Raises GPU costs 15%. Good evening Ryan, Matt, how you doing? Doing well. Well we, we just finished recording three three, six so we're still bushy tailed but we were too long on that episode and so we said to cut it off there and Jonathan had to.
[00:00:41] Speaker C: Go breaking the fourth wall.
[00:00:44] Speaker A: I know, breaking it a little bit. But yeah, we'll just pretend like we didn't, we didn't stop and take a pause to do another episode.
So all good. All right, so we've got a bunch of things. First up, I gave you guys homework, which I don't think either of you did. Nope. Which was to watch the thing.
[00:01:00] Speaker D: I did all the other homework of the podcast.
[00:01:02] Speaker A: You did all the other homework? Come on. Yeah, that's good.
But Google Basically, Google DeepMind released this documentary called the Thinking Game for free on YouTube back in November. We've been saving this episode for people to watch it, which is the fifth anniversary of Alpha Fold. And the feature length film provides the behind the scenes behind the AI lab and documents the team's work towards artificial general intelligence over a five year period.
The document documentary captures the moment when the AlphaFold team learned that they had to solve the 50 year protein folding problem in biology. A scientific achievement that recently earned Dumbass Hasabis and John Jumper the Nobel Prize in chemistry. The film was produced by the same award winning team that created AlphaGo documentary which chronicled DeepMind's earlier achievement in mastering the game of Go. While this is primarily a documentary release rather than technical product announcement, it provides context for understanding Google's broader AI strategies. And I did watch it on my way to London to go check out some Christmas markets. I watched it on the plane and it is if you don't know anything about DeepMind and how Dennis's habis came to be and his background, all that, it is fascinating.
It's if you're not into technology, don't care about any of that and don't care about AI and how they built all the AI models that now powering the world of LLMs we have, you will not like this documentary.
But you know it was, it was cool to see them, you know, tackle, you know, beating games and they, you know, they had the AI, you know, they taught their AI how to, you know, Do Starcraft and how to beat those games and how to beat Go and AlphaGo and that application and then getting into the folding protein problem and why they did decided to do it and then decision to. We'll just fold all the proteins and release it as open source to the world as their gift to the world. It's just, it's a fascinating piece. I do definitely recommend checking it out if you have a chance to do it. I did not force my wife to watch it. I don't think she would have cared for it as much as I did. But I, I nerded out and got a lot more perspective. Demis Hassavis, I learned was a ch a chess prodigy in his youth. And then it was like I'm wasting my life and wanted to go solve real problems and ended up going on to become, you know, the foundering of the lab that is Google's eat Mine. So, you know, really fascinating story and background.
[00:03:06] Speaker C: That's cool. Yeah, I'll definitely. I know, I know I was supposed to do this already, but it does seem interesting and I definitely, I know I watched the previous one and I really enjoyed that. So cool.
[00:03:19] Speaker D: You know it definitely sounds, I'm gonna just reiterate what Ryan said. It sounds really interesting.
I just need an hour of free time in my life, you know, to watch it.
But it definitely sounds amazing how they kind of jumped from one to the next and kind of grew and I assume your wife didn't want to watch it so she could just. Or maybe she did watch part of it to fall asleep on the plane. She's not interested in this.
[00:03:40] Speaker A: She was watching some other show on the plane, you know, was watching my show.
I don't, I mean I was gonna watch it and then I didn't because I was kind.
So she probably, she probably would have, she probably would have watched it with me. I don't know if she would have enjoyed it.
[00:03:53] Speaker D: But I also feel like you cheated because the way you had time to watch this was you took a 10 hour plane to London.
[00:03:59] Speaker A: I was just about to say feels it was a cheat code. It was a cheat code.
[00:04:04] Speaker D: I had a five hour drive with two kids in the car. Just saying I feel like I wasn't able to watch it.
[00:04:11] Speaker A: Well, I do recommend checking it out if you have time and when you do watch it we can talk about an after show or something, what your thoughts are. But definitely worth checking out at some point.
ServiceNow is acquiring Armis for $7.5 billion to integrate real time security intelligence with this configuration management database allowing customers to identify vulnerabilities across it, OT and medical devices and remediate them through automated workflows. Deals are ready to close in the second half of 2026 and aims to triple ServiceNow's current 1 billion annual security revenue. I don't know who this company is and the fact that they were bought for 7.75 billion makes me think I should know who they are, but I really don't. The acquisition represents a strategic data play when combined with ServiceNow's recent purchase of Data World, giving the company both massive volumes of security asset data from armis and the governance tools make the data searchable and usable with AI tooling. This combination enhances ServiceNow's CMD capabilities by an order of magnitude. ServiceNow has completed six acquisitions this year, including Armis Veza for identity access management and Data World for data governance, signaling an aggressive expansion strategy focused on security and data management.
And this deal positions ServiceNow to eliminate the patchwork of security tools organizations currently used by embedding security capabilities directly into its AI platform. I mean, this is a bit of a shot across the bow at Palo Alto and like trying to get into their space or into Google and Mandiant. Like it's sort of interesting. ServiceNow wants to get heavily into security as a, as a big growth area for them.
[00:05:35] Speaker C: Yeah, it's, it's because it's sort of. I mean, I'm with you on like I have no idea what this company is or what they do and it is confusing.
And so like is this security tooling that you use for doing analysis and you know, threat hunting or whatever it is, or is this something that they're adding into their existing tooling and so it's more of an integration? I don't know.
[00:06:02] Speaker A: Yeah, I mean they, they've. The marketing team has gotten to their website, so I have no idea. Right. You know, they don't have a product area, they just have a platform. We're a platform play, which is always a marketing thing, the cyber exposure management platform. And it's really tied to assets and again into their cmdb. It makes me think it's like vulnerability management and active threat intelligence hunting and then you know, maybe a little bit of Yuba type type work. Maybe inside of this. Again, I just, it's hard to say. It's interesting though. They do play into medical devices, which is interesting area and IoT as well.
So again, maybe it's an endpoint company. Yeah, it's one I don't really know so if one of our listeners knows, we'd love to hear what this thing is beyond what we know from the press release and five minutes on their website, which.
[00:06:48] Speaker C: Well, on their. Yeah, my website is really. I was trying to do that too and it.
[00:06:53] Speaker A: Yeah, I don't know.
[00:06:53] Speaker C: Like I, I do think that this is more of like if you're not on a cloud hyperscaler and you've got a bunch of devices and you know, like ServiceNow has sort of the ability to inventory and do that sort of automatic detection of your, your on premise environment. And so I wonder if this is more of that play where it's sort of a cloud posture management service.
I don't know, like an, not necessarily an Aqua hire, not at 950 employees, but you know, definitely a feature purchase.
Yeah.
[00:07:26] Speaker A: So according to Gemini, you know, I said is it a threat detection tool? Threat intelligence tool? It says no, not primarily that it's more of an early warning vulnerability intelligence detection system. Something so more more like a user and entity behavior. And it says it shares many features with those type of tools but is categorized as a cyber exposure management platform. While traditionally UBA folks have on user logs and identities.
So it's really unclear.
[00:07:50] Speaker C: I'm going with posture management. Yeah, I think that's what it is. Yeah.
[00:07:53] Speaker A: That's probably the closest I could get. Again, asset data makes sense with cmdb.
I can see some synergies, so it makes some sense.
[00:08:02] Speaker D: I was having the same conversation with Claude on my laptop, so it was interesting to see the difference.
[00:08:08] Speaker A: I was last. I was like was it this is it that? And like what is it? Then I was like, well it's something new. Like no, it's not. No one, no nothing is new ever.
[00:08:16] Speaker C: How do they already have that much revenue? And like, like.
[00:08:19] Speaker A: So like it's a medical device space. I guarantee it. That's. It's gotta be the medical device thing. That's where they, they've sold heavily is my guess so.
Well, we briefly talked about this or maybe I want to talk about it and we didn't talk about it. But there's a, there's a new JSON out there or alternative to JSON called tune or Token Orientated Object Notation, which is a new data format designed to replace JSON in LLM prompts, claiming to reduce input token usage by approximately 40% while maintaining or improving accuracy. The format works by eliminating verbose JSON syntax and repeated tokens, converting structured data into more compact representation that LLMs can still interpret effectively. I mean JSON has always been a waste of space in commas and all kinds of things. But it had to be written that way because that's what the computer required. And I'd almost argue that TUNE is more what I would have wanted.
Very simple comma separated values.
So you know, I, I appreciate that it, you know we realized that hey, if you don't do all this weird syntax formatting to make the computer understand it, it's easier to read. Imagine that like English language but for LLM. So maybe LLMs will finally solve all my JSON complaints and YAML and all the other things. Or maybe not and I'll just hate tuned as much.
But if you are very interested in reducing your token usage, apparently this is something you should look at. As the JSON data set example, use 172 tokens was reduced to 71 tokens. That's over half of the reduction there. So quite a bit. And many of the model providers now support Tune already out of the box because again it's just common separated values in a lot of ways.
[00:09:53] Speaker D: But without headers, that's the first thing they also release.
[00:09:57] Speaker A: Yeah, without headers.
[00:09:57] Speaker D: No, there's headers. Well there's headers in the first row in the example they have animals name. But yeah, I just like how the first thing they released was a Python converter from JSON to Tune. Yeah like that's what they do to release to get it out there.
[00:10:10] Speaker A: You know how many, you know how many enterprise service buses did nothing but convert from YAML to JSON or from SOAP to JSON?
There is a lot of that use case in a lot of places quite often. So Google has recapped their 2025 with their blog post highlighting several things including Gemini 3 Pro, which it dropped in November. Gemini 3 Flash in December introduced several specialized AI models including Nano Banana Pro, the VO 3.1 for video duration and Imagen 4 for image creation. Google's AlphaFold celebrated its five year anniversary which was talked about. Google's quantum work achieved recognition with Googler Michael Deveret receiving the 2025 Nobel Prize in Physics. Where's Majoran in this one? And Google formed the Agentic AI foundation with other AI labs to establish open standards for agentic AI.
Lots of big highlights from Google in the AI space in general.
[00:11:05] Speaker C: So nice pretty laudable achievements you know in here like you know going way beyond like just sort of like big feature enhancements or market share gains. Right. Like actually changes like the.
To our world. Right. Like the. Some of the stuff about the Physics into quantum computing was really cool to read.
[00:11:30] Speaker A: It's kind of a neat.
Agreed.
Meta is acquiring Singapore based AI agent firm Manus for over $2 billion, bringing onboarded company that claims 125 million in revenue. Again, another company I've never heard of.
Just eight months after launching its general purpose AI agent, Manus will continue operating a subscription service while his team joins Meta to enhance automation across consumer products. Like Meta, AI Assistant and business tools, Manus is offering AI agents capable of executing complex tasks, including market research, coding and data analysis. Having processed over 147 trillion tokens and supported 80 million virtual computers to date, the platform provides both free and paid for tiers and has already been tested in Microsoft and Windows 11 PCs for tasks like creating websites from local files. Acquisition represents Meta's continued strategy of acquiring specialized AI startups to accelerate AI capabilities and llama large language model development. This follows Meta's $14.3 billion investment in Scale AI in June and the acquisition of AI wearable startups Limitless earlier this month. Manus originated as a product of Chinese startup Butterfly Effect before relocating headquarters from Beijing to Singapore in June. Backed by investors including Tencent, Hongshan Capital Group and Benchmark, which led a 75 million Series B round.
[00:12:44] Speaker C: You know, the upside, if they've just been around for eight months, they don't have the terrible tech debt that all these smaller firms.
[00:12:54] Speaker A: Right, like that's what you think.
[00:12:57] Speaker C: Well, no, I mean they've got, they've got eight months of it because there's. Everything at this scale is just, you know, pay the credit card.
[00:13:04] Speaker D: I was going with 75 million Series B to over $2 billion purchase price.
[00:13:08] Speaker C: Yeah, I mean this is, it's, this is the, the bubble though, right? Like there's, there's no way this will continue unless these revenue streams are real. But I just, I think people are desperate for answers and desperate to purchase solutions so that they, they can say they're doing something. I know it's an influence on my day job, which is like, you know, I get asked some pretty complicated questions about how we're going to, you know, manage AI, both in, you know, product development as well as sort of just the, the impact it can have in the incoming requests. And so it's, the temptation is real to be like, oh, we have a solution that promised it can do it, whether it can or not. And so like, I think that there's, there's some, there's a lot of bloat in the environ, in, you know, the, the Marketplace for with tools like this. So I don't know.
[00:13:59] Speaker D: All I can think of, based on what you said was filecorn's old slogan, the promise of the cloud delivered, the promise of AI delivered.
[00:14:06] Speaker C: It's totally that, right? Like, it's.
[00:14:10] Speaker A: Yeah. I mean, I think meta's in a weird place as well. I think Llama has not been as successful as they wanted it to be. I think they are trying to figure out how it's going to impact Facebook and all their other properties like Instagram. They're trying to integrate these things. And so by buying, you know, Aqua Hire maybe in some ways. And then also how do I pull agents into the Facebook experience? This might make sense.
So it's definitely something that could be very interesting.
[00:14:37] Speaker C: They've had some really high publicity failures with agents into Facebook, you know, like fake Personas and fake accounts posting and, you know, like. Which I thought was completely tone deaf given, you know, how much flack they get with bots and, you know, automated sort of things that are inciting riots.
But yeah, nice.
I don't understand meta as a company anymore. Right. It made sense, you know, Facebook and then even at, you know, know, acquiring different platforms. But with the Metaverse sort of flopping and all these, you know, the wearable thing, I don't really see these taking off.
[00:15:11] Speaker D: Like, I don't know, I forgot about the Metaverse.
[00:15:15] Speaker A: Most people did. I mean, most people did. That's why they pivoted to AI, because they thought the Metaverse would be the future.
[00:15:20] Speaker C: I mean, they renamed their company, like.
[00:15:22] Speaker A: That'S a big deal.
[00:15:23] Speaker C: And it didn't work.
[00:15:25] Speaker A: Nope, it did not.
All right, let's move to aws. AWS has split Security Hub into two services. The new Security Hub with enhanced capabilities using the Open cybersecurity Schema Framework or ocsf, and the Security Hub cspm, which continues a separate service focused on cloud security posture management.
Schema changes from AWS Security Finding format ASFF to OCSF means existing automation rules demigration to work with a new service.
AWS released an open source Python migration tool on GitHub that automatically discovers Security Hub CSPM automation rules, transforms them to OCSF for schema and generates configuration templates for deployments. Tool handles regional differences intelligently, supporting both home region deployments where rules apply across linked regions and region by region deployments for unlinked regions. Not all automation rules can be fully migrated due to schema differences between ASSFF and OCSF tool Join Migration Report Identifying rules that cannot be migrated or only partially migrated and creates all new rules in a disabled state by default so administrators can validate them before enabling.
The migration tool reserves the original order of automation rules, which matters when multiple rules operate on the same findings or fields and the migration capability is included in the Security Hub essentials plan and no additional cost beyond standard Security Hub pricing. So great, you change the format and you give us a tool. I appreciate it.
[00:16:37] Speaker C: Well, and now, you know, they're. The blog post is very big on, you know, these are complimentary services that you should have both of them enabled and working and so now they get to charge it twice.
You know, it's. I do understand why they're doing this.
[00:16:51] Speaker A: Just because no one else adopted the AWS security funding format.
[00:16:57] Speaker C: Well, there's that, but then there's also, you know, having a separate cloud posture management tool that's separate from your sort of sim soar like solution, which is what it sounds like. They're going to make the thing that is now called just Security Hub more into like. I think it's a launching point for that and the next steps.
[00:17:17] Speaker A: Makes sense.
[00:17:18] Speaker D: For the first time ever, I need to actually compare it in my head to Azure Services versus for my entire career, I've always compared stuff back to AWS services and I'm a little upset with myself right now, not gonna lie. Yeah, no, but I mean, I mean Azure has done this. I think GCP also has done this where you have your sentinel, your actual. Your SIEM with all your alerts and everything, and your CPAM, which is separate. The problem I always have with CPAMs and this is a larger rant conversation we have, is there's no interoperability.
So if you have a CPAM and you want to then set up GRC Tool or your other security tool also can run it. There's no interoperability. So you then have to acknowledge things in three different spots and there's no single source of truth. And then I just go on a longer, longer rant and we should move on to the next topic.
[00:18:11] Speaker C: Justin, you and I have made a career of writing glue code between these platforms though.
[00:18:17] Speaker D: Oh my God, it drives me up the wall.
[00:18:20] Speaker C: Well, you need your GRC software to talk to this. Yeah, I got, I got some Python.
[00:18:24] Speaker A: I can write for you.
[00:18:26] Speaker D: Oh my God. I mean, I have a problem where, you know, I. We have a GRC tool and it's running the same scans that Defender runs and I'm like, cool. And then we have our, you know, other tool that your security uses that throws in the the thing and I'm like cool, but they're yelling at me about the same thing. I'm like I I don't care. I'm going to solve the root problem, but I don't need three people yelling.
[00:18:53] Speaker A: At me and the only alternative is.
[00:18:55] Speaker C: A huge data lake and then just custom stuff to pull that right like and it's just not good.
[00:19:01] Speaker D: Which at that point I'm building my own.
[00:19:03] Speaker A: You are?
[00:19:03] Speaker C: Yeah.
[00:19:05] Speaker D: Cpam. Which you know, maybe is the was what you what I'll do in my career that just, you know. Or I'll just build a standard format. That'll be my 2026 vibe coding objective.
Build a standard format that I can sell or provide to all of these just to make my day job easier. You don't even care. I will give it to you for free. Please just use it well.
[00:19:28] Speaker A: If you are looking for other areas make your day job easier and you're using EKS, they're giving you new proactive EKS monitoring with CloudWatch operator and AWS Control Plane metrics. This comes with EKS clusters running version 1.28 and above, which now automatically send Control plane metrics to CloudWatch at no extra charge, covering API server health, Scheduler performance and ETCD database statuses. The new CloudWatch observability operator add on extends this with container insights and application signals for deeper visibility into workloads and applications without code changes. The enhanced monitoring addresses common operational challenges like detecting pod scheduling bottlenecks through metrics such as scheduler pending pods and scheduler schedule attempts unschedulable, which helps identify under resourced worker nodes. Critical integer components like emission webhooks, which power aws, load balancer controller and Ursa functionality can now be monitored for failures and latency issues. And the ETCD database monitoring is particularly important since EKS has an 8 gig recommended limit and exceeding it makes clusters read only. Yeah, that's a bad day.
[00:20:24] Speaker C: Ouch.
[00:20:25] Speaker A: Application signals provide automatic integration for Java apps with pre built dashboards tracking traffic in your EKS cluster so guess what? Amazon runs a lot of Java containers.
[00:20:36] Speaker C: I mean, I like this, except for the fact that it's an operator. Like I don't understand why this isn't just configuration options in your cluster, right? This is something that you have to deploy as a separate deployable artifact and maintain separately alongside your cluster. So it's just one more thing and it's like I get it, but I don't. I don't see that that's necessary.
Unless I misunderstand the instrumentation and how this is implemented, but it just seems sort of something they could have built into the service versus making it attack.
[00:21:10] Speaker A: On well, if you're looking to save yourself some money with ECS managed instances, Amazon is now supporting EC2 spot capacity, allowing customers to run fault tolerant containerized workloads up to 90% discount compared to on demand pricing. While AWS handles all infrastructure management, you can figure a new capacity option type parameter as Spot or on demand in your capacity provider setting. This extends ECS managed instances beyond its existing capabilities of automatic provisioning, dynamic scaling and cost optimized task placement. AWS still handles infrastructure operations through AWS controlled access in your account, but now you can choose between Spot and ongoing capacity types alongside existing options for gpu network optimized and burstable instance families. Feature available in all aws regions where ECS managed instances currently operate. And pricing includes both the spot EC2 instance costs and additional management fee for the compute provisioning service. Those specific management costs are not disclosed in this announcement. This targets customers running stateless or fault tolerant containerized apps like batch processing, CI, CD pipelines or web services that can handle interruptions.
[00:22:08] Speaker C: One of my predictions that I didn't give in the last episode was that serverless technology is going to become fully stateful. And I feel like this is exactly that, right? Like you're, you're scheduling an ephemeral workload on SPOT that can be easily interrupted and you're full, but it's, it's you still want to own and I guess access the compute. I don't really know if you can like, you know, SSH into these machines or is it just configuration where you can assign it sort of your GPU and have a go, but pretty wild. I mean it's great. I mean for people that are taking advantage of the service, I'd much rather have Amazon manage, you know, the patching and management of the services as long as it's not exorbitantly expensive and you know, the being able to sort of leverage Spot compute for things that are kind of container based. Like that was the first thing I did on ECS was sort of tie this to Spot and then so this is cool.
[00:23:11] Speaker A: Yeah, I think it's really cool.
Well, for even more container joy, Amazon EKS now supports DNS based and admin network policies, allowing teams to control POD traffic using stable domain names instead of constantly changing IP addresses. Thank you Jesus. This eliminates the operational overhead of maintaining IP allow lists for AWS services on premise systems and third party APIs while providing centralized policy management across multiple namespaces. Admin network policies operate in two tiers with hierarchical enforcement that cannot be overridden by namespace level policies, enabling platform teams to enforce mandatory security controls like blocking access to EC2 instance metadata service at the 169 address policy uses label based segmentation to apply security standards across multiple namespaces simultaneously reducing the need for per namespace policy management. DNS based policies are available in EKS Automode clusters version 1.29 and later, while admin policies work in both EKS auto mode and EC8. Two based clusters running VP CNI version 1.21 or later policy evaluation order follows a strict hierarchy Admin tier deny rules take precedence over everything followed by admin allow rules, then namespace scope policies and finally baseline tier policies. Real world applications of this include multi tenant environments where different applications need controlled access to specific aws services like S3 or DynamoDB and hybrid cloud scenarios where workloads access on premise databases through stable DNS names rather than valid that remain valid even as underlying infrastructure is changing.
[00:24:33] Speaker C: And if you ever wonder what's going to happen to all your on premise, you know, network engineers that have been managing firewall rules forever, here you go. Because it's going to be this, you know, that's it's exactly the sort of pattern as you know, IP based firewall rules inspection. So great. I mean it's needed for sure because these, you know, these Kubernetes networks have been open and every workload on that cluster has access to everything and, and it's the IP based restrictions haven't been around all that long.
So like this is, this is just making that a lot easier to manage which I'm in favor of.
[00:25:09] Speaker D: I really want to see a Cisco firewall I'm in or Juniper or choose a firewall vendor, go deal with Kubernetes and see how they handle that.
[00:25:21] Speaker C: Yeah, you thought they were angry before.
[00:25:26] Speaker D: Like I'm just saying most of the network admins I know, you try to talk to them about namespaces and policies and you know, getting they're not there. So this is going to be like you said, you know, it's network admin. I'm like I don't think so. I think this is going to be a true DevOps admin that really wants to actually set stuff up securely or you know, a large enterprise that has a true, you know, security aspect in their DevOps team that really needs to manage this.
I don't think you're going to see the old tool security people in that world. But maybe I'm wrong.
[00:26:02] Speaker C: No, I agree with you. It's a, it's definitely what I want. You know, as someone who really wants to focus on transformation of these teams. But you know those, those engineers have to be willing to grow and adapt their skills, right? Like that's, there's just not going to be giant F5s and you know, Juniper firewalls running in data centers forever. Right? Like it's moving. And then why would I run a virtual Palo Alto with, with all its throughput limitations and it's still expensive in my. To manage traffic within my, you know, virtual network in a cloud environment when there's a cloud native service that will do the same thing, you know, and it's just slightly different.
[00:26:42] Speaker D: I mean, I'm not disagreeing with what you're saying.
[00:26:44] Speaker A: Oh, I know the reality of the fact for something, if you are a.
[00:26:48] Speaker C: Traditional engineer listening to our show, this is like a, this is an example of something where you can take your skillset and add a ton of value. But yeah, it is going to be a different pattern and different tooling.
[00:26:58] Speaker A: Well, AWS has raised prices not in the way you would have expected, but apparently in ways that are problematic.
So it's not just a blanket increase to an instance type, but it's an instance, an increase to EC2 capacity blocks which are used for ML. And the increase is approximately 15% over the weekend with P5E48xlarge instances jumping from $34.61 to $39.80 per hour in most regions. Which is a pretty big departure from AWS's two decade pattern of price reductions and reference. One of the first straight increases to a line item not tied to regulatory requirements or regional specifications like sms. Capacity blocks allow customers reserve guaranteed GPU capacity for ML training jobs from one day to several weeks in advance with locked in rates paid upfront. It attributes the price increase to supply and demand patterns for this quarter. But in the global GPU shortage, driven by increased AI workload demand across the industry, price increase creates complications for customers with enterprise discount programs as their percentage discounts remain the same but absolute costs rise by 15%. This gives competitors like Azure and GCP a direct talking point for enterprise sales conversations, which I'm sure they'll be using to their advantage. And the change that was a precedent that could extend to other resource control services, particularly ram, which if you're aware of the RAM market right now is in dire straits and RAM costs are running. I Think through the roof. Yeah, yeah.
So definitely a big impact to you if you're running a serious ML workloads. But you know, is this a broader concern to the overall AWS ecosystem?
[00:28:28] Speaker D: I don't think it's a broader concern, but I think it's the first real time we've seen, I guess you could say the, the IP address when they started charging for IPv4s.
You know, I feel like it's the first real time that you're seeing a dramatic increase and I think people are gonna. It's been a fear for many companies for many years. What if they raise the prices and there's nothing we can do because we're already there, you know, and they're doing it and there's not much you can do is the thing, you know, you. But that's just the, the fact that it's a scarce resource right now and they need it to handle, I'm sure, power costs and everything else that they have to handle.
[00:29:11] Speaker C: I mean, we've long talked about Amazon's willingness to absorb costs in order to grow product and to grow market share. And I feel like this is just. They're being forced, I think, I don't think there's any way they can make it make sense given how scarce GPUs are.
And so I wouldn't be surprised to see this be less of a talking point for cloud competition, but more of something that they also do.
[00:29:36] Speaker A: Easy to Capacity Manager is adding three new Spot interruption metrics at no additional cost across all commercial AWS regions. That are tracks total Spot instance count Interruption count interruption counts and interruption rates across regions, availability zones and accounts to help optimize your spot placement strategies. The new visibility helps customers make data driven decisions about Spot instance diversification by identifying patterns in interruptions.
Or you could just make this service like I subscribe to, which would be really great.
Like this is the recommended Spot instance for this, this region or whatever would be a really handy feature when trying to pick up Spot instance fleet.
[00:30:11] Speaker C: I'm sure Q will tell you they.
[00:30:12] Speaker D: Don'T want to make it too easy.
[00:30:14] Speaker A: No, you don't make it.
[00:30:14] Speaker D: They're giving you the tools so they're, they're almost making easy.
[00:30:19] Speaker A: Yeah, I mean the problem with spot especially GPUs have come a thing and batching is such a big push now is that you think you're good and then all of a sudden someone runs a workload that just absolutely murders you.
So yeah, I mean, but this is.
[00:30:32] Speaker C: Better than the alternative of like you have your sort of availability group or instance node tracking and you just sort of have to assume that it's the Spot market interrupt. Right. Like, there was no easy way to correlate this unless you went and logged in and looked at the logs and stuff. So this is actually.
Now you could put it in a dashboard and see that this, this capacity change was directly related.
But yeah, I do understand the. The want of like can. Is there a way to have sort of predicting capacity prediction by instance type for workloads? But I think isn't that, you know what companies like Spot and all the. There's another littler one. That's exactly what they were built to do, right?
[00:31:16] Speaker D: Yeah.
[00:31:17] Speaker A: Well, we'll see.
Let's move on to GCP and a story that I wanted to kill, but Ryan wanted to keep.
[00:31:24] Speaker D: Ooh, calling him out right at the start.
[00:31:27] Speaker A: Oh, it's fair.
[00:31:28] Speaker C: It's fair.
[00:31:28] Speaker A: It was fair. Yeah. Looker, which is a reporting tool made by gcpe, now allows users to upload CSV and spreadsheet files directly into the platform through a drag and drop interface and the new Self Services Explorers feature currently available to you in Public Preview. This bridges the gap between governed data models and ad hoc analysis by letting users combine local files with existing Looker data while maintaining administrator oversight on uploads and permissions. The new Tab Dashboard feature helps organizations organize complex dashboards into logical sections with automatic filter propagation across tabs, reducing the visual clutter by showing only relevant filters per view. Users can share specific tab URLs and export entire multi tab dashboards as single PDF documents, making it easier to present cohesive data narratives. Internal Dashboard theming is now available in Public Preview, enabling organizations to customize tile styles, colors, fonts and formatting to match the corporate branding guidelines within the Looker application. And a new content certification flow helps distinguish between ad hoc experiments and vetted data sources, addressing governance concerns when users upload their own data set. These features are available starting at looker version 25.20 and can be enabled through the Admin Labs page with no specific pricing changes announced, as they appear to be included in existing Looker subscriptions.
[00:32:38] Speaker C: Okay, hear me out, executive who likes pretty graphs and pictures for. For everyone that has to supply you with pretty graphs and pictures. This is very important. It is very difficult to sort of modify and work with existing data sets in any BI tool. And so this is another knob that you can put. And I could use something like this, you know, for just uploading a very easy CSV of like product names or usernames or something that's just a list versus having to sort of parse that out of a very large data set, which may have a combination of structured and unstructured data or just bad schema adherence. And so this is sort of a nice tool for being able to create those types of things.
[00:33:25] Speaker A: Yeah, I mean, having been a guy who wrote Crystal Reports back in the day and was Certified and Crystal Reports 8, I would use a spreadsheet as a second data source that I would link to the master data source I needed to do like look up things or to do different label types. So I've used this experience. I've used something very, very similar to this in desktop clients for reporting many, many times. So I get it as well. I just, I didn't want to talk about.
[00:33:49] Speaker C: It's also a BI tool which is.
[00:33:52] Speaker A: Yeah, I mean like a quicksight. Dropped this in. I don't know if I care about that either.
[00:33:57] Speaker C: Yeah, I don't think I care even I wouldn't fight for that.
[00:34:01] Speaker A: All right, well, let's move on to something more interesting. AlloyDB's AI Natural Language API is currently in preview, enabling developers to build agentic applications that translate natural language questions into SQL queries with near 100% accuracy. System uses descriptive contexts like table descriptions, prescriptive context including SQL templates and facets for complex conditions, and a value index to disambiguate database specific terms that foundation models wouldn't recognize. The API addresses a critical business need where 89% accuracy is insufficient, particularly in industries like real estate, search and retail where poor query integration directly impacts conversion and revenue. Users can iteratively improve accuracy through a hill climbing approach, starting out with out of the box capabilities and progressively adding context to handle nuanced questions like homes, near good schools that require specific business logic for terms like near and good. The system provides explainability features that show users what the API understood, requested to be and allow agents and end users to verify the interpretation even when accuracy is not perfect.
Integration options include MCP Toolbox for databases for developers writing AI tools or Gemini Enterprise for no code agentic programming while following conversational applications that combine web knowledge with database queries. Google plans to expand this natural language capability beyond ladb to broader set of Google Cloud databases. Those specific timelines and price details for the preview are generally were not disclosed.
I mean natural language querying. I am all here for it every.
[00:35:19] Speaker C: Day of the week and I do love that. You know, it's like natural language API querying via API Right, because you're seeing this built into like, you know, graphical UIs and stuff. But the fact that you can directly make an API call to do the same query is pretty rad.
[00:35:36] Speaker A: Well, and gives you the explain plan is cool too. So you can see like, oh, you interpreted it not quite the way I wanted you to.
So that's good.
[00:35:44] Speaker D: Yeah, I've definitely seen this done. The biggest piece is giving it the context and having the table schema and everything well documented in advance. That's the trick to make this stuff actually work.
And if it just kind of has.
[00:35:59] Speaker A: To guess, I hope I would discover a lot of it. And then again, like most agentic things, it can ask you questions and so you can give it the context that it needs to decipher it. But then if it had a way to store that in like a Cloud MD table or whatever the equivalent would be for this particular tool, that it can store those type of interpretations over time would be helpful too, as it could then use that in rag type learning models.
Google is introducing an enhanced tool governance for Vertex AI Agent Builder through Cloud API Registry integration, allowing administrators essentially manage and curate approved tools across organization while developers access them via new API Registry object in the Agent Development Kit. This addresses the duplicative work problem where developers previously build tools separately for each agent and gives enterprises better control over what data and APIs their agents can access.
The ADK now supports Domain i3 Pro and Flash models with full TypeScript compatibility plus improved state management features including automatic recovery from failure, human in the loop, pause and resume capabilities, and conversation rewind functionality. The new interaction API integration provides consistent multimodal input and output handling across agents. Well, A2UI enables agents to pass UI components directly to applications without security risks or executable code.
So that's, that's helpful.
[00:37:13] Speaker C: Yeah, I mean this is, it just goes to show, you know, how early we are in this ecosystem. Right. Like it's, it's.
Companies are just starting to sort of get wise to sort of the, you know, that they've got a whole bunch of developers, you know, using these platforms and they're all kind of doing their own things in separate little silos and there's very little ability to share or, or get any kind of like optimization with those central resources.
And it's, you know, I do think that this is a good thing. I do feel like we're not quite there yet with the solutions, but I think it'll, it'll get there.
I'm kind of glad to see these, these types of features being developed directly into like the Vertex AI platform and not just sort of packaged up and as part of Google's other AI products that are more sort of like turnkey enterprise ready, which is nice.
[00:38:05] Speaker A: Google's launching VM Extensions Manager in Preview to centralize and automate the installation and lifecycle management of OS agents across COMPUTE engine fleets.
Service eliminates manual scripting and startup script dependencies by providing policy driven control that can reduce operational overhead from months to hours, according to Google. The preview supports three critical attentions at launch the Cloud Ops Agent for Telemetry Collection Agent for SAP Monitoring for SAP workloads, and Agent for Compute workloads for workload evaluation. Administrators can pin specific extension versions or let the system automatically deploy the latest releases with more extensions planned for the future. VM Extension Manager offers two rollout speeds for global slow mode which executes zone by zone deployments over five days by default to minimize the risk, while fast mode enables immediate fleet wide updates for urgent security. Patches with zonal policies at the project level are available now with global policies and organization or folder level policies coming in the next few months.
Service integrates directly with existing compute googleapis.com API without requiring a new API enablement or discovery. I like this. I wish it would have already had third party support beyond SAP like all of your security tools like Qualys and CrowdStrike and other antivirus solutions. It'd be great to have that all available through this. Hopefully this gets a lot of vendors to jump on board very quickly because this is a nice feature and if.
[00:39:22] Speaker C: It does it would sure lock me because I'm currently writing using the OS config part of compute for the day job like the policy evaluation with validation and then enforcement that automatically and fix it. But it is it's a whole bunch of custom shell scripting I'm having to write and test across multiple operating systems to do this. So how awesome would it be if this already supported all of my security agents and the monitoring agents and stuff we had. So but I will say that my experience with these in GCP slow mode is going to be really slow five days. Super conservative maybe right? Like because I will tell you that things I am promised now that will be anywhere between you know five and 20 minutes will take hours and so I'm struggling with long iteration cycles on working on these tools so I don't know, I wish it was a little bit more.
[00:40:19] Speaker D: Well it's also because we're impatient.
[00:40:21] Speaker A: Yeah interesting. If it was a if 5 days became an SLA they guaranteed that'd be nice, right?
[00:40:26] Speaker C: Or tunable.
[00:40:27] Speaker D: I like that they released both of those day one, you know, slow and fast mode because I feel like so many things would just do fast. Okay, here you go. We just install this and. Which is great like they said for patching or you know, you know, a vendor having a bug that takes out OSes, you know, anything along those lines to get past it.
[00:40:47] Speaker A: Damn crazy.
[00:40:48] Speaker D: Yeah, I was not trying to not.
[00:40:50] Speaker A: Say that they're not a sponsor today.
[00:40:52] Speaker D: But at the same point, like it's nice I thought both of them, but.
[00:40:55] Speaker C: Why not tunate, right? Like you have interruption capacities, you define number of this. This is just a big knob for slow versus fast.
[00:41:02] Speaker D: But at least they thought about that day one. Yeah, that's I guess my point. It's nice that they at least thought about that day one versus, you know, to me this is an mvp. They gave you slow and fast and now they need to give you that knob to say, well maybe we do the first burst quickly and the second burst slowly, et cetera, et cetera, you know. And they add more features to this over time.
[00:41:23] Speaker A: Cloud SQL for MySQL Enterprise Plus Edition now includes Optimized Writes, a feature that automatically tunes five different MySQL parameters and configurations based on real time workload metrics to improve your write performance. Feature is enabled by default on all Enterprise plus instances and requires no manual intervention or configuration configuration changes. Google reports that up to a 3x better write throughput compared to the standard Enterprise Edition with reduced latency particularly beneficial for write intensive OLTP workloads. Performance gains vary based on machine configuration and the feature components. The existing SSD back data cache that provides up to 3x higher read throughput. Optimize writes feature works by automatically adjusting MySQL flags, data handling and parameters in response to instance and workload characteristics.
And if you don't have Enterprise plus, you can upgrade easily unlike Azure which require you to redeploy.
[00:42:11] Speaker D: Yeah, that sounds about right actually.
[00:42:15] Speaker C: I mean they have to write some sort of optimization to protect from rewriting SQL queries so it makes sense.
Like select star, join, join all tables, blah blah blah.
[00:42:26] Speaker A: Moving on to Azure, we're on the final stretch. Matt, it's getting late.
[00:42:30] Speaker D: Sorry I'm fading here.
[00:42:32] Speaker A: Yep. Microsoft is acquiring OSMOS to bring agentic AI capabilities to fabric for autonomous data engineering workflows. OSMOSIS uses agents to automate data preparation tasks that typically consume most of data teams times transforming raw data into analytics ready assets in OneLake without manual intervention. The acquisition addresses a common enterprise challenge where organizations have abundant data but lack efficient ways to make it actionable. OSMOS will integrate into Microsoft's fabric unified data platform, allowing AI agents to handle data connection preparation and transformation tasks that currently require significant annual effort and technique release. So what I see here is before we'd always talk about this as being ETL jobs and ETL tooling and now we've given you AI agents do the ETL work so you don't have to do the heavy lifting. And I'm sure there'll be a rash of these type of acquisitions in the next year as well.
[00:43:19] Speaker C: As long as they deliver on the promise. Right? Because you've, there's been solutions that sort of make the same promise, not with AI, but like you know, I've long bashed AWS glue because it's supposed to, you know, do automatic schema detection and be able to do that automatic translation and it just never works the the way it should. And it's, it's drives me nuts that it's, it's so failure prone like one line in a log will not have a field and the whole thing shuts down or it, it's automatic schema detection is completely off base. And so like I'm, you know, as long as like the AI and agentic add to these things actually fixes these common problems like that would be fantastic. And you know, I think it could be a differentiator for fabric almost enough to make me use it.
[00:44:05] Speaker A: Almost.
Almost.
You just talked about how much you love reporting tools earlier.
[00:44:11] Speaker C: I do love reporting tools and data and parsing, but I also hate reporting tools, data and parsing. Nice.
[00:44:19] Speaker A: Well, Azure is apparently deploying Nvidia's next generation Arubin platform at scale scale with infrastructure already designed to handle its power, cooling and networking requirements. Microsoft's Fairwater data centers in Wisconsin and Atlanta can accommodate Rubin's 50 petaflops per chip and 3.6 exaflops per rack without retrofitting, resulting a 5 times performance jump over the GB200 systems. I'm pretty sure I can't get this in a desktop form factor. The deployment leverages Azure systems approach where compute, networking, storage and Azure work as an integrated platform with key technical ambulance including support for 6th generation NVLink with 260 terabits second bandwidth connect X9 1600 gigabits networking, HBM4 memory, thermal management and pod exchange are for rapid hardware servicing without extensive rewiring. Azure's track record includes operating the world's largest commercial Infiniband deployments and being first to deploy both GB200 and GB300 NVL72 platforms at scale companies multi year collaboration with Nvidia on co design means Ruben integrates directly into existing infrastructure nailing faster customer deployments compared to competitors who need infrastructure upgrades per Microsoft.
I'm sure Google and Oracle and everyone say otherwise.
[00:45:26] Speaker D: I mean I feel like then part of what this means in my head also is that did they over engineer to start off like it's great that you're able to do it and it's amazing but did you over engineer or did you kind of already have this in mind that you need this next level of scale?
Because I always say build to, you know, maybe the next level, you know but if you're building multiple levels ahead, you probably building something you don't necessarily need yet.
Maybe it's different when it comes to actually building buildings that's not at all confusing.
[00:45:55] Speaker C: But yeah, I do think that you know, this is just the beginning of that arms race and you know these things are, I don't know if it's over engineered or this is the, the product marketing, you know, tilt because I think there's a lot of that usually with these announcements. But I do think that this, you know, they're solving problems that are going to plague everyone. It has, you know, on premise infrastructure they have to manage and it's always been sort of a challenge, you know, to get cooling and electricity at the density you need and it's just getting worse and worse and worse.
[00:46:25] Speaker A: And I have one last story for us before we can go turn to pumpkins. Oracle is set to build to power on their new data center in Michigan. This new data center is in Saline Township specifically. This data center is specifically to serve OpenAI's infrastructure needs marking another major cloud capacity expansion for AI workloads.
Facility will use closed loop non evaporative cooling systems that consume water comparable to an average office building or millions of gallons daily like traditional evaporative systems. The project includes a 17 year power grant with DTE Energy where Oracle pays 100% of energy costs including new transmission lines and on site substations. With Michigan law prohibiting utilities from passing data center costs to existing ratepayers, Oracle claims as large customer contribution to DTA6 costs will reduce overall energy costs for other customers by approximately 300 million annually by 2029 and 2030. But in the meantime you'll be paying through the nose. Facility will create 2,500 union construction jobs and 450 permanent on site positions plus an estimated 1500 jobs across Awashtenaw County. With construction scheduled to begin in Q1 of 2026, Oracle is developing only 250 of 575 acres of the remaining land protected as open space, farmland, wetlands and woodlands, including 47 and a half acres in conservation easements. This will be Oracle's 148th data center with 64 more under construction globally. But the company provides no specific pricing or service details for customers beyond OpenAI.
[00:47:43] Speaker D: They have 148 data centers. Holy crap.
[00:47:46] Speaker A: I mean that's what I read and.
[00:47:48] Speaker D: A whole article that's what I took from it.
[00:47:50] Speaker C: So half of them are. They've got a couple servers running around in the back of a truck.
[00:47:54] Speaker D: Yeah.
[00:47:54] Speaker C: Like it's not.
[00:47:55] Speaker D: Yeah, yeah.
[00:47:56] Speaker A: Some of them are in some executives garages in countries and stuff.
[00:48:00] Speaker C: I mean I do think it's, it's interesting that the, the closed looped on app non evaporative cooling systems are becoming sort of the thing because of water usage in AI but it's sort of didn't we move to evaporative cooling because all of the refrigerants and, and was, was harmful to the environment and, and disastrous for, for non evaporative cooling. So I really wonder like maybe technology has improved where this isn't as big of a concern.
[00:48:24] Speaker D: I think it's straight water because I remember we talked about Microsoft's data center in Georgia or someone's. I'll remember Bolt will tell us at one point when it just does the episode for us when Justin gets too much spare time on his hands again. But the short version is I think it was said they pre filled the system with water and it's like you know, 300,000 houses or some massive sum and then over time and then they don't need any more water for X number of years. So I think it's a closed loop straight water system. They just have enough of a buffer and they have enough cooling in there that it doesn't, they don't need the extra coolants. That's at least what I understood from it. Which could be a hundred percent wrong. I'm not gonna lie.
[00:49:09] Speaker A: I mean my, I've seen other closed loop of our systems. It typically doesn't require more than a few gallons to be added, you know, in a given week. So it's a very low consumption amount and that's typically not a large amount in most data centers. But in some cases it can evaporate some water out through you Know, basically condensation going in and out of certain coolers. That's typically where they lose some ability, some water capability. But most of it is recycled completely and cooled and stored.
[00:49:38] Speaker C: But are you trading the water concern for, you know, the high energy costs of generating refrigerant and. And the transport of this refrigerant and. And all the reasons why we, you know, moved away from. Which was mostly because it was expensive, let's be honest. Not.
But it's, you know, because that's. You have to cool it with something.
[00:49:58] Speaker A: I assume that you are going to use solar at some level or other, you know, maybe, I don't know, nuclear power plants at some point in the future. Because energy is going to become a big problem everywhere for lots of reasons. But the water usage is not really the big concern right now with ASRs. It's typically light pollution, apparently from some Facebook data center I read about in Virginia.
[00:50:19] Speaker C: Well, that's nuts.
[00:50:20] Speaker A: Or it's typically power usage and pushing grids over the edge so that they have the protections in the law in Michigan is pretty impressive.
They can't pass that along to customers. So that's good. Yeah.
[00:50:32] Speaker C: And I do like that this is a suburb of Detroit, too, which is, you know, economic downturns and so, like, this is bringing industry and jobs to that region, which is cool from a.
[00:50:42] Speaker D: Person that goes to that area quite often. For my day job, it's like south of Detroit or south of Ann Arbor, which, you know, major university and everything. It's not that far out of the way, though. I can honestly say I've never been down to that area, but it seems to be close to, you know, populated areas.
[00:51:01] Speaker A: All right, gentlemen, another week in the box. We're all done.
[00:51:06] Speaker D: First of the year, Doug. Yeah.
[00:51:07] Speaker A: Next week, hopefully, the cloud providers start dropping some more news. It's been a little slow couple weeks here coming out Christmas, which I did appreciate, but we also could use a little bit of stuff to talk about.
[00:51:17] Speaker C: This show's approaching an hour. I think we're okay.
[00:51:21] Speaker A: I think we're fine.
[00:51:22] Speaker C: Yeah.
Could do with less news.
[00:51:26] Speaker A: Yeah, less news would be fine, too. All right, I'll talk to you later.
[00:51:28] Speaker C: All right, bye, everybody.
[00:51:30] Speaker D: Bye, everyone.
[00:51:34] Speaker B: And that's all for this week in Cloud. Head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.