[00:00:00] Speaker A: Foreign.
[00:00:08] Speaker B: Where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure.
[00:00:14] Speaker A: We are your hosts, Justin, Jonathan, Ryan and Matthew.
Episode 314 recorded for the week of July 22nd.
Vector I hardly knew her. S3's new AI storage play good evening, Matthew.
[00:00:30] Speaker C: How are you?
Awake. How are you, Mr. Ryan?
[00:00:34] Speaker A: Awake.
I think it's a stretch this week.
[00:00:38] Speaker C: Did you nap today?
[00:00:40] Speaker A: I did not. I tried. It didn't work.
[00:00:42] Speaker C: That's the problem.
[00:00:44] Speaker A: Yeah. And it's just the two of us again because apparently we didn't submarine the entire show last time.
[00:00:51] Speaker C: I'm pretty sure we lost half of our listeners, but don't worry about that.
[00:00:54] Speaker A: Yeah.
For the. For the other half of you that are still listening, thank you.
[00:00:58] Speaker C: They'll be gone shortly.
[00:01:02] Speaker A: Most likely.
All right, let's get started with a follow up. SoftBank and OpenAI's 500 billion AI project struggles to get off the ground so it's the AI effort that was unveiled at the White House about six months ago, where they said it's been $100 billion, now has a more modest goal of building a small data center by the end of the year in Ohio.
Like kind of a big, you know, change from. From their previous commitments, SoftBank had committed to 30 billion earlier years was one of the largest ever startup investments by SoftBank and it led them to take on new debts and sell some assets.
The investment was made alongside Stargate, giving them a role in physical infrastructure needed for AI.
Altman, though, has been eager to secure computing power as quickly as humanly possible and has proceeded without SoftBank funding publicly. They say it's a great partnership and they look forward to advancing projects in multiple states.
Oracle was part of Stargate, but the recent 30 billion deal just signed with OpenAI seems a little suspect. The commitment includes a 4.5 gigawatts of capacity and would consume the equivalent power of more than two Hoover dams to or about 4 million homes.
Oracle was also named part of the deal with the UAE firm MGX as partners. But Oracle CEO Safra Katz said that Stargate hadn't been formed yet last month.
[00:02:33] Speaker C: I mean, doesn't surprise me that, you know, everyone's like, oh, how hard can it be to build a data center? But it's city zoning, power consumption, grid improvements, water for cooling, or depending how you're cooling and then getting communities to approve and these things end up being a massive undertaking. And it takes the hyperscalers a long time to get these things up and operational. So it doesn't surprise Me that a large or a small data center by the end of the year is probably something that was already in the works beforehand, that they're just taking over other plans because most data centers take a couple years to really get up operational, get the equipment in, you know, and then shipping all the equipment there, you know, it takes time. That's why, you know, Microsoft AWS and Google announced these things, you know, years in advance as it takes them time to procure everything.
[00:03:30] Speaker A: Yeah, I'm just getting kind of tired of these grandiose sort of launches where they talk about the future and all these things and then the reality is they haven't really planned through this enough. And so it always terms from this, you know, 500 billion effort to a data center.
And you know, I'm sure the plans haven't really, you know, gone away entirely that they had. They're just changing the timelines on these things. But it does, it just sort of bothers me.
[00:03:57] Speaker C: Yeah, I mean the big things make the news. It makes you think that something's going to happen. And I think when we initially talked about the story, you read the fine print and it's like, wait A second, in 2040 they will have spent the $5500 billion, you know, it's like, okay, cool, what's it actually going to do today? So what, a new data center is going to cost a billion dollars? Maybe. I, I mean, I can't imagine. I have no concept of it. I'm like, I deal with pennies every day. You're talking billions, you know, but I.
[00:04:28] Speaker A: Don'T think it's a billion. I don't think it's.
[00:04:30] Speaker C: No, it's going to be in the probably 100 million or so, you know, pocket change, you know, but yeah, it doesn't surprise me. That's, that's harder than they anticipated. Like really, who would have thought? There's a reason why everyone said, eh, screw building my own dev center, managing it. Let me go to the hyperscalers and pay the premium even if I'm going to lift and shift.
[00:04:51] Speaker A: Yep.
[00:04:53] Speaker C: So as you all know, for some reason we all find undersea cables very interesting and I haven't figured out why all of us do yet, but. But there's an interesting article I came across about transatlantic communication and how they're leveraging cables for other items. So scientists have developed new instrumentation that transform existing undersea fiber optic telecommunication cables into ocean sensors, measuring variations in light signals between repeaters, enabling them to monitor water temperature, pressure, tidal patterns without disrupting phone or Internet service.
The technology uses fiber grading at the cable repeaters roughly positioned every 50 to 100 kilometers. So there's a lot when you cross the Atlantic Ocean to reflect light back and forth, allowing research to measure slight changes in time travel to indicate the surrounding effects of the water on the cable.
The 77 day test on the lling cable from Portugal to Brazil successfully measured daily water temperatures weekly to temperature variation and tidal patterns across 82 subsections, demonstrating the potential for global submarine cable network to serve a dual purpose. The technology could enable early tsunami warning systems, long term climate monitoring by leveraging millions of kilometers of existing cables and providing valuable ocean data without requiring new sensor deployments.
[00:06:23] Speaker A: Yeah, it's super cool. I mean, yeah, I don't know, it feels like our version of like getting into World War II or something. But like, yeah, like all of us sort of research these cables and, and you know, how they work just seems a little bit of a mystery. How do those repeaters work? Like what's the infrastructure like?
The fact that they're now using these things for sensor data and is just super cool. Just you know, the scale of it all to get that much, you know, those, those numbers and over the distance especially as, you know things are changing and warming up. I think it'll look, I think it's powerful to know like the currents and things because it has so, so much impact on our weather and all kinds of things. So it's, I don't know, I just think it's the coolest thing ever that not only are they building these huge like information superhighways, but now they're even adding more value to them.
[00:07:18] Speaker C: And I'm sure you're measuring like nanoseconds in there from the light repeaters because what crossing The Atlantic is 80 milliseconds from probably my house to London.
So you're measuring microseconds at that point to see each of these sections.
And I don't really understand how the repeaters even work. It's not like it's a switch. It's not like there's ip. I fundamentally don't understand. My brain doesn't go down to that level. It's magic, right? It's like somehow this thing takes the signal, repeats it, you know, and it's doing that across multiple fibers anyway, bypass that whole thing. But they're able to measure down to these microseconds I would assume and be able to tell then figure out from there what the temperature correlation and leveraging existing infrastructure versus deploying new things which would require manufacturing, you know, boats or whatever to deploy it, you know, more infrastructure to manage it. Leveraging the same thing when you're talking temperature changes and whatnot makes a big difference. And if they can get, you know, half of the, you know, several hundred, if not thousand of the fiber optic cable undersea cables to provide this data, it could be another data point that we can leverage for, you know, hurricanes or, you know, anything else going on out there in the ocean that we just don't know without satellites nowadays.
[00:08:40] Speaker A: Yeah.
Yeah, it's wild. That's pretty cool.
[00:08:44] Speaker C: Yeah, that was a fun story.
[00:08:47] Speaker A: All right. Moving on to AI is How ML Makes Money Amazon backed Anthropic rolls out new cloud AI for financial services.
So this new service is a tailored version of cloud for enterprise specifically designed for financial professionals to analyze markets, make investment decisions and conduct research using Claude 4 models with cloud expanded usage limits.
The solution integrates with major financial Data providers including Box Pitchbook, Databricks S& P Global, and Snowflake for real time financial information access with availability through AWS Marketplace and Google Cloud Marketplaces coming soon.
This represents Anthropic's strategic push into Enterprise AI following their 61.5 billion valuation in March.
Targeting financial services as business as businesses increasingly adopt generative AI for customer facing functions, makes sense.
The offering includes cloud code capabilities and the implementation to support positioning it as a specialized alternative to general purpose AI assistance for complex financial analysis tasks.
Cloud providers benefit from this vertical specific AI approach as it drives compute consumption through AWS and Google Cloud while demonstrating how foundation models can be packaged first for specific industry needs.
Pretty neat.
[00:10:12] Speaker C: It's literally why we named this section this Guy is How ML Makes Money.
Ta da.
[00:10:19] Speaker A: Yeah.
[00:10:20] Speaker C: Took us what, three years after rename that section to get to that point.
[00:10:24] Speaker A: But yeah, I mean I don't know. I think we've proven this one over and over again because it is sort of flip the funny bits of like you are just sort of looking over large data sets and, and providing insights on it. Right? We've been doing that for a while and yeah, the training of the models I think is the neat part of these things to sort of like how do you like. It's the same foundational model but with the financial tooling kind of layered on top of it. I assume that there's like rag customizations and other things sort of at play.
[00:10:56] Speaker C: That's kind of what it is, is like can you get enough in a base foundational model to just do a little bit of rag on top of it what's the minimal amount you can do? Because that's the expensive part to then run your model at that point. And if you're able to leverage all the data sets, then hopefully you don't need to do as much rag if you're leveraging the Claude 4 or the latest models out there.
[00:11:21] Speaker A: Well, and I wonder how much of these large data sets with these providers you think about MCP and that sort of integration with these things where it's.
It's much less of a sort of a generative AI approach of like, you know, generating this opinion on the data, much more just sort of like asking it questions and it generating the query on the back end to actually get the data.
[00:11:43] Speaker C: So yeah, yeah, the thing that annoys me always is, hey, before you actually can do a support ticket, before there was the dumb bots, now there's generative AI bots. I'm like, no, no, no. I've done the research. I can't find this API. It doesn't exist.
It's like, have you checked our API docs? No, I decided to not check your API docs. This might have been a pain in my existence this past week. Are you sure you've checked this? How about this API? It's like, please open a ticket with a human.
Humans are busy. It will take longer. I don't care. Yeah, sorry, that was my rant of the week, Hunter AI bots. But I'm like, I've officially tried this already, guys. Thank you for playing.
[00:12:26] Speaker A: Yeah, I'm more frustrated the AI bots aren't. Aren't better like that because they're worse, because you can't get around them as easy as the stupid bots.
But.
But they provide the same level of dumb in terms of answers, typically.
[00:12:41] Speaker C: Well, it's like by the time I've decided I'm going to give in and like, call it quits. I've done my research, you know, I guess maybe I'm not the normal person, but I'm like, I don't want to talk to support that. Like, means that I failed at that point. So let me go fight with this until I, you know, bleeding from my head from banging against the wall for the Same error for 12 hours before I open the sport ticket. And they're like, have you tried a comma true at the end? So you're like, wait, damn it. Yeah, you know, so like, I try not to as long as I can, you know, for somebody to tell me you dumb.
[00:13:18] Speaker A: Yeah. So, yeah, I'm the same way. I think and I think that we are atypical and I think we just know too many things. And so it's like, well I got an idea, I can try this. Oh, I got an idea, I can try this. I got an idea and try this. And until you're blue in the face.
[00:13:32] Speaker C: I still will never forget. I was debugging something and like it was after a red eye flight, I didn't really sleep and I was like writing some cloud formation back when you only could do JSON and I was like, oh, I give up. And I opened your support ticket about cloudformation. They were like, you're missing like the word reference or something there. And I was like, I hate everyone. Yeah, thank you AWS support for being actually useful.
[00:13:56] Speaker A: Yeah, it is always funny because whenever I get I do get to that point in desperation I open the case. It's almost every time something stupid like.
[00:14:05] Speaker C: Yeah, if it was back in the day when I had to manage firewalls, it was it's always a NAT rule.
I cannot get NAT rules on firewalls to save my life.
I just couldn't 12 live video understanding models are now available in AWS. Bedrock 12 live brings two specialized video understanding models to Bedrock Marigeno for video embedding and search and Pegasus for generating text from video content.
These models enable natural language queries like Find Scene where main characters first meet to locate specific moments of video libraries. The models were trained on AWS SageMaker Hyperpop and support both synchronous and asynchronous because I don't know why you would do one or not the other interface patterns.
Key technical capabilities include video to text summarization with timeline descriptions, automatic metadata generation, and vector embedding for similarity search. The models accept video inputs via S3URIS and base64 encoding string that is a very long set of text.
Practical applications span multiple industries. Media teams can search for dialogue across footage libraries, marketing can personalize content at scale and security teams can identify patterns across multiple video feeds for your physical data center. Not what Ryan does.
Pricing follows AWS Amazon Bedrock Standard models and is available in a limited number of regions.
[00:15:42] Speaker A: Maringo Maringo.
[00:15:45] Speaker C: I definitely say that I paused. Hopefully they cut out my long no.
[00:15:49] Speaker A: No, edit that to yeah, probably not now because we're talking about it.
[00:15:54] Speaker C: So it's fun.
[00:15:54] Speaker A: It's like damn it but yeah, I mean which I just learned is like a chicken dish or something like that. Like what?
[00:16:03] Speaker C: It sounds like Argentinian but I don't know why. Yeah, I mean I feel like this is definitely something that came out of like Amazon video so that they were able to find stuff a lot faster. And this is like, hey, let's prioritize it. Like this is that next evolution.
[00:16:19] Speaker A: Yeah. I mean I haven't heard of 12 labs before so I mean for and for all I know, because I don't follow the video AI space all that closely. Closely.
They're a powerhouse in the field, I'm sure. But it is interesting to see sort of that model garden sort of model Oops where you have a whole bunch of these specialized AI models in a marketplace for their specialty things. And that's sort of the interesting thing because generating video could be quite expensive. And so if you're flipping between you know, the, the Amazon video model, this 12 Labs model and, and some other video model, like do you have to like spend 45 bucks each time to get the result you want? It's kind of a, it just, it's an interesting problem to have in these things, but it's pretty cool. I mean if you're, you know, like it's.
[00:17:16] Speaker C: It'S another tool.
[00:17:17] Speaker A: It's another tool.
[00:17:19] Speaker C: Test the tool, see if it works for you.
All right.
[00:17:22] Speaker A: Moving on to cloud tools, harness AI unveils advanced DevOps automation, smarter pipelines, faster delivery and enterprise ready compliance.
Sure.
Harness AI brings the context aware automation to DevOps pipelines by understanding your organization's existing templates, tool configuration and governance policies to generate production ready CI C pipelines that match internal standards.
Oh please God, don't look at our existing pipelines. That's not where we should start.
[00:17:54] Speaker C: Never, never start there. That's normally a bad life choice.
[00:17:57] Speaker A: Yeah.
The new automation platform uses large language models combined with a proprietary knowledge graph to provide AI driven troubleshooting, natural language pipeline generation and automated policy enforcement directly integrated into the harness platform rather than a separate add on. This addresses the growing challenge of faster AI generated code, outpacing traditional pipeline capabilities while managing increasingly fragmented tool chains and mounting compliance requirements.
Key capabilities include automatic pipeline generation, intelligent troubleshooting that understands your specific environment context, and built in governance guardrails for enterprise ready compliance without added complexity. I really doubt that as someone who has to put in that compliance, it's always complex.
The solution is positioned as having an AI DevOps engineer on call 24, 7 who already knows your system, helping teams move from idea to production faster while reducing the manual toil in the software delivery process.
[00:19:00] Speaker C: Totally, totally see an AI being a 24, 7 on call engineer that looks at our old pipelines, generate new ones and definitely doesn't fall into the same old terrible patterns of things I've done to fix a bug at 2am in the morning that then stayed in production for five years.
[00:19:21] Speaker A: Yeah, I mean I really wonder like how this, you know, I do like that it's built into the existing tooling.
As a you know, infosec professional and like how. How is this compliance really put in because if I have to prompt it as the software engineer like that's not okay. But then how do I from a central organization provide that sort of governance at a. At a level that's not actually just dragging everything to a screaming halt?
And so like I, you know AI can do that somewhat I suppose but it's sort of. That's the level of decision making where you're doing risk trade offset. I. I don't know if I'm ready to offset AI.
[00:20:04] Speaker C: Yeah, I'm not there yet.
On to AWS Last week was the New York City summit and they definitely was some good announcements both right before and during it. The first one of which was a new feature to S3 because they just keep figuring out how to add more things to S3 at this point which I am always impressed by. That's going to be one of my. I think that's going to be one of my predictions A new feature to S3 that for Reinvent like just anything I feel like that's not too general.
[00:20:36] Speaker A: It doesn't happen every year.
[00:20:38] Speaker C: You know what.
[00:20:39] Speaker A: Yeah I saw it's worth it. I think I'd. I think I'd let you make it because it's.
I mean it. It's. You know we haven't talked about it yet but it is surprising when they something of this scale to that foundational service has been around for how long?
Crazy.
[00:20:54] Speaker C: That's their second service. SQS was the first the only thing I learned from studying for one of the exams at one point I would.
[00:21:02] Speaker A: Have thought it was easy to classic.
[00:21:04] Speaker C: No, no, no. S3 was definitely before then introducing AWS S3 vectors the first cloud storage with native vector support at scale because you know S3 is definitely at a scale anyway AWS S3 vectors introduces native vector storage in S3 with a new bucket type that can reduce vector search costs by up to 90% over traditional vector databases because you're not running a database. This addresses the growing need for affordable vector storage as organizations scale their AI application.
The service provides sub second query performance with similar search across tens of millions of of vectors per index which Support up to 10,000 vector. Sorry, 10,000 indexes per bucket. Each vector can include metadata for filtering queries, making it practical for recommendation engines and semantic search applications.
Native integrations with AWS Bedrock Knowledge Base SageMaker Unified Studio simplifies Building Rag applications while OpenSearch, Ryan's favorite application service exports features enabled a tier storage strategy, which is a very, very nice feature.
Organizations can keep infrequently accessed vectors in S3 vectors and move high priority data to open search for real time performance. This is in preview in five regions with dedicated APIs for, for vector operations.
Pricing isn't specified yet, so hold on to your butts guys. But the 90% cost reductions claim it can provide significant reductions in costs.
This positions AWS as the first cloud native vector in object store, potentially completely disrupting the market. Thank you. Commentary.
We can clearly tell we're playing with our pot still a little bit.
[00:23:02] Speaker A: Yeah, just a bit.
[00:23:04] Speaker C: I mean if it does 50% of what the. Of the 90% and especially with the Elasticsearch OpenSearch integration, like that's key right there because you can really tier your data now at a much better level for sure.
Even like before, like once you with OpenSearch, you had to pull it in from S3, like you saw a big performance hit.
It's like that feature of. It's well worthwhile.
[00:23:31] Speaker A: Yeah. I mean and it's, it is just, you know, I love this trend where instead of having just hardware running all the time, you know, you're. You're now accessing your data, you know, kind of at rest and with the tiering that they have, they've built into this, like that's just awesome. And I wouldn't think that, you know, vectorized searching would be something that anyone would think to put in S3. And so like it's like, you know, I read this article, I'm like that's genius, you know, because it's.
I want to know how it works and I still don't really, you know, like I can barely grasp, you know, what a vector DB is in general and how to use it.
But it is, it is sort of cool. Like I think that that's neat. I want to, you know, like play around and kick the tires and you know, because I think the devil's in the details of how those vectors get generated and making those and, and how usable they are. But that's. It's a pretty cool feature. I.
And it's just kind of neat.
[00:24:32] Speaker C: I'm curious about the pricing. Once they ga it so, so expensive.
[00:24:36] Speaker A: It's gonna be all the money.
You know it is.
All the newer stuff on S3 is very expensive. Yeah, but it's all, you know, like in a lot of cases, worth it.
Announcing Amazon Nova Customization customization in Amazon SageMaker so they've introduced customization capabilities for Nova directly within bedrock.
The service offers both parameter efficient fine tuning using Lora adapters for smaller data sets with on demand inference and full fine tuning for extensive datasets requiring provisioned throughput. Giving customers the flexibility based on data volume and cost requirements.
Direct preference optimization and proximal policy optimization. Wow. Enables alignment of these models to output the specific requirements like brand voice and customer experience preferences. Addressing the limitations of prompt engineering and rag for business critical workflows.
Knowledge and distillation allows customers to create smaller cost efficient models that maintain accuracy of the larger models. Particularly useful when lacking adequate training data such as samples for specific use cases.
Early adopters including mit, csail, Volkswagen and Amazon's internal teams are already using these capabilities with recipes currently available in US east through SageMaker Studios.
[00:26:01] Speaker C: Jumpstart interface I think I only understood about half the words you said there. Like I got fine tuning but I don't know what parameter efficient fine tuning is using a Lora adapter.
Like I'm just above my pay grade on some of these things.
[00:26:17] Speaker A: I mean you do it's, it's, it's such a fast field that you, you know, like I barely understand these things and I've only because I've been working on a project in my day job to sort of get information based off of all of our internal IT data sets. Right. Like and have a custom bot that simplifies our, you know, employee day to day and onboarding and so like you know, like I, you know, I sort of understand the parameter efficient fine tuning not in this direct concept and definitely not enough to talk about it more than you know, like it's how you weight certain things and you have a lot, you have a lot of knobs behind the covers when you're do using, you know, something, you know, like an agent space or in you know, Google Agent space.
SageMaker yeah, it's just a, it's very complicated and so it's everything about running a, you know, search index but with a lot more knobs.
Kind of surprised it took this long to get into SageMaker like and you don't hear Nova talked about a lot. So it's cool to see it.
[00:27:30] Speaker C: I mean I feel like we're just so overwhelmed with bedrock and you know, it just Kind of consumes the life of all these things.
[00:27:38] Speaker A: Yeah, definitely.
[00:27:40] Speaker C: Introducing another feature of aws, Amazon Bedrock Agent Core provides enterprise grade infrastructure for deploying AI agents at scale. Addressing gaps between proof of concept agents built with frameworks like Crew AI and Lane Graph and production ready systems. Because a POC never went to production in my day job or in any day job I've ever seen before in my life, Preview includes Just saying that this is a preview also, so don't mind that detail. Preview includes seven module services, runtime for serverless deployment, memory for session management, observability for monitoring, identity for secure access controls, gateway for API integration, browser for web automation, and code interpreter for sandbox code execution.
Agent Core runtime offers isolation serverless environments with three network configurations, Sandbox, Public and VPC only. Which feels weird that that wasn't part of the initial review.
But don't worry about security. Just saying that enabling developers to deploy agents with just three lines of code while maintaining session isolation and preventing data leakage. But don't worry, you can't run it in just your VPC yet.
Agent Core Identity implements a secure token vault that stores oauth tokens and API keys. Allowing agents to act on behalf of users with proper authorization across aws services and third party platforms like Salesforce, Slack, SAM Company and GitHub. This eliminates the need for developers to build custom authentication infrastructure which is never a good life choice while maintaining enterprise security requirements.
Agent Core transform existing API and Lambda into agent ready tools using mcp providing unified access with built in authentication, throttling and request transformation capabilities.
Because this is preview, it's only going to be available in a few locations and integrations will be in the service. It is free right now for the next two months or so.
I mean to me this is nice that like we're trying to take the complexities and everything else and kind of standardize it in such a way that building these agents does Isn't everyone reinventing the wheel? It starts to be a little bit more standardized and it's just as simple as like hey, build me a flask app. Cool. We have, you know, install flask or a standard thing and it's not 500, you know, different systems that everyone's custom built in house.
[00:30:13] Speaker A: Yeah, this takes me back to when chat ops was first on the scene and then everyone built their chat op spot and they could do all kinds of things and then they realized that you basically just gave everyone in the company full admin access to just tear down any machine or touch your Interface. Right.
[00:30:34] Speaker C: Code definitely wasn't what I've wrote a few times.
[00:30:37] Speaker A: These agent spaces, or Agent Core in this case is, you know, how you sort of control that. So we are learning from our mistakes of the past, which is great, right, because this allows basically the.
You have different parts of your business. They all want to access internal data sets, but there's, you know, who has access to what, you know, do you want, you know, you want your. Your support team to be able to sort of, you know, run queries across, you know, all their.
All your customers and all the customer support cases and all that. But there's a lot of financial data also in Salesforce and a lot of sensitive data that, you know, like, if it got out, could be, you know, in violation of SEC and a whole bunch of things like that. And so it's like you have to put rules around these things, and that's really what these Agent Core services and Agent Space services really offer, which is great because you can sort of apply sort of rule sets around the data access and so. But you can still then grant that team to write their own AI bot that's querying those services. And so, like, I do really like this model and I'm glad to see it come about so quickly because I don't think, you know, like, in the security space, there's been a lot of like, oh, well, we can't train AI on that data because there's too much sensitive things. We can't protect it. And so now we, we can actually unlock that because there's tools now where we can say, oh, we can go through and tailor it to our access profiles and tune it so that certain things are only accessible indirectly and put some rules around it, which is cool.
[00:32:18] Speaker C: Well, what it needs to be is not in the training of itself, but more of the how do you leverage, you know, returning that data? Because they want it to be trained on more things so it has more knowledge. But don't give me information like the company's, you know, current sales numbers and sales projection for the quarter If I'm a L1 support engineer, you know, so, like, you gotta. You gotta balance it out. But if a CEO comes in and is trying to, you know, not do, you know, what CEOs do? Which would probably be ask somebody else for it. But, you know, if they're trying to go play with it themselves and get the data themselves, they would say, hey, give me this. And yeah, they're obviously authorized. So I feel like it's also on the back end too which is, which is the hard part. So this is like one piece of it now, how you handle the data transformation to customers.
[00:33:08] Speaker A: Yeah, and it's also pretty cool. Like you know the, the isolated environments and so you can, you can put data sets and only make them available in the specific tier. And so like you can have from a user perspective, it's, it's all one thing. And you're, you're interacting with your making your agent or however you're using your internal chatbot, whatever it is all from one space. But you have these different levels of isolation that you can apply on the back end, which is really cool.
And I wonder, I bet you could even promote that depending on, you know, how you want to build out the application of those agents.
All right. Streamline the path from data to insights with new Amazon SageMaker catalog capabilities.
This is like the worst headline because it's the it should read. Amazon SageMaker now integrates into Quicksight directly in the unified studio, allowing users to build dashboards from project data and publish them to SageMaker Catalog for organizational wide discovery and sharing.
This is awesome.
[00:34:20] Speaker C: Tell us more.
[00:34:22] Speaker A: I will once I learn how to make my mouth work.
This eliminates the need to switch between platforms and maintain consistent governance across analytic workflows. SageMature Catalog adds support for S3 general purpose buckets with S3 access grants, enabling teams to discover and access unstructured data like documents and images along with their system structured data.
The integration automatically handles permissions when users subscribe to S3Assets, simplifying cross team collaboration and diverse data types.
Automatic onboarding from AWS Glue Catalog brings existing Lakehouse datasets into SageMaker catalog without manual setup. Unifying technical and business metadata management.
This allows organizations to immediately explore and govern their existing data investments through a single interface.
The integrations require IAM Identity center set up for quicksight and appropriate S3 permissions with standard pricing for each service applying. This is available in all commercial aws regions where SageMaker is supported. The features address the complete data lifecycle from ingestion to visualization. Real world applications include medical imaging analysis and notebooks and combining unstructured documents with structured data for comprehensive analytics and building executive dashboards. That's the number one use case, let's be honest. For automatically staying synchronized for Justin. Yeah, pretty pictures for Justin.
This has been the biggest challenge for.
[00:35:48] Speaker C: A long time coming.
[00:35:50] Speaker A: Yeah, like it was funny to me because I always, you know, I started getting into big data sets and nerding out because once you get the ability to query and Generate insights from a very large data set. Like, it's just super neat.
But then when you want to share that, it is super hard if you want to productionize it at all. It's just very complicated.
You know, similar to like the agent things, there's. There's permissions to different parts of the data sets. And if you, especially if you're hooking up to a data lake and then the ability to just have this natively integrate with, you know, your visualization BI tool is just awesome.
You know, I'm hoping that I don't have to configure my QuickSight dashboard. That AI will handle that for me and just make it all magically work. And pretty pictures generate themselves from the data set, but pretty cool. I like the automatic onboarding.
I like the permission sets where you're only showing data based off of IM permissions.
Okay, cool.
[00:36:56] Speaker C: Yeah, I mean, just. They kind of piecemealed all those little things that you and I have probably done multiple times with, you know, all those pieces. They glued it together. And building that service catalog, even though I think Justin still thinks that's a dirty word, you know, for people, is something that hopefully should be easier to get people, like you said onboarded, offboarded, and kind of help make that person's life a lot easier.
[00:37:22] Speaker A: This, it's not a service catalog. It's just a unified studio catalog. Right. Yeah, I think it, I think it counts catalog.
[00:37:29] Speaker C: Sorry.
[00:37:30] Speaker A: Yeah. Yeah.
Well, it's. I mean, it's. It's a different thing. Right? You. You want to be able to sort of provide these package bits.
I just don't believe that you can have, you know, like, it's the argument of, like, how do you support reusable components anyway? We should not get into like, service catalog offshoot again.
[00:37:52] Speaker C: Yeah.
So one of the oldest ways to get into AWS was the free tier. And the oldest way to get yourself upset with your credit card was to burn through those free tier and all of a sudden start getting charged the bill. So AWS made it more complicated now, but slightly better with new customers can now get up to $200 to start exploring AWS. AWS introduces a new free tier structure with up to $200 of credits. Hence the up to $100 is just for signing up and $20 is for gamification of the AWS console with $20 for each of the following services, EC2, RDS, Lambda, Bedrock, and Budgets. I actually do really much appreciate that the budgets is one of the key things to set up, which I think have to be activated in the first six months. New customers can choose between a free account, no charges for six months until credits have expired with limited service access or paid plans with full AWS where the credits are automatically applied to the bill. Hopefully it doesn't break anyone's automation and take down an automatic deployment they're working on. The free account plan restricts access to enterprise focused services, but limits over 30, but includes over 30 always, always free tiered services. I think like S3 is one and a few of the other ones with automatic email alerts of 50, 25 and 10% of credit remaining and notification intervals of 15, 7 and 2 days before expiration. So you can actively ignore everything multiple times. This replaces the previous 12 month free tier and with always that complicated page. I don't know if you guys have ever looked at. It was like which service is free for how long and for where and what.
For new accounts created after July 15, 2025 while existing accounts remain on the legacy program. A notable shift in AWS customer acquisition strategy.
The required activities expose new users to core AWS services and cost management tools. Thank you for being a FinOps focused company and teaching proper instance sizing budgeting alerts from day one rather than discovering that when your CFO or your personal credit card gets the bill, definitely. I know we talked about cutting it, but I think it's kind of fun the way they gamified it a little bit and force you to go play with the things with the key one here being budgets. I feel like that should have been like in order to use EC2rds like you are and especially Bedrock, like you had to set up that budget and it kind of forces people to fix, you know, a lot of those.
Hey, on Reddit, how many times have you seen, hey, I've accidentally caused a 300 bill because you know, I launched an i3.4XL. Like yeah, that will do it.
[00:40:46] Speaker A: Yeah. And if you think about, you know, AI and so you know the fact that Amazon Bedrock is in here, like it's easy to burn money with AI stuff like very unintentionally.
And so I do really like that they are putting budgets in there. I really think they should offer the first 20 bucks and then make you do budgets for the remaining 180.
[00:41:11] Speaker C: Yes.
[00:41:11] Speaker A: But I mean I will admit that I think I didn't play with a lot of services when I was experimenting using free tier because it was just sort of a. It was, you know, the free tier was the free tier with the timing and the you know that what was available and so you know, whatever my interest was dejour it's probably where all that was focused and I probably spent it all in one place. So it's. I do think it's pretty cool to gamify to explore the different services which I is the entire point of the free tour.
So pretty cool. And I like all the notifications that they're giving you when you're credits are about to expire because that's always been a bane of my existence.
[00:41:56] Speaker C: Well it's only when if you're on that specific one where you can use those services until then. So there are some caveats because.
[00:42:04] Speaker A: Oh for sure.
[00:42:05] Speaker C: Yeah. But you know overall it's. It's a great update. Like they're giving you a little bit more money so maybe inflation's a cause cause too. But you know it's a lot of things that they're still kind of forcing.
[00:42:17] Speaker A: You to play with. Mm.
I imagine the return on that 200 bucks is like I bet you they don't even have to factor it in.
[00:42:24] Speaker C: Yeah, I'm pretty sure I've worked for many companies where that 200 for all customers for the year they burned with the one customer.
[00:42:34] Speaker A: Oh yeah.
[00:42:35] Speaker C: With one EVP or whatever they're calling it now.
[00:42:37] Speaker A: So like that's not even a big enough change. Right. To go look at. Like it would cost me more money in manpower to go figure out how we spent that $200. Yeah, yeah.
[00:42:49] Speaker B: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days.
If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference.
Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Our chair will click the link in the show notes to check them out on the AWS Marketplace.
[00:43:28] Speaker A: Monitor and debug event driven applications with new Amazon EventBridge logging.
EventBridge now provides comprehensive logging for event driven applications tracking the complete event lifecycle from receipt through delivery with detailed success failure information and status codes. This addresses a major pain point in debugging. Tell me about it.
Where event flows were previously opaque, if accessible at all, this new feature supports three log dimensions. Cloudlogs, logs kinesis data, firehose, NS3 with configurable log levels and the optional payload logging, which is not so, because usually you never get that. Data logs are encrypted in transit with TLS and at rest when using customer managed keys.
And the logs include valuable performance metrics like ingestion to start, latency, target duration and HTTP status codes, making it straightforward to identify bottlenecks between event bridging processing and the target services.
What previously took hours of trial and error debugging.
[00:44:31] Speaker C: Yeah, tell me more.
[00:44:32] Speaker A: Can now be diagnosed in minutes. Yeah, exactly.
API destination debugging becomes significantly easier as the logs clearly show authentication failures, credential issues and endpoint errors with specific error messages. This is particularly useful for troubleshooting integrations with external HPS endpoints and SaaS applications.
There's no additional eventbridge charge for logging what customers already pay for. Only Pay for standard S3 CloudWatch logs for Kinesis data host. Yeah, your data firehose, you're still paying for it.
[00:45:02] Speaker C: Just be clear.
[00:45:03] Speaker A: Yay.
The feature operates asynchronously with no impact to event processing, latency or throughput.
Oh my God, where have you been all my life?
[00:45:13] Speaker C: So many. Let me. There's a line. Previously took hours of trial and error for debugging can now be done. Yes. The amount of time I was like, okay, this talks to this, talks to this. Where are we missing? What's the error? You know, trying to piecemeal it through was like painful. So if this does again, half of what it does, it's saved me a lifetime of effort.
[00:45:36] Speaker A: Yeah, like I don't know how many times I've had to set up like a dead letter queue and failure things just to get that payload to figure out what poison pill got into my stream.
Like what was in there that's causing this to just constantly retry and never stop, you know, like I've so many times have I like had to cobble together some sort of processing and routing and add it on to the application to actually get to that data.
And so this is super handy payload or super handy like feature that I hope comes to everything.
[00:46:11] Speaker C: It reminds me of like when SQS gave you the ability in the console to like see the first 10 items in the queue. It was like, oh my God. The amount of like hacky scripts I've done to like kind of just do this. It's like one of those like it's so simple. Obviously it's a complex thing to set up and you got to be careful because if you're pumping this CloudWatch logs.
[00:46:31] Speaker A: You know, will be painful that can get expensive.
[00:46:34] Speaker C: Yeah real fast. You know S3 not so much but then you actually have to query it. So you know, choose which way if you want to Pay Athena or CloudWatch and fire hose where you're pumping solar through. So like still going to cost you money but at least if you dump test through you probably don't need it that often.
You're just doing when you set up so maybe you set up some sort of like hey if Dev go to CloudWatch logs else go to S3, you know and you put some logic in your infrastructure. Hopefully that will help you out.
[00:47:02] Speaker A: And how cool is it that you could like you know, you can configure the log level so you may have something that's like you know, send all error messages, you know to CloudWatch logs, you know and maybe you could set up the same type of thing for having more, more configurable options for putting it in S3 for long term restorage and that kind of thing. Like it's. It's kind of neat they offer that sort of tooling around it where it's configurable at different levels.
[00:47:29] Speaker C: Because S3 has not gotten enough new features and you love recently AWS S3 metadata now supports metadata for all your S3 objects.
S3 metadata now provides complete visibility for into all existing objects in S3 buckets through Apache iceberg tables, eliminating the need for custom scanning systems and expanding beyond just tracking new objects and changes. The service introduces two types live inventory tables that provide a complete snapshot of all objects refreshed within an hour. So questionable for live and journal tables that is near real time changes for objects auditing and lifecycle tracking. Good. You're about to get figure out why pricing includes one month backfill of $0.30 per million objects with no additional monthly fees for buckets under 1 million objects. The and journal table costs 30 million, sorry cost $0.30 per million updates at a 33% price reduction. The table enables SQL queries through a Queen Athena to use cases like finding unencrypted objects, tracking deletes, analyzing storage costs by tags and optimizing ML pipeline schedules by pre discovering metadata. It's currently available in Ohio, Virginia and US West 1, not 2. Just to be clear there with tables automatically created and maintained by table services without requiring manual compact or garbage collection.
Wow, there's a few weird ones in here. Like really US west one. So somebody there like somebody in the Bay Area definitely was like the key driver for this because.
Yes, yes. I mean the other interesting Piece is like the use cases are things that are pretty interesting of like, you know, cost by tags is a massive one because I've definitely done a bunch of analysis of like S3 objects and like, okay, do we take the one time hit now or get killed on it? You know, because it's costing us X amount versus do we accept tiering and everything else? Like, there's a lot of pretty interesting things that can come along the line with this.
It's not terribly expensive too to implement.
[00:49:53] Speaker A: Well, I mean it can't, it's, it's really difficult to implement without making a huge mess. Right. Like I, I remember sort of enabling exactly this thing and that's how I spent my, you know, or that's, that was my first real. Oh, I gotta go meet with the, the CFO problem. Right.
[00:50:13] Speaker C: It's never a good day.
[00:50:14] Speaker A: Yeah, right. Because I, I wanted to know, you know, how the data was being used and the, the age of these things. And so I went querying a whole bunch of objects and turns out it, even at, what was it, like a fifth of a cent per query, there was enough data in there for that to be a problem.
[00:50:35] Speaker C: It's amazing how much sense or fractions of cents add up real fast.
I was explaining to my CFO and to my finance team, they're like, why can't you like figure out exactly what things cost? And I'm like, okay, so tell me if we have n number of unknown agents connected to n number of files uploaded into n number of places with n number of these and everything costs different fractions of sense, you tell me.
And it adds up real fast.
[00:51:09] Speaker A: Oh yeah.
It's kind of interesting because there's a lot of, a lot of overlap with just the, this S3 inventory feature set as well. Right. And so it's kind of taking that to the next level.
[00:51:20] Speaker C: Yeah.
[00:51:21] Speaker A: Because I think you get a lot of object metadata from that inventory.
But I guess this is more like. This is metadata you can organize yourself in iceberg tables maybe.
[00:51:34] Speaker C: Well, isn't there a cost also for tags in S3AFI?
[00:51:39] Speaker A: Oh, oh yeah, I'm sure there is.
There's at least, there's at least API costs for, for tagging them.
[00:51:46] Speaker C: Right. So like you're going to get hit multiple times on this, so you got to be a little bit careful.
[00:51:51] Speaker A: Yeah, it's definitely very tricky and it's just, you know, that's, you know, it's not really S3's fault. It's just very easy to shoot yourself in the foot with anything. It's. That's kind of scale, right? Like that's.
Huh.
[00:52:05] Speaker C: In completely random news, apparently you actually get charged for double layer server side encryption with ON S3 for Blob storage. I didn't realize that.
[00:52:15] Speaker A: I did not either.
[00:52:16] Speaker C: You get free for SSE for S3, SSE for customer value key and SSE for kms, but you don't for double se kms.
Anyway, sorry. I've been playing with some double server side encryption recently on some stuff at my day job, so it caught my eye when I saw that one.
[00:52:40] Speaker A: Well, I get why they do it, right? Because it's for any of these things.
If you don't have access to the keys then you can't dedupe it, right?
[00:52:49] Speaker C: But it's a cost per gigabyte to do it. So like. Yeah, I get it.
[00:52:53] Speaker A: Yeah, yeah, yeah.
[00:52:54] Speaker C: Anyway, in us squirreling news.
[00:52:58] Speaker A: Yeah.
Simplify your serverless deployment or development with console to IDE and remote debugging for AWS Lambda. Hooray.
[00:53:08] Speaker C: We need like clapping or like you know, when you adjusting here to be able to play like the clapping. Please don't do it. You'll blow my eardrums out.
[00:53:14] Speaker B: Ah.
[00:53:15] Speaker A: I was just loading it like I'm in charge. I can do what I want.
Except for I don't know how.
That's fine. Little delayed.
[00:53:39] Speaker C: This is why we're not allowed to do this with this is the adult supervision. You and I were like, oh, how many of these can we play? I think next time we should see how many of these we can play as we're doing it.
[00:53:49] Speaker A: But yeah, no, I think that our listeners are very lucky that you and I are both tired and kind of doing this last minute because otherwise we.
[00:53:56] Speaker C: Would probably have a sound effect per article.
That's the goal. Next time.
[00:54:01] Speaker A: Next time.
Yeah.
[00:54:02] Speaker C: Justin, this is what you get for traveling.
[00:54:08] Speaker A: Lambda now offers direct console to IDE integration with VS code, adding an Open and Visual Studio code button that automatically handles setup and opens functions locally, eliminating the manual environment configuration and enabling developers to use full IDE features like integrated terminals and package management.
Remote debugging capabilities allows developers to debug Lambda functions running in their AWS account directly from VS code with full access to VPC resources and IAM roles. This is getting better the more I read it.
This solves the long standing problem of debugging cloud functions that interact with other production and AWS services.
The remote debugging feature supports Python Node JS and Java runtimes at launch and automatically handles secure Connection setup, breakpoint management, and cleanup after debugging sessions to prevent production impact.
Both features are available at no additional cost beyond standard Lambda execution charges, making it more cost effective for developers to troubleshoot issues in actual cloud environments rather than maintaining complex local emulation setups. Oh my God. You should see some of the nastiness I've set up over time.
This addresses a key serverless development pain point that I remember doing a focus group on like many years ago. So it's pretty awesome to see this and no longer will you end up in the situation where your functions work locally but fail into production due to differences in permissions, network access and service integrations.
[00:55:38] Speaker C: I have bad news for Peter. It only supports Python, Node JS and Java. It does not support Ruby.
[00:55:45] Speaker A: Oh yeah, no one wants that except for Peter.
[00:55:51] Speaker C: I wonder if he still listens.
[00:55:53] Speaker A: No.
[00:55:57] Speaker C: I mean this is one of those things that I've done some really hacky stuff over the years. Like Ryan has to like log lines printing out probably secrets into CloudWatch log, you know, to see where the hell I had problems. Like, this is one of the few times where I'm like, full integration into VS code. Yeah, sign me up. This is actually a good feature that like with direct integration because it's almost the opposite. Most things like deploy your app function or everything with VS code, I'm like, please don't do deployments with VS code. But now I'm like, oh, I can actually leverage it for the software dev capabilities of it. That is worthwhile. And that's where I feel like this actually, like it's good because it gives you that remote debugging iam you'll have to make users and deal with, you know, IM stuff everywhere. Like it. This is pretty full featured for day one. I think this is an amazing release.
[00:56:53] Speaker A: The fact that, yeah, the VPC access, like that is awesome because that's always been a huge pain point, you know, trying to like re emulate the environment in my dev environment by like running, you know, code in a container that's supposed to rep, you know, be a simile effect simile for the Lambda environment and doing all that. But then how do you plug it in? How do you get all these things like actually talking to one another and you know, it's just so rarely that I'm able to write something that can just interact with only Amazon services that are available, you know, on public APIs.
So pretty sweet of them to include that from day one for sure.
[00:57:39] Speaker C: Accelerate your software releases with blue green deployments. But for ECS though I thought they already had this and I meant to look this up beforehand. Ecs. Oh no, I know what it is. ECS includes built in blue green deployment and no additional charges, eliminating the need for for teams to build custom deployment tooling and providing automatic rollback capabilities.
The feature introduces lifecycle hooks that integrate with lambda functions, allowing for teams to run validation tests at specific stages like pre scale up, post scale up and traffic shifting phases before committing to new versions. BlueGreen deployments maintain both environments simultaneously during deployments, enabling near instantaneous rollback because it keeps the container without affecting end users. Except if your container broke your end users. But it's not going to be Amazon's fault.
Since production traffic only ships after validation phases, the implementation requires configuring IAM roles, load balancers or service connections target groups through ECS console. Each service revision maintains immutable configuration and consistent rollback. This addresses the challenge where development teams previously spent cycles building undifferentiated deployment tools, focusing on and instead lets them focus on business innovation.
And the feature I was thinking of was the built in CodeDeploy because I've done that many times where I leverage CodeDeploy in ECS kind of handle it.
Yeah, there was another ECS deployment option and I think it was rolling deployments was built in. Yeah, there was like one built in feature and blue green you always had to go the other way.
[00:59:20] Speaker A: Yep. Yeah, rolling deployments. Yeah, is exactly right because I that was always the my go to and I never did blue green because it was extra and I didn't want to do extra. So it was just like well some of some percentage of these things are going to fail. Right. And just like until.
Until I hit a failure threshold that's just going to be how. How it works. And so this is nice that you can actually do blue green and and so much easier so you don't have to like fail back the other way like where you're still you know, basically providing 10% of your your customers or incoming requests like failure responses which is terrible. It's always awful as you're watching that rollback.
So blue green it makes that much easier.
[01:00:04] Speaker C: Much better. Yeah. And before you had to do like code deploy and then the load balancer would cut over and you know you had to like to do the testing on. It was complicated. So like you always have to do like weird hackiness to get the testing to work the way you want it to. So it's nice to kind of like set it up with lambda too. So you can like have your unit test in Lambda that you have Kiro or any other AI bot now right. For you that automatically fires against these things. So it will be, it'll be a nice setup.
[01:00:34] Speaker A: Yeah, no, I mean being. Being able to validate environment specific configurations. Right. It alongside your production network without having it breaking everything is awesome. Right. And I do love that Lambda integration where you know, you can have your sort of, you know, smoke tests already built in. So pretty sweet.
All right. Amazon bracket adds new 54bit Qubit Quantum processor from IQM Amazon Bracket announces the new superconducting custom quantum quantum processor with square lattice topology, expanding the quantum computing options available to AWS customers alongside the existing Tapped Ion and Neutral Atom devices.
This is a sci fi novel. The MRUB QPU features state of the art gate fidelities and dynamic circuit support enabling researchers to experiment with more complex quantum algorithms using familiar tools like the bracket SDK, Nvidia Q2Q, QSKIT and Penny Lane.
Hosted in Munich and accessible via the Europe Stockholm region, this edition strengthens AWS's quantum computing presence in Europe while providing on demand access to latest generation quantum hardware without requiring direct hardware investment.
Amazon Braket Hybrid Jobs offers priority access to Emerald for running fully managed quantum classical algorithms, addressing fruit practical need of combining quantum and classical computing resources for real world applications.
AWS Cloud Credits for research program supports accredited institutions experimenting with quantum computing, reducing the barrier of entry to academic research but maintaining the standard bracket pricing for commercial users.
[01:02:26] Speaker C: All I can think of whenever I hear Penny Lane is that song Beatles. I think it is. Yeah.
[01:02:32] Speaker A: So yeah, yeah, I feel like we've moved on from like talking about quantum computing as this like thing that will happen someday in the future.
[01:02:43] Speaker C: Oh my God.
[01:02:43] Speaker A: We start talking about square lattice topology versus the neutral tapped ion. I'm like, oh, this is not made up anymore. Like uh, oh, like this has somehow gotten real.
Well I. And I've been sleeping on it.
[01:02:58] Speaker C: That's where I just go, hey British Jonathan, help me out here. I assume you understand this and I don't.
[01:03:04] Speaker A: Seriously? Yeah.
[01:03:05] Speaker C: Jonathan.
[01:03:06] Speaker A: Yeah, no, I definitely. This would definitely be like I still don't need. Yeah. Like, because I'm still like stuck on like I don't know how I would code anything for a quantum computing and.
[01:03:16] Speaker C: It'S, you know, encode Python. There'll be some emulation engine that converts it out. Call it a day.
[01:03:22] Speaker A: Import work. Quantum computing. Yeah, yep. It definitely. That's.
[01:03:29] Speaker C: That's going to be me onto the world of gcp. I Think this is going to be a long show Ryan. I think we should have curled cut a few more episodes.
[01:03:37] Speaker A: We did good. We did get the and GCP kept it light for us.
[01:03:40] Speaker C: Thank God. I don't know that Azure. I think Azure actually showed up so we'll see.
[01:03:45] Speaker A: Oh no. They did.
[01:03:47] Speaker C: Yeah.
New Monitoring libraries for optimizing Google TPU resources Google has released a new monitoring library for cloud TPUs that process real time metrics like tensor core utilization, HBM utilization buffer transfer latency sampling at 1Hz and enabling developers to dynamically optimize their AI workload directly in code.
The library integrates JAX and Pytorch into installations through Lib TPU and allows programmatic adjustments the address this addresses key gaps in the observability of the TPU compared to AWS CloudWatch for EC2 GPU instances and Azure GPU monitoring completely different things. I think our AI went off the deep end Google customers does Google customers similar performance on the TPU architecture.
I mean getting these things and getting this data is going to be important because as we run out of resources as we already have but as we're building more stuff getting this level of detail and being able to decide what you're doing and where will both cause will both helpfully create less resource contention, less power utilization and let people spend less money because we'll optimize their code through all these things.
So I mean you're going to be fairly deep into it but I assume the people building the foundational models this stuff's going to be pretty critical.
[01:05:20] Speaker A: Yeah, you know it's. It seems kind of late coming that they're you know they're providing sort of that operational insight to TPU utilization. Right. And so it's. I guess they've been able to do this with I guess standard APM tracing.
[01:05:36] Speaker C: But I assume it was there and like the key customers were kind of getting it and now like they're just gaing it. It's kind of the way I assume it maybe.
[01:05:45] Speaker A: I mean it is an SDK that's they're releasing like in addition to the existing services. Right. It's not a service by itself but it is a neat little like easy, you know like, like any library. It's just easy button for like instrument my code. You make it visible. Right. So I do like that.
[01:06:04] Speaker C: I mean observability is important.
[01:06:08] Speaker A: Get to know cloud observability application monitoring.
[01:06:12] Speaker C: I segued. You could have taken the segue there a little bit.
[01:06:15] Speaker A: Sorry I didn't pay Any I was in my head, yeah. Google is introducing application monitoring, so maintaining their terrible naming that you can be impossible to search for.
It's an out of the box observability solution that automatically generates dashboards for applications defined in App Hub. This eliminates hours of manual dashboard configuration and provides immediate visibility into the four golden signals, traffic latency, error rates and saturation.
The service automatically propagates application labels across large metrics and traces in Google Cloud, enabling a very consistent filtering correlation across all telemetry data without a bunch of manual tagging efforts.
Integration with Gemini Cloud Assist Investigations provides AI powered troubleshooting that understands application boundaries and relationships, offering contextual analysis based on the automatically collected application data.
This positions Google Cloud competitively against AWS CloudWatch application insights and Azure Application Insights by reducing the upfront investment typically required for applications monitoring setup while also forcing Google's SRE best practices down everyone's throat like they usually do.
Which is fine because they're great, but I like to make fun of it.
Organizations can start using application monitoring immediately by defining the applications in App Hub and navigating to the cloud observability and enabling Gemini features.
[01:07:47] Speaker C: So okay, explain me a little bit. The GCP part is it like hey, we build templates and then we shove it onto every application that you link it to.
[01:07:57] Speaker A: So I'm not familiar with this Application Hub they're talking about, but I assume that it's some sort of application catalog where you're defining your components and then once that's in place and all that definition is there, you can I guess hit a button and get all your dashboards through this new service.
But that's new to me. Is the, the App Hub part.
[01:08:25] Speaker C: Even Gemini's AI overview on Google is not useful. App Hub is Google Cloud's product designed to help users manage and understand applications from the for the Google Cloud infrastructure, it allows users and organizations and categorizes resources, services and workloads into applications, providing centralized view of an application component in their relationship.
I don't know what that means.
[01:08:50] Speaker A: I mean it is so it's exactly what I said, which is it's group stuff, an application design center where you're sort of, instead of like sort of just launching a whole bunch of servers on the thing, you're defining the service that the underlying infrastructure supporting. And then through that you're having to label each of these things. Like if you got a service component across many different, you know, cloud run functions, you're, you're labeling those as a group, not sort of relying on your naming convention or tagging on the functions themselves.
[01:09:21] Speaker C: Got it.
[01:09:22] Speaker A: And, and so it'll, you know, like apparently it looks like it has the ability to do a little bit of auto discovery of resources.
So once you define sort of your application set set up, you can sort of let it go, which is pretty cool.
And it's free of charge.
Nice.
Just have to run your app, which.
[01:09:46] Speaker C: Hopefully you're already doing.
[01:09:48] Speaker A: Yeah.
[01:09:49] Speaker C: In other ways that Google made life better this week. Deepseek R1 is now available on Vertex AI Model Garden. Google adds Deepseek R1 to the AI model as a managed service, eliminating the need to provision eight H200 GPUs which is typically required to run large language models.
The model as a service provides enterprise grade security and compliance while providing both restful API and OpenAI Python client integration, positioning GCP alongside AWS and Azure model In the managed LLM space, DC joins LLAMA 4 model and Vertex AI expanding open models catalog, giving customers more flexibility to choose what they want. The service operates without outbound access for data security, so China will not be taking out your data while it, while it is suitable for enterprises with strict compliance concern but still want to leverage AI without compromising security and GDPR concerns.
[01:10:56] Speaker A: Yeah, I mean this is really the power of using those, those public models in something like a model garden is you know, like instead of like you know, running a server, installing all the models and getting it all in place and hooking it all together, you can now just basically provision this within your Virtix AI environment and have a web endpoint that you can then send prompts to and it makes that much, much easier to do. And so the fact that they're, you know, like it, it's because it's Deep Seek. They're like, yeah, this isn't. Can't talk to anything, you know, because that's always funny. Like everyone's always concerned about China's going to steal all our data.
[01:11:37] Speaker C: They already have it, don't worry. I know they're on, they're on lower level, they're on firmwares and stuff like that. So you know, don't worry about it.
[01:11:45] Speaker A: Exactly.
They're in my phone.
[01:11:48] Speaker C: No, that's Google.
[01:11:51] Speaker A: Yeah.
So pretty cool.
I haven't used Deep Seek. I haven't really needed to go into model shopping so to experiment too much.
[01:12:00] Speaker C: So I played with it when it first came out because it was the first kind of reasoning model that really walked you through it and I always found that interesting.
So it's Just nice to be able to choose it now. I mean I pretty much live on Claude, I feel like. And if I'm playing with stuff or I run high credits or I want to, you know, see how things compare because I have tried to run different things in multiple models to see what it comes up with, but it's more for just like edification. Slash. Hey, I got time in this meeting. I'm not paying attention.
[01:12:28] Speaker A: Yeah, you can see Google phoned it in. Just those two and one's just look, we have been a model so we are moving on to Azure.
[01:12:36] Speaker C: No phone in joke or like that's all they do. Come on. Yeah, too easy.
[01:12:42] Speaker A: Too easy. I guess we have a boatload of Azure stuff. You're supposed to call these for the interesting ones and I don't know but.
[01:12:50] Speaker C: I deal with this so I always find them interesting.
[01:12:53] Speaker A: We made that mistake.
[01:12:54] Speaker C: Yeah, that's what you get for having an Azure host. I think maybe we'll cut one of these ones. We're going to merge a few as we go.
[01:13:00] Speaker A: Yeah, we are going to merge this one. I mean it's sort of.
[01:13:02] Speaker C: There's two we're merging which together.
[01:13:04] Speaker A: So it won't. It won't be too bad. Yeah.
So unified by design. You can mirror Azure Databricks Unity catalog to Microsoft OneLake in Fabric.
Everything's in fabric.
Microsoft Fabric now offers general availability of mirroring for Azure databricks Unity catalog, enabling direct access to DataBricks tables in OneLake without data duplication or ETL pipelines.
This integration allows organizations to query databricks data through fabric workloads and Power BI Direct Lake mode, meaning just a single copy of the data feature addresses key enterprise challenges of bringing Azure databricks and fabric ecosystems together.
And there are technical improvements in the GA release that include support for ADLs with firewalls enabled public APIs for CICD automation, and full integration with the OneLake security framework for enterprise grade access controls.
They're also announcing a Cosmos DB integration into Microsoft Fabric.
So they're bringing data directly in Cosmos DB natively into Fabric, combining NoSQL database capabilities with Fabric's analytical platform to create a unified data environment for both operational and analytical workloads without maintaining separate services.
The service automatically mirrors the operational data to OneLake in delta format for real time analytics, enabling T SQL queries, Spark notebooks and Power BI reporting on the same data without ETL pipelines or replication.
New vector and full text search capabilities support AI workloads with multiple indexing options, including Microsoft's Diskan for large scale scenarios, positioning this as a direct competitor. AWS's documentdb vector search and GCP's ladblue alloy db.
[01:14:57] Speaker C: I mean they just continue to shove everything into fabric, which is good because that's their data platform. So if you have your data in Cosmodb or anywhere now trying to think of other places that doesn't, you can pull it directly in versus some sort of ETL pipeline into a data lake. And that headache that we've all sworn at multiple times in our careers and swore never to do again and surprised it's back. But if they have all these, it kind of nicely flows in and you still get to figure out what a fabric buildable unit is when in your spare time now.
[01:15:31] Speaker A: Yeah. And you know, how many times have you seen, you know, like something where it's like it makes this makes sense when you're putting together one across one data set and then you realize you have this, a whole wealth of data in this other data set like you, whether it's in DataBricks or Cosmos DB or some other integration. And you like it just hurts my soul when I have to write that data and maintain two copies of it. Right. Just so that I can get this, this analysis, you know, done in the place I need it to be done. So it's. I do like that they're, they're offering these integrations deep down on the fabric layer. So. Pretty cool.
[01:16:07] Speaker C: Public preview CLI for migrating from Availability sets and basic load balancers on AKS as you all know, Amazon or sorry Azure likes to deprecate things. And there's a couple things they've deprecated, you know, basic IP addresses and basic load balancers on the chopping block.
So AWS introduced the CLI commands to migrate your AKS clusters from the deprecated availability sets and basic load balancers to virtual machine scale sets before the end of the deadline of September 30, 2025.
Love how it's a public preview with T minus 3 months, 2 months. Just want to call this out. If you're dealing with this, you really should get your butt into gear. The automation tool migrates critical needs in basic load balance or lack of features like availability zones SLAs. So if you're using Kubernetes with a production grade workload without an sla, while availability sets are being replaced by the far more resilient VM skill sets VMSs, this positions AWS relatively comparable with EKS and GKE which already provides these items. Organizations running AKS should prioritize this and test this immediately because you have under three months. This preview gives customers nearly two years to plan and execute migration. The early adoption is thing so there's a bunch of upcoming or maybe it's just in my day job that I keep getting the emails because you know, our cloud is that old but there's a bunch of deprecations coming out there and it is extremely nice that Azure is attempting to help you migrate away from some of these things and handle these migrations for you. But definitely test these in your lower level environments because I've never seen a upgrade deployment tool break anything ever in my career.
Dripping with sarcasm.
[01:18:02] Speaker A: Yeah, I mean this is a, it's clearly a large architectural change on the Azure like infrastructure and they're deprecating these so that they don't have to run the cost of maintaining two of them.
But that's, you know, hopefully they're recognizing that savings by keeping their services cost, you know, reasonable because it's painful for customers.
[01:18:28] Speaker C: Yeah, most of it's not a large increase, you know, that I've seen, you know, from doing my research. I mean and a lot of it's like feature sets kind of like launch template versus luft configurations. Like it just has a lot more in it.
[01:18:42] Speaker A: All right.
Microsoft announces a Cloud HSM.
Yeah, these are Cloud HSM delivers FIPS143 Level 3 certified hardware security security modules as a single tenant service giving customers full administrative control over their cryptographic operations and key management infrastructure.
The single tenant architecture ensures complete isolation of cryptographic operations, making it suitable for workloads requiring the highest levels of security assurance and regulatory compliance.
Key use cases include protecting certificate authorities, database encryption keys, code signing certificates and meeting specific regulatory mandates that require a hardware based key storage.
While the pricing details are not available at this time, organizations should expect premium costs typical of a dedicated HSM service.
[01:19:39] Speaker C: Cha Ching.
[01:19:40] Speaker A: Yeah, as well it should be right. Like the reality is that they're running a server with a whole bunch of protections and isolating it just for you. It's just renting the whole nvid right for these things.
And these HSM devices are hilarious if you see them constructed in real life. There's a bunch of like they have like the ink packs as well as a whole bunch of like auto disabling. If you're like monkeying around with the hardware and it's just sitting there in a server server rack somewhere in your data centers and so it's. These things are expensive and so that that level of Isolation. That level of management where you. There's no ability for any cloud hypervisor to sort of scale that across multiple clients. It's expensive.
[01:20:28] Speaker C: I mean I remember when AWS had it and you had to give them. It took them two weeks to launch it, you know and two weeks after that like you know. And that was their pre.
Was it. That was their V1 of the HSM. So you know I get where they're going and you're kind of coming a little bit full circle with some of these things.
[01:20:46] Speaker A: Yeah.
[01:20:46] Speaker C: But it really does give you that level. So if this is a requirement and you're looking at Fortune 50 companies and credit card company and financial services, healthcare, maybe, you know, money's not an object here, it just is what it is.
Hence the premium costs.
Hosted on behalf of or Hobo Public IPs model for ExpressRoute gateways is now generally available.
The only reason the story is in here because I wanted to say Hobo a few times.
I was trying to find a good show title with it like the Hobos run the show or something like that. Azure's new Hobo model for ExpressRoute Gateway eliminates the need to manually assign public IP addresses. With Microsoft's now managing this infrastructure for you, this reduces complexity. Configuration complexity definitely potential mismatches and configurations with their on prem Azure via Express route particularly benefiting organizations with limited networking experience.
If you're setting up Express route you probably have some networking experience. Just saying that Hobo models align Azure more closely with AWS Direct connect gateway approach where public IPs are abstracted away. Though Azure still requires customers to manage more networking components compared to then the overall AWS implementation. This is a dramatic improvement and will hopefully make your engineering and corporate IT teams life much easier. As they do it attempt to not be in the mixed mode where you're going to have some set up one way and some step the other because then your corporate IT will hate you.
[01:22:22] Speaker A: Yeah, it's funny because this announcement made me think about AWS direct connect completely differently than I had thought about it with the. You know now that you can have direct connect gateways like it does sort of abstract the. The. The public communication which I never really thought of it. I thought of it much more as just like a sort of a dedicated peering link in the data center. But the reality is is that yeah it's multiples of those and you hook them all up to a gateway. It's like then you just don't care.
So kind of neat. I do like these I've already. I've always appreciated the service in Amazon. I'm glad to see it's coming to Azure as well.
[01:22:59] Speaker C: Yeah, I think to simplify it makes life better.
[01:23:02] Speaker A: All right now in Public Preview Orchestration versioning for durable functions and durable task SDKs Amazon introduced or Azure we both did it in Azure. It's because it starts with an A and our branch Non Elastic addresses a critical challenge where modifying orchestration logic could break the existence existing in flight workflows. This allows developers to safely update their orchestration code without disrupting running instances.
This feature enables side by side deployment of multiple orchestration versions, letting new instances use updated logic while existing instances complete with their original code, similar to how AWS Step functions versioning works, but with tighter integration to Azure serverless ecosystem.
They've also released A durable functions PowerShell SDK as a standalone module as as well the PowerShell gallery, making it much easier for developers to build stateful Service applications using PowerShell without bundling it with the Azure functions runtime, which is nice of them. The GA release provides PowerShell developers with native support for orchestration platforms like function chaining, Fan Out, Fan in, and human interaction workflows, bringing PowerShell to parity with C and JavaScript for durable functions development.
The standalone module simplifies dependency management and version control, allowing teams to update the SDK dependently of their Azure functions runtime version and reducing potential compatibility issues.
While AWS step functions and GCP workflows offer similar orchestration capabilities, Azure's approach uniquely integrates with PowerShell's automation heritage, targeting IT operation teams who already use PowerShell for infrastructure management.
Organizations can now build complex workflows that combine traditional PowerShell automation scripts with the server sequestration, enabling scenarios like multi step deployment pipelines or approval workflows without managing state infrastructure.
[01:25:02] Speaker C: I mean, any of these improvements are just good, durable functions is designed for that consistency and having that consistency and allocation of the time, you know, but potentially breaking the things in flight kind of wasn't a good look for them. So you know, having that kind of a little bit more robustity with the versioning and making sure that different, you know, you're able to control that a lot better is just, you know, beneficial. So it's one of those general quality of life improvements, a sharp edge that they hopefully made a little bit duller.
[01:25:37] Speaker A: Yeah, I mean I remember running into this with step functions, you know, and it was sort of. It's sort of a bad realization where I realized I had to put some sort of circuit Breaker because it's just publishing events from a queue in my case. Right. And it's like well how do I stop it? I'm going to change the payload of the events. It's going to break. Oh no. You know, and so like this is. It's great to see it sort of part of the early launch because it's durable functions. It's not.
Has not been around for all that long.
[01:26:09] Speaker C: No.
[01:26:09] Speaker A: Very cool.
[01:26:11] Speaker C: Azure WAF for application load balancers for container is now in public Preview Azure brings WAF capabilities to application gateway for containers extending layer 7 security to Kubernetes workloads with protection against common exploits like a WAF does. This positions AWS and and Azure kind of all in the same playing field. The preview enables organizations to implement security policies across container applications without separate WAF instances, reducing overhead and complexity. Target customers includes enterprise customers migrating to Kubernetes who need enterprise grade security may or may not be the Azure WAF without sacrificing agility and container deployment.
Pricing are not provided but it's expected to be a pay per bill along the same way as the application gateway WAF tiers. And if you do have the Azure DDoS protection, it is free.
So we'll see if this theory stays the same across.
[01:27:12] Speaker A: Yeah, I sort of surprised to see this. I felt like this is a late thing. If you're publishing an endpoint for Kubernetes service which is common, a website or whatever, not being able to hook it to the WAF service is a huge risk. Right. And it's, you know, after managing WAFs for as long as I have, like you see just all of the nonsense scraping in attempts for everything on on the web and it's you know, like a WA having anything exposed publicly, you have to have a WAF in place for it because otherwise you're just running a huge risk of someone just waiting to figure out how to get in. And there's so many bots and so many automation things that are just out there running against everything that's publicly available.
[01:28:04] Speaker C: Yeah, it's amazing what WAFs find and what they're able to pull, you know, and seeing, you know what, as soon as you go launch an EC2 instance with a port 80 open and just see how fast that thing starts to get hitting, you're like I haven't even launched my code yet. Like in that first 15 seconds of the server booting up, it's already open. So. So you know, having a WAF outbound of everything at the app gateway level you know, or even out further at the cdn, depending on how you've architectured your infrastructure. It's just key.
[01:28:34] Speaker A: Yeah.
[01:28:36] Speaker C: That was a long show, Ryan, for two people. Sweet Jeebus. We're totally skipping Cloud Journey. Sorry, everyone.
[01:28:42] Speaker A: Yeah, we will.
The extra programming we'll pick up next week.
[01:28:49] Speaker C: We're going for like the C level here. Like, we achieved it. We showed up, we talked, somebody listened to us. I call it a day.
[01:28:56] Speaker A: Yeah. And hopefully in editing we can bring this down to a reasonable level. But this is because this is. There was a lot we don't know this much, we don't know this many things. Why are we confer?
[01:29:08] Speaker C: Because you and I are like squirrels on with acorns all over the yard.
[01:29:12] Speaker A: Yeah. And we're both sleep deprived, so we're just rambling just incoherent nonsense on the Internet.
[01:29:18] Speaker C: But it was the AWS summit. They released a bunch of stuff Azure finally woke up from, came back from summer break, and they'll be gone from shore for a few weeks again.
So we'll kind of see where it ends up in the future.
[01:29:32] Speaker A: All right, well, that wraps up this.
[01:29:33] Speaker C: Week in the Cloud.
Bye, everyone.
[01:29:36] Speaker A: Bye, everybody.
[01:29:40] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services.
While you're at it, head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.