282: Search - ChatGPT vs Google…. Fight!

Episode 282 November 14, 2024 01:04:10
282: Search - ChatGPT vs Google…. Fight!
tcp.fm
282: Search - ChatGPT vs Google…. Fight!

Nov 14 2024 | 01:04:10

/

Show Notes

Welcome to episode 282 of The Cloud Pod, where the forecast is always cloudy! This week Justin, Ryan, and Matthew are happy to be joining you in the clouds versus watching election information. This week we’re talking nuclear energy, AI Search tools, and all things Pre:Invent. Welcome, and thanks for joining us! 

Titles we almost went with this week:

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info. 

Follow Up

01:13 Energy regulators scrutinizing data center use reject Amazon bid 

3:21 Justin – “It’s sort of sad because I kind like the idea of nuclear power to solve a bunch of problems, but it has to be done in the right way for sure.”

General News  

04:12  IT’S EARNINGS TIME!  

04:22 IBM revenue misses, but execs say AI will drive future growth 

05:32 Ryan – “…it seems like that’s a pretty good play to beef up, you know, your consultant side of the business to implement that. Because a lot of businesses are going to need to do that. And a lot of them don’t have the in house skills to do it.”

05:50 Alphabet stock soars as earnings crush estimates on strong cloud growth 

07:09 Matthew – “I mean, I was talking recently with some people and they were saying how a lot more of the really small companies are leveraging Google just because their developer experience inside the platform is much better than the other ones. It’s interesting to kind of see if that’s it, but it’s a ton of small companies to keep up with.”

07:40 Amazon stock jumps 6%; Q3 revenue up 11% to $158.9B; profits hit $15.3B; AWS sales up 19%

09:17 Justin – “…it’s a crazy amount of people, by the way. I can’t even fathom having 1.5 million employees. Like, what do they all do?”

09:39 Microsoft dips on weak guidance after beating on earnings 

10:47 Justin – “No, they were, they were applauded for doing well and beating expectations, but they were beaten because they predicted slower growth for this quarter and the next quarter. So it was more, I don’t think they lowered their guidance, but I think they basically said to expect it to be on the lower side of the range that they gave, which made investors unhappy.”

AI is Going Great – Or How ML Makes All Its Money 

11:38 Introducing ChatGPT search 

12:58 Matthew – “I just like how their first real solution was, hey, let’s do a Chrome plugin, which is owned by Google. You’re just trying a weird next step.”

AWS

15:16 Amazon Virtual Private Cloud launches new security group sharing features  

16:02 Matthew – “They had something that I definitely used in the past, which was a Lambda that watched the Amazon SNS topic for the public IP addresses. you could block it. In theory, you could do the same thing. Well, especially because you was over the default 50 group limit, 50 rule limit. So every time you wanted to use it, you always had to request the limit upgrade.”

18:08 AWS enhances the Lambda application building experience with VS Code IDE and AWS Toolkit

16:02 Ryan – “My first thought when reading this is I’m curious on how this will like sort of fit in with my AWS SAM workflows, which does give you a CI-CD workflow because publishing directly with cloud formation. So it is sort of an interesting thing. I’m hoping that you could kind of seamlessly merge those experiences because it would be kind of nice if they made that easier.”

19:382 AWS Lambda now supports AWS Fault Injection Service (FIS) actions

12:32 AWS now accepts partial card payments

22:12 AWS announces Amazon Redshift integration with Amazon Bedrock for generative AI

Announcing general availability of auto-copy for Amazon Redshift

Amazon Redshift now supports incremental refresh on Materialized Views (MVs) for data lake tables

Announcing Amazon Redshift Serverless with AI-driven scaling and Optimization

AWS announces CSV result format support for Amazon Redshift Data API

22:51 Ryan – “I just keep thinking about the Redshift product team. Like, they must be just devastated because clearly these were made for mainstage announcements. It’s even got generative AI. They did all the things and they still didn’t make it.”

23:31 Amazon CloudWatch now monitors EBS volumes exceeding provisioned Performance

New Amazon CloudWatch metrics for monitoring I/O latency of Amazon EBS Volumes

Amazon ElastiCache for Valkey adds new CloudWatch metrics to monitor server-side response time

26:40 Unlock the potential of your supply chain data and gain actionable insights with AWS Supply Chain Analytics

27:38 Amazon Aurora PostgreSQL Limitless Database is now generally available

28:44 Justin – “So one of the things that will mess people up a little bit is that they, you know, way you size this as minimum and maximum capacity measured by Aurora capacity units, which, know, is magic numbers that they created that sort of represent CPUs and things. And so you can set up your minute, your, 16 ACUs as your minimum, and then you can go up to as many as 6,144 ACUs as the maximum, which, that seems like a lot of shards.”

29:48 Amazon SES adds inline template support to send email APIs

31:49 AWS announces UDP support for AWS PrivateLink and dual-stack Network Load Balancers

33:09 AWS AppSync launches new serverless WebSocket APIs to power real-time web and mobile experiences at any scale

33:19 Justin – “If I knew what AppSync did and I knew what my use case would be for this, I’d probably be really excited about it, but I don’t really know either, so that’s all I’m gonna say about it.”

35:13 Amazon Route 53 announces HTTPS, SSHFP, SVCB, and TLSA DNS resource record support 

36:12 Ryan – “Well, if all problems are DNS, you should just add more complexity, right?”

40:00 How Executives Can Avoid Being Disrupted by Emerging Technologies 

GCP

42:03 Mandatory MFA is coming to Google Cloud. Here’s what you need to know

42:26 Ryan – “I am a little nervous about that phase three just because there’s always differences when you do MFA through Federation as I’ve learned through AWS integrations. And so it’s like, I hope that goes smoothly.”

43:59 Powerful infrastructure innovations for your AI-first future

46:01 Justin – “ …we want you to know it’s coming because other our competitors are going to be offering these, but we also are going to offer them. So we want you to know that, but we don’t know what they’re going to cost or anything about them because Nvidia hasn’t given us any details, but we want to announce first.”

48:45 C4A VMs now GA: Our first custom Arm-based Axion CPU 

50:10 Justin – “Yeah, that was a weird quote. For our customers that run on a different cloud than us, this works great. OK.”

51:13 Introducing an industry first: application awareness on Cloud Interconnect

52:29 Matthew – “I just wonder how much, how many people actually need this. Like for QOS, like I feel like I’ve really set it up on VoIP and like backups, offsite backups back in the day. Like that was about it…it just feels like the wrong way to manage it.”

55:01 Speed, scale and reliability: 25 years of Google data-center networking evolution    

Azure

1:00:03 No new Azure DevOps OAuth apps beginning February 2025  

1:00:16 Justin – “So run and provision those as quickly as possible so you have them if you’re working in middle of a project before they go away and you have to redo all your work.

1:01:29 Microsoft names Jay Parikh as a member of the senior leadership team 

24:29 Justin – “…all I can think of is Azure has been beaten up pretty bad on security. Charlie Bell’s been there about two years, hasn’t seemed to move the needle and I don’t know, but if I was a betting man, I’d say the former CEO of a security startup is probably going to maybe be in charge of something security wise.”

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod

View Full Transcript

Episode Transcript

[00:00:07] Speaker A: Welcome to the Cloud pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker B: We are your hosts, Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker C: Episode 282 recorded for the week of November 5, 2024. Search ChatGPT versus Google and fight. Good evening Ryan and Matt. How you doing? [00:00:30] Speaker B: Doing good, doing good, good. [00:00:31] Speaker D: How are you? [00:00:32] Speaker C: I'm just glad to be recording with you guys versus watching the election results. Personally, this is a much better use of my time than stressing over what stuff I can't control. [00:00:41] Speaker B: Yeah, no, I'm hiding with my head in the sand. Try not to look at any media whatsoever until later when the versus getting all apprehensive about the results as they come in and changing every five minutes. [00:00:56] Speaker D: And there's not much that you're going to see for the next few hours. So you know, this is a good stalling tactic. [00:01:01] Speaker C: Yeah, yeah. I mean I won't. We probably won't know today anyways. It'll probably be like Friday or next month or longer. [00:01:09] Speaker B: Yeah. [00:01:09] Speaker C: Or you know, Supreme Court. It could be months. So we'll see. So we'll talk about more exciting news I think at this point in time. So first of all, we've been talking about nuclear power here quite a bit on the show and we talked about AWS originally doing a deal with a company called Talon Energy where they were basically going to build their data center right next to nuclear power plant so they could get that sweet, sweet nuclear power for cheap directly next to their data center. But unfortunately the Federal Energy Regulatory Commission rejected the proposal that would have allowed the data center to co locate with existing nuclear power plant. The commission voted it down 2 to 1. FER chairman Willie Phillips said the commission should encourage the development of data centers and semic manufacturing as national security and economic development priorities. But Commissioners Mark Christie and Lindsay C. Both Republicans, voted to reject the proposal while Davis Davis Rosner and Judge Judy Change, both Democrat, did not vote. And Willie Phillips was a Democrat as well. Talent Energy, who signed the agreement, drew challenges from neighboring utilities including AEP and Exelon who challenged the novel arrangement arguing would be unfairly shift costs of running the broader grid to other consumers. And the FERC's order found the region's grid operator, PJM Interconnection failed to show why the proposal was necessary and proves such a deal would be limited to Susquehanna plant. Given the widespread interest in placing data centers next to power plants, it could have a chilling effect on the region's economic development. And we'll see what Talon ends up doing here and I guess we'll get to see what happens with Microsoft. Constellation goes in front of the Nuclear Commission or the FERC Three Mile island restart project. Maybe not a great start to that. Hopes. [00:02:46] Speaker B: It's kind of interesting that the, you know, Amazon's building a data center next to a power plant has to go through the Energy Regulatory Commission. You know, I think that was just a, like a strategy. But like if Amazon built a data center, they would power it, presumably. And you know, like, I don't know if that would have to go through the same thing. Unless. Except for. Unless there was some sort of, you know, conditions or stuff that Amazon was looking for, like using the land around power plant or something like that. [00:03:19] Speaker C: Yeah, I don't know. We will see. It's sort of sad because I kind of like the idea of nuclear power to solve a bunch of problems, but has to be done in the right way. [00:03:31] Speaker B: For sure. Yeah. I mean, I want more information. Right. Like it's. So this dying early I don't think is a good thing. [00:03:39] Speaker C: Yeah. Well. And like, how does that work? The commission, like two, three people voted, but two people abstained. Like, okay. Seems like they really get the fair shake that it needed in that particular vote. [00:03:50] Speaker B: Yeah. And I hope that that wasn't just like a failure to draw a position. Right. I hope it was like they weren't around or something like, I don't know. [00:04:00] Speaker C: The kids are sick or I hope it wasn't something like, we're up for reelection in two weeks and we don't want to. We don't want to be useless against. Yeah. Which would be bad. [00:04:10] Speaker B: Yeah. Yes, it would. [00:04:12] Speaker C: All right, well, it's on to earnings. Favorite time of the quarter. I had a. I have a surprise guest in earnings today. IBM is here having missed their earnings. You know, that's not surprising, not shocking. They only earn $2.00 $2.30 a share, excluding non recurring items, which was $0.08 better than the consensus estimate. Although revenue only rose 2% on a constant currency basis, which. I don't know, none of those words made any sense to me, but Basically they made 14.97 billion and they were supposed to make 15.08, so. Yeah, whatever. But the reason why we're talking about it here is because of this quote that I picked out from Chief Financial Officer James Kavanaugh. We are very focused. Sorry. We are very focused on ensuring we get an early lead position and establish IBM Consulting as a strategic partner of choice for generative AI. This is a long term Growth factor with a multiplier effect across our software, our platforms and our infrastructure. About 3/4 of the gen AI business is consulting and only 1/4 is software. Says IBM. So they're coming for your AI dollars, baby. Wow. [00:05:22] Speaker B: I mean, I guess it makes sense. I don't know if I agree with the percentage of the amount of AI business in terms of the 1/3 versus 2/3, but there's a huge part of it for sure. So it seems like that's a pretty good play to, to beef up, you know, your consultant side of the business to implement that. Because a lot of businesses are going to need to do that. A lot of them don't have the in house skills to do it. So Cool. [00:05:48] Speaker C: Yep, we will see. Well, Alphabet was up next reporting revenue or, sorry, earnings per share of $2.12 on revenue of 88.27 billion for the quarter ending September 30th. This represents a profit and sales increase from the same period last year of 37% and 15% respectively. So much better. Analysts had only accepted revenue of $1.83 per share and 86.44 billion, which means they were 2 billion off because they're bad at their jobs. Advertising revenue topped expectations at 65.85 billion versus expectations of 65.5. Cloud revenue was 11.4 billion, up 35% from the same period last year, exceeding all expectations. Sundar Pichai said this business has real momentum and the overall opportunity is increasing as customers embrace generative AI. The Google also say they're planning to spend 13 billion on capital expenditures to help grow that AI business as well as data center capacity which please, please get some more M3 capacity. [00:06:46] Speaker D: So like 15 GPUs, 20 GPUs. Yeah, I think so, yeah. [00:06:51] Speaker B: 13 billion. Yeah, yeah, 13 billion. [00:06:54] Speaker D: Yeah, that feels about right. Maybe, maybe 30. [00:06:58] Speaker B: It's fine. I'm shocked about how much the business is growing. You know, like it's, it's kind of crazy that it's still 35% year over year or approximately 30% year over year. It's been like that for a while, I think. [00:07:12] Speaker D: I mean I was talking recently with some people and they were saying how a lot more of the, you know, really small companies are leveraging, you know, Google, just because their developer, your experience inside the platform is much better than the other ones. It's interesting to kind of see if that's it. But it's a ton of small companies to keep up with. [00:07:33] Speaker C: So next up was Amazon who topped estimates for the third quarter reporting $158.9 billion in revenue of 11% year over year and earnings per share of $1.43. Profits were 15.3 billion, up from 9.9 billion a year prior. AWS came in just below expectations at 27.4 billion in revenue, up 19% with 10.4 billion of that being operating profit. So over two thirds of the operating profit for Amazon comes from aws. Investors continue to keep a close eye on AI adoption on the cloud giant, and I thought it was interesting. Despite layoffs and unfavorable RTO policies, they are currently at 1.55 million employees, up 3% year over year, and they are now hiring about 250,000 people for the holiday season. I'm curious to see after January if those numbers start going in a different direction. [00:08:24] Speaker B: It is kind of crazy because you would think that. I mean, so I imagine a lot of the retail stuff, the return to office for the vast majority of Amazon employees doesn't apply or matter, right? Because it's if you're a delivery driver or if you're working in a warehouse, you know, you're not working from home. So I bet you it's a small portion of the company compared, you know, if you have $1.5 million employees. It was kind of crazy. [00:08:55] Speaker D: And from people I know there, while it wasn't a formal policy, I definitely know a lot of people whose management teams were already demanding four to five days in the office either way. So I'm curious of like you said, how much percentage of that 1.55 million people did it actually affect? Yeah, but it's good news. [00:09:14] Speaker C: It's a crazy amount of people, by the way. I can't even fathom having 1.5 million employees. What do they all do? [00:09:21] Speaker B: Well, yeah, I mean, that explains why I can get stuff to my house in like six hours, right? [00:09:30] Speaker C: And then finally wrapping up earnings. Microsoft reported an earnings and revenue beat for the fiscal first quarter, but was bludgeoned for predicting slower growth than analysts expected the latter part of the year. Revenue was 65.59 billion, or $3.30 per share, versus the 64.51 expected, or $3.10 per share expected. The CEO, Sacha Nadala, said he feels pretty good that going into the second half of his year that the supply demand will eventually match up. Azure growth, which we care about the most, was 33%, with 12 points of that growth coming from AI services. And for those of you who tuned in last week's episode, we did talk about they are moving some of their things out of the Azure bucket into other areas, including mobile device management and power bi. And they've added in more of their AI dollars into Azure growth to help give a better perspective of their AI investment and return on that investment, along with their Azure cloud costs and revenue to match up better to how AWS reports it. [00:10:24] Speaker B: And once again, I implore our listeners that if you can explain to me why when you predict one set and you come in over that in terms of revenue, why that's a bad thing. [00:10:38] Speaker C: No, they were, they were applauded for doing well and beating expectations, but they were beaten because they predicted slower growth for this quarter and the next quarter. [00:10:48] Speaker B: Okay, so they lowered their. Okay, that makes more sense to me. [00:10:52] Speaker C: I don't think they lowered their guidance, but I think they, they basically said expected to be on the lower side of the range that they gave, which made investors unhappy. [00:11:00] Speaker B: Okay. [00:11:01] Speaker C: Which I think it's just them hedging their bets of the election and Iran and Israel and, you know, all the other macroeconomic things. I think they're just hedging their bets. And if the economy starts turning around like we all hope it does, then they'll blow out those estimates, too, and get penalized for that. Probably too, Ryan. So you're probably right. [00:11:16] Speaker B: Yeah. See, I mean, that's. I still don't understand the stock market probably never will. [00:11:22] Speaker C: All right, let's move to how AI is how ML makes money with ChatGPT. Back in the news, OpenAI announced a ChatGPT search, which is a Chrome extension to take over the search experience from Google. The ChatGPT search, you can search the web with fast, timely answers with links to relevant web sources, which you would have previously needed to go to a search engine for. ChatGPT will choose to search the web based on what you ask, or you can manually choose to search by clicking the web search icon. Search will be [email protected] and their desktop and mobile apps, as well as in the extension open, AI says getting the answers on the web can take a lot of effort and sometimes requires multiple searches and digging through links to find quality sources and the right information. Now, chat, you just ask in natural language, where's a great restaurant to eat in Philadelphia? You might be saying to yourself, but you know, AI is not real time and your LLMs are not up to date. And that's true. And so for real time sources, ChatGPT has partnered with news and data providers to get things like weather, stock, sports, news, and mapping updates to allow you to use those things. I did try the Chrome plugin personally. I installed it after about 45 minutes as I did not care for the results from ChatGPT. I actually realized I kind of like the hybrid that Google's given me with Gemini and their normal results. [00:12:35] Speaker B: That was my first thought with this was like, why would I give up the ability to do both? Like I, you know, like I, I can use ChatGPT as the model that, you know, I get it. But yeah, not yet. I think not yet. [00:12:52] Speaker D: I just have their first, you know, real solution was hey, let's do a Chrome plugin. Which is, you know, like just felt like a weird next step. [00:13:03] Speaker C: Well, I mean like, I mean if you're going to go after a browser, Chrome is probably the number one browser. [00:13:08] Speaker D: Chrome is. [00:13:09] Speaker C: I haven't looked in a wild browser market share but I last I looked at Chrome was well ahead of everybody else. [00:13:14] Speaker B: Yeah. Yeah. [00:13:15] Speaker C: As much as Micro trying to get Edge to be a thing, it's not going to be a thing. Despite my work. [00:13:19] Speaker D: 66.65% yeah. [00:13:22] Speaker C: Isn't Edge the number two though? [00:13:24] Speaker B: For some terrible reason probably because Microsoft. [00:13:28] Speaker D: Safari, Safari and Safari is 18 and Edge is 5. [00:13:34] Speaker B: Okay, that's good. [00:13:36] Speaker D: And Samsung Internet is 2.3. So a lot of Samsung phones I guess. [00:13:40] Speaker B: Wow. [00:13:41] Speaker C: There are definitely a lot of Samsung phones for sure. Well, I mean, I guess, I mean I think it's pretty ballsy that of all the things you could do in AI and LLM and things like that, your choice was let's go poke the bear of our biggest competitor and go after their search business in their browser. It is well, some cojones there. [00:13:59] Speaker B: Yeah. [00:14:00] Speaker D: Wonder if it has anything to do with Microsoft backing them. And I'm like, hey, go poke the bear that we lost to see what happens, you know, and kind of attacking it from the different angle. But I guess they don't have board sheets anymore. Jay and Microsoft give up their board sheets. [00:14:17] Speaker B: Oh, I can't remember what, what the fall, the end result of that fallout was like it was interesting for a little while. Then I, I couldn't pay. [00:14:27] Speaker D: Mama. Drama. Drama. [00:14:28] Speaker B: Great. [00:14:29] Speaker D: Yeah. [00:14:29] Speaker B: Yeah. [00:14:31] Speaker C: All right. AWS has officially entered pre invent season which means they are dumping a ton of great stories of product managers dreams that did not make a reinvent main stage. And there's some in here that I am very sad about because in prior years they've been very exciting reinvent announcements that I would have been very happy about. And this just tells me that we're going to Get a lot more AI this year, which makes me sad. So first up in the pre event camp AWS is making it easier to manage your security groups with a new security group sharing feature. You now associate a security group with multiple VPCs in the same account using security group VPC associations. And when using shared vpc, you can now also share security groups with participant accounts in that shared VPC using shared security groups. This ensures security group consistency and simplifies configuration maintenance for your admins, which thank God you get this. Finally, because if you had multiple VPCs in an account, you have to have multiple identical security groups that you'd potentially have to update all in synchronous numbers if you need to make changes, which was very unpleasant. Now if they could just make it possible for a company, potentially a SaaS company that has to have whitelisting available to publish a managed security group that you could then subscribe to in your Amazon account to make easier access, I would be really appreciative of that. So Amazon, if you could add that quality of life improvement, I would be even happier. [00:15:48] Speaker D: They had something that I definitely used in the past, which was a lambda that watched the Amazon SNS topic for the public IP addresses so you could block it. So in theory you could do the same thing. [00:16:02] Speaker C: It's a hack though it's not the same lambda spackle. [00:16:07] Speaker D: Well, especially because you was over the default 50 group limit, 50 rule limit. So every time you want to use it, you always had to request the limit upgrade. [00:16:16] Speaker C: Yeah, that's a hack to try to get to there. But yeah, I like to publish this public, you know, security group that you can subscribe to that I update when we update our SaaS product or you auto scale or you do any of the things that Amazon makes available to you to make things easier on yourselves that then kills all of your customers. So appreciate that. This makes me wonder what the. [00:16:36] Speaker B: There was a feature at the resource share or I forget, it's the RAM thing. But anyway, resource where you can share across share VPC security groups across accounts within your Org. And now I'm like, wait, how did that work then? [00:16:52] Speaker C: If this is different, it's probably a hack too. [00:16:54] Speaker D: It was shared in the same vpc, so the security group was still associated with the vpc. [00:17:00] Speaker B: Oh, okay. [00:17:01] Speaker C: Yeah, so now you're gonna have multiple. Multiple VPCs to one security group, which was a nice improvement, I think. [00:17:06] Speaker D: I'm just trying to figure out how that Terraform provider, which definitely had a lit, like everything's gonna have to change From a list to a store, from a string to a list which give you a fun breaking change at one point. [00:17:19] Speaker C: Yeah, but I do like appreciate the idea that every time I used to create circuits and chose the wrong vpc I now know how to delete it and recreate. I can just associate the right VPC to it which I do sort of appreciate. [00:17:33] Speaker D: Yeah, that was like EBS volumes and selecting the right subnet. [00:17:39] Speaker C: Yeah. The right availability and how many times. [00:17:42] Speaker D: It would create something you try to attach it. Why is it show up in 15 minutes later? Finally figure out what the error was exactly. [00:17:53] Speaker C: AWS enhances a Lambda Application Building Experience for those of you using VS code ide, this basically is a experience streamlines the code test deploy debug cycle, providing a guided walkthrough that assists developers from setting up their local development environment to run their first application on the cloud and adds enhanced user experience in each step of the cycle. When you install the AWS toolkit extension on VS code, you'll be greeted with a new app building experience. They'll guide you through the necessary tooling installations and configurations required to set up your local environment for building lambda based apps. In addition, you get a curated list of sample apps which guides you through step by step coding, testing and deploying of those apps in the cloud. Lambda. I do wish this also gave you a CI CD path, but I appreciate that, you know, not everyone needs that and so this is a good little handy way to get started with Lambda development. [00:18:40] Speaker B: Yeah, my first thought when reading this is I'm curious on how this will like sort of fit in with my AWS SAM workflows, which does give you a CI CD workflow because publishing directly with cloudformation. So it is sort of an interesting thing. I'm hoping that you could kind of seamlessly merge those experiences because it would be kind of nice if they made that easier. We'll see, we'll try it out and I'll try to get back to you. [00:19:07] Speaker C: Okay, let us know. I didn't know you're doing a lot of lambda coding these days. [00:19:11] Speaker B: All my personal stuff is always in Lambda. Now that's true. [00:19:15] Speaker C: AWS Lambda now supports the Amazon Fault Injection service, so now you can do things like verify the lambda errors for all language runtimes without code modification. And the tests that can be run are returning custom HP SAS codes from the gateway. So if you have an API gateway in front of your Lambdas or add a 1 second startup delay to your 1% of incommocations which can Cause massive amounts of havoc, which could be a fun one to play on your developers, maybe in depth, but nice to have some fault injection opportunities for your lambda functions at once as well. [00:19:46] Speaker B: Yeah, I really like that startup delay for just some of the invocations just because that's such a real world problem that I've run into and you're just like trying to track down, you know, issues that come from that. [00:19:59] Speaker C: So I also wish you could do IP exhaustion as a fizz cat. [00:20:03] Speaker B: Yeah, that'd be good one. [00:20:06] Speaker D: I thought they got rid of a lot of the IP exhaustion once they moved to singly uni that holds many lambdas. [00:20:15] Speaker B: They did. And then that sort of caused other things because of the attachment being slower. So you'd have even more cold start issues as the lambda was attaching to that shared eni. So it was sort of this weird mix anyway. [00:20:30] Speaker C: Yeah, you got a little bit of. You got a little bit of both worlds. So I think they actually support both, I think the ENI attachment way and they still do the legacy way where you can just scale up inside the VPC which would run you out of IPs. So yeah, still would like to test and something that I am kind of shocked took this long. AWS now accepts partial payments via credit card. AWS customers can now pay the bill with multiple cards or take a partial payment the first part of the month and there's another payment at the end of the month. Until now, customers could only pay their entire bill at once prior to the due date. With partial payments, customers can now split the amount due into smaller payments, which would be charged on different cards or different periods of time. [00:21:08] Speaker D: This. [00:21:09] Speaker C: Previously you would have had to call AWS customer service, but now you can just do it from your console without any hush, any issues. [00:21:16] Speaker B: I'm just feeling very fortunate that I didn't ever run into this problem. And that just means that my AWS account spend wasn't big enough. So this is. I didn't know this didn't exist. Yeah. [00:21:28] Speaker C: Or your AWS spend was so big that you could not use a credit card because no credit card limit was that high. Right. So one of the two problems or. [00:21:37] Speaker D: You just spent all your company's money one. Three. [00:21:40] Speaker B: Yeah, well, that's what I typically do. [00:21:42] Speaker D: Yeah. [00:21:43] Speaker C: In the interest of not boring you guys all with seven redshift stories, I've combined them into one update for you guys. Several new features for redshift, including redshift integration with Amazon Bedrock, allowing you to leverage your large language model from simple SQL commands alongside your redshift data the next gen AI driven scaling and optimization in Cloud data warehousing Redshift Serverless now uses AI techniques to automatically scale with workload changes across all key dimensions such as data volume, concurrent users and query complexity. And the final announcement the Redshift Data API now supports comma separated values for your results vs JSON which for many of us JSON is preferred and for many of us CSV and Excel would be much better. So now you get both options. [00:22:23] Speaker B: Yeah man, I just keep thinking about the Redshift product team. Like they must be just devastated because clearly these were made for mainstage announcements. It's even got generative AI. They did all the things, they still didn't make it. [00:22:39] Speaker D: Yeah the CSV one is questionable that. [00:22:42] Speaker C: One would definitely not have been mainstage, but they also general availability autocopy which is the ability to auto copy data from your redshift into other systems and then also incremental refresh on materialized views for data lake tables also with ga but I forgot to put them in the notes so sorry. [00:22:57] Speaker B: I'm sorry. Those are both really cool features though. Yes. So I like it. [00:23:03] Speaker C: Amazon and CloudWatch also had three announcements that I'm combining for your listening pleasure. CloudWatch will now monitor EBS volumes exceeding provision performance which thank goodness this will allow you to quickly identify and respond to latency issues stemming from under provisioned EBS volumes that may impact the performance of your applications. Ideally you have the other problem where you've over provisioned your EBS volume and this will now tell you that you are wasting your money. So that'll be the other flip side of this. In addition to that you now get two new CloudWatch metrics for EBS volumes including Volume average read latency and Volume average write latency to monitor the performance of those EBS volumes, which is a big complaint I had back in the day when I was trying to do any type of RAID across EVS volumes which they tell you not to do, but you still did it because you had to sometimes. But one volume being not great was definitely a big problem. So this now will help you identify those. And then finally Elasticache for Valkey node based clusters now support server side write request latency and read request latency metrics in CloudWatch as well. So very nice quality of life improvements to CloudWatch which those none of those are made mainstage either. [00:24:05] Speaker D: No, no. But the EBS one for exceeding it is definitely useful for GP3 or GP GP3 so my brain went to GPT3. And I was like, wait, that's not right. You know, because I definitely ran into issues, especially when it was first released, which I'm not even gonna try to remember how many years ago it was now. Just making me feel old, you know, I definitely provisioned some and all of a sudden started hitting like weird slowness and whatnot. And I kind of had to dig through the metrics. And then there was this weird math that every time I would open a support case, they would tell you the math. It would figure out how many iops and all the different things that you could do it. So it's really nice to have that metric so you can really easily see where your problems are. [00:24:51] Speaker B: Yeah, Especially the latency metrics. Right. Because I. Every issue that I've had at that level, it was such a black box that you end up, you know, opening a support case and working through with support, see it, and they're like, oh no, yeah, you, you're just exceeding the bandwidth. I'm like, it would have been nice to know some other way. [00:25:11] Speaker C: Yeah, I had similar problems with EFS attachments to Linux boxes where all of a sudden things just are going terribly and you figure out eventually that, oh, I've exceeded my burst capacity for the EFS volume and I need to provisioned. But then even provisioned, they didn't have great metrics. They fixed that a couple of years ago, which I appreciated. So now I can properly size my EFS volumes. Although I'm still waiting for AI to come help me. Save me for having to do that. Oh, it will someday. It'd be nice to be able to just auto scale the amount of IO I need for the website that I don't, you know, only does traffic during the day. So yeah, great, but then you'll just. [00:25:46] Speaker B: Complain because it's like, it's over optimizing, you'd be like, what are you doing, bot? And it's like, it's not. My credit card got me in. [00:25:53] Speaker D: But you can now pay with two credit cards. Don't worry, you'll be fine. It's fine. [00:25:57] Speaker C: Okay, sorry. Well, last week or the week before, one of you mentioned to me like, why, why haven't they deprecated AWS supply chain? And I said, well, let's. Because they're not going to deprecate that because Amazon uses it for the store. And to prove that to you, they announced this full blog post. Not just a. Not just a quick what's new? Blog post, but a full blog post explaining the new Generally availability of AWS Supply Chain Analytics Powered by Amazon Quicksight, this new feature helps you to build custom report dashboards using your data in AWS Supply chain. And with this feature your business analysts or supply chain managers can perform custom analysis visual data and gain actual insights for the supply chain management operations of their business. And for the executive and me. Some pretty graphs in the in the blog post if you check it out. But yeah, don't think supply chain is going away anytime soon. [00:26:47] Speaker D: Sorry, all I have to say is this whole article is here for Justin to tell us. Told you so. [00:26:53] Speaker C: Yep, yep, that's whole. [00:26:55] Speaker B: It's the only reason it's included. Absolutely. [00:26:57] Speaker C: Because when I talked about supply chain for any other reason. Absolutely not. [00:27:01] Speaker D: Want to make sure everyone was clear. [00:27:03] Speaker C: Yeah, no people listen know that I'm petty like that. [00:27:12] Speaker A: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Archer will click the link in the show notes to check them out on the AWS marketplace. [00:27:52] Speaker C: Last year at re invent 2013 they announced Postgres Aurora Limitless. And now because Reinvent is just around the corner, they need to generally available that crap before it lapped itself. And so now it's generally available. The Amazon Aurora PostgreSQL limitless database. Generally available is a new serverless horizontal sharding capability for Aurora. You can scale beyond the existing Aurora limits for write throughput and storage by distributing the database workload over multiple Aurora Writer instances while maintaining the ability to use it as a single database. [00:28:21] Speaker B: Yeah, this is like the more I learn about actual like relational databases since my pre usually my use cases are NoSQL, the more this type of thing is crazy right? Like it's cool, it's absolutely neat. It scares the crap out of me. But you know, being able to group different tables on certain writers and have that load spread across would be. Is super interesting. And then you still get the sort of simplicity at the application, you know, connection pool level just to point it in one place. [00:28:57] Speaker C: Yeah, so one of the things that will mess people up a little bit is that they, you know, the way you size this is minimum and maximum capacity measured by Aurora capacity units, which, you know is magic numbers that they created that sort of represent CPUs and things. And so you can set up your. Your 16 ACUs is your minimum, and then you go up to as many as 6,144 ACUs as the maximum, which. That seems like a lot of shards. And so they do say if you need more than 6,144 ACUs shards to contact them, because they want to talk to you about what you did wrong. [00:29:32] Speaker B: Talk you out of your use case, most likely. [00:29:34] Speaker D: Yeah. You're doing something wrong. We'll let you know what it is shortly. [00:29:37] Speaker C: Yeah. And then basically each shard has a maximum capacity of 128 terabytes of disk space. And the reference tables have a size limit of 32 terabytes for the entire DBShard group. So there are some sharp edges. [00:29:48] Speaker B: There are limits to this limitless database. [00:29:50] Speaker C: Yeah, it's the limited limitless database. Do read the fine print on this one. Ses, which you all think is a very simple SMTP product, has continued to get additional features and fixed one of the heavy lifting items that I've had to do many times. SES now allows customers to provide templates directly within the Send Bulk email or Send Email API request. And SES will use the provided inline template to do the mail merge to render and assemble the email content for delivery, reducing the need to manage template resources in the SES account, which was a very common use case, either through Lambda functions or through an EC2 box. Just sat there spinning templates and emails out the door. [00:30:31] Speaker B: God, I remember you asking for this to the SES product team, like forever ago. [00:30:37] Speaker C: Because if I have a template and you have an email address, you should be able to combine them in the platform. You shouldn't have to do it outside and then send a completely completed email. But my feature requests eventually get responded to 10 years after I needed them. [00:30:49] Speaker B: Yeah, that's pretty crazy. Yep. [00:30:54] Speaker D: I still get annoyed by the SES team, but that's all I got. [00:30:58] Speaker C: Their whole email. Like spam filtering? Yeah. They're so anal about the bounce rate and all that. It's like they are. I get why they have to do it, but it's also very frustrating. [00:31:08] Speaker B: Yeah. [00:31:09] Speaker D: Also, they only talk to the root email address and everything else and you can't actually ever talk to anyone. And like at one point they blocked one of my customers and it was because they were doing what their business was. But Amazon said it wasn't what their business was. And yeah, turns out you can just switch it to a new region and they don't care anymore. So it bypasses all this. So it's great. [00:31:33] Speaker C: So before when it used to be the global mail senders where they were managing reputation for the ips, that was why they were really anal about a lot of that. Now you can set up your own. You can specify your own IP address for it and then they don't care. [00:31:47] Speaker D: No, they still care. [00:31:49] Speaker C: They don't care as much. But some care because they don't want to be known as a blacklisted box. But if it's your ip, they don't care as much as they used to. [00:31:57] Speaker D: Yeah, yeah. [00:31:59] Speaker C: AWS is launching UDP protocol Support for AWS PrivateLink on IPv4 and IPv6 and on the network load balancer over IPv6. Previously you could only support TCP via the private link, while the network load balancer only supported UDP over IPv4. This enables customers who use PrivateLink and clients that use IPv6 to access UDP based applications such as media streaming, gaming, VoIP and other applications. None of them that I have to support because UDP is lossy and that typically doesn't work for a SaaS app. [00:32:30] Speaker B: No, it does not. And I haven't had to do it for a long time with just general infrastructure services. [00:32:37] Speaker C: I mean, I do sometimes dream about working for a gaming studio or something where you're in the wild, wild west of making game servers work and all these weird protocols and fun things you have to do. And then I'm like, no, no, I'm okay, I don't want to do that. But I like the idea of it. A cowboy in me is like, yeah. [00:32:51] Speaker B: I want to do work vacations. I want to figure out how to like, instead of like going to work, like go to work in like some specialty industry for a little while. [00:32:59] Speaker C: Like internships for adults. [00:33:00] Speaker B: Little like, yeah, but I still want to get paid. Yeah, but I want to do something cool. Yeah. [00:33:07] Speaker C: I just want to spend a month with you guys and I want to. I want to do something cool. Yeah, yeah, I get it. Makes sense. [00:33:11] Speaker B: And I'll be either exhausted or bored of you by then. And then I'll go home, you know, you don't have to put up with me anymore. [00:33:19] Speaker C: Well, let's talk about events. AWS is launching AWS AppSync Events, a new solution for building secure and performance serverless websocket APIs to power real time web and mobile Experiences at any scale. AWS AppSync events let you easily broadcast real time event data to a few or a million subscribers using secure and performance serverless websocket APIs without needing to manage connections or resource scaling. And if I knew what AppSync did and I knew what my use case would be for this, I'd probably be really excited about it. But I don't really know either. So that's all I'm going to say about it. [00:33:50] Speaker B: Yeah, that's what I was sort of thinking. Like this is events for websockets. Like I don't get any of this. [00:33:58] Speaker C: Yeah, even their blog, like they have a full blog post on this too. I just linked to the what's new. But if you go to that, then you go to the thing it talks about, you know, all the things it could do, but it doesn't doesn't give you a single use case example. So like clearly if you know what this is and when you need it, you're like, yes, this is lively, but if you have no idea what it's needed for, then you're not the right people. I guess I do have one little sentence here. From live sports scores, group chat messages, price levels or location updates, developers want to build apps that give their customers pertinent up to the second information. Okay, so this is like you're sending events to mobile apps with updates to like pertinent information they need like sports course. Okay, like that makes some sense to me. [00:34:32] Speaker B: Okay. [00:34:33] Speaker C: But not really. But why would you do the websocket, I guess, versus. Oh, I mean, I guess. [00:34:37] Speaker D: Well that makes sense for websockets. You can live push down. Yeah, yeah. [00:34:42] Speaker C: What was the technology before websockets? There was something else we used to do back in the day. To do like section of a website that you wanted to constantly update, you would do a polling mechanism in the browser. I don't remember the name of that, but maybe this is probably just because I don't do web development anymore. I've lost the, lost the thread on what they did there and this is probably the future. [00:35:00] Speaker D: Yeah, I mean the websockets is kind of the way that people keep that live connection to be able to send bi directional traffic and whatnot back and forth. Just websockets use memory and that crashes systems. So I really like this serverless websocket aspect. [00:35:16] Speaker C: Yeah, I like that I don't have to do anything. I like that part. All right, let's talk about Route 53 and things we don't understand either. So Route 53 now supports HTTPs and service binding or SVCB record types, which provide clients with improved performance. Instead of only providing the IP address of an endpoint in the response to a DNS query, HTTPs and SVCB records respond with additional information needed to set up connections, such as whether your endpoint supports HTTP 3, thereby letting supporting clients connect faster and more securely. In addition, you can create TLSA authentication or TLSA records with Route 53, TLSA records that may be used to associate TLS server certificates or public keys with your domain name, leveraging DNS Security Extensions or dnssec. This provides you with a prerequisite component of DNS based Authentication of Named Entities or dane, a protocol frequently used in conjunction with SMTP to assure secure and confidential mail transport. [00:36:10] Speaker B: That's kind of cool. [00:36:12] Speaker C: I get the idea of like telling them in the DNS request like, hey, we support HTTP 3. That way you don't have to make an HTTP 2 call first and then upgrade your protocol to 3. So I get like, some of that makes sense to me, but it seems like maybe you're overloading a little bit into the DNS to do this, but I could see the advantage of it. [00:36:30] Speaker B: Well, if all problems are DNS, you should just add more complexity. Right, right, of course. Right. [00:36:35] Speaker D: Well, it's your own database, it's an infinitely scalable database, so let's keep adding things to it. [00:36:42] Speaker B: Right. [00:36:43] Speaker C: Oh, I also forgot. They now also support the ability to associate secure shell key fingerprints with your domain name through SSHFP records. SSHFP records provide you with a mechanism to record fingerprints in DNS through DNSSEC and to distribute them to clients via SSHFP for validation of the fingerprints published in DNS against the fingerprints offered by the server. As a result, when connecting to a server via ssh, clients are able to securely authenticate the server, which basically, I guess, is that when you SSH a box it says like this server you don't know about, you can hit yes. Add it to my list of known hosts. I assume this is something around that. [00:37:15] Speaker B: Wow. Like I was. I was all into the TLSAA records, but that sounds nutso to me. [00:37:23] Speaker C: Yeah. Again, a use case would be helpful in this particular blog article. Like, what's my use case for this? [00:37:32] Speaker D: A lot of these things sound good, but like our other tools actually. Like how many other tools are actually looking for these records? [00:37:39] Speaker C: Well, I assume that the. I assume that the web browser like Chrome, I'm assuming is pushing a standard that, you know Checks for this header and this Data to see HTTP 3 versus HTTP 2. [00:37:50] Speaker D: Well, it wouldn't be a header, it would be its own. [00:37:52] Speaker C: Well, yeah, sorry. In the record. [00:37:55] Speaker D: So every time it's going to make Chrome or a web Browser's making another DNS record request as it calls google.com it also calls the SVCB record too, at the same time. [00:38:09] Speaker C: I guess you make both DNS requests. [00:38:13] Speaker D: Now I'm wondering if my PI hole even supports this. So are these even going to be useful for myself? [00:38:18] Speaker C: Well, you're going to find out that your pie hole doesn't work anymore and. [00:38:21] Speaker D: So you'll be for the podcast. That was a different problem. [00:38:26] Speaker C: It's actually they have charts in this other article. I'll link to this in the show notes. But yeah, being able to validate the signature of the SSH host is a really interesting use case. [00:38:39] Speaker B: Seems like an edge case. Not so that you would want to do at that level, but yeah, no, I was looking through the use case I found on the Internet. [00:38:54] Speaker C: SSH fingerprinting helps mitigate man in the middle attacks and improve the overall security of your system. Does that make sense? [00:39:00] Speaker B: I mean, no, I get how it's used, but it's also funny because it's that extra layer of protection really needed, I guess, you know, because you, you already have the ssh, you know, exchange at the cyber level and, and verification of host keys. And I assume that you're still going to have to add this new key that's attached to the DNS record to your node hosts. [00:39:23] Speaker C: Look, the Route 53 product manager needed something to release for his performance reviews. [00:39:29] Speaker B: And so this is what he came up with. [00:39:31] Speaker C: And he doesn't care if you don't need it. Yeah, let's see. [00:39:34] Speaker D: The SDH thing being really cool, like for Route 53 internal records, you know, your post boots up, it puts it in, but it also kind of goes against the whole like paddle not pet, you know, I shouldn't be SSH into my boxes that much. And also ssm. So, you know, I have many other questions. It's kind of a cool thing. [00:39:59] Speaker C: So in a blog post I can only call very ironic, Amazon posted how executives can avoid being disrupted by emerging technologies, which is great since they were disrupted very hard by AI. Amazon says innovation happens 50 times faster than it did five years ago. And to be good at staying ahead of it, you need to anticipate technology trends and be a bit of a technology fortune teller. And they give you five ways to do this technology fortune telling. Number one engage in technology and monitoring and scouting. Two create a culture of curiosity and experimentation. Three use technology, roadmapping and scenario planning and four form external partnerships. I don't think any of those would have avoided them being disrupted by AI. [00:40:40] Speaker D: I wonder if this was their lesson learned. Post that like hey guys, this is everything we did wrong. Maybe we should go do these things. Also, I think you only listed four. [00:40:51] Speaker C: Not five because they only have four in the actual article. [00:40:55] Speaker D: Very cool. [00:40:55] Speaker C: There you go. [00:40:57] Speaker B: That is funny. They really do only have four. [00:41:00] Speaker C: I wonder if I miscounted. [00:41:01] Speaker B: Oh no, I get it. [00:41:02] Speaker C: No. [00:41:02] Speaker B: 1. 2. [00:41:05] Speaker C: Anyways, if you're interested in how Amazon wants to say not get disrupted and how you can disrupt them, just read this guide and hide from them. [00:41:14] Speaker B: Well, before you recommend they read this guide, one of the tips is to is hold internal hackathons and invite your technology providers to participate so you know your mileage may vary. [00:41:25] Speaker C: Indeed, I did Enjoy create a CoE. No, not a Center of Excellence, a Center of Engagement. In the center of Engagement, everyone, technologists and business users is encouraged to try new technologies and report the results. I'm sure your infosec team will love that. That's great. [00:41:41] Speaker B: I mean it's. I've done the same thing in my day job, so I can't really. [00:41:45] Speaker C: Sunglasses. [00:41:47] Speaker B: It's a center of enablement, not of center of excellence. [00:41:50] Speaker C: Right, but enablement is different than engagement. It sure is. It sure is. We'll let that go. All right. Google Cloud this week is telling us. No that just like Microsoft, they will be forcing mandatory MFA for Google Cloud in a phased approach that will roll out to all users worldwide during 2025. To ensure a smooth transition, Google Cloud will provide advanced notification to enterprises and users to help plan their MFA deployments. Phase one starts right now, as I just told you about it, where they encourage you to do MFA adoption and early 2025. MFA will be required for password logins to the Google consoles and Google mail. And phase three, which will be end of 2025, MFA will be required for federated users. And those are users coming in through federation. So thanks. It's just the way it has to be at this point, folks. [00:42:39] Speaker B: I can't really argue like I am a little nervous about that Phase three just because there's always differences when you do MFA through federation, as I've learned through AWS integrations. And so it's like, I hope that goes smoothly. [00:42:54] Speaker C: It's also like, okay, well if I'm federating and my federated partner already did mfa. You're now forcing me to MFA through you. Well, that's. [00:43:01] Speaker B: That's exactly what I'm talking about. So, like that passing through that sort of MFA authenticated sort of property. That's a pain because it has to. You don't want to do MFA to MFA like you. And users will not. [00:43:15] Speaker D: They'll. [00:43:15] Speaker B: They'll revolt. [00:43:16] Speaker C: Like, I know. [00:43:17] Speaker B: Or maybe I'm touchy, but I almost threw a computer the other day, so I understood. [00:43:22] Speaker D: This is why Ryan can't have nice things. [00:43:24] Speaker B: Got it. [00:43:25] Speaker D: MFA is all it takes to set them off. [00:43:28] Speaker C: That's why podcast laptop is a piece of junk. [00:43:29] Speaker B: That way when we make it mad. [00:43:30] Speaker C: He just can trash it and not be upset. [00:43:32] Speaker B: It's the. It's the oldest because it's like a brick that you cannot break. [00:43:37] Speaker D: Do you still have a CD drive at your laptop Riot? [00:43:41] Speaker B: No, I do not. It's not quite that bad. [00:43:43] Speaker D: Just for the audience. You actually have to look to check just why. [00:43:48] Speaker C: He'S like, maybe it does. I haven't used it in a decade. But yeah. All right, well, Google is dumping money into AI hardware at an impressive pace. And so we got to geek out with some hardware today, guys. [00:43:58] Speaker D: Woohoo. [00:43:59] Speaker C: First up is the Trillium. Their sixth generation TPU is now available to Google Cloud customers in preview, which means it won't be available to you unless you pay them lots of money. The new Trillium chip is over 4x improvement in training performance, up to 3x increase in inference throughput, a 67% increase in energy efficiency, an impressive 4 and a half or 4.7x increase in peak compute performance per chip, double the high bandwidth memory and double the inner chip interconnect, which I'm sure is important for inference. [00:44:28] Speaker D: Those are pretty impressive stats though. [00:44:31] Speaker B: Yeah, specifically the energy part. [00:44:34] Speaker C: Right? [00:44:34] Speaker B: Because I think that's going to be the next arms race. And these chips for training. [00:44:39] Speaker C: 100%. [00:44:40] Speaker D: 100%. Especially as the EU and other countries all make you start to track all your carbon emissions, whether it's first, second or third source and how it affects everything. So yeah, these are definitely going to come into a lot of effect. [00:44:55] Speaker B: And apparently we're not going to be able to spin up new nuclear reactor sites just willy nilly because the regulatory bodies say no. [00:45:03] Speaker D: Sorry, I just bought a nuclear reactor in my backyard. What could possibly go wrong? I have a hole. It'll be perfect. [00:45:09] Speaker C: Perfect. The new A3 and A3 MegaVM is powered by the Nvidia H100 Tensor Core GPUs we just talked about these last week, so I won't bore you too much about those, but they do have 2x the GPU to GPU bandwidth up to 2x higher LM inference performance and ability to scale tens of thousands of GPUs in a dense performance optimized cluster for large AI and HPC workloads. So they are, they are pretty big. And if you want to learn more about them, check out last week's episode where we talked about them in quite a bit of Depth with the AI supercomputer support for the upcoming Nvidia GB200 NVL72 GPUs will be coming with more details soon that we'll talk about in a future your show. [00:45:47] Speaker B: But it sounds big. [00:45:48] Speaker C: That sounds big. [00:45:48] Speaker B: That's what I like about that. Yeah, yeah, they put it in there and it sounds big. [00:45:51] Speaker C: Yep. [00:45:52] Speaker D: I actually got their pre announcing it. [00:45:54] Speaker C: Yeah, like we want you to know it's coming because other our competitors are going to be offering these, but we also are going to offer them so we want you to know that. But we don't know what they're going to cost or anything about them because Nvidia hasn't given us any details. But we want to announce first. That's what that was. [00:46:06] Speaker D: Yeah. So we're announcing something that we're going to add to our product line from a third party that we don't know anything about. But we're going to have guests. [00:46:14] Speaker C: Exactly. [00:46:14] Speaker D: We're good. Great. [00:46:15] Speaker C: Yeah, good. Titanium, which is their answer to Nitro, basically now supports enhancements for AI workloads. Titanium reduces processing overhead on the host through a combination of on host and off host offloads to deliver more compute and memory resources for your workloads. And while AI infrastructure can benefit from all of Titanium's core capabilities, AI workloads are unique in accelerator to accelerator performance requirements. And to meet those they've introduced a new Titanium ML network adapter that includes and builds on Nvidia Connect X7 NICs to further support VPCs, traffic encryption and virtualization. And I don't know what those are, but they're made by Nvidia so they had to be expensive and fast. [00:46:53] Speaker B: I'm all I read is that their Titanium solution was their NICs were melting down due to AI workload so they had to solve this with something. That's all I read. [00:47:03] Speaker C: Those Nvidia Connect X7s are diamond plated. [00:47:06] Speaker B: Yes. [00:47:09] Speaker C: Hyperdisk ML is now generally available. Hyper Disk ML is their AI focused block storage service that they announced in April at Next now generally available, it complements the computing and networking innovations discussed in the blog. With purpose built storage for AI and HPC workloads, Hyper ML disk accelerates data load times effectively. You can attach 2500 instances to the same volume and get 1.2 terabytes per second of aggregate throughput per volume, which is more than 100 times higher than the offerings from major block storage competitors. And shorter data load times translate to less accelerated aisle time and greater cost efficiency of your training jobs. And this supports GKE as well to create multi zone volumes for your data to live on in hyperdesk ML. [00:47:51] Speaker B: Interesting. I wonder, like. Yeah, I was just thinking about, you know, the old days of doing Object store and then block store and you know, mounting things with like Fuse and so this is kind of interesting. It's cool. AI is bringing back some pretty fundamental problems that we used to have to solve for different workloads, but now it's. [00:48:12] Speaker C: All magnified by all of our web apps. They have sort of the similar usage patterns. Unless your database didn't scale, you didn't need a lot of Big Iron. But now AI requires Big Iron and so all these problems come raring back. [00:48:25] Speaker B: Everything old is new again. [00:48:27] Speaker C: Exactly. [00:48:28] Speaker D: Where's the mainframe? Let's have an AI mainframe. [00:48:32] Speaker B: No, no, no, no. [00:48:34] Speaker D: Too far. Okay. [00:48:35] Speaker B: Sorry. Yeah. [00:48:38] Speaker C: And this is what I'm excited about. The Axion based or arm based Axion CPUs are now generally available. They were first announced at Google above 24. The Axion is the ARM based CPU designed for the data center, with them being generally available for the C4A class of computers with up to 10% better price performance than the latest generation ARM based instances available from leading cloud providers like AWS. C4A. VMs are a great option for a variety of general purpose workloads like web and app servers, containerized microservices, open source databases in memory caches, data analytics and media processing, and AI inference applications. Andy Gutman, VP of and GM of databases at Google, says Spanner is one of the most critical and complex services at Google, powering products including YouTube, Gmail and Google Ads. In our initial tests on Axion processors, we've got up to 60% better query performance per VC CPU or prior generation servers. As we scale out our footprint, we expect this to translate to a more stable and responsive experience for our users, even under the most demanding of conditions. They have three different families in the VM portfolio. A standard, a high memory and a high CPU. They have different ratios of CPUs to memory, 1 to 4 for the standard, 1 to 8 for the high memory and 1 to 2 for the high CPU option for you in that portfolio. I did not look at pricing for this but expect them to be very competitive with their intel and AMD counterparts. I do also have a quote from Liz Fong Jones, a field CTO at Honeycomb. Honeycomb IO helps engineering teams debug their production systems quickly and efficiently. Sampling is a key mechanism for controlling observatory costs for our customers who are running applications on Google Cloud. We have validated that the new Axion CPUs and C4AVMs offer the best price performance on Google Cloud for running your Refinery Sampling proxy to forward only the most important representative samples to Honeycomb on aws. [00:50:22] Speaker B: Yeah, that was a weird quote. For our customers that run on a different cloud than us, this works great. Okay, yeah. [00:50:34] Speaker C: I just find it funny because there are also Liz Fong Jones has been quoted many times for her love of Graviton CPUs on AWS. It was funny that they included that. [00:50:44] Speaker B: One and I'm guessing that's why they went for her for a quote. It just so odd. [00:50:51] Speaker C: The Refinery proxy that is how you send your data from your servers to our servers runs better and more efficiently. Thank you. Thank you for that. All right, so a while ago Google introduced Cross Cloud Networking to transform and simplify hybrid and multi cloud connectivity enable organizations to easily build distributed apps. As organizations modernize their infrastructure leveraging AI, ML and other managed services, they have adopted Cross Cloud Network to reduce operational complexity and lower the total cost of ownership. The point of the Cloud Interconnect was to provide robust, high bandwidth SLA backed connectivity to Google Cloud to help you migrate your workloads to Google Cloud. With Cross Cloud Interconnect they enable dedicated and private activity from Google to another cloud provider and together they form the foundation for building hybrid and multi cloud distributed apps, which is their way of saying please move your workload to Google Customers have traditionally lacked the capability to prioritize traffic, though forced them to over provision bandwidth or risk subpar performance during periods of congestion in their cross cloud network. To address this need for traffic prioritization, Google is introducing application awareness on Google Interconnect in preview. Google Cloud is the first major cloud service provider to offer a managed traffic integration solution that empowers you to solve the critical challenges of traffic prioritization over Cloud Interconnect. You're also the only cloud that provides this, but you know, whatever application awareness enables Flexibility with choice of two policies, strict priority across traffic classes and bandwidth shared per traffic class. Application awareness on Cloud Interconnect provides multiple business benefits including prioritization of business critical traffic, lower total cost of ownership and fully managed SLA backed solutions. So I guess they went to the riverbed store and bought a bunch of riverbeds to do package shipping. [00:52:26] Speaker B: Yeah, I mean qosing subnets hasn't worked forever, you know and I wonder if this is going to end up being the same thing where it's just everything ends being optimized so nothing ends up being optimized. That's crazy. [00:52:40] Speaker D: I just wonder how much, how many people actually need this like for QoS. Like I feel like I've really set it up on like VoIP and like backups, off site backups back in the day, like that was about it. [00:52:55] Speaker B: Well, I imagine this is going to be backup heavy, right? It's because you're cloud Interconnect, you're going to ship you know, large amounts of data for backups and or MI training and they don't want that to take down the web serving tier or something like that. [00:53:09] Speaker D: I don't know, it just feels like. [00:53:11] Speaker B: Feels like the wrong way to manage it. [00:53:13] Speaker D: Right. And like bandwidth and net and pipes, you know I'm just going to state something which I know is incorrect but like should be cheap enough nowadays, you know, except for you know, no one's lowered the price of egress traffic but we'll bypass that point, you know, in years. So you know, it should be cheap enough that be surprised if like you really have to over provision that much. But your nightly backup job runs if you're still doing full backups. Yeah, I guess it can defect it. [00:53:44] Speaker B: Well actually you just reminded me of something like if you are at capacity with your provision link, like getting a new link isn't exactly trivial, right. You have to core data cross all kinds of things. And so this might be a nice tool for you know, people that are up against that threshold where it's just like, well as long as this traffic goes through, you know and the other thing can take the hit. [00:54:06] Speaker C: So I mean I don't exactly understand where you specify this policy. Like I, you know reading through the documentation and I see where you configure the traffic destination on the cloud cross interconnect. But how do you tag the app traffic to the specific class of traffic that you want it to be in. [00:54:24] Speaker B: This new world of cloud networking? [00:54:26] Speaker C: I don't know. Right, yeah, I know like the old School ways. What is this? We put it in a VLAN and then we have the vlan. Oh, that's actually what they're doing. It's VLAN attachment. [00:54:37] Speaker B: It is VLAN attachment. Okay. [00:54:38] Speaker C: So basically you put the apps in different VLANs and then you're doing it. [00:54:42] Speaker B: You just prioritize VLAN. That makes sense. [00:54:45] Speaker C: Okay, that's actually pretty rudimentary. But still first cloud to provider to offer it. All right. And finally we talk all the time on the show how important it is to know the principles behind how the hyperscaler of your choice is built. In the case of aws, they have a very strong regional availability zone isolation model and gcp. We have talked about their common storage layer and what that common storage layer enables as well as the global network. But this blog post I grabbed for us because it has key insights into the design thinking of the 25 year design of the Google network. And I think that's important to understand how your network is built. So as Google says, Rome wasn't built in a day and neither was Google's network. But 25 years and they're in sharing some of the details of how they started out small and now run the fifth generation Jupiter data center network with now scales to 13 petabits per second of bisectional bandwidth. And for perspective, this network could support a video call of 1.5 megabytes a second for all 8 billion people on earth at the same time. [00:55:41] Speaker B: Wow. [00:55:42] Speaker D: What's bisectional bandwidth? [00:55:45] Speaker C: Don't ask questions. I don't know. [00:55:48] Speaker B: We didn't research that in advance. [00:55:50] Speaker C: You should Google that. So their network evolution, the minimum number. [00:55:54] Speaker D: Of wires that need to be cut when dividing a network into two equal segments of nodes. [00:56:01] Speaker B: Cool. [00:56:01] Speaker C: Not sure that helps make any sense. [00:56:02] Speaker B: To me at all. No. [00:56:05] Speaker C: Yeah, you didn't help us. The network evolution has been guided by a few key principles. First of all, anything anywhere which their data center network support efficiency and simplicity by allowing large scale jobs to be placed anywhere among 100,000 servers within the same network fabric with high speed access to needed storage and support services. And this scale improves application performance for internal and external workloads and eliminates internal fragmentation. They needed it to be predictable and low latency which they prioritize consistent performance and minimize tail latency by provisioning bandwidth headroom, maintaining 99.999% network availability and proactively managing congestion through end host and fabric cooperation. It needs to be software defined and system centric. Leveraging software defined networking for flexibility and agility and Qualify and the global release dozens of new features every two weeks across the global network. Incremental evolution and dynamic Topology which basically incremental evolution helps us to refresh the network granularly rather than bring it all down wholesale. A dynamic topology helps us to continuously adapt to changing workload demands that accommodation of optical circuit switching and SDN supports in place physical upgrades and an ever evolving heterogeneous network that supports multiple hardware generations in a single fabric. And finally traffic engineering and application centric Quality of service Optimizing traffic flows and ensuring quality of service helps them tailor the network to each application's unique needs. These principles led to 2015 and Jupiter, the first petabit network with 1.3 petabytes a second of aggregate bandwidth by leveraging a merchant switch, silicone CLAS topologies and SDN. In 2022 they enabled 6 petabytes a second with deep integration of optical circuit switching, wave division multiplexing and highly scalable orion SDN controller. 2023 they did 13 petabytes a second by enhancing jump or support native jumper supporting native 400 gigabytes per second link speeds in the network core and the fundamental building block of Jupiter network Now consists of 512 ports of 400GB connectivity, 400Gbps of connectivity both to end host and to the rest of the data center for an aggregate of 204.8 terabytes a second of bidirectional non blocking bandwidth per block. And that's just where they are today. In the future they're charting the future with the next generation of network infrastructure. For example, they're busy working on networking infrastructure needs for the A3 Ultra VMs featuring the new Nvidia Connect X7 networking supporting non blocking 3.2 terabytes per second of GPU to GPU traffic over RDMA over converged Ethernet they will deliver a significant advanced network scale and bandwidth both purport and network wide in the next few years. [00:58:29] Speaker B: I was about to get all like poo poo on this article because I'm like that. Well they're, they're just announcing that they're, they've upgraded the hardware and they've done it a lot of times over 25 years. But then reading through the article, they actually publish all of their white papers on how they did this and so you can actually. Because like the blog post doesn't give you enough details because, because you couldn't do that in a blog post. But publishing all the white papers really gets you into the nitty gritty because that's always for me where the rubber meets the road on these things. Like yeah, you can upgrade, you know, hardware, but how do you do that in a way that doesn't affect the performance? How do you do that in a scaled out way where you can take care of those details really matter and it can be a little tricky. So yeah, I haven't read any of the white papers, but I'm hoping that they sort of go through not only the network design, which I'm sure is the vast majority of it, because network people are going to network. But yeah, look forward to that. [00:59:24] Speaker C: Yeah, all these papers were published in sitcom and presented at Sidcom, so they're scientifically backed to many of these as well. So yeah, if you want more details on this, the white papers are good sleeping, reading or really interesting to you, one of the two, depending on how into networking you are. [00:59:42] Speaker B: How much coffee I've had exactly. [00:59:45] Speaker C: All right, let's wrap up this marathon evening tonight with two Azure stories. First up, for those of you using Azure DevOps OAuth apps Beginning in February 3, 2025, Microsoft will no longer accept new registrations for Azure DevOps OAuth apps is their first step in sun signing the Azure DevOps OAuth platform. So run and provision those as quickly as possible so you have them if you're working in the middle of a project before they go away and you have to redo all your work, not. [01:00:11] Speaker B: Going forward, stop provisioning these. Just another identity provider in your ecosystem. [01:00:17] Speaker C: Don't do it going forward. They're advocating for you to build apps on top of the Azure DevOps REST API and explore the Microsoft Identity platform and registering a new Entre application instead. [01:00:28] Speaker B: Yes, please. [01:00:29] Speaker C: And all existing OAuth apps will work until the official end of life in 2026. [01:00:33] Speaker B: Yeah, no, it's just you see this with GitHub where you can also do the same sort of flow and it's just like yes, I realize that you have to talk to your IT team and you can do this very simply by registering an OAuth app in your repository. But it ends up being just total chaos once this is rolled out. And then you've got this weird sort of two step. We're not two step but like sort of two separate identity command centers and it's just like ugh, stop. Things are already hard now anyway, rant over. Sorry. [01:01:13] Speaker C: All right. Satya Nadella is welcoming Jay Peric to Microsoft as a member of the Senior Leadership Team or slt. Reporting directly to Satya Jay was the global head of engineering at Facebook, which is now Meta, and most recently it was the CEO of Lacework who just had a successful exit as well. His focus will be extend beyond technology, which this passion and dedication to developing people will foster a strong culture and build world class talent. And Jay will be immersed in learning about Microsoft's priorities and culture, spending time with senior leaders and meeting customers, partners and employees around the world. And they will share more on what his role will be and focus in a few months. And all I can think of is who is. Azure has been beaten up pretty bad on security. Charlie Bell's been there about two years, hasn't seemed to move the needle. And I don't know, if I was a betting man, I'd say the former CEO of a security startup is probably going to maybe be in charge of something security wise. [01:02:10] Speaker B: Yeah. [01:02:10] Speaker D: What would give you that idea? [01:02:12] Speaker C: Justin? I know how corporate corporations work. [01:02:16] Speaker D: They're gonna say I'm bringing it to you, leaves. [01:02:18] Speaker B: Yeah, I'm sure Charlie Bell's golden parachute is going to serve him just probably just fine. [01:02:25] Speaker C: Yeah, yeah, yeah. [01:02:26] Speaker B: He's still counting all his AWS money. Yeah, but I mean, I don't know. No one likes. [01:02:33] Speaker C: I mean, we, we said that that role that Charlie Bell went to was a thankless job anyways. There was no way he was going to be successful. And you know, despite his success at aws, I just didn't think, you know, it's a culture problem, which is why they're bringing a guy who's really good at building culture. Because you have to change the culture to get security to be first and forefront of a company like Microsoft who has not cared about it at the level they needed to, as shown by all the recent security issues. [01:02:59] Speaker B: Yeah. [01:03:02] Speaker C: All right, gentlemen, it's now officially time to go watch the election results. I guess I can tell you, no shock to either one of you that the election is still not called because it's only 7:00 Pacific Time and they are still counting votes across the board. [01:03:17] Speaker B: It's not going to be called by the time we record the next episode. [01:03:20] Speaker C: Yeah, that's probably likely, but I think it will be. We will find out. All right, gentlemen, it was great talking to you and keeping me distracted from all these election results. And we'll talk to you next week. [01:03:32] Speaker B: Bye everybody. [01:03:33] Speaker D: Have a good one. [01:03:36] Speaker A: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services. While you're at it, head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.

Other Episodes

Episode 138

October 13, 2021 01:06:32
Episode Cover

138: Cloud Pod productivity is way up thanks to the Facebook outage

On The Cloud Pod this week, the team is running at half-duplex without Peter and Ryan. Plus Cloudflare R2 is here, Facebook died for...

Listen

Episode 150

February 03, 2022 00:36:45
Episode Cover

150: The Cloud Pod Exfiltrates Jonathan’s Credentials

On The Cloud Pod this week, Jonathan is still AWOL. Also Amazon is on GuardDuty with credential exfiltration, Google Cloud Deploy is generally available,...

Listen

Episode 112

April 16, 2021 00:52:37
Episode Cover

112: The Cloud Pod bots are in control

On The Cloud Pod this week, the team discusses the future of the podcast and how they’ll know they’ve made it when listeners use...

Listen