[00:00:00] Speaker A: Foreign.
[00:00:06] Speaker B: Welcome to the Cloud pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure.
[00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan and Matthew.
[00:00:18] Speaker A: Episode 285 recorded for the week of December 17, 2024. Six years of cloud news and we're still talking about FPGAs and PowerPC. Hello, Jonathan, Ryan and Matt, how's it going?
[00:00:30] Speaker B: It's great. Happy sixth podcast birthday.
[00:00:33] Speaker A: 6. I can't believe it. It feels like it's been less than that. Maybe Pandemic was in there. So that's a time distortion field.
[00:00:39] Speaker B: And yeah, it could be six or it could be 20. It feels very strange.
[00:00:44] Speaker A: Or 20.
[00:00:45] Speaker C: It's fair.
[00:00:46] Speaker D: I was gonna say two or 20.
[00:00:48] Speaker A: You know, I don't think both Ryan and Matt, you know, their birthday is not six. They're like only like four and two at this point, I think. Yeah, or maybe four and one. Like I don't. How long has matter with us now? It's been solid two years, right? Or is it a year?
[00:01:03] Speaker C: Could also be 20.
[00:01:04] Speaker A: Could also be 20.
[00:01:05] Speaker D: Yeah, could also be 20.
[00:01:06] Speaker B: Feels like 20.
[00:01:08] Speaker A: I feel like, I feel like we. We started fighting with your daughter at her birth for going to bedtime for recording. So I mean like how older daughter now. I mean that we probably use that as a litmus test.
[00:01:17] Speaker D: Yeah, yeah, she's almost three, so yeah, probably a solid two years.
[00:01:22] Speaker A: Yeah, probably a solid two years. Perfect. Well, I still enjoy talking about the cloud with you guys, so hopefully you guys are still enjoying it and all those good things.
Well, we are rapidly approaching the new year and we'll wrap up our last of our RE Invent talk today and then cover what we missed during Re Invent from the other cloud providers. And then next week's episode we'll do our what favorite announcements of next of this year and then predictions for next year and how we did our predictions last year. And I haven't looked yet, but I have no confidence that I had any good predictions last year.
So yeah, we're wrapping up the year here in a hurry. It's a little, little sketch on recording here because we've had a lot of travel here the last two months, but we'll get back on track hopefully in the new year refreshed and ready to talk about FPGAs.
[00:02:10] Speaker B: Oh no modern technology, but it's a.
[00:02:13] Speaker A: Get to re invent first. HashiCorp had a couple announcements at Re Invent this year that I thought we should mention, so. HTTP Vault Secrets Auto rotation is generally available. This allows you to have dynamic secrets that are kept in sync with Vault Secrets. And this allow you to leverage cloud native services in addition to Vault and keep that all synchronous. And you might want to do this if you're multi cloud and you'd like to be able to use cloud native services with your multi cloud solutions where perhaps you want serverless to talk to your database and you need to access the serverless of things through Vault. You don't have to open a third party layer for that. You can just do that directly from the Amazon Secrets which is built into the SDK. So handy. Glad to see this one get GA'd finally.
[00:02:54] Speaker C: Yeah, this is fantastic.
So you don't have to like sort of replicate, you know, secrets around between different systems and then with Vault's existing capabilities of doing like short lived Amazon sessions and using it as a sort of a application authentication engine. This is pretty sweet.
[00:03:12] Speaker A: Yeah, this is, this is definitely one of those things that's needed, I think on all three cloud providers because you either have to figure out how to synchronize secrets across all three of them or you need a middleman like Vault who can kind of do it for you.
[00:03:23] Speaker D: This qualifies under the category of things that I feel like we talked about so long ago, I just already assumed was ga. I'm surprised that it wasn't.
[00:03:33] Speaker A: That's a bad assumption on Hashicorp products. Like they don't go GA for, you know, seven years.
[00:03:38] Speaker D: It was I think for Terraform or something like that.
[00:03:40] Speaker A: Exactly, exactly.
[00:03:42] Speaker D: But I guess it is a 1.0.
[00:03:44] Speaker A: Yes, it is a 1.0 release. Speaking of Terraform, they also want to let you know that the AWS provider is now at 3 billion downloads and they're hoping to replace that of course, with the AWS cloud control provider which is now generally available with the 1.0 release. For those of you who remember, this is the provider built around AWS cloud control API so they can bring new services to Hashicorp Terraform even faster than before. I assume at some point as this continues to get stability, they'll start deprecating the Terraform AWS provider or stop development. I'm not sure if they're going to keep both of them in active development. Just doesn't make sense to me. But you never know with Hashicorp or now IBM Hashicorp or soon to be IBM Hashicorp.
[00:04:25] Speaker D: I feel like it did a lot of Terraform 11 to 12 or 11 to 12 migrations. As a consultant in a Past life. I foresee a lot of you know, consulting projects to move from the, you know, AWS provider to their control cloud control provider too in the future for people.
[00:04:44] Speaker A: Yeah.
They also announced that in June AWS and Hasher partnered to develop a comprehensive set of Terraform policies in compliance with standards like cis, hipaa, thanos and aws. Well Architecture Framework. This is now in beta for you. These pre written Sentinel policies set for AWS via the Terraform Registry. These Support services including EC2, KMS, CloudTrail, S3, IAM, VPCs, RDS and EFS which is a nice improvement and we'll get you started in a big way on some security then. Also the Terraform stacks are now in public beta to simplify your provisioning and management scale. When deploying managing infrastructure at scale, teams usually need to provision the same infrastructure multiple times, different input values across multiple cloud provider accounts, regions and environments. And before stacks there was no built in way to provision manage lifecycle of these instances as a single unit. With stacks that resolves all of those problems for you.
[00:05:37] Speaker C: Sure there was. You just had to define like hundreds of providers. Yeah, exactly.
[00:05:44] Speaker A: Some pretty nice quality of life improvements for AWS with Terraform.
[00:05:47] Speaker C: Yeah, I mean I'm a big fan of doing policy evaluation at you know, Terraform. Invoke time just to get that feedback directly to whoever's executing that Terraform rather than have it be a security ticket later or just blocked by permissions. I feel like it's a very good feedback. And so having pre built policies makes life easy because developing those policies isn't exactly fun.
[00:06:11] Speaker D: Yeah, it's much better like, like you said. You know I always looked at like TF sec, which I think got deprecated a few years ago, but it was a, it was a really good tool to kind of have built in. And so if you're already in the Terraform enterprise, you can leverage Sentinel pretty easily. Like having that there like, like Ryan said. So config doesn't go off or anything else and somebody yells at you, you know that you, that you push this to production before the rule got evaluated fake and somebody looked at the guard duty and notified you, you know, hopefully helps shorten that life cycle of vulnerabilities in production.
[00:06:46] Speaker C: I'm glad to see that Stacks is also replicated in Terraform because I used that in cloud formation for the first time recently and it was amazing.
So that's, that's pretty exciting as well just because yeah, that is really annoying when you have to deploy, you know like say a role in every one of your, your sub accounts so you can access them across a counter. And it's, it's pretty awesome when you can just do that with sort of one configuration and just watch it go.
[00:07:16] Speaker D: That reason right there is why I kind of recommended people use cloudformation stacks to like do roles, launch config, you know, anything else that you need to do kind of before Control Tower really took off, things like that. And then Terraform for your infrastructure, you know, a little bit more like platform than you know, actual deployment teams or you know, application teams. But this Terraform stacks really does replace that cloudformation stacks feature pretty nicely.
[00:07:43] Speaker C: I mean if you're already in the Terraform thing, you know, like it's kind of, it's. It's interesting because, you know, I've seen many places sort of the, the central cloud team or the security teams will use the cloud formation stack sets for those kinds of resources and then application developers and operational staff will use Terraform directly. So I can see it being used each way. Also, some of the organizational things get weirdly supported at the provider level. So we'll see.
[00:08:13] Speaker A: And our last hash item this week is they released Terraform 1.10 with several new features. The most interesting of them is this one on handling secrets. This is one of the big complaints if I recall from open Tofu was that they didn't. People didn't like the fact that the secret was actually in the state file. So basically ephemeral values are to solve this to enable secure handling of secrets before secrets can persisted in the plan or the state file. Since the secrets are stored in plain text within these artifacts, any mismanage or access of file would compromise the secret itself. Can to address this ephemeral value are not stored in the artifact, not the plan file or the state file. They now expect to remain consistent from plan to apply or from one point and apply around to the next. Ephemeral supports making input output variables as ephemeral within ephemeral blocks which declare that something needs to be created or fetched separately for each Terraform phase and then used to configure some other ephemeral object and then explicitly close for the end of that phase.
So for those of you who are used to dealing with secrets in your terraform files, this will probably mess you up, but it is a good improvement in general for addressing a pretty significant problem with the state file in my opinion for a long time.
[00:09:20] Speaker C: Oh yeah, this has been.
I've had to battle this with security teams who are Looking at, you know, approving Terraform Enterprise. I've had people pull secrets out of the state file and use them inappropriately. Like this is a great feature to see. So pretty psyched about it.
[00:09:38] Speaker A: All right, so while we were watching all the re invent news and I think we even mentioned this a little bit last week intel fired their CEO Pat Gelsinger. Which sort of makes sense because on stage at AWS they made the claim that 50% of new CPU capacity on AWS was graviton. So not so great for Intel. CEO Pat Gelsinger was forced out after four years handling control to two of the tenants while they search for a new CEO to hopefully lead the company from doom.
Reports are that he left after a board meeting where the directors felt his plan was too costly and ambitious to turn intel around and efforts so far weren't working and the progress to change wasn't fast enough.
In my opinion is replacing the top guy is a surefire way to make it go faster.
Gelsinger inherited a company 21 rifle challenges which he only made worse in many aspects. Made claims about AI chip deals that exceeded Intel's own estimates leading the company to scrap revenue forecast just a month ago. The whole results of his turnaround won't be known until next year when the planned flagship laptop chip is supposed to be released and made in its own factories as well. As intel started construction on a 20 billion suite of new fabs in Ohio and hired a larger workforce back during the pandemic that eventually led them to layoffs, potential sales or spinoffs of those assets. Gelsinger plan included becoming a major player in contract manufacturing for others a business model they called the Foundry model. Intel has announced Foundry customers including Microsoft and Amazon. But neither would bring to intel the volume of chips needed to reach profitability.
I mean sir, okay. In addition they were looking to TSMC to build some of the chips but shame TSMC said well if you're going to compete with us, we're not going to give you great pricing. And so that didn't really work out for them either. So intel is still a pile of garbage now many years later after those terrible CPU vulnerabilities that killed capacity and performance and set them back quite a while. And I still don't know if they've even made it to below 10nm on a lot of their stuff because they were stuck there for a long time on that architecture.
[00:11:32] Speaker B: I mean you could write a whole book, we could do a whole episode on on the the screw ups that Intel's made over the years. I think they just got, they were in such a dominant position and they became complacent and too risk averse. Which is, which is kind of funny to hear that the board were complaining that Gelsinger's plan was, was too, too risky basically is what they were saying. So they, they were, they were too risk averse. They still are risk averse.
And yeah, the things have, the things that messed up. They never took AMD seriously soon enough as a competitor they, they turned down Apple for building a mobile chip for the original iPhone. I mean, yeah, they got stuck going from 14 to 10nm for a long time. That really slowed them down. Like I don't, I don't think anybody could actually have turned intel around in, in four years.
I don't necessarily think Pat did the best job but there's, it's going to take them a long time to dig themselves out of the place they found themselves in. You know, they, they missed out on the mobile market completely.
They, they're not in good shape.
[00:12:36] Speaker A: Yeah, I mean the problem is like this is a, this is an R D problem and you have to, you have to have things in the pipeline that people are going to buy. And the problem is because where intel is at, they need something that's going to have big volume but nothing's going to have big volume until it's proven in the market. So they're kind. It's the chicken and the egg problem. And yeah, it's interesting. I was reading an article or a blog post by Brian Canhill from. He's one of the founders or CTO at Oxide Computers, which is one of the new hardware manufacturers come out in the last few years really targeting data centers and hyperscalers, their customer base. And they were talking about partnering with intel on a network chip that they were always nervous the entire time that Intel's going to kill it at any time. And sure enough they ended up killing it after they were about to ga on that chip. And Oxide is maybe not the biggest company in the world, although they're probably building a lot of servers right now for data centers.
It's still a market that if you get proof that it works and value and you can show proof points and those become opportunities to sell to other companies in the networking space.
Unless they get figured out a path to a billion dollars on any initiative. It seems like they don't want to even try. Which is kind of a big problem when you have a deteriorating stable intel business.
[00:13:53] Speaker C: Yeah, it doesn't make any sense to me.
[00:13:55] Speaker B: I think they just lost. They just lost the way they, for years they just worked on these tiny incremental changes because they didn't have to, because they felt like they had no competitors.
[00:14:05] Speaker A: Well, and even their 64 bit play titanium was a disaster. I mean, the reason why the chipsets, the AMD 64 chipset, is because AMD beat them to it with a better chipset than Titanium at a fraction of the price.
And their dominance in that time, they thought that they were going to win just based on the fact they were Intel. And I think that culture is probably still a big part of what drives Intel. But you know, I wouldn't be shocked to see intel get bought by somebody. But you know, they have a bit of a challenge because they got, they took a bunch of money from the government to build that foundry service. And so they, it's not a simple situation where you can just sell off assets because if those assets are going to potentially deteriorate the investment from the government, they're going to prohibit that happening. So whoever's coming in as new CEO has a tough road to hoe.
[00:14:53] Speaker B: Yeah, I think the only thing they have going for them is the fact that building this foundry, it's cost him a lot of money for sure. But if politics goes away, it's going over the next few years, be able to manufacture in the US is going to be super important. And TSMC is in Taiwan, not China. But whether or not the new administration makes that distinction in terms of tariffs is yet to be seen, I guess. So, you know, if all goes well and they, they get the, the foundry working, then perhaps companies will be forced to manufacture with intel rather than TSMC just because of cost. So maybe, maybe that's the light at the end of the tunnel for them.
[00:15:28] Speaker C: I would like to see more local development of those things just because the, you know.
[00:15:32] Speaker B: Sure, yeah.
[00:15:32] Speaker C: The supply chain shortages really highlighted a major concern. And so like I, I like that, you know, it's. If that's what it takes to keep the company limping along, I guess that's good. It'll buy them some time. But they still have to change the culture and you know, find someone who can be way more strategic in the, in the market.
[00:15:52] Speaker A: Yeah, I mean, I think the, you know, the fact that Microsoft and Amazon were willing to be customers of yours, that's, that's a good sign. Like you get successful there. You're building the Graviton 5 on behalf of Amazon or you're building whatever Microsoft's arm solution is going to be or you're building something completely new. But like, you know, the Foundry definitely seems like it has potential. It's just it's behind schedule and it's costing a lot of money to build. But once it gets up and running, if it can be competitive with tsmc, it is a potential game changer. But you gotta be underselling competitors and if you're trying to get TSMC to build your stuff in the interim you're not going to get good pricing from tsmc. So you're kind of screwed both ways.
Well, while they were announcing Nova ChatGPT announced Chat GPT Pro Chat GPT Pro is a $200 a month subscription that enables scaled access to the best of OpenAI's models and tools. This plan includes unlimited access to their smartest model, OpenAI 01 as well as the 01 Mini and GPT4 and advanced voice model. And I would care if I hadn't canceled my subscription couple weeks ago on ChatGPT as well. But how much of a power user are you are that you're running out of Usage on the 30amonth plan. But you need this.
[00:17:12] Speaker B: I canceled my ChatGPT subscription a long time ago and switched to Claude and that's 20amonth and I regularly run out of credits, aren't there? I would imagine it's comparatively priced in terms of number of tokens in and out every day.
I know some people are shocked by the cost, like oh my God, it's 200. But really think about the productivity increase that at least I've seen in using AI over the past few months.
I'd pay it in a heartbeat if Anthropic had an equivalent plan. $200 a month, unlimited access to Claude, even slightly slow down. I don't necessarily need instantaneous responses, but the value you're getting for $200 is immense.
[00:17:58] Speaker A: I mean if Claude was doing it for the API and for the website, sure. But I think this ChatGPT Pro thing is just for the web interface to ChatGPT. It's not necessarily for the API, is it?
[00:18:10] Speaker B: I don't know. I'm sure they draw the distinction like anthropic do.
[00:18:14] Speaker A: I think in the case of OpenAI to use the APIs it's a different costing model. It's always been a different costing model based on usage. So again I don't I haven't ran out of usage on Claude Web, I have run out of usage on cloud API, but they're again, they're built differently so I'M impressed that you run out of usage on cloud web.
[00:18:35] Speaker B: I use their projects a lot so I've got a lot of artifacts in there that consume a lot of tokens all the time and I kind of wish they'd have better ways of managing those. So you know, for this conversation I don't really care about these two things but I'd like to leave in the project for everything else. But you know I've so many my, my feedback to them. But no, I think $200 is, is just the tip of the iceberg for what people are going to be prepared to pay for things like this. I mean I wouldn't be surprised if this time next year or even in six months there's going to be a thousand dollar plan which will have built in agents that go and do things for you.
I like.
[00:19:08] Speaker A: I can see paying 200 bucks for a certain type of agentic AI for sure. I think that something that's doing something for me like an AI assistant.
[00:19:16] Speaker D: Yeah.
[00:19:16] Speaker A: Or something that has a very aesthetic, you know things it's going to do for me for 200 bucks a month. Sure, I think that'd be cool. But yeah, it'll be curious to see if you're right if it gets more expensive. Maybe your prediction for 2025.
[00:19:26] Speaker D: I was going to say my prediction now is like they end up with a spot market for these types of things where you get like you can bid on how much you're going to spend and get slotted in so you get a slow response.
[00:19:38] Speaker A: Spot market for Nova. That'd be fascinating.
[00:19:41] Speaker C: Oh for. Oh wow.
[00:19:43] Speaker D: I don't think it's actually going to happen. I don't think it's there yet but in a couple years I think you'll.
[00:19:47] Speaker A: End up with something spot market for inference.
[00:19:49] Speaker C: I can see it.
[00:19:50] Speaker B: Oh definitely.
[00:19:50] Speaker A: I could see it for inference. I don't know, I don't know training but I see inference.
[00:19:54] Speaker C: Yeah, well training they kind of already have that like you can do scheduled workloads.
[00:19:59] Speaker A: That's true if you can do it. But yeah, yeah it's more like batch like if you can do it in batches you can do a batch of this price and then if the price goes up above that you can drop the next batch to later. So yeah but.
[00:20:13] Speaker B: There are a lot of cloud cost management tools out there. But only Archera provides cloud commitment insurance. It sounds fancy but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Our chair will click the link in the show notes to check them out on the AWS Marketplace.
[00:20:52] Speaker A: All right, let's move to aws.
So I'm really surprised by announcement from Amazon but this one got me by surprise. Number one that they didn't cover this at re invent at all and number two that they even this even exists. So they're announcing a second FPGA powered instance which is the F2 instance with up to eight AMD field programmable gate arrays or FPGAs AMD EPYC Milan processor, up to 992 cores, high bandwidth memory, up to 8 terabytes of SSD based instant storage, up to 2 terabytes of memory. The new App Gen sensors are available in two sizes and are ready to accelerate your genomics, multimedia processing, big data, satellite communications, networking, silicon simulation and live video workloads. And they actually give some cool examples of how you might use an FPA GA. What was interesting is probably AstraZeneca here who used thousands of F1 instances to build the world's fastest genomic pipeline, able to process over 400,000 whole genome samples in under two months while they will adopt Illumina Dragon for F2 to realize better performance at an even lower cost with these F2s and then they had some other things around using SQL processing for Spark and satellite operators, using multi programmable radios with this type of stuff for the development kit. So some interesting use cases but when this first came out it sounded really cool. I really like the idea of this idea. You basically train a CPU model using FPGA to do this one specialized thing you really care about as fast as humanly possible and then cement it in code. Then when you're done with it you just change the makeup of the processor to do what you want differently. So it sounded cool and I always wanted to come up with a use case for it. Never did. But then nothing had happened. I was literally talking to Ryan and Jonathan. I think literally our prior two companies prior company ago this was announced year two pre the cloud pod the F1s coming out and I think we had talked about using it potentially in the mortgage market at that time and we couldn't work with use case then either. But fascinating.
[00:22:48] Speaker C: I didn't understand what it did then, I don't understand what it does now.
[00:22:57] Speaker B: You just build a massive logic array. You program it so that.
How do we get.
[00:23:03] Speaker D: Yeah, see, I'm like, I'm like. I think you're gonna try to explain it to me, but I still don't think I understand your answer.
[00:23:09] Speaker B: Yeah, I think I already explained it, like years ago on the podcast, probably days ago.
[00:23:13] Speaker A: You did?
[00:23:14] Speaker B: Yeah, I mean, it's like, it's like building your own customers. Logic set of logic chips, logic gates with. With inputs and outputs. A little more complicated than that.
[00:23:25] Speaker D: Okay, more like five. Keep going down. You're at like 25. I need like five.
[00:23:32] Speaker B: All right.
[00:23:35] Speaker D: If I understand you, it's like a little kids race car set where you can have like forks in the road and you drive your car in the certain ways and you build that on top of the CPU so it always goes in that certain direction. Maybe it's.
[00:23:49] Speaker B: It's more like imagine. Imagine a matrix of squares, like a 10 by 10. Did you say 5? Okay, 10 by 10 multiplication thing, right? And you've got 10, 10 rows and 10 columns. And the rows and columns can be inputs or outputs. And in each box you can put a logic gate so you can have it make a connection to something or not make a connection to something or connect to something else. It could be a NAND gate or a NAND gate or a not gate or whatever else. But essentially in each box of the array you can put a logic gate.
[00:24:21] Speaker D: And then we're not doing a segment of Jonathan explaining, like, it's fine.
[00:24:26] Speaker C: Okay.
[00:24:26] Speaker A: So I went to Claude and I asked Claude to explain FPGA to me. Like, I'm a five year old, okay. And so I have this one. If you want me to try this one, John.
[00:24:35] Speaker B: Yeah, yeah, go ahead.
[00:24:36] Speaker A: Okay. So it says, imagine you have a huge box of LEGO bricks. And with these LEGO bricks, you can build anything you want. A house, a car, a spaceship. And when you're done playing with one thing, you can take it apart and build something completely different using the same bricks. So far with us.
[00:24:50] Speaker D: Got it.
[00:24:51] Speaker A: And that cga.
[00:24:55] Speaker C: It'S a silicon CPU chip, right?
[00:24:57] Speaker A: We're getting there. We're getting there.
[00:25:00] Speaker D: We're legos, Ryan. Legos.
[00:25:02] Speaker C: Legos. Legos.
[00:25:02] Speaker A: Explain legos to you first. So then FPGA is like that box of LEGO bricks, but for electronics. It's a special computer chip that has lots of tiny electronic building blocks inside of it. And engineers can tell these blocks how to connect together to make different electronic circuits. Just like how you connect Lego brocks to make LEGO bricks to make different toys. For example, today you might tell the FVGA to be a calculator, but tomorrow you erase that and make it control lights. And the next day you erase it again and make it control music. And the really cool part is that unlike regular computer chips that can only do one specific job, FPGA is to be reprogrammed. Like taking apart, rebuilding your LEGO creation to do different jobs whenever you would.
[00:25:37] Speaker B: Like to, and then blazingly fast.
[00:25:41] Speaker C: Okay, now this makes sense to me. Reprogrammable cpu. So you're not, you're not stuck with the.
[00:25:47] Speaker A: Okay, yeah, with one instruction set.
[00:25:50] Speaker D: So it's like you can I just go with maybe I'm going up, not down. But like firmware. You essentially like a ROM on the firmware. You can like reprogram. So it's very efficient at this one thing over and over and over again. And then you say, okay, today I want you to drive me to the store. And now you are a calculator.
[00:26:09] Speaker B: You know, software defined networking is to network as. As FPGA is to silicon. So you can.
[00:26:18] Speaker D: That works.
[00:26:19] Speaker B: Right?
[00:26:19] Speaker D: Got it.
[00:26:20] Speaker A: Okay.
[00:26:20] Speaker B: All right.
[00:26:21] Speaker A: Are we all. Do we all know a thing? Can we get a. Can we get a conference? I think we know a fpga.
[00:26:26] Speaker D: We know a thing.
[00:26:27] Speaker A: We know a thing now. Standard there.
[00:26:29] Speaker C: I want to. I want to steal the name from a. An old podcast.
[00:26:32] Speaker B: Exactly the.
[00:26:33] Speaker A: Yes. Yes.
[00:26:34] Speaker D: No. Yes. Yes.
[00:26:35] Speaker A: No.
[00:26:35] Speaker D: Yes.
[00:26:39] Speaker A: All right. Introducing storage optimized Amazon EC2 IAG instances powered by Graviton 4 and 3rd gen AWS Nitro SSDS. These new storage optimized instance types provide the highest real time storage performance among storage optimized EC2 instances. With the third generation of AWS Nitro SSDs and AWS Graviton processors.
The IAG are the first instances to use a third generation Nitro SSD. These instances offer up to 22.5 terabytes of local NVME SSE storage with up to 65% better real time storage performance per terabyte and 60% lower latency variability compared to the previous generation of the i4G. You get these new shiny instances with up to 96G CPUs, 768 gigs of memory and 22.5 terabytes of storage. Usual network caps and EBS caps are there for smaller instance types that go up to pretty large throughputs as needed. Amazon suggested you consider these servers for your I O. Intensive workloads that require low latency access to data such as transactional databases, real time databases, which I don't know that they're true to real time and the transactional is but conversation, NoSQL and real time analytics such as Spark should have.
[00:27:40] Speaker D: Named those podcasts like they explain like M5 things. And now like we do a section on real time databases. Reverse transaction database.
[00:27:48] Speaker C: I think a real time database would be like a. What do you call this? The streaming database. Like a time series. Time series memory.
[00:27:57] Speaker A: A time series. Okay, that makes sense. All right. Why don't you say time series database? Come on Amazon.
[00:28:02] Speaker C: Because I'm probably wrong. It means something completely different.
[00:28:06] Speaker A: Would be the first time we're wrong on the show. Try the best we can.
[00:28:11] Speaker D: I always liked the I series. I've used them a few times. The free storage there when you don't care about this type of data and it's really truly ephemeral or you built it so you have, you know, three, no SQL replicas and you know, one each AZ gives you that free storage layer and doesn't really cost you that much extra. Is really nice. And the performance of it was, you know, blazingly fast. When I think I did it with like the i3, so I can't imagine with the i8s what they are. And that's when I realized I'm old.
[00:28:41] Speaker A: I've used i4 a few times. The i4 was quite fast as well. From my experience using it in the past.
[00:28:46] Speaker B: Had some reliability issues with SSDs on those instances though.
[00:28:51] Speaker A: Interesting.
[00:28:53] Speaker B: But then we were writing, you know, 25 terabytes of logs a day to them, so.
[00:28:58] Speaker A: Well, wasn't that as an i3 or is that an i2 back in those days?
[00:29:04] Speaker D: I remember though I still remember when I had to open a ticket because I ran the same performance test three times. I have three different results. And it turned out it was like an EBS block issue under the hood. So, you know, you always have these edge cases with, you know, when you stress test the hell out of these things.
[00:29:25] Speaker A: Well, Jeff Barr is announcing after 20 years, 3283 blog posts and 1,577,105 words, he is wrapping up as lead blogger on the AWS News blog. Jeff is apparently stepping back to become a builder again and says he went from a developer who could market to a marketer who used to be able to develop. And while there's nothing wrong with that path, he wants to go back to building. He will still appear on the AWS On Air Twitch show And will speak at community events around the globe. But we'll be primarily building something inside of aws. Maybe we'll see what that is in the future. There is a robust AWS news blog team that will replace him. They'll keep cranking out the announcements that we talk about here every week. And I look forward to seeing what Jefferson next and if there's a new lead blogger or if lead blogger becomes nova over time.
[00:30:10] Speaker C: Yeah, I wonder. I mean, I doubt that's why he's stepping down, but maybe it's part of a larger decision. But I'm dying to know what he's going to work on and I love to see people go back to, you know, directly contributing on a technical level from either a management or a non technical role. So that's exciting for him.
[00:30:28] Speaker D: I mean, when I've talked to him in person, he is extremely technical. I can go into detail not just from, you know, these, but he's told, you know, when we were having lunch with him, he was telling us how he's kind of ending up a little bit of like a QA tester because, you know, to write these blogs before the product gets released and have it be the announcement blog. He would end up with qa, you know, versions of stuff, you know, before they got ga'd and release can. So he ended up like still being extremely technical in this position. So, you know, I definitely can see him going back to being an amazing filter of things.
[00:31:01] Speaker B: Yeah, definitely miss him.
[00:31:03] Speaker A: Yep. I always knew it was an important blog post and Jeff was the one who's writing it. So I'll have to figure out who the new Jeff is. Oh, it's so and so. Okay, this is a good one.
[00:31:13] Speaker D: 164 blogs a year he did on average. That's impressive.
[00:31:17] Speaker A: That's impressive. Yeah.
All right, let's move to GCP. Cassandra, a key value NoSQL database is prized for its speed and scalability and used broadly for applications that require rapid data retrieval and storage, such as caching, session management and real time analytics. Simple key value peer structure gives you high performance and easy management, especially for large datasets. But that simplicity means poor support for complex queries, potential data redundancy, and difficulty in modeling intricate relationships. And to help solve this, Google is making it easier than ever to switch from Cassandra to Spanner with the introduction of Cassandra to Spanner Proxy Adapter, an open source tool for plug and play migrations of Cassandra workflows to Spanner without any changes to the application logic. And if you're Wondering if this proxy adapter will scale for your needs. Don't worry. Been battle tested by none other than Ryan's alma mater, Yahoo. Hooray.
I have a quote here From Patrick and J.D. newman Noonan, principal Product Manager at Core Mail and Analytics at Yahoo. The Cassandra adapter has provided a foundation for migrating the Yahoo Contacts workload from Cassandra to Spanner without changing any of our CQL queries. Our migration strategy has more flexibility and we can focus on other engineering activities while utilizing the scale redundancy and support of Spanner without updating the code base. Spanner is cost effective for our specific needs, delivering the performance required for businesses of our scale. This transition enabled us to maintain operational continuity while optimizing cost and performance.
[00:32:34] Speaker C: That's pretty neat. I didn't work with the Context team much when I was there, but I was on the platform engineering team that sort of created the own the internal services that provided this functionality. And one of the things that was just starting as I was leaving is the migration to Cassandra away from our internal tool.
So it's exciting, you know, like that's how long ago it was.
It's, you know, from a Google perspective, like that's fantastic business model, right? If you can get people using your service by making it really easy to adopt and then as they slowly transition, you know, the application, they can probably get better functionality and more features by calling it natively and it's a lot easier to consume rather than like a giant migration and rewrite type of thing. So that's pretty sweet.
[00:33:27] Speaker A: Yeah, I like the idea. Like could they create a Mongo proxy to do Mongo to Spanner? Like could they do other NoSQL solutions? Like I, I'm kind of curious if this is something we might see in the Future for other NoSQL databases because Cassandra is giving you global scale and you know, once the data is in Cassandra, I assume you or sorry, not once it's in Spanner, you can assume query in other other ways like with BigQuery and other things which might enable other use cases. So I could see this being really powerful in the right contexts.
[00:33:56] Speaker C: I could also see this being very Cassandra specific because of the technology under the hood of Span.
[00:34:01] Speaker A: Yeah, very key value pair based.
[00:34:05] Speaker C: Yes, that's true.
[00:34:06] Speaker B: But we'll see.
[00:34:07] Speaker C: I mean it'd be neat to see. Yeah, other things. Because I can tell you when evaluating technology for new applications, it's absolutely one of the things that you use, like could I adopt this and how easy and hard would it be?
[00:34:20] Speaker B: I'm So I think the repl. What was it called? Babel Fish, the SQL Server to Postgres connector, which was. Which was touted and then kind of vanished.
[00:34:32] Speaker D: No, it's still there, it's still used.
[00:34:35] Speaker A: It's just not very popular.
[00:34:37] Speaker C: Yeah, well, think about who their customers are.
[00:34:39] Speaker A: Right?
[00:34:39] Speaker C: It's the. Yeah, it's.
[00:34:41] Speaker D: Yeah, I don't want to talk about it.
[00:34:44] Speaker C: Yeah, fair enough.
[00:34:48] Speaker A: I was wondering if. I haven't been to the Open source page in a while. Oh yeah, babbler support for Postgres 16. I mean they're still, still definitely coding on it.
[00:34:57] Speaker B: Just kidding. But I mean it hasn't taken. It hasn't made a dent in Microsoft's earnings. So it's. I'm assuming it's still got issues.
[00:35:05] Speaker A: I mean there was. I mean based on some of the case studies I gave it, there's definitely some challenges to data storage format for certain contexts. So if like math is really important in your app, I understand isn't great for that.
[00:35:19] Speaker D: Yeah, it's a little bit more like T. SQL specific. So it depends how you wrote your SQL too. So from pocing this, I mean many years ago now, it definitely required a lot of finesse in order to make it work. And the performance of it wasn't always there. So you couldn't just like drop it in and be like, hey, I'm moving to Aurora. Like there was a lot of testing to kind of really make it work at that point, you know, the person I was working with said, no, we're just going to rewrite it in Postgres natively and move on in life. Because it like was going to be harder than it was worth. It wasn't like, hey, let's put this in and then as we deprecate the legacy, we'll start writing in postgres native. It was a lot more work than ended up being needed.
[00:36:03] Speaker C: Yeah, need the easy buttons to be easy.
[00:36:06] Speaker A: Speaking about not easy, Google is adding support for 30 additional services to custom Org policies originally limited to just gke, Data Product, Compute Engine and cloud storage. They are now making my life more miserable by adding very common services like BigQuery, Server Manager, KMS, Load Balancing, Next Gen Firewall, Cloud Run, Cloud SQL, Cloud VPN, Dataflow, Firestore, IAM Identity Platform, Redis, PSC and Secret Manager and VPCs. This allow you to enforce conditional restrictions such as specific roles for your resource in your project. You can also now set custom Org policies to domain restricted sharing principles, including all users of an org specific partner identities or service accounts. And service agents.
[00:36:44] Speaker C: To be fair, the org policies aren't making your life miserable.
[00:36:48] Speaker A: I had to hear you can talk about them. So yes, I'm making my life miserable.
[00:36:53] Speaker C: Yeah, no, VPC security controls are where most of the pain points are. And these conditional policies I really like because then you can actually enforce rules that you care about. You know, there's the common ones that are part of the general offering and those are good for things that are really obvious. Like, you know, I want to grant everyone primitive roles so I don't have to manage like very fine grained policies, but I also don't want them to create, you know, API keys that are going to get proliferated everywhere. And so now with this policy you can say, you know, you can't, even with all the permissions, you can't export this bigquery data set to somewhere public or you know that, that depending on what the conditionals they allow are. So that's pretty cool. I like that.
Despite them making your life miserable. In fact, I think I like them.
[00:37:43] Speaker A: More if mostly you talk about it for so long and I'm just like, okay, I know this is important, but he's just talking about it so much.
[00:37:54] Speaker C: Oh no, there's been a ton of things and you know, changes. You just haven't, you haven't noticed them, which is by design. So it's.
[00:38:00] Speaker A: Yeah, it's perfect.
[00:38:00] Speaker C: It's success.
[00:38:02] Speaker A: All right, well, Google says. Sit down. Nova announced a week after Reinvent Gemini 2.0 model is available and ready for the agentic era apparently and I just canceled my subscription two weeks ago, so I guess I won't know about this one. A year ago, Gemini 1.0 was launched with the intent to focus on information as the key to human progress. And the first Gemini model is built to be natively multimodal. Gemini 1.0 and 1.5 drove big advances of multimodality and long context to understand information across text, video, images, audio and code and process a lot of it. Gemini 2.0 is the most capable multimodal capable model yet for Google. With new advances in multimodality like native image and audio output and native tool use, it'll enable them to build new AI agents that bring them closer to their vision of a Universal assistant. Gemini 2.0 Flash experimental model will be available to all Gemini users and it was available earlier today when I was before it and they are launching a new feature called Deep Research which uses advanced reasoning and long context abilities to act as research assistant. Soaring complex topics and Compiling reports on your behalf available to you if you have Gemini advanced To Flash replaces 1.5, Flash outperforms 1.5 and even outperforms 1.5 Pro on key benchmarks which the article has several of them. Updates to Project Astra that they announced at I O from Feedback they have made improvements to Gemini to version of Astra. Better dialogue, new tool use including Google search, lens and maps and better memory allowing up to 10 minutes of in session memory and improved latency. I also have Project Mariner which is a new agent that helps you accomplish complex tasks starting with your web browser. This research product. Product, sorry, this research toy is able to understand reading across information in your browser screen including pixels and web elements like text, encode images and forms. And then Jules is a new AI agent to assist developers with code. It integrates directly into your GitHub workflow and it can tackle an issue developer plan and execute it all under developer's direction and supervision. So Gentic AI coming to you in Gemini.
[00:39:51] Speaker C: I wonder how much of that's branding because they were really pushing the Agentic model in Gemini 1.0.
So I'm sure you know they're addressing some of the shortcomings that prevented that. But know I imagine models are going to continue to evolve and and get more functional over time. So kind of this one's weird to me.
[00:40:13] Speaker A: I think it's just the way to announce a new model and then they give you some purpose built agents versus having to build agents from scratch. What you said to do before.
[00:40:21] Speaker B: Yeah, I think it's new enough that they need to provide examples like this. Otherwise people like well what do I do now? Great, you say it does this, but.
[00:40:27] Speaker A: Well, the domic A.I. i don't understand. Yeah, be honest right there.
[00:40:30] Speaker C: Well and even like you know, the flash replaces 1.5 flash and outperforms it. Like I guess there are articles posted or there are examples posting the article of that, but it's kind of strange.
[00:40:42] Speaker B: Yeah, what if this is it's based on Gemma? Because now this is Gemma too as well, which is a free one you.
[00:40:48] Speaker A: Can get with the derivative of Gemini.
[00:40:53] Speaker C: I didn't think they were related at all.
[00:40:55] Speaker A: Maybe they're not.
[00:40:56] Speaker B: I don't know, maybe they're not very similar names though.
[00:41:00] Speaker A: Yeah.
Well if you wanted to build your own Gemma, the Trillium, the sixth generation TPU is now generally available. Trillium TPUs were used to train the new Gemini 2.0 model and since they're done training it, clearly they're available for Sale. Google's most capable AI model yet. Some of the key improvements of Trillium 4x improvement in training performance 3x increase in inference throughput, 67% increase in energy efficiency, 4.7x increase in peak compute performance per chip. They doubled the high bandwidth memory, they doubled the interchip interconnect bandwidth. And 100,000 trillion chips in a single jupyter network fabric is available to you with up to 2.5x improvement in training performance dollar and up to 1.4x improvement in inference performance dollar. So these trillium chips are pretty powerful.
[00:41:43] Speaker B: Relative to the old ones.
[00:41:46] Speaker A: The old one, the Trillium 5th gen, they didn't compare to other cloud providers.
[00:41:50] Speaker B: Yeah, that's a slight red flag for me. Maybe an orange flag. They're not comparing it with things that people actually know.
[00:41:57] Speaker A: Oh, people who's Trillion five know it.
[00:42:01] Speaker B: Yeah, that guy's not here today.
[00:42:05] Speaker A: Yeah, I don't know that guy. No idea. Well, anyways, there if you want to. If you're on Google and you're using Trillium, check on the sixth generation. You might have a good experience.
Google Next returns to beautiful Las Vegas at Mandalay Bay, April 9th through the 11th, 2025. In fact, you can even register for it now using the last bits of your 2024 budgets. Early bird pricing is $1,000 for a limited time on February 14th or when tickets sell out, whichever comes first, those prices will go up. Experience AI in action, they say, and forge powerful connections. Or potentially meet the cloud pod hosts. Most of us will be there. Build and learn live. All available to you on the Google Cloud Next agenda, which we'll keep track of as they announce more details as we head into April.
[00:42:49] Speaker C: I'm terrified of what they mean by experience AI in action. Absolutely terrified.
[00:42:57] Speaker A: You're really selling and going there now. Good job.
[00:42:59] Speaker B: Yeah.
[00:42:59] Speaker D: I mean, you were so excited before.
[00:43:02] Speaker A: Yeah.
[00:43:04] Speaker D: It started off with Vegas. So like, you know, you're already at level one here.
[00:43:08] Speaker C: Yeah.
[00:43:10] Speaker B: Yeah.
[00:43:10] Speaker C: I mean, at least Google Next is more approachable, unlike the huge behemoth of Reinvent. But it's, you know, like I. I do remember that was the first sort of AI overload experience I had was the last one. So we'll see.
Sorry, I'm turning into a curin. So I'll have to.
[00:43:31] Speaker A: You don't have to put. You don't have to leave your house with pants on. So I get it.
[00:43:34] Speaker C: Exactly. Go to an airport, go talk to people. Like.
[00:43:41] Speaker D: Yeah, there's a reason we do a podcast. We just Talk to each other. People have to listen to us.
[00:43:46] Speaker C: Yeah.
[00:43:46] Speaker B: It's not Old man shouts in the sky.
[00:43:48] Speaker A: They don't have to listen to us. They can unsubscribe at any time. Which are doing right now, I'm sure.
[00:43:52] Speaker B: Yeah.
[00:43:54] Speaker A: Intros.
[00:43:55] Speaker B: Yeah. You are the old man shouts AI guy.
[00:43:58] Speaker C: Yeah. Yes, I am.
[00:43:59] Speaker A: Yeah. And Quantum and FPGA apparently true.
All right. OgunCloud is opening their 41st cloud region in Quantum Mexico. This is the third cloud region in Latin America after Santiago, Chile and Sao Paulo, Brazil. And I think they're, they have the most. I don't think Azure nor AWS is more than 2 in Latin America.
[00:44:22] Speaker D: I think Oracle does.
[00:44:24] Speaker A: And Oracle has them in garages.
[00:44:29] Speaker D: It's amazing how many regions you know, all these cloud providers have. You know, it used to be like, oh my God, they're opening a region now. It's like, great, they're opening another region like this new.
[00:44:40] Speaker A: And then like, wait, I can't wait for the data sovereignty laws that force me to put my app there. Yeah, kind of the feeling with these regions.
All right, well, Google wants to remind you that they continue to offer IBM Power systems on the Google Cloud. Originally launched in 2020, they then partnered in 2022 with converged technologies to upgrade the service by enhancing network connectivity and bringing full support to the IBM I operating system. And today they're announcing with Converged Enterprise Cloud with IBM Power for Google Cloud for simply it 4G support. All three major environments in Power for AIx, IBM I and Linux as well as now available in four regions, two in Canada, two in EMEA and two in North America, showing you truly what the amount of demand for this is. There's a quote here from Scott Vash, Vice President WMS Development with the Warehouse Manager Services at Infor. Infor was one of the original IP4G subscribers and years later we continued our mission critical IBM power workloads and IP4G for our clients. IP4G's availability and performance have more than met our requirements and we were very, extremely satisfied with the overall IP4G experience.
[00:45:44] Speaker D: This just feels like you're not cloud native.
[00:45:48] Speaker A: I mean you are, you're on the cloud with Power.
[00:45:51] Speaker B: Well, I mean the, the before Graviton came along it was an ARM data center chip. Power was the original data center ready ARM chip. It was like, it's really, really efficient. It's really fast. Lots of single cycle instructions instead of like, like Intel. It's very, very predictable. Really good for data handling. It's, it's an awesome platform. It's a, it's a shame it's was overtaken really could have been more popular.
[00:46:17] Speaker A: Yeah, I look, I almost bought a bunch of Power workload boxes back 2000, mid 2000s pre cloud for a bunch of Oracle workloads because you could, you know, get some pretty powerful. You know we ended up not going that direction but you're close so I'm somewhat familiar with it. I don't, I think it's down to POWER four. I think back then those days it was Power two or Power three but I don't know what the benchmarks are compared to ARM processors today or other workflow types.
[00:46:41] Speaker B: Yeah, I don't think it scales quite as well as these tiny ARM cores.
[00:46:46] Speaker A: I don't think so either.
[00:46:47] Speaker C: So would, would this kind of workload be you know like similar to what you would like, you know, sort of running an Oracle in Oracle Cloud? Oracle or like a hosted VMware?
I've never really heard of these systems so I don't know.
[00:47:03] Speaker A: It's a different processor architecture. I mean it's mainframe.
[00:47:09] Speaker D: Solaris.
[00:47:10] Speaker A: Yeah, Solaris, you know, all these different. It's a flavor of a processor type. It has a different instruction set than i86 and it works really well for certain workloads that are high throughput, like databases, typically Oracle I operating system, et cetera. I actually didn't know there was a Linux port for Power, so that was kind of interesting to me. I thought it was all IBM at this point.
I wouldn't start a new workload on it, but if I had a workload on my data center that I needed to move to the cloud to shut it down, I'd look at it for that use case. So it's nice to have.
All right, next up is The Achieve Peak SAP S4 HANA performance with the new compute engine x4 instances on Google Cloud.
The X worker is purpose built to handle the demanding workloads of SAP HANA OLTP at OLAP workloads and they're the only cloud provider that has standard memory configurations up to 32 terabytes that's actually certified by SAP HANA. You can get 960, 1440 or 1920 VCPU cores respectively, paying out your 1624 or 32 terabit box. And all I've learned about SAP HANA is that I don't think I ever really want to manage it. It's going to cost me an ARM and a leg. But I guess if you're an SAP customer, you are probably spending a lot of money already on your erp. This is probably a rounding error in those conversations. There is a quote here from Sean Lund, US Chief Technology Officer at Deloitte. In the past few years our SAP HANA systems have seen significant data growth and increasing need for higher and higher performance. The 24 terabyte export machines and hybrid disk storage. We have been able to raise the ceiling of our future data growth and are also looking to see improvements in our performance. Adding to this, Google XPR machines are cloud native, giving us opportunities to automate system management and operations. See, Deloitte says a big iron box is cloud native, so must be true. Yeah, must be true.
[00:49:00] Speaker C: Wonder if this is going to become the new mainframe that eventually gets chipped away.
This seems like very similar to the the IBM offering as well.
[00:49:10] Speaker A: I mean I have to imagine that. I mean other than all the default reports you get out of SAP HANA and the stuff they're doing and I think a lot of the core SAP databases all moved into HANA now. I mean a lot of the initial workloads for HANA are reporting and I think all of that had been replaced by Spark. But then they, once they started moving everything into it, I think that'll be quite a bit.
[00:49:32] Speaker C: And then customers like, you know, they built the pre built integrations with our systems and I think there's appealing sort of quality of life things surrounding it as well. Yeah, I don't know.
[00:49:46] Speaker A: All right, Google is introducing Google Agent Space, which is a terrible name.
Agent Space unlocks enterprise expertise for employees with agents that bring together Google's advanced reasoning, Google quality search and enterprise data. Regardless of where it's hosted, it'll make your employees highly productive by helping them accomplish complex tasks that require planning, research, content generation and actions all with a single prompt. Agent Space unlocks enterprise expertise by doing the following Gives you a new way to interact and engage with your enterprise data. Using notebookln plus, your employees can upload information to synthesize, uncover insights and enjoy new ways of engaging data such as podcast audio that we talked about here on the show previously. Information discovery across the enterprise, including searching unstructured data such as emails and documents. If you give Google access to all that data and expert agents automate your business functions like expense reports or other multi step processes. All through Google Agent Space all I.
[00:50:36] Speaker C: Can hear is agents in space.
[00:50:42] Speaker B: It's going to sit on your shoulder like Clippy and be like I don't think he wanted to do that.
[00:50:46] Speaker C: It's I was trying to figure out what the yeah, user interface is because like you know Expense reports and you know, sort of trial, you know, boring sort of day to day things. Like I am all for offloading that to AI.
[00:50:59] Speaker A: Any, anything that can make concur expense reports easier for me. I'm going to give it a shot once.
[00:51:07] Speaker C: Well, I don't even like, I don't want easier. I want like I don't want to do it.
What, what data do you need?
Get me out of this loop.
[00:51:17] Speaker A: I need you to go find the receipt that I sent to one of my many email addresses to attach to this expense.
If you can do that for me, that'd be great.
[00:51:25] Speaker C: I mean, I hope it's better than, you know, the, the sort of workflows that are in there now where it's like, you know, apply this rule set and logic fork out and then apply this rule set and then by the time you get to the end of it it's like, you know, this has been, this is denied by policy. What policy? Not going to tell you. Why is it denied by policy? Not going to tell you that either.
[00:51:45] Speaker D: Why would you want to know that information? Don't worry about it. It's just your money.
[00:51:49] Speaker C: Yeah, exactly.
[00:51:51] Speaker A: Azure had one story for us this week. They're also telling Nova to get lost with the debut of Phi4, a new generative AI model in research Preview. They say it's improved in several areas over its predecessor, per Microsoft, particularly in mass problem solving. 5.4 is available in limited access by Azure AI foundry development platform and only for research purposes. Today this is Microsoft's smallest model, coming in at 14 billion parameters in size and it competes with other small models such as GPT4O Minim, Genomy 2.0 and Cloud 3.5 Haiku. Microsoft introduces performance improvements to high quality synthetic datasets alongside high quality datasets of human generated content and some unspecified post training improvements. Which sounds like we just fudged the numbers, but sure, okay. Unspecified post training improvements it is.
[00:52:35] Speaker B: I believe you can download it for free as well from Hugging Face.
[00:52:38] Speaker A: Available for free.
[00:52:39] Speaker C: It's interesting because this, this announcement's for only for research purposes. I wonder if that's like to get benchmarks or kind of.
[00:52:48] Speaker D: They don't trust it yet. It's like private public for you.
[00:52:52] Speaker A: Yeah, they tell you it's for research purpose only and then it goes and becomes very toxic. You can just say, well, it was only in research, right?
[00:53:02] Speaker B: Yeah, there's some amazingly small models that are very, very good now. I think a lot more effort's going into.
I think what Microsoft do is is really focus on the the call the data quality going into training set and the order in which the model's trained apparently is is showing up as very crucial to getting good outcomes. And so I think Microsoft's putting some work into data data quality and data ordering in training. So I'll check out fire that looks because they're for the size of model they're really impressive.
[00:53:32] Speaker A: Yeah. You have several on my laptop using the LM Studio that you took me off to and I like to on them occasionally they run a little slow. I'm on Mac forcing me to buy a new Mac at some point but they it is kind of nice to have the power of an LLM like on my laptop and I can use I'm on a flight not catch the Internet which is pretty handy.
[00:53:54] Speaker B: Yep.
[00:53:55] Speaker A: Very very cool.
[00:53:56] Speaker B: Yeah.
[00:53:57] Speaker A: All right. And our last announcement for this week. Oracle database at AWS is available in limited preview.
So if you're waiting for it, you might still be waiting for it because unlimited preview you have to be approved. But up until now it has been impossible to replicate the performance and functionality of Oracle Database on Exadata in AWS, says Dave McCarthy, research vice president with IDC. With Oracle Database at AWS, customers can finally enjoy the same experience with an easy migration path to the cloud for their on premise mission critical workloads. And this allows them to reap the benefits of simplifying their daily management and operations. Prioritized modernization initiatives. You know, the more I think about this AWS Oracle Database at AWS and GCP and Azure, I think about how they're getting all these companies locked into these things and then just jack up the rates. I'm looking forward to that. And you know, people complain about it in five years from now, so I look forward to that. But yeah, available to you in limited preview. If you're an Oracle customer looking to get your Oracle database on AWS managed by Oracle and aws.
[00:54:52] Speaker B: I really wonder why the performance didn't match. You know, is it literally. Is it a software thing? Is it like if running On AWS run 50% slower, you know, like why.
[00:55:02] Speaker A: I feel, I feel like they're actually. I think, I think this is exit. I think they're actually installing exadata okay. In the data center and this hardware is highly tuned for this purpose.
[00:55:12] Speaker B: That's.
[00:55:12] Speaker C: That's back in my data center days.
[00:55:14] Speaker D: It's running power PC.
[00:55:16] Speaker C: Yeah. We had to, we had to deploy a very specific physical infrastructure in order to make Oracle Rack work. Right. And it was like, everything different. You know, our normal deployment pattern that we use commodity hardware and, you know, had a very different, you know, automated sort of deployment. And then we had these very complex Oracle Rack sections of the data center where very specific cabling, very specific colors on that cabling, very specific connections, all had to be all set up in this one way. And it was like that was only. That was one setup. And then if the custom, if, you know, some property wanted to have another one, we had to build a whole new another one. And it's very expensive and very annoying.
[00:55:58] Speaker D: You say color them go faster.
[00:56:00] Speaker B: Stripes on the side of the server. That's what it sounds like.
[00:56:03] Speaker D: I was just picturing them denying your support request, saying, hey, you have a incorrect physical setup because your Ethernet cable is green versus red.
[00:56:13] Speaker C: I mean, you're making jokes, but that was in the contract.
[00:56:19] Speaker A: My few experiences with Rack were never really great. So it's not a scalability play as much as they want you to think it is, because the amount of throughput you need to have between each node to maintain the rack is a lot.
[00:56:33] Speaker C: It's huge. And I imagine there's no different with, with this exadata specialized deployment.
[00:56:38] Speaker A: I mean, exadata has, I think it's all like heavy duty, like van style, you know, style connectivity between each node. And so you get like dedicated cards for that purpose. But yeah, I mean, and it's expensive. Exadata is not cheap even on, like just, you want to get a, you know, mind boggling large bill, just go to the Oracle cloud calculator and choose an exadata Oracle instance and price it out. Like, I mean, you start at almost $300,000 a year. Jeez, it's crazy. All right, well, that is it for another fantastic week here in the cloud, guys. One more week before the end of the year. So we'll see you next time here on the show.
[00:57:14] Speaker B: Yep. Thanks, everybody. Take care.
[00:57:16] Speaker C: Bye, everyone.
[00:57:17] Speaker D: Bye, everyone.
[00:57:21] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services.
While you're at it, head over to our
[email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.