281: Happy Birthday, ECS. You're still so much better than K8 at 10!

Welcome to episode 281 of The Cloud Pod, where the forecast is always cloudy! Justin and Ryan are your hosts as we search the clouds for all the latest news and info. This week we’re talking about ECS turning 10 (yes, we were there when it was announced, and yes, we’re old,) some more drama from the CrowdStrike fiasco, lots of updates to GitHub, plus more. Join us!

Titles we almost went with this week:

Github Universe full of ECS containers
Github Universe lives up to the Universal expectations

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.

Follow Up

01:09 Dr. Matt Woods ended up at PWC as chief innovation officer

YAWN
What exactly does a chief innovation officer at PWC do? Is this like a semi-retirement?

General News

01:44 TSA silent on CrowdStrike’s claim Delta skipped required security update

Delta isn’t backing down with CrowdStrike, and in a court filing said CrowdStrike should be on the hook for the entire $500M in losses, partly because CrowdStrike has admitted that it should have done more testing and staggered deployments to catch bugs.
Delta further alleges that CrowdStrike postured as a certified best-in-class security provider who “never cuts corners,” while secretly designing its software to bypass Microsoft security certifications to make changes at the core of Delta’s computer systems without Delta’s knowledge.
Delta says they would never have agreed to such a dangerous process if it had been disclosed.
In its testimony to Congress, CrowdStrike said that they follow standard protocols, and that they are protecting against threats as they evolve.
CrowdStrike is also accusing Delta of failing to follow laws, including best practices established by the TSA.
According to CrowdStrike, most customers were up within a day of the issue – while Delta took 5 days.
Crowdstrike alleges that this was caused by Delta’s negligence in following the TSA requirements designed to ensure that no major airline ever experiences prolonged system outages.
CrowdStrike realized Delta failed to follow the requirements when its efforts to help remediate the issue revealed alleged technological shortcomings and failures to follow security best practices, including outdated IT systems, issues in Delta’s AD environment and thousands of compromised passwords.
Delta threatened to sue Microsoft as well as CrowdStrike, but has only named CrowdStrike to date in the lawsuits.

3:48 Ryan – “It’s a tool that needs to evolve very quickly to emerging threats. And while the change that was pushed through shouldn’t have gone through that particular workflow, and that’s a mistake, I do think that that should exist as part of it. Yes, could they have done better with documentation and all that? Of course.”

04:51 Google is a Leader in Gartner Magic Quadrant for Strategic Cloud Platform Services

- It’s Magic Quadrant time! But let’s be real – when ISN’T it MQ time.
- The Magic Quadrant is out for Cloud Platforms… and AWS is still top dog.
- BUT Microsoft and Google have moved further to the right than AWS – which is for completeness of vision.
- Oracle also made the leaders quadrant.

- - Strengths
    - Operational excellence
    - Solutions support
    - Robust Developer experience
  - Cautions
    - Complex and inconsistent service interfaces
    - Limited traction for proprietary AI models
    - Fewer Sovereign cloud options

Google

- - Strengths
    - AI Infused IT Modernization
    - Environmental Sustainability
    - Digital Sovereignty
  - Cautions
    - Incomplete understanding of traditional enterprise needs
    - Uneven resilience
    - Distributed cloud inconsistencies

Azure

- Strengths
  - Cross-Microsoft Capabilities
  - Industry Clouds
  - Strategic partnership with OpenAI
- Cautions
  - Ongoing Security Challenges
  - Capacity Shortages
  - Inconsistent Service and Support

07:04 Justin – “…it’s still a shared security model. You still have requirements you have to meet. So you’re not off the hook completely by checking assured workloads for sure.”

08:12 4.2 Tbps of bad packets and a whole lot more: Cloudflare’s Q3 DDoS report

Cloudflare gives us the 19th edition of the CloudFlare DDOS threat report.
The number of DDoS attacks spiked in the third quarter of 2024.
Cloudflare mitigated nearly 6 million DDOS attacks, representing a 49% increase in QoQ and 55% increase YoY.
Out of those 6 million, Cloudflare’s autonomous DDOS defense systems detected and mitigated over 200 hyper-volumetric DDoS attacks exceeding rates of 3 terabits per second (Tbps) and 2 Billion packets per second (Bpps).
The largest attack peaked at 4.2TB and lasted a minute.
The Banking and Financial services industry is subjected to the most DDoS attacks.
China was the country most targeted, and Indonesia was the largest source of attacks.

09:27 Justin – “DDoS is not an IF thing. It’s a WHEN problem for every company.”

AI is Going Great – Or How ML Makes All Its Money

10:12 GitHub Copilot moves beyond OpenAI models to support Claude 3.5, Gemini

- In a sign of continuing ruptures between OpenAI and Microsoft (in Justin’s opinion,) Copilot will switch from being exclusively OpenAI GPT models to a multi-modal approach over the coming weeks.
- First Anthropic 3.5 Sonnet will roll out to Copilots chat web and VS Code interfaces, with Google Gemini 1.5 pro coming a short term later.
- In addition, Copilot will support gpt o1-preview and 01 mini, which are intended to be stronger at advanced reasoning than GPT-4 – which copilot has used until now.
- The new approach makes sense for users as certain models are better at certain languages or types of tasks.
“There is no one model to rule every scenario,” wrote GitHub CEO Thomas Dohmke “It is clear the next phase of AI code generation will not only be defined by multi-model functionality, but by multi-model choice.”

11:11 Ryan – “it’s very interesting that GitHub is doing that with Microsoft’s heavily involvement in OpenAI. But I also wonder if this is one of those things where the subsidiary is given a little bit more leniency, especially since it’s not really divorcing OpenAI or ChatGPT in general.”

AWS

12:32 EC2 Image Builder now supports building and testing macOS images

MacOS is now supported in EC2 Image Builder.
This will allow you to create and manage machine images for your macOS workloads, in addition to the existing support for Windows and Linux.

13:54 Celebrating 10 Years of Amazon ECS: Powering a Decade of Containerized Innovation

ECS is now 10 years old!! We still remember it being announced at Re:invent in 2014… and we’ve been fans ever since.
Its had a fun evolution:
- 2014 EC2 Container Service Launch
- 2015 ECS Autoscaling
- 2016 ALB for ECS
- 2017 AWS Fargate
- 2018 AWS Auto Scaling
- 2019 Graviton 2 support
- 2020 BottleRocket
- 2021 ECS Exec
- 2022 ECS Service connect
- 2023 Guard Duty ECS runtime support
- 2024 EBS support

16:29 Justin – “Despite Kubernetes dominating the market, you know, ECS has continued to get a lot of innovation. I imagine it runs a lot of services under the hood at AWS for their use cases and how they run your services that you consume…Happy birthday, ECS. Stop getting older because I can’t be aging this fast.”

17:54 AWS announces EFA update for scalability with AI/ML applications

AWS announces the launch of a new interface type that decouples the EFA and the ENA.
EFA provides high bandwidth low latency networking crucial for calling AI/ML workloads.
The new interface (EFA-only) allows you to create a standalone EFA device on secondary interfaces.
This allows you to scale your compute clusters to run AI/ML applications without straining private Ipv4 space or encountering IP routing challenges with linux.

GCP

19:35 AI Hypercomputer software updates: Faster training and inference, a new resource hub, and more

Google is announcing major updates to the AI Hypercomputer software layer for training and inference performance, improved resiliency at scale, as well as centralized hub for hypercomputer resources
Centralized AI Hypercomputer Resources on GitHub:
- Launch of the AI Hypercomputer GitHub organization, a central repository for developers to access reference implementations like MaxText and MaxDiffusion, orchestration tools like xpk (Accelerated Processing Kit), and performance recipes for GPUs on Google Cloud.
- Facilitates easier discovery and contribution to AI Hypercomputer’s open-source projects.
MaxText Now Supports A3 Mega VMs:
- MaxText, an open-source, high-performance implementation for large language models (LLMs), now optimized for A3 Mega VMs powered by NVIDIA H100 Tensor Core GPUs.
- Offers a 2x improvement in GPU-to-GPU network bandwidth over A3 VMs.
- Collaboration with NVIDIA to optimize JAX and XLA for overlapping communication and computation on GPUs.
- Introduction of FP8 mixed-precision training using Accurate Quantized Training (AQT), delivering up to 55% improvement in effective model FLOPS utilization compared to bf16 precision.
Reference Implementations and Kernels for Mixture of Experts (MoE):
- Expansion of MaxText to include both “capped” and “no-cap” MoE implementations, providing flexibility between predictable performance and dynamic resource allocation.
- Open-sourcing of Pallas kernels optimized for block-sparse matrix multiplication on Cloud TPUs, compatible with PyTorch and JAX, enhancing MoE model training performance.
Monitoring Large-Scale Training:
- Introduction of a reference monitoring recipe to create a Cloud Monitoring dashboard in Google Cloud projects.
- Enables tracking of metrics like CPU utilization and identification of outliers, simplifying MLOps for large-scale training jobs.
SparseCore on Cloud TPU v5p Now Generally Available:
- SparseCore, a hardware accelerator for embeddings on Cloud TPU v5p, is now generally available.
- Each TPU v5p chip includes four SparseCores, delivering up to 2.5x performance improvement for models like DLRM-V2 compared to previous generations.
- Enhances performance for recommender systems and models relying on embeddings.
Improved LLM Inference Performance:
- Introduction of KV cache quantization and ragged attention kernels in JetStream, an open-source, optimized engine for LLM inference.
- These enhancements improve inference performance by up to 2x on Cloud TPU v5e.

21:02 Ryan – “it really does show how much the IEI branding is taking over everything. Because a lot of these things were the same things we were talking about for machine learning.”

21:44 BigQuery’s AI-assisted data preparation is now in preview

Now in preview, BigQuery data preparation provides a number of capabilities:
- AI-powered suggestions: BigQuery data preparation uses Gemini in BigQuery to analyze your data and schema and provide intelligent suggestions for cleaning, transforming, and enriching the data. This significantly reduces the time and effort required for manual data preparation tasks.
- Data cleansing and standardization: Easily identify and rectify inconsistencies, missing values, and formatting errors in your data.
- Visual data pipelines: The intuitive, low-code visual interface helps both technical and non-technical users easily design complex data pipelines, and leverage BigQuery’s rich and extensible SQL capabilities.
- Data pipeline orchestration: Automate the execution and monitoring of your data pipelines. The SQL generated by BigQuery data preparation can become part of a Dataform data engineering pipeline that you can deploy and orchestrate with CI/CD, for a shared development experience.

22:12 Justin – “What could go wrong with low code complex data pipeline?”

23:21 Google Cloud Apigee named a Leader in the 2024 Gartner® Magic Quadrant for API Management

It’s amazing how many companies are in this quadrant but don’t feel like real API gateways..

24:29 Justin – “Amazon web services though, being a very, very good at ability to execute, but not a completeness of vision. they’re in the challenger quadrant, speaks volumes about how little innovation API gateway has gotten.”

Azure

25:42 What Microsoft’s financial disclosures reveal about Azure’s market position

Microsoft will now change the way it reports some Azure metrics to the stock market in their upcoming earnings call (Which we’ll cover next week.)
MS said the change will align Azure with consumption revenue and by inference more closely aligning how AWS reports its metrics.
The account change removed slower growth revenue streams and raised the growth rates for azure.
It also increased the AI contribution within Azure.
Removed services:
- EMS (Enterprise Mobility and Security) and Power BI

27:17 Azure at GitHub Universe: New tools to help simplify AI app development

Github Copilot for Azure now in Preview, integrating the tools you use your IDE and Azure.
You can now use @azure, giving you personalized guidance to learn about services and tools without leaving your code.
This can accelerate and streamline development by provisioning and deploying resources through Azure Developer CLI templates.
AI App Templates further accelerate your development by helping you get started faster and simplifying evaluation and the path to production.
Using an AI App template directly in your preferred IDE such as Github codespaces, vs code and visual studio.
You can even get recommendations for specific templates right from Github Copilot for Azure based on your AI use case or scenario.
Github Models now in preview to give you access to Azure AI’s leading model garden.
Keeping Java apps up to date can be time consuming, and to help they are giving you Github CoPilot upgrade assistant for Java to offer an approach using AI to simplify this process and allowing you to upgrade your java apps with minimal manual effort.
Scale AI applications with Azure AI evaluation and online A/B experimentation using CI/CD workflows

28:37 Ryan – “I like all of these, but I really don’t like that they’re keeping the Java apps up to date. Like, they’re just furthering the life of that terrible, terrible language. And one of the things is that they abstract all these simple things away, but it’s like, that’s why I hate it. It shouldn’t exist. It’s terrible. And newer languages have moved on.”

29:21 New from Universe 2024: Get the latest previews and releases

AI-Native = Github Copilot Workspace + Code Review + Copilot Autofix to allow you to rapidly refine, validate and land Copilot-generated code suggestions from copilot code review, copilot autofix and third party copilot extensions.
Github Spark is a new way to start ideas. It’s powered by natural language and it sets the stage for github’s vision to help 1 billion people become developers.
With live history, previews and the ability to edit code directly, Github Spark allows you to create microapps that take that crazy small, fun idea and bring it to life.
Raising the quality of Copilot power experiences, they have added new features such as multi-modal choice, improved code completion, implicit agent selection in github copilot chat, better support for C++ and .Net and expanded availability in Xcode and Windows Terminal.
You can now edit multiple lines and files with copilot in VSCode, applying edits directly as you iterate on your codebase with natural language.
Github Copilot code reviews provide copilot powered feedback on your code as soon as you create a pull request.
This means no more waiting for hours to start the feedback loop. Configure rules for your team and keep quality high with the help of your trusted AI pair programmer. Now supporting C#, Java, Javascript, Python, Typescript, Ruby, Go and Markdown.
Github Copilot extensions allow you or your organization to integrate proprietary tools directly into your IDE via the github marketplace.
Some that we saw in the marketplace were Docker for Github Copilot, Teams toolkit for Github Copilot. Atlassian, New Relic etc.
For the EU, you now get Data residency for Github Enterprise Cloud.
Github Issues got further improvements with sub issues, issue types, advanced search and increased project item limits

28:37 Ryan – “I do like adding the code reviews and feedback ability to GitHub. I think that’s a fantastic thing just to have built in. I hope that that allows some of the finding nine different people to validate my PRs to make sure I can go to production, go away, but we’ll see, doubt it.”

34:06 Accelerate scale with Azure OpenAI Service Provisioned offering

Azure OpenAI Service Data Zones allows enterprises to scale AI workloads while maintaining compliance with regional data residency requirements.
It offers flexible, multi-regional data processing within selected data boundaries, eliminating the need to manage multiple resources across regions.
99% Latency SLA for Token Generation: Ensures faster and more consistent token generation speeds, especially at high volumes, providing predictable performance for mission-critical applications.
Reduced Pricing and Lower Deployment Minimums:
- Hourly pricing for Provisioned Global deployments reduced from $2.00 to $1.00 per hour.
- Deployment minimums for Provisioned Global reduced by 70%, and scaling increments reduced by up to 90%, lowering the barrier for businesses to start using the Provisioned offering.
Prompt Caching: Offers a significant cost and performance advantage by caching repetitive API requests. Cached tokens are discounted by 50% for the Standard offering.
Simplified Token Throughput Information: Provides a clear view of input and output tokens per minute for each Provisioned deployment, eliminating the need for detailed conversion tables or calculators.

35:36 Justin – “I implemented Claude and my VS code, and when I ask it questions now it tells me how many tokens I used, which has been really helpful to like learn how many tokens and how much that does cost me. You know, especially when you’re paying by the drip now, like I have Claude subscription as well. And that one, just paid 20 bucks a month and I see the value of just paying 20 bucks a month if you’re doing a lot of heavy duty stuff, but if you need to integrate an app, you have to use API’s and that’s where the tokens really kill you.”

36:04 Announcing AzAPI 2.0

AzAPI provider, designed to expedite the integration of new Azure services with Hashicorp Terraform, has now released 2.0. This updated version marks a significant step in their goal to provide launch day support for azure services using terraform
Key Features of the AzAPI include
- Resource Specific versioning allowing users to switch to a new API version without altering provider versions
- Special functions like azapi_update_resource and azapi_resource_action
- Immediate day 0 support for new services.
Also, all resource properties, outputs and state representation are now handled by Hashicorp configuration language instead of JSON

37:15 Justin – “I kind of like the idea of it though, because, you know, if you, if you change the API for the service and now you have to roll a whole brand new provider, you have to maintain a lot of branches of providers. Cause if you push, you know, to a new provider that has different syntax, like that could be a breaking change. So this allows you to take advantage of a newer API without the breaking change potentially.”

38:31 Announcing Azure OpenAI Global Batch General availability: At scale processing with 50% less cost!

GA of Azure OpenAI global batch offering, designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24 hour turnaround and 50% less cost than global standard.
Why Azure OpenAI Global Batch?
Benefit 50% lower costs, enabling you to either introduce new workloads or run existing workloads more frequently, thereby increasing overall business value.
Efficiently handle large-scale workloads that would be impractical to process in real-time, significantly reducing processing times.
Minimize engineering overhead for job management with a high resource quota, allowing you to queue and process gigabytes of data with ease. Substantially high quotas for batch.

Oracle

40:09 Create a multi cloud data platform with a converged database

Oracle Autonomous Database will be available across all major cloud service providers (hyperscalers) by 2025, including Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
Introduction of Oracle’s Converged Database Solution: A single database that manages all data types (structured, unstructured, graph, geospatial, vectors) and can be deployed across private data centers and all major cloud platforms.
New Features:
- Deployment Across Multiple Clouds:
  - Oracle Autonomous Database on OCI: Offers features like automated security measures, continuous monitoring, and scalability without rearchitecting applications.
  - Integration with AWS: Strategic partnership enabling deeper analytical insights by combining Oracle Database services with AWS Analytics for near-real-time analytics and machine learning without complex data pipelines.
  - Oracle Database@Azure: Availability of Oracle Database services within Azure data centers, allowing seamless integration with native Microsoft Azure services for high performance and low latency.
  - Oracle Database@Google Cloud: Integration of Oracle technologies into Google Cloud, providing services like Oracle Exadata Database Service and Oracle Autonomous Database, fully integrated into Google Cloud networking.
- Converged Database Capabilities:
  - Unified Data Management: Handles multiple data types within a single database system, reducing the need for multiple specialized databases.
  - Compliance with Data Residency Regulations: Ensures minimal data replication and consistent data management across geographies to meet stringent regulatory requirements.

41:58 Justin – “And it’s kind of interesting, but I can think of really interesting data warehouse use cases. could see some interesting, you know, different global replication needs that you might have that this could be really handy. And so if you’re already sending all the money to Oracle, why not take advantage of something like this? If it makes sense for your solution.”

42:33 Oracle Cloud Migrations can now migrate AWS EC2 VM instances to OCI

Oracle now natively will migrate your EC2 VM to ZOCI.
This fully managed toolset provides you with complete control over the migration workflow while simplifying and automating the process, including:
- Automatically discovering VMs in your source environment
- Creating and managing an inventory with OCI of the resource identified in the source environment.
- Providing compatibility assessments, metrics, recommendations and cost comparisons
- Creating plans and simplify the deployment of migration targets in OCI

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod

[00:00:07] Speaker A: Welcome to the Cloud pod, where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker B: We are your hosts, Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker C: Episode 281, recorded for the week of October 29, 2024. Happy birthday, ECS. You're still so much better than Kubernetes at 10. Good evening, Ryan. How's it going? [00:00:29] Speaker B: It's going pretty well. Yeah. [00:00:32] Speaker C: This is Halloween week. Lots of things going on, and so it's just going to be me and you this evening. [00:00:37] Speaker B: Well, it's very fitting considering it's ECS's 10th birthday and you and I traversed that whole life cycle together. [00:00:47] Speaker C: Yeah. [00:00:47] Speaker B: Awesome. [00:00:48] Speaker C: Yeah. I remember arguing with you very profusely that we should not deploy Kubernetes and we should just go with ecs. And then you were mad because I was right. [00:00:57] Speaker B: Yes, Well, I was mad at first, and then now. Now 10 years later, I'm like, oh, thank God. [00:01:05] Speaker C: I mean, now we're. Now we're still doing Kubernetes at this current job. But yes, if I was on aws, I'd still probably be making an argument that unless you have something super complicated, ECS is the way to go. All right, well, let's. Let's get to that story here after we get through a couple others that we need to hit up first. First of all, Dr. Matt woods has started his new job as the Chief Innovation at PwC. And that's all I have to say about that. I'm kind of bored with it. [00:01:31] Speaker B: Wow. [00:01:32] Speaker C: Not what I expected. [00:01:33] Speaker B: What I expected. And is that like, retire me? Is that what that is like? I can't imagine what a Chief Innovation Officer would do at first. WaterhouseCooper. [00:01:44] Speaker C: Right, exactly. Very interesting. For sure. All right, so moving on to actual news we care about. TSA has been silent on CrowdStrike and claim Delta skipped required security updates. So basically, they're saying that Delta, you know who's been suing them, is basically saying that in a court filing said CrowdStrike should be on the hook for the entire $500 million loss that Delta took, partly because CrowdStrike has admitted that it should have done more testing and staggered deployments to catch bugs. Delta further alleges that CrowdStrike postured as a certified best in class security provider who never cuts corners while secretly designing its offer to bypass Microsoft security certifications to make changes at the core of Delta's computer systems without Delta's knowledge. Delta says they would never agree to such a dangerous process if it had been disclosed. Crowdstrike in his testimony to Congress, said that they followed standard protocols and that they are protecting against threats as they evolve. CrowdStrike is also accusing Delta of failing to follow laws, including best practices established by the TSA. And according to the CrowdStrike, most customers were up within a day of the issue, while Delta took five days. CrowdStrike alleges that this was caused by Delta's negligence in following the TSA requirements designed to ensure that no major airline ever experiences prolonged system outages. CrowdStrike realized Delta failed to follow the requirements when its efforts to help remediate the issue revealed alleged technological shortcomings and failures to follow security best practices, including outdated IT systems, issues in Delta's active directory environment, and thousands of compromised passwords. Delta previously had threatened to sue Microsoft as well as CrowdStrike, but is only named CrowdStrike and their current lawsuit continue to keep an eye on what happens in this one as if Delta wins, it's going to be a bit bad precedent for CrowdStrike. [00:03:24] Speaker B: Well, there's not going to be any winning on this. It'll settle for some amount, far less than 500 million in my opinion. You know, after, you know, going through the CrowdStrike outage and, and working with our CrowdStrike team in my day job, like, and getting the full technical details of, you know, what was upgraded, why that upgrade path existed, you know, like it's a lot more reasonable when you understand that. Right. Like they, it's a tool that needs to evolve very quickly to emerging threats. And while the change that was pushed through shouldn't have gone through that particular workflow and that's a mistake, I do think that that should exist as part of it, you know, like, and you know, like, yes. Could they have done better with documentation and all that? Of course. But so, you know, Delta being an such an outlier for how long it took, like it makes me wonder what their, their IT posture is. And I bet you it's not good. [00:04:31] Speaker C: Yep. [00:04:32] Speaker B: And we'll never know because they're going to settle out of court. [00:04:35] Speaker C: Yeah, most bummer part of it, but yeah, we'll see. Maybe we'll find out. You know something if Delta decides to drag crowdside through the mud, you know, during discovery or if it even gets a trial at all would be kind of the question. But definitely they're not done fighting. So we'll keep an eye on it. All right. Google and Amazon and Azure were all named leaders in the latest magic quadrant for cloud platforms. AWS is still the top dog and that means that they are top, top of the thing and farthest to the right. However, I would say that both Google and Azure have moved further to the right than AWS has, which means they have a better completeness of vision per Gartner's definitions of this, which I bet that's very heavily impacted by AI. [00:05:25] Speaker B: Yeah, but totally. [00:05:29] Speaker C: Jessica, we'll go through all of this because again, you should go to the link we place and you can go read it for yourself if you're interested. But AWS strengths, operational excellence, solutions support and robust developer experience. Their cautions, complex and inconsistent service interfaces, limited traction for proprietary AI models and fewer sovereign cloud options. So those. I actually think the third one, the fewer sovereign cloud options is. Yeah, I don't think they have as good of a story as Azure or GCP does in terms of isolation and preventing non American citizens from accessing certain environments. Unless you do things like, you know, use GovCloud, which is a much more complicated implementation of things as well as all of your data sovereignty for Europe and things. I think you have limited options in AWS other than create an account and shut down regions. [00:06:17] Speaker B: Well, they just announced the EU sovereign region as well. Right. Which is. [00:06:22] Speaker C: Yeah, but again like it's, it's creating another account, you know, that's set up for that purpose. It's not something you would apply to an existing account to protect certain things. Where in Google world you would set up a project and say this is an EU project and then that declares certain rules that have to be followed and how that works. [00:06:41] Speaker B: I don't know if that's how the implementation goes. You can do certain things like with assured workloads and restricting access there, but the projects themselves are not regional separated. [00:06:50] Speaker C: But the assured workload gets applied at the project level. So that's, that's the point. [00:06:54] Speaker B: Right. But if you have, you know, like regional things, you can deploy multiple within a project, you can deploy to multiple regions. [00:07:02] Speaker C: Sure. [00:07:02] Speaker B: So it gets a little weird, right? Like if you have an EU sort. [00:07:05] Speaker C: Of restriction, it's still a shared security model, you still have requirements you had to meet. So yeah, you're not off the hook completely by checking assured workloads for sure. [00:07:13] Speaker B: Yeah. [00:07:15] Speaker C: Google strengths, AI infused IT modernization, environmental sustainability and digital sovereignty and their cautions, incomplete understanding of traditional enterprise needs. Hallelujah. Uneven resilience and distributed cloud inconsistencies, which can't argue with, can't argue with any of those Azure strengths across Microsoft capabilities, industry clouds and strategic partnerships with OpenAI and their cautions, ongoing security Challenges, capacity shortages and inconsistent service and support, which all three of those I cannot argue with either. Although I would have probably thrown that capacity shortages on Google too. [00:07:57] Speaker B: Right. [00:07:58] Speaker C: So overall, if you, if you like to. If you listen to Gartner and you care what Gartner has to say, check out the latest Magic Quadrant from Gartner. All right. Cloudflare had their 19th edition of the Cloudflare DDoS Threat Report. They do this every quarter. They said the number of DDoS attacks spiked in the third quarter, 2024, with Cloudflare mitigating nearly 6 million DDoS attacks, representing a 49% increase in quarter over quarter and 55% increase in year over year DDoS traffic. Out of those 6 million, Cloudflare's autonomous DDoS defense systems detected and mitigated over 200 hypervolumetric DDoS attacks, exceeding rates of 3 terabits per second and 2 billion packets per second. The largest attack peaked at 4.2 terabytes per second and lasted a minute. The banking and financial sector service industry is subject to the most DDoS attacks, and China was the most country most targeted, while Indonesia was the largest source of the attacks. Which is actually really interesting to me that Indonesia is location. Like how many computers are in Indonesia? [00:08:55] Speaker B: Right. [00:08:55] Speaker C: So you can run a 4.2 terabyte attack. [00:08:58] Speaker B: Well, I mean, computers, but I mean. Yeah. What's the pipeline? Right. [00:09:01] Speaker C: It's a. Yeah, we know those cables. We talk about them all the time. When they build new ones. There's not that much pipe coming from Indonesia. [00:09:07] Speaker B: Yeah. Yeah. Crazy. Yeah. [00:09:10] Speaker C: I wonder if that's where the. The control plane they find for the attacks is. And then the attacks are typically globally distributed. Because that would make more sense to me. But yeah, clarify that in the report. But yeah, you know, DDoS is not a if thing, it's a when problem for every company. [00:09:27] Speaker B: Yeah. It's kind of funny that the denial of services is still like, hitting the banking and financial services, like most. Most likely. Like, I would expect ransomware and stuff to be more prevalent, but I guess the article doesn't actually. [00:09:41] Speaker C: I mean, this is about DDoS. I mean, I think ransomware is probably this. You know, phishing and ransomware are probably the two largest attack vectors. But I think from a DDoS perspective, banking and financial services getting an error that gets you access to the operating system of a banking system could potentially get you access to things with money. [00:09:58] Speaker B: Yeah, that's true. [00:10:00] Speaker C: All right, Moving on to AI is going great. GitHub Copilot is moving beyond the OpenAI models to support Claude 3.5 and Gemini directly inside of GitHub Copilot. In sign of the continuing ruptures between OpenAI and Microsoft, Copilot will switch from being exclusively OpenAI GPT models to a multimodal approach over the coming weeks. Anthropic 3.5 Sonnet is available to you basically now via the chat web and VS code interface. Well, Google Gemini 1.5 Pro will come short in a few weeks. Apparently they'll also be supporting GPT01 Preview and O1 Mini, which are intended for the stronger advanced reasoning capabilities as well as score higher in code. Which makes sense. The new approach makes sense for users as certain models are better at certain languages or types of tasks. And the GitHub CEO Thomas Dumke says there's no one model to rule every scenario. It is clear that the next phase of AI co generation will not only be defined by multimodal functionality, but by multimodal choice. [00:10:59] Speaker B: Yeah, now that it's very interesting that GitHub is doing that with Microsoft's heavily involvement in OpenAI, but I also wonder if this is one of those things where the subsidiary is given a little bit more leniency, especially since there's still sort of. It's not really divorcing OpenAI or ChatGPT in general. [00:11:20] Speaker C: Yeah, well, I mean I think we're seeing Microsoft kind of backing away from the OpenAI partnership in more ways than they used to. Like they're, you know, they have their own competitive models now. They are partnering with other providers of foundational models. I think they're very clearly seeing that both in the case of Google and in Amazon case, they're, they're offering multiple open source models and well as their own with Titan and Gemini and that to be competitive long term, you can't just be an open AI shop. So I think seeing this extend into GitHub is just a continuation of what we're already seeing happen in Azure with Model Gardens and really the ability to do those things. [00:11:57] Speaker B: I didn't see Model Gardens as a step back from OpenAI, just more, you know. [00:12:02] Speaker C: Oh, you're giving more choice. It's not a step back. I mean not saying they're going to pull their investment or not invest in the future, but I think they, I think Microsoft is playing their bets. [00:12:12] Speaker B: All right. [00:12:12] Speaker C: AWS EC2 image builder now supports building and testing macOS images. MacOS being supported by Introvert will allow you to create and manage machine images for your macOS workloads. In addition to existing support for Windows and Linux systems. I guess this is for build systems mostly. If you need to have certain prerequisites on your macOS workstations. I hope that this allows you to actually get away through. One of the big problems with macOS is that if you were to spin up a box, create an image in AMI, you now had to pay for it for 24 hours. And I'm hoping because this is a managed service, maybe you're able to get a more partitioned pricing structure, but probably not. Amazon's not that nice. [00:12:49] Speaker B: Yeah, I mean, you might save some money for like, you know, the old way of having to do this image build would be like, you know, through Hashi's packer or something along with. Or spinning it up yourself and going and installing it and saving it as an ami. And then you'd have to pay for the full hour or, you know. But I doubt this changes the pricing model at all. [00:13:10] Speaker C: Probably not, but I can have dreams. [00:13:13] Speaker B: Yeah, but it hopefully, you know, there's, you know, whatever that thing that they clearly install is the default os, that just vends money, you know, that's the only way you can afford to run these things. You could now remove that from your OS and maybe save some cash. Agreed. [00:13:29] Speaker C: As we said at the top of the show, ECS is now 10 years old, which is impossible because I'm not that old. And Reinvent 2014 was like a couple years ago, right? [00:13:38] Speaker B: Yeah, I think so. [00:13:39] Speaker C: 10 years, that's crazy. When it launched, I was there at Reinvent, I enjoyed going out into the Conex yard. They built out basically of all these different containers they stacked around. And I believe the musical group that year was Martin Garrix, if I recall correctly. But I might be wrong, so don't quote me on that one. ECS has had quite the evolution over the last 10 years. So 2014, of course it launched. 2015 it got auto scaling, which I forgot was not a day one feature. 2016 it got support for ALB, which made sense because the CLB and was basically starting to be retired. 2017, Fargate came, which allowed you to run your containers without running your own hosts, which was great opportunity. And then 2018, ECS auto scaling was supported. And that one, I surprised how late that came because I think it was always there from my opinion, but apparently not. [00:14:31] Speaker B: I think that's a misnomer somehow. Like we'll have to figure out because there's. ECS auto scaling was released in 2015, so the AWS Auto Scaling, I think is more of. [00:14:41] Speaker C: I think it's for the hosts itself, but yeah, I don't really know. [00:14:47] Speaker B: Growing cluster size. [00:14:48] Speaker C: Yeah, I think it was growing the clusters. Let me go back to it here in the notes. AWS auto scaling with the release, teams could now build scaling plans easily for their Amazon ECS tasks. That year also saw the moving of Amazon ECR to its own console experience outside of the ECS console and integration of Amazon ECS with cloud Map. Yeah, AWS auto scaling unified scaling for your cloud apps. [00:15:08] Speaker B: Like, I'm fairly certain that auto scaling was an option in 2015. [00:15:14] Speaker C: Yeah. This new service unifies, built on an existing service, specific scaling features. It operates on any desired EC2 auto scaling groups, spot fleets, ECS tasks, DynamoDB tables, DynamoDB global secondary indexes and Aurora replicas. So this has to just be kind of a node level. Yeah, node level and unification of the process. After that came Graviton 2 support in 2019. 2020 got you bottle rocket. 2021 was ECS exec, which I was very happy about when it announced 2020. [00:15:44] Speaker B: That for that for six years when they announced. [00:15:46] Speaker C: Yeah, I'm like, oh yeah, we only asked for that in 2016. 2022 was ECS service connect. 2023 got you guardduty ECS runtime support and 2024 came EBS support. Some of the highlights of ECS's evolution. So it's been, you know, despite Kubernetes dominating the market, you know, ECS has continued to get a lot of innovation. I imagine it runs a lot of services under the hood at AWS for their use cases and how they run your services that you consume. But I'm, you know, glad to see it still existing. I, you know, I always worry a little bit that Amazon's just going to give in and just move to EKS and kill ecs. But hasn't happened yet and I hope it never does. [00:16:24] Speaker B: Yeah. And the fact that they continue to announce enhancements. Right. Like I, I hate that both Kubernetes and ECS are adopting the EBS support and sort of giving up on the stateless workloads. But that's just it. It's also natural. I get it. But yeah, it's crazy. How is Graviton 2 supported that long ago? Lots of things in this article make me feel old. [00:16:48] Speaker C: Right. I remember the first Graviton came out, then Graviton 2 and yeah, 2019. Yeah, I mean that Graviton 2 feels about right because you have Pandemic in the middle of it. I even remember Bottle rocket getting kind of announced right before Pandemic started and we were all kind of like, what is this exactly? And we didn't really know. [00:17:07] Speaker B: And. [00:17:07] Speaker C: Yeah. Mm, yeah, it's, it's interesting. But yeah. Congratulations. Happy birthday ecs. Stop getting older because I can't, I can't be aging this fast. [00:17:16] Speaker B: Seriously. [00:17:18] Speaker C: Well, if you are in the AI training business, Amazon is announcing the launch of a new interface type that decouples the EFA and the ena. This is the Elastic Fabric adapter and the Elastic Network Adapter. EFA provides you a high bandwidth, low latency networking crucial for calling AI ML workloads. And this new interface, EFA only allows you to create a standalone EFA device on secondary interfaces. This allows you to scale your compute clusters to run AI ML applications without straining private IPv4 space or encountering IP routing challenges. With Linux. [00:17:50] Speaker B: I'm going to put you a little on the spot. Can you define, tell me the differences between what a fabric elastic fabric adapter is versus an attached network device? [00:18:02] Speaker C: So the fabric adapter I believe is just connect to the storage backend where the network adapter is to provide other throughput items like IP addressing for your load balancer to talk to the server. And so by separating out the ability for, you know, basically you're just giving you dedicated capacity to the storage backends that Amazon provides you through the Elastic fabric adapter without giving you, you know, burning IP addresses that you're never going to use with the ENA side. Huh. [00:18:28] Speaker B: That's pretty awesome actually. [00:18:29] Speaker C: You're impressed that I could answer you on the fly, weren't you? [00:18:33] Speaker B: I was nervous, actually. More. And I am impressed of course, but. [00:18:37] Speaker C: Only because I also didn't understand this announcement when I wrote the show notes. And so I had to go do that research. At that time I was like, I don't understand. But I figured someone might ask. [00:18:47] Speaker B: And now you're justifying my laziness because I had that thought when I originally read through and I was like, oh, I should go do that. And I didn't. [00:18:53] Speaker C: Yeah, and I was even 20 minutes late, so you could have done that a while ago. All right, let's move to gcp. Google is announcing major updates to the AI hypercomputer software layer for training and inference performance, improving resiliency at scale, as well as a centralized hub for hyper computer resources. So this includes the centralized AI hyper computer resource on GitHub, which is a GitHub organization and a central repository for developers to access reference implementations like Max Text and Max diffusion and orchestration tools like XPK. Max Text itself will now support A3 Megavms which is a VM powered by Nvidia H100 tensor core GPUs offering you a 2x improvement in GPU to GPU network bandwidth over the legacy A3 VMS collaboration with Nvidia to optimize JAX and XLA for overlapping communication and computing computation for GPUs as well as a referencing implementation and kernels for a mixture of experts or mo, which basically allows you to do something which I don't understand and then monitoring large scale training for you to basically introduce a reference monitoring recipe to create a cloud monitoring dashboard in Google Cloud projects to monitor your mega VMs as well as support for sparse cloud Sparse core on cloud TPU v5P is now generally available and improved LLM inference performance of the KV cache quantization and ragged attention kernels in Jetstream. Again, these are a lot of words I don't understand exactly, but if you are into this mega VM thing and centralized AI hyper computers, I'm sure this is really cool. [00:20:23] Speaker B: It's funny because you know it cracks me up because it really does show how much the AI branding is taking over everything because a lot of these things were the same things we were talking about for machine learning. Right. But it is cracks me up that it sort of, you know, sort of taken over. Yeah, but other than that. Yeah, I know very little details other than the sort of GitHub organization like I like to see those models just because, you know, can't ask AI to write all your code. You got to copy pasta some, you know, some of it cool. [00:21:03] Speaker A: There are a lot of cloud cost management tools out there, but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you don't use all the cloud resources you've committed to, they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Archer will click the link in the Show Notes to check them out on the AWS marketplace. [00:21:42] Speaker C: Well, if you are trying to get your AI to work and you need to load your data, you might need to prep your Data and so BigQuery is now giving you AI assisted data preparation and preview. This gives you a number of capabilities including AI powered suggestions for BigQuery data preparation using Gemini and BigQuery to analyze your data and schema and provide intelligent suggestions for cleaning, transforming and enriching that data. Data cleansing and standardization capabilities Visual Data Pipelines, which is an intuitive low code visual interface helping both technical and non technical users easily develop complex data pipelines and leverage BigQuery's rich and extensible SQL capabilities. What could go wrong with low code? Complex Data Pipeline and then Data Pipeline Orchestration which will automate the execution and monitoring of your data pipelines. Available to you all now in BigQuery data prep. [00:22:27] Speaker B: Yeah, so any data pipeline is a complex data pipeline. They just don't make a simple version of these things, unfortunately. But I will say that this tool is amazing. I've used it in my day job, you know, for it had a giant data set that was listing out, you know, SSL certificates for incoming connections and trying to verify that we're using, you know, modern ciphers and TLS 1.2 and not allowing TLS 1.0, you know, this allowed me to replace my hacky thing that I wasn't doing very efficiently when I started asking questions. And so pretty sweet actually. I did not use the data pipeline part, just the schema sort of validation and feedback and I'm a SQL idiot. So it was. It had a lot of recommendations weirdly. But yeah, very cool. [00:23:23] Speaker C: There's one more magic quadrant this week which is for API management. An Apogee was named the top leader again. Furthest to complete is a vicious and ability to execute. And I was mostly struck by this list of apparently API management tools as most of them don't seem like API management tools to me. [00:23:42] Speaker B: Right. I left it up because I was like, this is crazy. The little picture, the quadrant box, like. [00:23:48] Speaker C: So like some of them make sense to me. So Google Cloud, Apogee is there, Kong is there, even IBM is there. I kind of get them. I understand that. Salesforce, Mulesoft. Okay, sure. [00:23:58] Speaker B: What? [00:24:01] Speaker C: And then like WSO2 is on the list. I know those guys. Postman. Then things get weird. Yeah, Postman is on here. I didn't understand Boomi being on here. I didn't axway I was like either. They're clearly innovating in ways I don't understand because I'm not talking to them very often and I'm just out in the dark or I don't understand this quadrant. That's kind of my take. Like SAP is not building an API gateway, are they? And they're on the Challenger quadrant. I'm like, I Don't Amazon Web Services though being very good at ability to execute but not a completeness of vision. So they're in the challenger quadrant. Speaks volumes about how little innovation API Gateway has gotten in years. [00:24:44] Speaker B: Like yeah, I mean it hasn't gotten a lot of love in years, but it, I guess maybe I'm not the right customer because I did find it lacking in a whole lot of areas. But again, you looking at this quadrant, I feel like my definition of an API gateway and Gartner's definition somehow are just way off from each other. [00:25:05] Speaker C: I guess technically they call it API management, which maybe that has maybe that's where the rub is. Maybe they're thinking because I guess Postman I can see Postman being an API management thing, but I don't see it as a gateway, but I don't see Apogee necessarily being something that does Postman type thing. So that's where I'm just sort of like I don't understand the inclusion criteria for this particular quadrant. [00:25:25] Speaker B: Yeah, that is strange. [00:25:28] Speaker C: So anyways, Apogee top right, apparently the best and definitely worth checking out. I won't share that sentiment for it, but that's what Gartner thinks. So next week we are covering earnings and Azure has or Microsoft, which owns Azure has basically said they are changing the way they're going to report some Azure metrics to the stock market in their upcoming earnings call. Ms. Microsoft said the change will align Azure with consumption revenue and by inference more closely aligning with how AWS reports its metrics. The account changing removes slower growth revenue streams and raises the growth rate for Azure and also increases the AI contribution within the Azure spend line. So they're removing enterprise mobility and security which is they're basically MDM and MAM products and power bi and then they're basically adding in ads and a couple other things into their numbers. So we'll talk about this more next week. But a little preview. Azure is trying to be more like Amazon and how they report their earnings. [00:26:29] Speaker B: Yeah, there's a lot I don't understand about that statement specifically because it's sort of like. [00:26:37] Speaker C: Basically you have this revenue and these debts and you're going to count them in this bucket and then you're going to move some of them to this other bucket and so that way this number gets bigger than the other one. [00:26:48] Speaker B: Oh that part I understand how to lie with numbers. [00:26:50] Speaker C: I do that all the time. [00:26:51] Speaker B: But yeah, it's just the trying to report similar to Amazon but then the changes they made don't make any sense. Right, because it's sort of like they're still very different ways of reporting stuff. The things that they're grouping together in versus like Azure and Microsoft in general. I think it's pretty crazy. [00:27:12] Speaker C: Okay, well, GitHub universe happened this week and there is a lot to talk about. So first of all, Azure had a bunch of things For Azure or GitHub Copilot for Azure, which is now in preview. This is their plugin that plugs into Copilot or sorry into VS code and others that provides very Azure specific capability. So in your ide, when you have this installed, you can do an AT Azure which will then query your Azure information and give you personalized guidance to learn about services and tools without leaving your code line directly. This can accelerate and streamline development by provisioning and deploying resources through the Azure developer CLI templates and the AI app templates. Further accelerate your development by helping you get started faster and simplify evaluation in the path to production for all your AI workloads right from inside your IDE. GitHub models are now in preview to give you access to Azure AI leading model garden. So again, this is, you know, if you want to access some of the other things and try them out inside of your ide, you can do that directly as well as they have a new capability to keep your Java apps up to date, which can be time consuming. And to help with that, they're giving you GitHub copilot upgrade assistant for Java to offer an approach using AI to simplify the process and allowing you to upgrade your Java apps with minimal manual effort as well as you can now scale your AI applications with Azure AI evaluation and online AB experimentation using CICD workflows. [00:28:32] Speaker B: I like all of these, but I really don't like that they're keeping the Java apps up to date. Like they're just furthering the life of that terrible, terrible language. And one of the things is that, yeah, they abstract all these like simple things away, but it's like, that's why I hate it. Like it was, it's. It shouldn't exist and it's terrible and newer languages have moved on. Sorry, I'm angry today. Playing the part of Jonathan the Civic is Ryan. [00:29:00] Speaker C: Well, I mean, I appreciate that a lot of Java apps are just rotting on the vine with security vulnerabilities. So if this can help keep those more up to date, that's a good situation as well. All right, so then we get into the heavier duty GitHub universe stuff. This is for GitHub in general. So first of all AI native GitHub copilot workspace plus code review plus copilot auto fix to allow you to rapidly refine, validate and land Copilot generated code. Suggestions from Copilot Code Review, Copilot auto fix and third party copilot extensions directly in your IDE they have a new GitHub Spark which is a new way to start ideas. It's powered by natural language and it sets the stage for GitHub's vision to help 1 billion people become developers. With live history previews and ability to edit code directly, GitHub Spark allows you to create micro apps that take that crazy small fun idea and bring it to life. None of my ideas are crazy small or fun, but I will be checking out GitHub Spark at some point because I'm curious what it could do for me raising the quality of Copilot power experience. They have added new features such as Multimodal Choice which we talked about earlier, an improved code completion, Implicit agent selection and GitHub copilot chat and better support for C&.NET and expanded availability and Xcode for all your Apple developers and Windows Terminal will now be able to access GitHub Copilot. You can now edit multiple lines and files of copilot in VS code at applying edits directly as you iterate on your code base with natural language, so do updates across multiple files for same like I'm changing a module that needs to have references in other places, it'll do that for you automatically. Let's face what that means. GitHub copilot code reviews provide Copilot powered feedback on your code as soon as you create a pull request. This means no more waiting for hours to start the feedback loop. Configure rules for your team and keep quality high with the help of your trusted AI pair programmer now supporting C sharp, Java, JavaScript, Python, TypeScript, Ruby, Go and Markdown and GitHub copilot extensions allow you to you or your organization to integrate proprietary tools directly into your IDE by the GitHub Marketplace, for example Atlassian, New Relic, etc are all available to you directly so you don't install multiple plugins. Which is nice because if you've been trying different AI bots or Amazon things, you end up with a lot of extensions in your VS code and you can now do that through through the Copilot extensions and not have to install all that locally. And for the EU you now get data residency for GitHub Enterprise Cloud so You can now put all your data in the EU and comply with GDPR as well as GitHub issues continues to try to compete with JIRA with further improvements with sub issues, issue types and advanced search and increased project item limits. All available to you now in GitHub. [00:31:30] Speaker B: I have so many comments, so many. I do like, you know, the adding the code reviews and feedback ability to GitHub. I think that's a fantastic thing just to have built in. I hope that that allows some of the finding nine different people to validate my PRs to make sure I can go to production go away. But we'll see. Got it. Spark looks amazing. And I only say that because I have a server dedicated to doing just this in my house. Like I have a whole bunch of little things that built little APIs that, you know, serve different things. Like even some of the examples that they had directly, like tracking kids allowance. I have a chore chart and I have all kinds of like things that monitor or display data or home automation. And so like I love that this is a easy button for people to be able to build that, you know, because, you know, it's great having the technical expertise to put it together myself, but I can see how, you know, my kids are never going to learn all this stuff so. Because they're just going to be able to ask it in natural language, which is rad. [00:32:44] Speaker C: Yeah, I love that. It's when you create the little micro app, they call them Sparks, which I think is cute. [00:32:50] Speaker B: It's cute. Yeah. [00:32:51] Speaker C: I just signed up. [00:32:52] Speaker B: I hated it reading the headline and then I read the article, I was like, oh, this is fantastic. [00:32:56] Speaker C: Yeah, exactly. I did sign up for the waitlist on this one. It's available to you to sign up to hopefully get added pretty quickly. It worked out for me for Apple Intelligence, I signed up and got in almost immediately. So I'm hoping the same thing will happen here with GitHub Spark. [00:33:09] Speaker B: Oh, that's cool. The extension things is interesting too because I didn't understand it at all. Right. Because I was trying to figure out why it existed. Like, I don't know why I'd want Atlassian plugging in my ide. [00:33:24] Speaker C: Because they would give you access to all of your Confluence documentation directly from your idea. [00:33:28] Speaker B: Yeah, like I said, I don't know why I was. [00:33:33] Speaker C: Actually the Atlasian tool, it's called Rovo and I believe it does actually way more than just Confluence. I think it leverages JIRA stuff so you can find where changes happened that your code might require. So There's a couple interesting things there. [00:33:47] Speaker B: Yeah, that's pretty cool actually. Yeah. [00:33:51] Speaker C: All right, let's move on to. You can now accelerate scale with Azure OpenAI service provisioned offering. Basically Azure OpenAI service data zones are coming which allow enterprises scale AI workloads while maintaining compliance with regional data residency requirements. Hey, that sovereign cloud thing. It offers flexible multi regional data processing within selected data boundaries, eliminating the need to manage multiple resources and across regions. They now give you 98% latency SLA for token generation, ensuring faster and more consistent token generation speeds especially at high volumes. Providing predictable performance for mission critical apps. Reduce pricing and lower deployment minimums for your provision Global deployments from $2 to $1 per hour and deployment minimums for provision global reduced by 70% and scaling increments reduced by up to 90% lowering the barrier for businesses start using the provisioned offering versus just on demand. In addition, they're giving you prompt caching capabilities to help you receive 50% less cost for your cash tokens and a simplified token throughput information which provides you a clear view of input and output tokens per minute for each provision deployment. Eliminating the need for detailed conversion tables or calculators or building your other AI model to manage this AI model which is always not ideal. [00:35:02] Speaker B: Yeah, I'm glad to see these features around tokens being added because I've noticed it being sort of more highlighted in some of the other AI products as well and furthers it only Jonathan understands how this token thing works and then everyone else is just struggling. [00:35:23] Speaker C: I think I talked about last week I implemented Claude in my VS code and when I ask it questions and now it tells me how many tokens I used which has been really helpful to learn how many tokens and how much that does cost me. Yeah, especially when you're paying by the drip. Now I have Claude subscription as well and that one I just paid 20 bucks a month and I see the value of just paying 20 bucks a month if you're doing a lot of heavy duty stuff. But if you need to integrate an app you have to use APIs and that's where the tokens where they kill you. Yeah, definitely. Well, they're announcing AZ API 2.0. The AZ API provider is designed to expedite the integration of new Azure services with Hashicorp Terraform, which is now 2.0. This updated version marks a significant step in their goal to provide launch day support for Azure services using Terraform. I mean Azure doesn't release anything, so I don't know what launch Day support they need. The key features of AZ API include resource specific versioning, allowing users to switch to a new API version without altering provider versions, special functions like AZ API Update Resource and AZ API Resource Action and immediate Day Zero support for new services. Also, all resource properties, outputs and state representations are now handled by HashiCorp configuration language instead of JSON which anytime you get rid of JSON is a win. [00:36:33] Speaker B: But hcl, yeah, hcl, I'd rather do JSON. I'm surprised about the updating API versions without updating the providers because provider updates has saved me so much hassle and defining fuzzy fuzzy versions in my providers has caused me so much hassle that I'm, I'm surprised that they're even focusing at this level. [00:37:00] Speaker C: Like, I mean, I kind of like the idea of it though because you know, if you, if you change the API for the service and now you have to roll a whole brand new provider, you have to maintain a lot of branches of providers because if you push, you know, to a new provider that has different syntax like that could be a breaking change. So this allows you to take advantage of a newer API without the breaking change potentially. [00:37:23] Speaker B: Oh, this is the API on the cloud side. [00:37:25] Speaker C: Yeah. [00:37:26] Speaker B: Oh no, this is. Okay, sorry I misunderstood this. This isn't. Yeah, super. This is important. Like I'm. Yeah, cool. [00:37:35] Speaker C: Oh, now that you understand it. [00:37:36] Speaker B: Now I understand it. [00:37:37] Speaker C: It's perfect. [00:37:37] Speaker B: We should. Yeah, more of this. It's fantastic. [00:37:39] Speaker C: I'm actually surprised this wasn't a thing for a while because you know, this definitely something that can burn you pretty quickly. [00:37:45] Speaker B: Azure definitely has been very slow to update Terraform. [00:37:48] Speaker C: I think it's, they're trying to get you to use their proprietary, you know, ARM stuff I think. And that's, you know, Google's probably the fastest now because Terraform is their first party language and then Amazon's, you know, they had a similar capability that Amazon and Hazard worked on too. Right. To create the zero day provider for AWS resources. So it's just Azure catching up finally. All right, and then Azure OpenAI Global batch is now generally available which will let you scale processing with 50% less cost. This is designed to handle large scale and high volume processing tasks. Efficiently process asynchronous groups of requests with separate quota, 24 hour turnaround and a 50% less cost than the global standard. You get that 50% cost as your benefit. Excellent. Efficiently handle large scale workloads that would be impractical to process in real time and minimize the engineering overhead to do all the Job management, the high resource quota allowing you to queue and process gigabytes of data with ease. Thanks, appreciate this one. Wow. [00:38:45] Speaker B: I hadn't really thought about, you know, some of these pain points because I don't have any workloads where I'm doing AI at scale. But that now I'm just looking at this going, oh, how much money must that cost? So much money. [00:39:00] Speaker C: So much money. Yeah, if you're really foundational models, like I think, I think I saw something saying that basically to build like Gemini 1.5 cost, you know, like 3 or 4 million dollars in compute processing. [00:39:12] Speaker B: But wouldn't this be like the straight inference like you're using those models to like process? [00:39:16] Speaker C: I mean, it could be that too. Yeah, yeah, yeah. [00:39:18] Speaker B: Because that's, that's what I see when I see global batches. Because global batch doesn't really make sense for training in my. That's true, at least for my rudimentary knowledge of how all this stuff works. But yeah, it's like, you know, take all of my incoming transactions and match them to criteria like that's not so cool though. I mean, I'm glad it exists. I don't pay for it. [00:39:48] Speaker C: Then a final two stories come from Oracle this week. First, there's this article about creating a multi cloud data platform with a converged database. And when I read this initially I was like, I don't know what this garbage is going to be. Then I was reading through it, I realized, oh, this is why Oracle has probably been so big about getting partnerships with Amazon, Google and Azure. So basically Oracle Thomas database will be available across all major cloud providers by 2025. And so they're going to introduce the Oracle's Converge database solution, a single database that manages all data types, Structured, unstructured graph, geospatial and vectors can be deployed across private data centers and all major cloud platforms. So think about a global namespace for your single Oracle database that crosses all clouds, all data sovereignty needs and gives you one unified interface. That's the dream and that's basically what they're talking about here. So basically, again, they reiterate their partnerships with all four big cloud providers and what they're providing with Oracle Exadata, the Azure data centers, the connectivity capabilities and then the converged database capabilities include unified data management, which handles multiple data types within a single database system, reduces the need for multiple specialized databases, as well as compliance with data residency regulations, ensuring minimal data replication and consistent data management across geographies to meet stringent regulatory requirements. And so yes, I See you Oracle. I see your play and I'm impressed. I think it's also a massive failure domain, but you know. [00:41:12] Speaker B: Yeah, that's my first thought too. Like what could go wrong? [00:41:15] Speaker C: Yeah. Oh crap. US East 1 went down and now my data partitioned in US East 1 is now unavailable to the global cluster. But I like the idea of what they're trying to do. But this is truly a very multi cloud where you really want to leverage multi cloud, not just best cloud for the solution you need. And it's kind of interesting, but I can see some really interesting data warehouse use cases. I could see some interesting different global replication needs that you might have that this could be really handy. And so if you're already sending all the monies to Oracle, why not take advantage of something like this if it makes sense for your solution? [00:41:50] Speaker B: Yeah, the replication part, you know, seems kind of neat, but then if it's a global namespace, like I don't know how you really leverage it, but weird. Probably cool, I guess, if you have that need. I like smaller data. I was not bigger. [00:42:07] Speaker C: Yep. And then of course, because Oracle still hates AWS, they've now released the automatic EC2VM instance migration tool to OCI. They'll natively migrate your EC2VM to OCI. This fully managed toolset provides you with complete control over the migration workflow while simplifying and automating the process. Features include automatically discovering all the VMs in your source Amazon account so we can move them. Creating and managing an inventory with OCI of the resources identified in the source environment, providing compatibility assessments, metrics, recommendations and cost comparisons so you know how much money you're going to save before moving to OCI and creating plans and simplifying the deployment of the migration targets and OCI directly. So make all of your migration processes easily. If you have Windows. Buyer beware. [00:42:52] Speaker B: Yeah, do you. Have you done. You've done a little bit of Google, but are their compute prices lower than Amazon's oci? [00:43:01] Speaker C: Yeah, yeah, they are. In some cases. It's one of those weird like, you know. Yeah, if you have reserved instances and you have, you know, these things and yes, you can save some money potentially, but then you're giving your soul to Oracle and that cost is priceless. So. [00:43:17] Speaker B: I do like the compatibility metrics, you know, and recommendations because that's, you know, like this giant machine you have in aws. You can turn a smaller machine because that's how, almost always how that works out people over the provision. [00:43:33] Speaker C: Exactly. [00:43:34] Speaker B: But the tool knows yeah. [00:43:38] Speaker C: Well, that is it, Ryan, for another busy week in the Cloud. We'll be back next week. Talk about earnings. I know your favorite topic, and I saw one of them today. It's definitely a good quarter. So maybe the tech recession is ending. Who knows? [00:43:55] Speaker B: Ah, yeah, I hope so. [00:43:57] Speaker C: Right? [00:43:57] Speaker B: That'd be nice. [00:43:59] Speaker C: All right, see you next week. [00:44:00] Speaker B: All right, bye, everybody. [00:44:04] Speaker A: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services. While you're at it, head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening, and we'll catch you on the next episode.

Show Notes

Titles we almost went with this week:

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.

Follow Up

General News

AI is Going Great – Or How ML Makes All Its Money

AWS

GCP

Azure

Oracle

Closing

Episode Transcript

Other Episodes

Episode 218

218: The Cloud Pod is a Sucker and Shifts Left

Episode 237

237: Clean Your Crystal Balls - The Clod Pod Makes its re:Invent Predictions / Wishlist

Episode 145

145: The Cloud Pod Evidently Wants to Talk about re:Invent