327: AWS Finally Admits Kubernetes is Hard, Makes Robots Do It Instead

Welcome to episode 327 of The Cloud Pod, where the forecast is always cloudy! Justin, Matt, and Ryan are here to bring you all the latest news (and a few rants) in the worlds of Cloud and AI. I’m sure all our readers are aware of the AWS outage last week, as it was in all the news everywhere. But we’ve also got some new AI models (including Sora in case you’re low on really crappy videos the youths might like), plus EKS, Kubernetes, Vertex AI, and more. Let’s get started!

Titles we almost went with this week

Oracle and Azure Walk Into a Cloud Bar: Nobody Gets ETL’d
When DNS Goes Down, So Does Your Monday: AWS Takes Half the Internet on a Coffee Break
404 Cloud Not Found: AWS Proves Even the Internet’s Phone Book Can Get Lost
DNS: Definitely Not Staffed – How AWS Lost Its Way When It Lost Its People
When Larry Met Satya: A Cloud Love Story
Azure Finally Answers ‘Dude, Where’s My Data?’ with Storage Discovery
Breaking: Microsoft Discovers AI Training Uses More Power Than a Small Country
404 Engineers Not Found – AWS Learns the Hard Way That People Are Its Most Critical Infrastructure
Azure Storage Discovery: Finding Your Data Needles in the Cloud Haystack
EKS Auto Mode: Because Even Your Clusters Deserve Cruise Control
Azure Gets Reel: Microsoft Adds Video Generation to AI Foundry
The Great Token Heist: Vertex AI Steals 90% Off Your Gemini Bills
Cache Me If You Can: Vertex AI’s Token-Saving Feature
IaC Just Got a Manager – And It’s Not Your Boss
From Musk to Microsoft: Grok 4 Makes the Great Cloud Migration
No Harness.. You are not going to make IACM happen
Microsoft Drafts a Solution to Container Creation Chaos
PowerShell to the People: Azure Simplifies the Great Gateway Migration
IP There Yet? Azure’s Scripts Keep Your Address While You Upgrade

Follow Up

00:53 Glacier Deprecation Email

Standalone Amazon Glacier service (vault-based with separate APIs) will stop accepting new customers as of December 15, 2025.
S3 Glacier storage classes (Instant Retrieval, Flexible Retrieval, Deep Archive) are completely unaffected and continue normally
Existing Glacier customers can keep using it forever – no forced migration required.
AWS is essentially consolidating around S3 as the unified storage platform, rather than maintaining two separate archival services.
The standalone service will enter maintenance mode, meaning there will be no new features, but the service will remain operational.
Migration to S3 Glacier is optional but recommended for better integration, lower costs, and more features. (Justin assures us it is actually slightly cheaper, so there’s that.)

General News

02:24 F5 discloses major security breach linked to nation-state hackers – GeekWire

F5 disclosed that nation-state hackers maintained persistent access to their internal systems over the summer of 2024, stealing portions of BIG-IP source code and vulnerability details before containment in August.
The breach compromised product development and engineering systems, but did not affect customer CRM data, financial systems, or F5’s software supply chain, according to independent security audits.
F5 has released security patches for BIG-IP, F5OS, and BIG-IP Next products and is providing threat-hunting guides to help customers monitor for suspicious activity.
This represents the first publicly disclosed breach of F5’s internal systems, notable given that F5 handles traffic for 80% of Fortune Global 500 companies through its load-balancing and security services.
The incident highlights supply chain security concerns, as attackers targeted source code and vulnerability information, rather than customer data, potentially seeking ways to exploit F5 products deployed across enterprise networks.

03:12 Justin – “A little concerning on this one, mostly because F5 is EVERYWHERE.”

AI is Going Great – Or How ML Makes Money

04:55 Claude Code gets a web version—but it’s the new sandboxing that really matters – Ars Technica

Anthropic launched web and mobile interfaces for Claude Code, their CLI-based AI coding assistant, with the web version supporting direct access to GitHub repositories and the ability to process general instructions, such as “add real-time inventory tracking to the dashboard.”
The web interface introduces multi-session support, allowing developers to run and switch between multiple coding sessions simultaneously through a left-side panel, plus the ability to provide mid-task corrections without canceling and restarting
A new sandboxing runtime has been implemented to improve security and reduce friction, moving away from the previous approach where Claude Code required permission for most changes and steps during execution
The mobile version is currently limited to iOS and is in an earlier development stage compared to the web interface, indicating a phased rollout approach
This positions Claude Code as a more accessible alternative to traditional CLI-only AI coding tools, potentially expanding its reach to developers who prefer web-based interfaces over command-line environments

05:51 Ryan – “I haven’t had a chance to play with the web version, but I am interested in it just because I found the terminal interface limiting, but I also feel like a lot of the value is in that local sort of execution and not in the sandbox. A lot of the tasks I do are internal and require access to either company resources or private networks, or the kind of thing where you’re not going to get that from a publicly hosted sandbox environment.”

08:36 Open Source: Containerization Assist MCP Server

Containerization Assist automates the tedious process of creating Dockerfiles and Kubernetes manifests, eliminating manual errors that plague developers during the containerization process
Built on AKS Draft’s proven foundation, this open-source tool goes beyond basic AI coding assistants by providing a complete containerization platform rather than just code suggestions.
The tool addresses a critical pain point where developers waste hours writing boilerplate container configurations and debugging deployment issues caused by manual mistakes. (Listener beware, Justin mini rant here.)
As an open-source MCP (Model Context Protocol) server, it integrates seamlessly with existing development workflows while leveraging Microsoft’s containerization expertise from Azure Kubernetes Service. (Expertise is a stretch.)
This launch signals Microsoft’s commitment to simplifying Kubernetes adoption by removing the steep learning curve associated with container orchestration and manifest creation – or you could just use a pass.

09:47 Matt – “The piece I did like about this is that it integrated in as an optional feature, kind of the trivia and the security thing. So it’s not just setting it up, but they integrated the next steps of security code scanning. It’s not Microsoft saying, you know, hey, it’s standard … they are building security in, hopefully.”

Cloud Tools

33:09 IaC is Great, But Have You Met IaCM?

IaCM (Infrastructure as Code Management) extends traditional IaC by adding lifecycle management capabilities, including state management, policy enforcement, and drift detection to handle the complexity of infrastructure at scale.
Key features include centralized state file management with version control, module and provider registries for reusable components, and automated policy enforcement to ensure compliance without slowing down teams.
The platform integrates directly into CI/CD workflows with visual PR insights showing cost estimates and infrastructure changes before deployment, solving the problem of unexpected costs and configuration conflicts.
IaCM addresses critical pain points like configuration drift, secret exposure in state files, and resource conflicts when multiple teams work on the same infrastructure simultaneously.
Harness IaCM specifically supports OpenTofu and Terraform with features like Variable Sets, Workspace Templates, and Default Pipelines to standardize infrastructure delivery across organizations.

13:04 Justin – “So let me boil this down for you. We created our own Terraform Enterprise or Terraform Cloud, but we can’t use that name because it’s copyrighted. So we’re going to try to create a new thing and pretend we invented this – and then try to sell it to you as our new Terraform or OpenTofu replacement for your management tier.”

HugOps Corner – Previously Known as AWS

41:08 AWS outage hits major apps and services, resurfacing old questions about

cloud redundancy – GeekWire

AWS US-EAST-1 experienced a major outage starting after midnight Pacific on Monday, caused by DNS resolution issues with DynamoDB that prevented proper address lookup for database services, impacting thousands of applications, including Facebook, Snapchat, Coinbase, ChatGPT, and Amazon’s own services.
The outage highlighted ongoing redundancy concerns as many organizations failed to implement proper failover to other regions or cloud providers, despite similar incidents in US-EAST-1 in 2017, 2021, and 2023, raising questions about single-region dependency for critical infrastructure.
AWS identified the root cause as an internal subsystem responsible for monitoring network load balancer health, with core DNS issues resolved by 3:35 AM Pacific, though Lambda backlog processing and EC2 instance launch errors persisted through the morning recovery period.
Real-world impacts included LaGuardia Airport check-in kiosk failures, causing passenger lines, widespread disruption to financial services (Venmo, Robinhood), gaming platforms (Roblox, Fortnite), and productivity tools (Slack, Canva), demonstrating the cascading effects of cloud provider outages.
The incident underscores the importance of multi-region deployment strategies and proper disaster recovery planning for AWS customers, particularly those using US-EAST-1 as their primary region due to its status as AWS’s oldest and largest data center location.
We have a couple of observations: this one took a LONG time to resolve, including hours before the DNS was restored. Maybe they’re out of practice? Maybe it’s a people problem? Hopefully, this isn’t the new norm as some of the talent have been let go/moved on.

17:53 Ryan – “If it’s a DNS resolution issue that’s causing a global outage, that’s not exactly straightforward. It’s not just a bug, you know, or a function returning the wrong value, or that you’re looking at global propagation, you’re looking at clients in different places, resolving different things, at the base parts of the internet for functionality. And so it does take a pretty experienced engineer to sort of have that in their heads conceptually in to order to troubleshoot. I wonder if that’s really the cause, where they’re not able to recover as fast. But I also feel like cloud computing has come a long way, and the impact was very widely felt because a lot more people are using AWS as their hosting provider than I think have been in the past. A little bit of everything, I think.”

AWS outage was not due to a cyberattack — but shows potential for ‘far worse’ damage – GeekWire

AWS’s US-EAST-1 region experienced an outage due to an internal monitoring subsystem failure affecting network load balancers, impacting major services including Facebook, Coinbase, and LaGuardia Airport check-in systems.
The issue was related to DNS resolution problems with DynamoDB, not a cyberattack.
The incident highlights ongoing single-region dependency issues, as US-EAST-1 remains AWS’s largest region and has caused similar widespread disruptions in 2017, 2021, and 2023. Many organizations still lack proper multi-region failover despite repeated outages from this location.
Industry experts warn that the outage demonstrates vulnerability to potential targeted attacks on cloud infrastructure monoculture. The concentration of services on single providers creates systemic risk similar to agricultural monoculture, where one failure can cascade widely.
The failure occurred at the control-plane level, suggesting AWS should implement more aggressive isolation of critical networking components. This may accelerate enterprise adoption of multi-cloud and multi-region architectures as baseline resilience requirements.
AWS resolved the issue within hours but the incident reinforces that even major cloud providers remain vulnerable to cascading failures when core monitoring and health check systems malfunction, affecting downstream services across their infrastructure.

Today is when Amazon’s brain drain finally caught up with AWS • The Register

AWS experienced a major outage on October 20, 2025 in US-EAST-1 region caused by DNS resolution failures for DynamoDB endpoints, taking 75 minutes just to identify the root cause and impacting banking, gaming, social media, and government services across much of the internet.
The incident highlights concerns about AWS’s talent retention, with 27,000+ Amazon layoffs between 2022-2025 and internal documents showing 69-81% regretted attrition, suggesting loss of senior engineers who understood complex failure modes and had institutional knowledge of AWS systems.
DynamoDB’s role as a foundational service meant the DNS failure created cascading impacts across multiple AWS services, demonstrating the risk of centralized dependencies in cloud architectures and the importance of regional redundancy for critical workloads.
AWS’s status page showed “all is well” for the first 75 minutes of the outage, continuing a pattern of slow incident communication that AWS has acknowledged as needing improvement in multiple previous post-mortems from 2011, 2012, and 2015.
The article suggests this may be a tipping point where the loss of experienced staff who built these systems is beginning to impact AWS’s legendary operational excellence, with predictions that similar incidents may become more frequent as institutional knowledge continues to leave.

-And that’s an end to Hugops. Moving on to the rest of AWS-

23:58 Monitor, analyze, and manage capacity usage from a single interface with \Amazon EC2 Capacity Manager | AWS News Blog

EC2 Capacity Manager provides a single dashboard to monitor and manage EC2 capacity across all accounts and regions, eliminating the need to collect data from multiple AWS services like Cost and Usage Reports, CloudWatch, and EC2 APIs.
Available at no additional cost in all commercial AWS regions.
The service aggregates capacity data with hourly refresh rates for On-Demand Instances, Spot Instances, and Capacity Reservations, displaying utilization metrics by vCPUs, instance counts, or estimated costs based on published On-Demand rates.
Key features include automated identification of underutilized Capacity Reservations with specific utilization percentages by instance type and AZ, plus direct modification capabilities for ODCRs within the same account.
Data exports to S3 extend analytics beyond the 90-day console retention period, enabling long-term capacity trend analysis and integration with existing BI tools or custom reporting systems.
Organizations can enable cross-account visibility through AWS Organizations integration, helping identify optimization opportunities like redistributing reservations between development accounts showing 30% utilization and production accounts exceeding 95%.

25:45 Ryan – “This is kind of nice to have it built in and just have it be plug and play – especially when it’s at no cost.”

26:21 New Amazon EKS Auto Mode features for enhanced security, network control, and performance | Containers

EKS Auto Mode now supports EC2 On-Demand Capacity Reservations and Capacity Blocks for ML, allowing customers to target pre-purchased capacity for AI/ML workloads requiring guaranteed access to specialized instances like P5s. This addresses the challenge of GPU availability for training jobs without over-provisioning.
New networking capabilities include separate pod subnets for isolating infrastructure and application traffic, explicit public IP control for enterprise security compliance, and forward proxy support with custom certificate bundles. These features enable integration with existing enterprise network architectures without complex CNI customizations.
Complete AWS KMS encryption now covers both ephemeral storage and root volumes using customer-managed keys, addressing security audit findings that previously flagged unencrypted storage.
This eliminates the need for custom AMIs or manual certificate distribution.
Performance improvements include multi-threaded node filtering and intelligent capacity management that can automatically relax instance diversity constraints during capacity shortages.
These optimizations particularly benefit time-sensitive applications and AI/ML workloads requiring rapid scaling.
EKS Auto Mode is available for new clusters or can be enabled on existing EKS clusters running Kubernetes 1.29+, with migration guides available for teams moving from Managed node groups, Karpenter, or Fargate.
Pricing follows standard EKS pricing at $0.10 per cluster per hour plus EC2 instance costs.

27:33 Ryan – “This just highlights how terrible it was before.”

29:33 Amazon EC2 now supports Optimize CPUs for license-included instances

EC2 now lets customers reduce vCPU counts and disable hyperthreading on Windows Server and SQL Server license-included instances, enabling up to 50% savings on vCPU-based licensing costs while maintaining full memory and IOPS performance.
This feature targets database workloads that need high memory and IOPS but fewer vCPUs – for example, an r7i.8xlarge instance can be reduced from 32 to 16 vCPUs while keeping its 256 GiB memory and 40,000 IOPS.
The CPU optimization extends EC2’s existing Optimize CPUs feature to license-included instances, addressing a common pain point where customers overpay for Microsoft licensing due to fixed vCPU counts.
Available now in all commercial AWS regions and GovCloud regions, with no additional charges beyond the adjusted licensing costs based on the modified vCPU count.
This positions AWS competitively against Azure for SQL Server workloads by offering more granular control over licensing costs, particularly important as organizations migrate legacy database workloads to the cloud.
Interested in CPU options? Check those out here.

30:20 Justin – “This is a little weird to me, because I thought this already existed.”

31:46 AWS Systems Manager Patch Manager launches security updates notification for Windows

AWS Systems Manager Patch Manager now includes an “AvailableSecurityUpdate” state that identifies Windows security patches available but not yet approved by patch baseline rules, helping prevent accidental exposure from delayed patch approvals.
The feature addresses a specific operational risk where administrators using ApprovalDelay with extended timeframes could unknowingly leave systems vulnerable, with instances marked as Non-Compliant by default when security updates are pending.
Available across all AWS Systems Manager regions with no additional charges beyond standard pricing, the feature integrates directly into existing patch baseline configurations through the console at https://console.aws.amazon.com/systems-manager/patch-manager.
Organizations can customize compliance reporting behavior to maintain existing workflows while gaining visibility into security patch availability across their Windows fleet, particularly useful for enterprises with complex patch approval processes.
The update provides a practical solution for balancing security requirements with operational stability, allowing teams to maintain patch deployment schedules while staying informed about critical security updates awaiting approval.

30:20 Ryan – “It sounds like just a quality of life improvement, but it’s something that should be so basic, but isn’t there, right? Which is like Windows patch management is cobbled together and not really managed well, and so you could have a patch available, but the only way to find out that it was available previously to this was to actually go ahead and patch it and then see if it did something. And so now, at least you have a signal on that; you can apply your patches in a way that’s not going to take down your entire service if a patch goes wrong. So this is very nice. I think for people using the Systems Manager patch management, they’re going to be very happy with this.”

35:26 Introducing CLI Agent Orchestrator: Transforming Developer CLI Tools into a Multi-Agent Powerhouse | AWS Open Source Blog

AWS introduces CLI Agent Orchestrator (CAO), an open source framework that enables multiple AI-powered CLI tools like Amazon Q CLI and Claude Code to work together as specialized agents under a supervisor agent, addressing limitations of single-agent approaches for complex enterprise development projects.
CAO uses hierarchical orchestration with tmux session isolation and Model Context Protocol servers to coordinate specialized agents – for example, orchestrating Architecture, Security, Performance, and Test agents simultaneously during mainframe modernization projects.
The framework supports three orchestration patterns (Handoff for synchronous transfers, Assign for parallel execution, Send Message for direct communication) plus scheduled runs using cron-like automation, with all processing occurring locally for security and privacy.
Currently supports Amazon Q Developer CLI and Claude Code with planned expansion to OpenAI Codex CLI, Gemini CLI, Qwen CLI, and Aiden – no pricing mentioned as it’s open source, available at github.com/awslabs/cli-agent-orchestrator.
Key use cases include multi-service architecture development, enterprise migrations requiring parallel implementation, comprehensive research workflows, and multi-stage quality assurance processes that benefit from coordinated specialist agents.
We definitely appreciate another tool in the Agent Orchestration world.

37:46 Amazon ECS now publishes AWS CloudTrail data events for insight into API activities

Amazon ECS now publishes CloudTrail data events for ECS Agent API activities, enabling detailed monitoring of container instance operations, including polling (ecs: Poll), telemetry sessions (ecs: StartTelemetrySession), and managed instance logging (ecs: PutSystemLogEvents).
Security and operations teams gain comprehensive audit trails to detect unusual access patterns, troubleshoot agent communication issues, and understand how container instance roles are utilized for compliance requirements.
The feature uses the new data event resource type AWS::ECS::ContainerInstance and is available for ECS on EC2 in all AWS regions, with ECS Managed Instances supported in select regions.
Standard CloudTrail data event charges apply – typically $0.10 per 100,000 events recorded, making this a cost-effective solution for organizations needing detailed container instance monitoring.
This addresses a previous visibility gap in ECS operations, as teams can now track agent-level activities that were previously opaque, improving debugging capabilities and security posture for containerized workloads.

39:33 Ryan – “This is definitely something I would use sparingly because the UCS API is agent API chatting. So this seems like it would be very expensive, very fast.”

GCP

41:22 G4 VMs powered by NVIDIA RTX 6000 Blackwell GPUs are GA | Google Cloud Blog

Google Cloud launches G4 VMs with NVIDIA RTX 6000 Blackwell GPUs, offering up to 9x throughput improvement over G2 instances and supporting workloads from AI inference to digital twin simulations with configurations of 1, 2, 4, or 8 GPUs.
The G4 VMs feature enhanced PCIe-based peer-to-peer data paths that deliver up to 168% throughput gains and 41% lower latency for multi-GPU workloads, addressing the bottleneck issues common in serving large generative AI models that exceed single GPU memory limits.
Each GPU provides 96GB of GDDR7 memory (up to 768GB total), native FP4 precision support, and Multi-Instance GPU capability that allows partitioning into 4 isolated instances, enabling efficient serving of models from under 30B to over 100B parameters.
NVIDIA Omniverse and Isaac Sim are now available on Google Cloud Marketplace as turnkey solutions for G4 VMs, enabling immediate deployment of industrial digital twin and robotics simulation applications with full integration across GKE, Vertex AI, Dataproc, and Cloud Run.
G4 VMs are available immediately with broader regional availability than previous GPU offerings, though specific pricing details were not provided in the announcement – customers should contact Google Cloud sales for cost information. (AKA $$$$.)

43:03 Dataproc 2.3 on Google Compute Engine | Google Cloud Blog

Dataproc 2.3 introduces a lightweight, FedRamp High-compliant image that contains only essential Spark and Hadoop components, reducing CVE exposure and meeting strict security requirements for organizations handling sensitive data.
Optional components like Flink, Hive WebHCat, and Ranger are now deployed on-demand during cluster creation rather than pre-packaged, keeping clusters lean by default while maintaining full functionality when needed.
Custom images allow pre-installation of required components to reduce cluster provisioning time while maintaining the security benefits of the lightweight base image.
The image supports multiple operating systems, including Debian 12, Ubuntu 22, and Rocky 9, with deployment as simple as specifying version 2.3 when creating clusters via gcloud CLI.
Google employs automated CVE scanning and patching combined with manual intervention for complex vulnerabilities to maintain compliance standards and security posture.

44:14 Ryan – “But on the contrary, like FedRAMP has such tight SLAs for vulnerability management that you don’t have to carry this risk or request an exception because of Google not patching Flink as fast as you would like them to. At least this puts the control at the end user, where they can say, well, I’m not going to use it.”

44:45 BigQuery Studio gets improved console interface | Google Cloud Blog

BigQuery Studio’s new interface introduces an expanded Explorer view that allows users to filter resources by project and type, with a dedicated search function that spans across all BigQuery resources within an organization – addressing the common pain point of navigating through large-scale data projects.
The Reference panel provides context-aware information about tables and schemas directly within the code editor, eliminating the need to switch between tabs or run exploratory queries just to check column names or data types – particularly useful for data analysts writing complex SQL queries.
Google has streamlined the workspace by moving job history to a dedicated tab accessible from the Explorer pane and removing the bottom panel clutter, while also allowing users to control tab behavior with double-click functionality to prevent unwanted tab replacements.
The update includes code generation capabilities where clicking on table elements in the Reference panel automatically inserts query snippets or field names into the editor, reducing manual typing errors and speeding up query development workflows.
This interface refresh targets data analysts, data engineers, and data scientists who need efficient navigation across multiple BigQuery projects and datasets – no pricing changes mentioned as this appears to be a UI update to the existing BigQuery Studio service.

46:00 Ryan – “Although I’m a little nervous about having all the BigQuery resources across an organization available on a single console, just because it sounds like a permissions nightmare.”

47:10 Manage your prompts using Vertex SDK | Google Cloud Blog

Google launches GA of Prompt Management in Vertex AI SDK, enabling developers to create, version, and manage prompts programmatically through Python code rather than tracking them in spreadsheets or text files.
The feature provides seamless integration between Vertex AI Studio’s visual interface for prompt design and the SDK for programmatic management, with prompts stored as centralized resources within Google Cloud projects for team collaboration.
Enterprise security features include Customer-Managed Encryption Keys (CMEK) and VPC Service Controls (VPCSC) support, addressing compliance requirements for organizations handling sensitive data in their AI applications.
Key use cases include teams building production generative AI applications that need version control, consistent prompt deployment across environments, and the ability to programmatically update prompts without manual code changes.
Pricing follows standard Vertex AI model usage rates with no additional charges for prompt management itself; documentation available at cloud.google.com/vertex-ai/generative-ai/docs/model-reference/prompt-classes.

47:43 Justin – “If your prompt has sensitive data in it, I have questions already.”

49:05 Gemini Code Assist in GitHub for Enterprises | Google Cloud Blog

Google launches Gemini Code Assist for GitHub Enterprise, bringing AI-powered code reviews to enterprise customers using GitHub Enterprise Cloud and on-premises GitHub Enterprise Server.
This addresses the bottleneck where 60.2% of organizations take over a day for code changes to reach production due to manual review processes.
The service provides organization-level controls, including centralized custom style guides and org-wide configuration settings, allowing platform teams to enforce coding standards automatically across all repositories.
Individual teams can still customize repo-level settings while maintaining organizational baselines.
Built under Google Cloud Terms of Service, the enterprise version ensures code prompts and model responses are stateless and not stored, with Google committing not to use customer data for model training without permission. This addresses enterprise security and compliance requirements for AI-assisted development.
Currently in public preview with access through the Google Cloud Console, the service includes a higher pull request quota than the individual developer tier. Google is developing additional features, including agentic loop capabilities for automated issue resolution and bug fixing.
This release complements the recently launched Code Review Gemini CLI Extension for terminal-based AI assistance and represents part of Google’s broader strategy to provide AI assistance across the entire software development lifecycle.
Pricing details are not specified in the announcement.

51:08 Ryan – “It’s just sort of the ability to sort of do organizational-wide things is super powerful for these tools, and I’m just sort of surprised that GitHub allows that. It seems like they would have to develop API hooks and externalize that.”

53:19 Vertex AI context caching | Google Cloud Blog

Vertex AI context caching reduces costs by 90% for repeated content in Gemini models by storing precomputed tokens – implicit caching happens automatically, while explicit caching gives developers control over what content to cache for predictable savings
The feature supports caching from 2,048 tokens up to Gemini 2.5 Pro’s 1 million token context window across all modalities (text, PDF, image, audio, video) with both global and regional endpoint support
Key use cases include document processing for financial analysis, customer support chatbots with detailed system instructions, codebase Q&A for development teams, and enterprise knowledge base queries
Implicit caching is enabled by default with no code changes required and clears within 24 hours, while explicit caching charges standard input token rates for initial caching, then a 90% discount on reuse, plus hourly storage fees based on TTL.
Integration with Provisioned Throughput ensures production workloads benefit from caching, and explicit caches support Customer Managed Encryption Keys (CMEK) for additional security compliance

54:18 Ryan – “This is awesome. If you have a workload where you’re gonna have very similar queries or prompts and have it return similar data, this is definitely nicer than having to regenerate that every time. They’ve been moving more and more towards this. And I like to see it sort of more at a platform level now, whereas you could sort of implement this – in a weird way – directly in a model, like in a notebook or something. This is more of a ‘turn it on and it works’.”

55:30 Cloud Armor named Strong Performer in Forrester WAVE, new features launched

Cloud Armor introduces hierarchical security policies (GA) that enable WAF and DDoS protection at the organization, folder, and project levels, allowing centralized security management across large GCP deployments with consistent policy enforcement.
Enhanced WAF inspection capability (preview) expands request body inspection from 8KB to 64KB for all preconfigured rules, improving detection of malicious content hidden in larger payloads while maintaining performance.
JA4 network fingerprinting support (GA) provides advanced SSL/TLS client identification beyond JA3, offering deeper behavioral insights for threat hunting and distinguishing legitimate traffic from malicious actors.
Organization-scoped address groups (GA) enable IP range list management across multiple security policies and products like Cloud Next Generation Firewall, reducing configuration complexity and duplicate rules.
Cloud Armor now protects Media CDN with Network Threat Intelligence and ASN blocking capabilities (GA), defending media assets at the network edge against known malicious IPs and traffic patterns.

56:59 Ryan – “These are some pretty advanced features for a cloud platform provided WAF. It’s pretty cool.”

Azure

58:44 Generally Available: Observed capacity metric in Azure Firewall

Azure Firewall’s new observed capacity metric provides real-time visibility into capacity unit utilization, helping administrators track actual scaling behavior versus provisioned capacity for better resource optimization and cost management.
This observability enhancement addresses a common blind spot where teams over-provision firewall capacity due to uncertainty about actual usage patterns, potentially reducing unnecessary Azure spending on unused capacity units.
The metric integrates with Azure Monitor and existing alerting systems, enabling proactive capacity planning and automated scaling decisions based on historical utilization trends rather than guesswork.
Target customers include enterprises with variable traffic patterns and managed service providers who need granular visibility into firewall performance across multiple client deployments to optimize resource allocation.
While pricing remains unchanged for Azure Firewall itself (starting at $1.25/hour plus $0.016/GB processed), the metric helps justify right-sizing decisions that could significantly impact monthly costs for organizations running multiple firewall instances.

Generally Available: Prescaling in Azure Firewall

Azure Firewall prescaling allows administrators to reserve capacity units in advance for predictable traffic spikes like holiday shopping seasons or product launches, eliminating the lag time typically associated with auto-scaling firewall resources.
This feature addresses a common pain point where Azure Firewall’s auto-scaling couldn’t respond quickly enough to sudden traffic surges, potentially causing performance degradation during critical business events.
Prescaling integrates with Azure’s existing capacity planning tools and can be configured through Azure Portal, PowerShell, or ARM templates, making it accessible for both manual and automated deployment scenarios.
Target customers include e-commerce platforms, streaming services, and any organization with predictable traffic patterns that require guaranteed firewall throughput during peak periods.
While specific pricing wasn’t detailed in the announcement, prescaling will likely follow Azure Firewall’s existing pricing model where customers pay for provisioned capacity units, with costs varying by region and SKU tier.
When you combine these two announcements, they’re pretty good!

1:01:35 Public Preview: Environmental sustainability features in Azure API Management

Azure API Management introduces carbon-aware capabilities that allow organizations to route API traffic and adjust policy behavior based on carbon intensity data, helping reduce the environmental impact of API infrastructure operations.
The feature enables developers to implement sustainability-focused policies such as throttling non-critical API calls during high carbon intensity periods or routing traffic to regions with cleaner energy grids.
This aligns with Microsoft’s broader carbon negative commitment by 2030 and provides enterprises with tools to measure and reduce the carbon footprint of their digital services at the API layer.
Target customers include organizations with ESG commitments and sustainability reporting requirements who need granular control over their cloud infrastructure’s environmental impact.
Pricing details are not yet available for the preview, but the feature integrates with existing API Management tiers and will likely follow consumption-based pricing models when generally available.

1:02:44 Matt – “So APIMs are one, stupidly expensive. If you have to be on the premier tier, it’s like $2,700 a month. And then if you want HA, you have to have two of them. So whatever they’re doing to the hood is stupidly expensive. If you ever had to deal with the SharePoint, they definitely use them because I’ve hit the same error codes as we provide to customers. On the second side, when you do scale them, you can scale them to be multi-region APIMs in the paired region concept, so in theory, what you can do based on this is route a cheaper or more environmentally efficient one, you could route to your paired region and then have the traffic coming that way.”

1:06:09 Unlock insights about your data using Azure Storage Discovery

Azure Storage Discovery is now generally available as a fully managed service that provides enterprise-wide visibility into data estates across Azure Blob Storage and Data Lake Storage, helping organizations optimize costs, ensure security compliance, and improve operational efficiency across multiple subscriptions and regions.
The service integrates Microsoft Copilot in Azure to enable natural language queries for storage insights, allowing non-technical users to ask questions like “Show me storage accounts with default access tier as Hot above 1TiB with least transactions” and receive actionable visualizations without coding skills. Because a non-technical person is asking this question. In the ever-wise words of Marcia Brady, “Sure, Jan.”
Key capabilities include 18-month data retention for trend analysis, insights across capacity, activity, security configurations, and errors, with deployment taking less than 24 hours to generate initial insights from 15 days of historical data.
Pricing includes a free tier with basic capacity and configuration insights retained for 15 days, while the standard plan adds advanced activity, error, and security insights with 18-month retention – specific pricing varies by region at azure.microsoft.com/pricing/details/azure-storage-discovery.
Target use cases include identifying cost optimization opportunities through access tier analysis, ensuring security best practices by highlighting accounts still using shared access keys, and managing data redundancy requirements across global storage estates.

1:08:35 Ryan – “Well, I’ll tell you when I was looking for this report, I had a lot of natural language – and I was shouting it at my computer.”

1:09:52 Sora 2 in Azure AI Foundry: Create videos with responsible AI | Microsoft Azure Blog

Azure AI Foundry now offers OpenAI’s Sora 2 video generation model in public preview, enabling developers to create videos from text, images, and existing video inputs with synchronized audio in multiple languages.
The platform provides a unified environment combining Sora 2 with other generative models like GPT-image-1 and Black Forest Lab’s Flux 1.1, all backed by Azure’s enterprise security and content filtering for both inputs and outputs.
Key capabilities include realistic physics simulation, detailed camera control, and creative features for marketers, retailers, educators, and creative directors to rapidly prototype and produce video content within existing business workflows.
Sora 2 is currently available via API through Standard Global deployment in Azure AI Foundry, with pricing details available on the Azure AI Foundry Models page.
Microsoft positions this as part of their responsible AI approach, embedding safety controls and compliance frameworks to help organizations innovate while maintaining governance over generated content.
We’re not big fans of this one.

1:10:12 Grok 4 is now available in Microsoft Azure AI Foundry | Microsoft Azure Blog

Microsoft brings xAI’s Grok 4 model to Azure AI Foundry, featuring a 128K-token context window, native tool use, and integrated web search capabilities. The model emphasizes first-principles reasoning with a “think mode” that breaks down complex problems step-by-step, particularly excelling at math, science, and logic puzzles.
Grok 4’s extended context window allows processing of entire code repositories, lengthy research papers, or hundreds of pages of documents in a single query. This eliminates the need to manually chunk large inputs and enables comprehensive analysis across massive datasets without losing context.
Azure AI Content Safety is enabled by default for Grok 4, addressing enterprise concerns about responsible AI deployment. Microsoft and xAI conducted extensive safety testing and compliance checks over the past month to ensure business-ready protection layers.
Pricing starts at $2 per million input tokens and $10 per million output tokens for Grok 4, with faster variants available at lower costs.
The family includes Grok 4 Fast Reasoning for analytical tasks, Fast Non-Reasoning for lightweight operations, and Grok Code Fast 1 specifically for programming workflows.
The model’s real-time data integration allows it to retrieve and incorporate external information beyond its training data, functioning as an autonomous research assistant. This capability is particularly valuable for tasks requiring current information like market analysis or regulatory updates.

1:11:04 Generally Available: Enhanced cloning and Public IP retention scripts for Azure Application Gateway migration

Azure releases PowerShell scripts to help customers migrate from Application Gateway V1 to V2 before the April 2026 retirement deadline, addressing a critical infrastructure transition need.
The enhanced cloning script preserves configurations during migration while the Public IP retention script ensures customers can maintain their existing IP addresses, minimizing disruption to production workloads.
This migration tooling targets enterprises running legacy Application Gateway Standard or WAF SKUs who need to upgrade to Standard_V2 or WAF_V2 for continued support and access to newer features.
The scripts automate what would otherwise be a complex manual migration process, reducing the risk of configuration errors and downtime during the transition.
Customers should begin planning migrations now as the 2026 deadline approaches, with these scripts providing a standardized path forward for maintaining application delivery infrastructure.
You know would be even easier than PowerShell? How about just doing it for them? Too easy?
(Listener alert: This time it’s a Matt rant.)

Oracle

1:14:59 Oracle Expands AI Agent Studio for Fusion Applications with New Marketplace, LLMs, and Vast Partner Network

Oracle AI Agent Studio expands with new marketplace LLMs and partner integrations for Fusion Applications, allowing customers to build AI agents using models from Anthropic, Cohere, Meta, and others alongside Oracle’s own models.
The platform enables the creation of AI agents that can automate tasks across Oracle Fusion Cloud Applications, including ERP, HCM, and CX, with pre-built templates and low-code development tools for business users.
Oracle is partnering with major consulting firms like Accenture, Deloitte, and Infosys to help customers implement AI agents, though this likely means significant professional services costs for most deployments.
The AI agents can handle tasks like expense report processing, supplier onboarding, and customer service inquiries, with Oracle claiming reduced manual work by up to 50% in some use cases.
Pricing details remain unclear, but the service requires Oracle Fusion Applications subscriptions and likely additional fees for LLM usage and agent deployment based on Oracle’s typical pricing model.

1:15:45 Ryan – “They’re partnering with these giant firms that will come in with armies of engineers who will build you a thing – and hopefully document it before running away.”

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

[00:00:00] Speaker A: Foreign. [00:00:08] Speaker B: Where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker C: We are your hosts, Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker D: Episode 327 recorded for October 21, 2025. AWS finally admits Kubernetes is hard, but makes robots do it instead. Good evening, Matt and Ryan. How you guys doing? [00:00:30] Speaker C: Doing good, dude. [00:00:31] Speaker A: How are you? [00:00:33] Speaker D: Yes, it's been an interesting week in the cloud as Amazon was on fire yesterday. Yeah, the entire Internet was down. At least it felt like the whole Internet was down, except for me because I was on gcp, so. [00:00:44] Speaker A: And I was an Asher for once. [00:00:46] Speaker D: Small victories. Yeah. [00:00:50] Speaker C: Does it make the rest of the year worth it? Probably not, but at least we get. [00:00:54] Speaker A: This one for 18 hours, 16 hours or so. It was good times. [00:00:59] Speaker D: Yeah, we definitely will talk about that today, but let's hit some follow up first and then we'll get into the fun of Amazon's Hug Ops scenario yesterday. First of all, last week Ryan and I talked about the Glacier deprecation, or actually a bunch of deprecations that Amazon had announced. And then right after we recorded the show the next morning, I got an email clarifying something about Glacier. So we were a little bit surprised Glacier was in there and we had talked a little bit about potentially this is just the Glacier service standalone. And that's the email basically confirmed standalone. Amazon Glacier Service will stop accepting new customers as of December 15, but the S3 Glacier storage class, which includes instant retrieval, flexible retrieval, deep archive, are completely unaffected and continue normally and are basically part of the S3 service as a whole at this point. Existing Glacier customers can use it forever. Apparently there is no force migration required, at least not announced yet. I mean, forever is only until the next press release. And basically, you know, Amazon's essentially consolidating around S3 as a unified storage platform and standalone service, will be here getting no bug fixes. So yeah, enjoy. You will actually potentially get some cost savings if you move to the S3 glacier service. It is slightly cheaper than the Glacier standalone service. I was looking at some of the pricing charts and so then that made me email Synology, who I use to back up. I use Glacier to back up my Synology at my house. And I'm like, hey, are you going to do something about this? And they're like, we'll get back to you. [00:02:19] Speaker C: Oh, wow. [00:02:21] Speaker D: So I'm like, well that's good, that's a good sign. But yeah. So there you go, Ryan, clarification and kind of what we assume, but I just thought we should officially say that's what they said. [00:02:29] Speaker C: Yeah, that's cool. [00:02:33] Speaker D: All right. General news. F5 disclosed that nation state hackers maintained persistent access to their internal systems over the summer of 2024, stealing portions of the big IP source code and vulnerability details before containment in August. The breach compromised product development and engineering systems, but did not affect customer CRM data, financial systems, or F5 software supply chain. According to independent security audits, F5 has released security patches for Big IP, F5Os and Big IP Next products and is providing threat hunting guides to help customers monitor for suspicious activity. The first publicly disclosed breach of F5 control systems. Notable given that F5 handles traffic for 80% of the Fortune Global 500 companies through their load balancing and security security services. This incident will highlight your supply chain security concerns, attackers targeted source code and vulnerability information rather than customer data. Potentially looking for ways to break into your F5 product. So a little concerning on this one, mostly as F5 is everywhere and as well as they own NGINX, a couple other things. And if they're in, if they're in F5 company who has a lot of money for security in their source code, make sure you're also protecting your source code and your company's as well. [00:03:40] Speaker C: Yeah, I mean, the thing that scares me the most is the persistent access over, you know, a summer, which says to me like months. That's kind of spooky to me. And like, why wasn't that detected? And that's a long time to mess around inside the development environment. And you know, how are they verifying that nothing was sort of breached or inserted into the CSE pipe pipelines? [00:04:06] Speaker A: I mean, it also is. This is from 2024, this happened. So they're announcing it over a year later, which also feels kind of scary. [00:04:17] Speaker C: I mean, if. Especially if they detected it right now. [00:04:20] Speaker D: And so in the article it does talk about that they were asked by US entities not to disclose until now because of the risk to their business. So I. They were asked by Department of Justice to do that. [00:04:32] Speaker C: Okay, well, that's understandable. [00:04:34] Speaker A: Oh, I read that wrong when I read the article. Got it. Yeah, I read that Department Justin was asking them to disclose it now. I must have inverted it in my head. [00:04:44] Speaker C: Yeah, that makes sense. [00:04:45] Speaker D: They now are allowed to announce it. They were not allowed to announce it last year when they confirmed it. So definitely very concerning. [00:04:52] Speaker A: For sure. [00:04:56] Speaker D: All right. AI is how ML makes money. This week, Claude Code finally gets a web version. Yay. But it's the new sandboxing feature that really matters, says Ars Technica. The Web and Mobile interfaces for cloud code the CLI based AI coding system with a web version supporting direct GitHub repository access and the ability to process general instructions like add real time inventory tracking to the dashboard. The web interface introduces multi session support allowing developers to run and switch between multiple coding sessions simultaneously through a left side panel possibility to provide min task corrections without canceling and restarting and a new sandbox runtime has been implemented to improve security and reduce friction. Moving away from the previous approach where cloud code required permissions for most changes and steps during execution. The mobile version is currently limited to iOS and is in an earlier development stage compared to the web interface, indicating a phase rollout approach positions Cloud code is a more accessible alternative to traditional CI only AI coding tools and I'm looking forward to playing with this, although I've been busy this week dealing with other things so hopefully by next week I'll get to play with this. [00:05:53] Speaker C: Yeah, I haven't had a chance to play with the web version, but I am interested in it just because it is I found the terminal interface like limiting, but I also feel like the a lot of the value is in that local sort of execution and not in the sandbox because that's a lot of the tasks I do are internal and require you know, access to either like company resources or private networks or that kind of thing where you're not going to get that from publicly hosted sandbox environment. [00:06:21] Speaker A: Yeah, I was gonna say I find it useful also like everything I do I always say okay, build it in a Docker container. So I know that will run elsewhere too. So once you move it up to somewhere else, I don't know and I haven't played with it that much. I did open it up coincidentally by accident because I didn't realize it was new where they like I just don't think you can do that whole contest containerized workflow process. [00:06:47] Speaker C: I mean it seems like it's it's still interfacing with your code repository and and therefore it could be part of your CI C pipelines that build containers. [00:06:56] Speaker A: Just not. Yeah but if I'm testing something locally, like if I'm building a web app or something like that, which is what I've been on a roll recently. [00:07:03] Speaker D: A lot of people are moving away from local development anyway so they they need on server or they need other things. So maybe this solves those issues. But there's also been times where I've been, you know, because I use cloud code quite a bit and sometimes, you know, someone will call me and be like, hey, this is broken. And I'm like. Or like, you guys tell me Bolt's not doing something right, and I'm like, oh crap, I'm not anywhere near my laptop where I know exactly what I could fix with Claude. And you guys would just write the code yourselves. But, you know, has a whole different conversation. But, you know, I could just hop into the web interface potentially and do a quick Bolt fix if I needed to. For our show Note bot, for example. [00:07:36] Speaker C: Yeah, no, that is, I mean, that's a pretty powerful, you know, sort of option for that, especially, you know, like being able to like bug fix basically in public. You think about on call and that kind of certain things like that where you're less. Are you in, you know, working directly in this dev environment with all the special access and all the special tools and you're just doing natural language sort of prompts to. To fix things. [00:08:00] Speaker A: I mean, it also makes it more accessible to, you know, let's say junior person or potentially someone in your support. Org. Hey, there's typos. We've typo'd, you know, the word developer in 27 places and we have three tickets for it. You know, can a junior person or someone that doesn't really have full access do and handle it in this way versus having to handle local development of it? [00:08:27] Speaker D: Containerization Assist is automating the tedious process of creating Docker files and Kubernetes manifests eliminating manual errors that plague developers during the containerization process. Built on AKS's draft proven foundation, this open source tool goes beyond basic AI coding assistance by providing a complete containerization platform rather than just code suggestions. The tool is addressing a critical pain point where developers waste hours writing boilerplate container configurations and debugging deployment issues caused by manual mistakes. I mean, can I take a quick side note here? That's because you guys all chose Kubernetes. If you've chosen a platform as a service, you wouldn't have had to do any of that. That was a platform as a service supposed to do for us with all of you. [00:09:04] Speaker C: No, no, no. [00:09:04] Speaker D: Kubernetes is the future. We don't need any lambda or anything else like. No, no, this is on your own. This is on you. This is on all of you. As an open source, MCP server integrates seamlessly with existing development workflows while leveraging Microsoft's containerization expertise from Azure, Kubernetes service extratease is a stretch. The launch signals Microsoft's commitment to simplifying kubernetes adoption by removing the steep learning curve associated with container orchestration and manifest creation. Or you could just use a pass. [00:09:29] Speaker C: Yeah. Yeah. Enjoy your 17 nested layers of YAML documents. [00:09:36] Speaker A: The piece I did like about this is it integrated in as an optional feature kind of the trivia and the security thing. So it's not just setting it up but they integrated in you know, the full or you know, not the full but you know the next steps of security code scanning. So it's not Microsoft saying hey it's standard. You don't get security at premium do like which this isn't a service, it's a you know, git repository. But you know they are building security in. So hopefully as things grow this becomes more of a trend for Microsoft that you don't pay for security. But you know what a. Maybe it's just a pipe dream of mine. [00:10:13] Speaker C: I mean trivia is open source so it's like they're setting it up for you but it was already free. [00:10:19] Speaker A: Yeah, no, no, I know but like I guess it's more showing that they're thinking about a day one. [00:10:24] Speaker C: Yeah, no, and I do think it's cool. I do think having having that sort of built into the pipeline by default is neat. You know, seeing that more and more you see it directly on Docker Hub for example and it's just there's so many vulnerabilities and it. I didn't realize this coming from sort of an OS and server background like vulnerability management to me is easy. So it's just rebuild the thing. Figuring out which layer vulnerability is in like kind of came second nature. But I think if you don't come from that background it sort of feels a little bit like black magic and you know, and having zero idea on how to remediate whatever vulnerability it is. [00:10:58] Speaker D: So it's. [00:10:59] Speaker C: I do think that having this in there is. Is better and hopefully educates. [00:11:05] Speaker A: I just tell Claude once a week to update all my versions and fix any bugs it has. [00:11:11] Speaker D: Fix all bugs. [00:11:13] Speaker C: Cron job perfect. [00:11:15] Speaker D: See I can go wrong. Yeah, I do like that. You know looking at this repo they actually have, the other clouds are actually for citizens. There's actually, you know, here's how to authenticate against EKS and GKE and it's all actually in here where a lot of companies will say oh this is multi cloud. And then you actually go look at it like yeah, battery's not included for multi cloud. [00:11:36] Speaker C: Do yourself use your one cloud. Yeah, yeah it happens to Be ours. [00:11:40] Speaker D: You know, it's nice that they're trying something, so that's nice. All right. Moving on to Clouds Tools, Harness had a blog post that just mostly annoyed me, which is why we're talking about it. Basically, they said the title of this was Infrastructure as Code is great, but have you heard of Infrastructure as Code Management? Wow. Infrastructure as Code Management, or IACM for short, supposedly extends traditional infrastructure as code by adding lifecycle management capabilities, including state management, policy enforcement, and drift detection to handle the complexity of infrastructure at scale. Key features apparently include centralized state file manager with version control module and provider industries for usable components and automated policy enforcement to assure compliance without slowing down teams. The platform integrates directly with your CI CD workflow with visual PR insights showing cost estimates and infrastructure changes. And Infrastructure as Code Manager addresses critical pain points like config drift, secret exposure, and state files and resource conflicts with multiple teams working on the same infrastructure simultaneously. It also supports Open Tofu and Terraform with features like variable sets, workplace templates, and default pipelines. So let me boil this down for you. We created our own Terraform Enterprise or Terraform Cloud, but we can't use that name because it's copyrighted. So we're going to try to create a new thing and pretend we invented this and then try to sell it to you as our new Terraform or Open Tofu replacement for your management tier. Thanks. Thanks for that, Harness. I appreciate it. [00:13:08] Speaker C: Just, it was funny because I read this article and I got just as frustrated and I was really worried that this was something that was put in because it was like, supposedly like this really cool new thing. And I'm like, wait, no, this is just all how you're supposed to. This is what infrastructure's code has been forever. You just didn't manage it correctly now. So you're trying to invent a new thing to sell? Like, no. [00:13:28] Speaker D: Thank you. [00:13:29] Speaker A: Welcome to the sales and marketing team. [00:13:31] Speaker D: I mean, I. I'm sort of offended that you. You would have assumed that I would have thought this was a new thing. Like, I don't know. I don't necessarily. [00:13:38] Speaker C: I wasn't calling out names. You know, it's mostly blaming the AI. [00:13:41] Speaker A: Bot now, so I thought we were blaming Jonathan. [00:13:44] Speaker D: I did submit this article, so Jonathan's not here. You can blame him. Yeah, no, I. I mostly submitted this because I was super annoyed about it. Yeah, no, it's. [00:13:53] Speaker C: It is. It's just super frustrating. [00:13:55] Speaker D: These are all things that have existed. [00:13:56] Speaker C: And there's been tools and. And ways to do this fore. Dedicated platform. There's definitely ways to integrate a lot all of these things into your CIC pipelines. But yeah, definitely things that you need to address. You know, you do need drift detection, you do need to have, have state management in a way where multiple teams can interact with it, you know, separately and you know, having a place for reusable components and standards. But none of like some of this is like decades old. Thanks, Harness. [00:14:26] Speaker D: Yeah, well, Matt said that we had to create a new section today called HugOps Corner to Hug our friendly ops fellows at Amazon who had a really bad day on Monday here this week, October 20th. So basically for those of you who were not trying to use the Internet on Monday, aws used East 1 experienced a major outage, which is what it's known for, starting with midnight Pacific on Monday, caused by DNS resolutions through DynamoDB that prevented proper address lookup for database services impacting thousands of applications. Facebook, Snapchat, Coinbase, ChatGPT and Amazon's own services. I mean it's always DNS number one. And then, you know, this isn't the official, we have not seen an official rca. So this is speculative, but I think they basically said it was DNS in their status updates, but they haven't given us the root cause yet. So everything we're talking about is speculative today. But you know, first of all, the amount of press that this outage has gotten is crazy to me. [00:15:23] Speaker A: Insane. [00:15:24] Speaker D: It's like everyone forgot about when US East 1 used to fall over like every other month, which was like, you know, I mean, I realize it's had a really long streak of not having those problems for a while, which is appreciated. But yeah, it's like this isn't a new thing. US East 1 was called US Tire Fire 1 for a reason for a long time by a lot of us. And we always joked that, you know, friends don't let friends start an Amazon project in US east one for this exact reason. So, you know, they, it's a sort of interesting. Now there's a couple of things that I would point out and you know, we, we link to a bunch of articles, we will go through all of them. But you know, there are some things I noticed with this particular one. Number one, it took a long time for them to seem to get this thing on a positive direction. You know, I was, I happened to be awake because I couldn't sleep and so I saw people starting to say, hey, Amazon's having a problem. And I kind of ignored it, went to bed and nothing. My page didn't go off. So it was a win win. But, you know, it took a long time for them to resolve it basically until, you know, it was like hours before they had the DNS restored. Then they had, you know, all kinds of thundering herd problems and all kinds of the traditional issues you see with Amazon when they're trying to recover, you know, thousands of people's sites. And you know, the one thing that kind of struck me was, well, is it because they're out of practice like US East One hasn't, hasn't crapped out in a while and so they lost some muscle memory or is this really a byproduct of the fact that, you know, Amazon's talent retention has been terrible as of late? You know, between layoffs they've been doing or forcing RTO and you know, all the amount of people who have left Amazon in the last few years. Like, is this a reflection of the reality that, you know, when you let smart people go, sometimes things don't end up as well as you'd like them to be. And is this kind of going to be the new norm where we're just not quite seeing the level, same level of caliber from AWS as we were used to in the past? And is this a wake up call in some ways to Amazon that maybe what you're doing isn't the right thing? [00:17:19] Speaker C: Yeah, I mean, I'm not one for heroes and so like, I, I hope that it wasn't, you know, the earlier resolutions weren't just because people were kind of throwing themselves on, on the tracks getting these things resolved. But the, if it's a DNS resolution issue that's causing sort of a global outage, you know, like that's not exactly straightforward. It's not just, you know, a bug, you know, a function returning the wrong value or, or that you're, you're looking at global propagation. You're looking at all whole, you know, clients in different places resolving different things and some, you know, at the base, base parts of the Internet for functionality. And so it does take, you know, a pretty experienced engineer to sort of have that in their heads conceptually in order to troubleshoot. So, you know, like, I wonder if that's really the cause. It's, you know, where they're, where they're, you know, not able to recover as much. But I also feel like cloud computing has come a long way and the, you know, the, the impact was very widely felt because a lot of people, a lot more people are using AWS as, as their hosting provider than I think have been in the past. A little bit of everything, I think. [00:18:30] Speaker A: Yeah, I mean I definitely think it's a little bit of everything. I think it was interesting where they kind of made it sound like in the initial cloud status that they were kind of thought they were good at one point and then it kind of went south again on them because they were like, we're seeing recovery, we're seeing recovery, we're hold on, we're not. So it almost feels like there was multiple internal issues that kind of happened and they fixed one and you know, and thunder heard and other issues kind of caused other services to fail over, other managed platforms internally to fail over. And I don't think alas, if you're under an NDA, so we probably won't talk about it, we'll get that level of detail publicly. But they kind of felt like multiple issues there that kind of caused the global problem. I think the larger scale issue is that they still run a lot of these core services out of US East 1 and so many of them are based there. Like IAM is still based there, Cloudfront is based there. Can they. And I know some of them, they can't, you know, but can they start to figure out how to move those to other regions? So if a single region, aka US one goes down 37, things don't else go down that are global services because of that, you know, and that's I think kind of the bigger issue. And you know, they've tried to mitigate it some with was it over the summer or last year? Right. Since the time is skewed in life right now where they recommended and they adjusted the default for DNS to resolve for IM into each or for I was IM or STS into each region and they still left the default one to point there. So like they're trying to slowly do that but you know, they're literally flying 17 jumbo jets, you know, at once trying to keep, you know, change the engines on it without landing. You know, it's not like they can have a maintenance window of US east ones. They were going to take us down for two hours. The world would revolt, you know. So like I think there's probably multiple issues under the hood and I think that they are tracking in the right direction. But you know, this will probably help expedite some of the issues. Just like when was it 17 or 16 when S3 went down and took down the whole area again, you know. So core services kind of have big impacts. [00:20:50] Speaker D: Yep. I mean it'd be definitely interesting to See, you know, if this is a beginning of a trend of a lack of stability in the Amazon ecosystem, again, I mean, again, if you followed best cloud practices, you were multi region, you know, you, you designed for failure, you were not really that impacted by this. I mean, there definitely was things, you know, logging to the console might have been problematic for you, but you know, if you're, you had servers running in other regions, you know, traffic was routed there as it should, like, you know, things work normally. And again, there's still a lot of companies who don't invest in Dr. Or in getting outside of US east one in a serious way and then they're beholden to these issues. I also say, you know, welcome to October. We're a month and a few days and change away from reinvent. Things start getting rolled out at this time of year in the background. And so, you know, there's a lot more things getting pushed out that are new or had potentially have issues or bugs that could take down the control plane at Amazon. So, you know, it'd be interesting to see if we see a trend in a, in a negative direction or is this just a fluke and, you know, big things happen sometimes and you deal with it and you recover and they take lessons learned and then everything goes back on like normal. Or is this really the beginning of a trend that won't be great? I don't. My feeling is Amazon has a lot smart people still, even though they've let a lot go or a lot of them are left, they have a lot of systems that help control and keep their systems working. I imagine their RCA will be hopefully enlightening and will tell us what happened and what they're doing to fix it. And we'll feel better about that, but we'll keep you posted as we, we wait to hear 100% from them what the, what's going on. But just it's so funny how the media just jumped all over this one really did. [00:22:26] Speaker A: Well, it was just such a widespread outage, you know, like, I mean, every. [00:22:30] Speaker D: US61 outage has been widespread. It's just that we haven't had one in a while. [00:22:34] Speaker A: Right. [00:22:35] Speaker D: And it was a slow news day, I guess, or they needed a distraction from, you know, Trump's AI videos. Like, I don't know what was going on, but like it was, every news media jumped on it because I had people pinging me like, hey, are you impacted by Amazon outage day? I'm like, how'd you do it? Like, oh, it's on the top of the Wall Street Journal. Like, okay, yeah, nope, I wasn't. Thank goodness. Prior lives I would have been. But you know, luckily my Amazon ecosystem, the cloud pod, was not impacted because we run in US west too. So, you know, you're welcome. [00:23:03] Speaker A: Oh come on. You don't want to try us West One and just pay the extra 10%? [00:23:06] Speaker D: No, I don't. I don't. It also doesn't have any of the services that we actually use to run a website, so. Yeah, minor problems. U.S. 1. [00:23:15] Speaker A: Details. [00:23:16] Speaker D: Yeah, details. All right, well, exiting Hugops for AWS right into AWS. Let's talk about the new things that they launched that might have broken it. We'll find out. First up, apparently they did not use this tool to help prevent the outage. Or maybe they did, I don't know. EC2 capacity manager is a new service that provides a single dashboard to monitor and manage EC2 capacity across all accounts and regions, eliminating the need to collect Data from multiple AO services like Cost and Usage Reports, CloudWatch and EC2 APIs and is available to you at no additional costs in all commercial regions. The service aggregates capacity data with hourly refresh rates for on demand instances, spot instances and capacity reservations displaying utilization metrics by vcpu, instance count or estimated cost based on published on demand rates. Key features include automated identification of underutilized capacity reservations with specific utilization percentages by instance type and availability zone, plus direct modification capabilities for ODCRs within the same account. Data exports to S3, extended analytics beyond the 90 day console retention period, enabling long term capacity trend analysis and integration with existing BI tools or custom reporting systems and organizations can enable this for cross account visibility through of course AWS organizations helping identify observation opportunities like redistributing reservations between development accounts showing 30% utilization and production accounts exceeding the 95% utilization for example. [00:24:35] Speaker C: Yeah, I mean, hooray. You know these, these types of features where they're sort of trying to catch up from the, the separation between accounts and, and how companies have adopted, you know, a multi account strategy for a number of reasons and it made stuff like capacity management really difficult. Um, and it's kind of, you know, like these little islands of data and it, it's been, you know, a challenge to get it all sort of parsed in one place and where you can derive actual metrics out of it. So this is kind of nice to have it built in and just have it sort of be plug and play. You just turn it on. Especially when it's at no cost. I love it. [00:25:17] Speaker A: No cost. With exports, you know and which is which is pretty nice too. And day one features for organizations I feel like is like you've said the big one because I know I've written several scripts for myself that go cross accounts or an Azure Cross tenant cross subscription and kind of pull that same data too. [00:25:37] Speaker C: Yep. [00:25:39] Speaker D: Next up, EKS Auto Mode now supports EC2 on demand capacity reservations and capacity blocks for ML, allowing customers to target pre purchase capacity for your AI and ML workloads requiring guaranteed access access to specialized instances like the P5s. The addresses the challenge of GPU availability for trained jobs without over provisioning your infrastructure. There's new networking capabilities included separate POD subnets for isolating infrastructure and application traffic, explicit public IP control for enterprise security compliance, and forward proxy support with custom certificate bundles. These features enable integration with existing enterprise network architectures without complex CNI customizations. Thank you. Complete AWS KMS encryption now covers both ephemeral storage and root volumes using customer managed keys, addressing security audit findings that previously flagged unencrypted storage Performance improvements include multi threaded node filtering and intelligent capacity management that can automatically relax instance diversity constraints during capacity usage. EKS Auto mode is available for new clusters or can be enabled on existing clusters running Kubernetes 1.29 plus migration guides available for teams moving from managed node groups. Carpenter or Fargate pricing follows standard eks pricing at $0.10 per cluster per hour plus the EC2 instance cost. So nice to see those new features coming to EKS Auto mode this week. [00:26:51] Speaker C: Yeah, I mean this just highlights how terrible it was before, like managing your own, you know, compute layer, custom certificate bundles, like that's it's been a while since, you know, I've had to do plumbing at that layer, especially for, you know, eks, which is supposed to be a managed Kubernetes platform. So this is rough. So I'm glad they fixed this. I'm sure customers were clamoring for this. Torches and pitchforks, I imagine. [00:27:17] Speaker A: There's definitely times that, you know, I'm like are people going to know how all these things work under the hood when you have to go debug something that's gone wrong, you know. So I always kind of wonder that as we build these higher, higher level services, do people forget that? But it's again, it cleans up the toil and how to, you know, solve the larger scale problems too. [00:27:40] Speaker C: I mean if I never had to learn how to create a custom certificate bundle and distribute it to wide fleet of compute nodes, I'd be okay with that. Yeah, really no problems whatsoever. [00:27:50] Speaker A: I mean, it also goes back to like I was talking to somebody earlier today because I was generating an SSL cert because Azure doesn't have an SSL for app gateways because I have to get at least one to get in the app gateways per podcast and above your quota. I was like talking to them about how I was like, oh yeah, remember when you said upload it to IM and then attach your SSL cert? And now you just have acm. So, you know, I guess you're right. It doesn't really matter if you don't ever have to deal with it. [00:28:20] Speaker C: There's always going to be new problems. You know, that's the way I look at it. So I don't want to keep fixing the old ones. [00:28:26] Speaker D: I mean, I'd rather have those new problems. Or you know what you could do, Ryan? You could use Platform as a service. Yes, yes, I could. [00:28:35] Speaker C: This is Platform as a Service, which is the rough. [00:28:37] Speaker D: I mean, sort of. It's still Kubernetes. You're still defining pods. [00:28:40] Speaker C: Still Kubernetes. [00:28:41] Speaker D: Yeah, you're still defining pods and services. So yeah. Let's move on to Amazon. EC2 now supports optimizing your CPU for license included instances. This allows you, your customers, reduce VCPU counts and disable hyperthreading on Windows Server and SQL Server license included instances, enabling up to 50% savings on VCPU based licenses in cost while maintaining full memory and IOPS performance. This feature targets database workloads that need high memory and IOPS but fewer VCPUs. For example, an R7i8x large instance can be used from 32 to 16 VCPUs while keeping its 256 gigs of memory and 40,000 IOPS. The CPU optimization extends EC2's existing optimized CPUs feature to license included instances, addressing a common pain point where customers overpay from Microsoft licensing due to fixed VCPU counts. Available to you now in all commercial regions and Gov cloud regions, no additional charges. So this is a little weird to me because I thought this already existed. [00:29:36] Speaker C: It's because they're, they're announcing it wrong. Like, this is custom shapes. It's just they're not calling it custom shapes. [00:29:43] Speaker D: Right. [00:29:44] Speaker C: So you used to have to, you know, basically, oh, I know you might be paying for it. You know, I don't know if you're paying for the entire R7 IX large and all the compute that goes along with it, or do you actually get some sort of price change when you reduce the VCPUs running on it. I don't know, but that's always. It's been a challenge because you could always do it at the Windows OS layer, but I don't know if you could do it at like the sort. [00:30:07] Speaker A: Of console you can turn off Hyper threading at the Windows OS layer. I thought that was like a BIOS level. [00:30:12] Speaker C: No, but you can. You in a like the Microsoft SQL Server for instance, you can configure the the CPUs that it runs on so that you're not paying for all of them. But there'd still be accessible by the host os. [00:30:25] Speaker A: And then just go back to Justin Kameron, just use a managed service. Go use rds. I understand there are many issues with RDS and Microsoft SQL, but just use rds. I think we have a new theme of the show that probably should have made a title around select your managed service. [00:30:44] Speaker C: Yeah, I mean it won't be any surprise to our listeners that we're a big fan of managed service and not doing any real work. [00:30:55] Speaker D: Speaking of managed services, let's talk about patching and managed services too. [00:31:00] Speaker C: Too soon for me. This is my day job right now. [00:31:03] Speaker A: I know. [00:31:04] Speaker D: AWS Systems Manager Patch Manager now includes an available security update state that identifies Windows security patches available but not yet approved by patch baseline rules, helping prevent accidental slow exposure from delayed patch approvals. This feature addresses a specific operational risk where administrators using approval delay with extended time frames could unknowingly leave systems vulnerable with instances marked as non compliant by default when security updates are pending available from across all AWS Systems Manager regions with no additional charges beyond standard pricing Feature integrates directly into existing patch baseline configurations. So yeah, this makes sense because if you are doing delayed, you still maybe want your security patches sooner and you don't want them to be be out of compliance for security reasons. So that's nice. Quality of life improvement. [00:31:47] Speaker C: Well, I mean it's, it, it sounds like just a quality of life improvement, but it's something that should be so basic but isn't there, right? Which is like Windows patch management is. Is cobbled together and not really managed well. And so like you could have a patch available but the only way to find out that it was available previous to this was to actually run like go ahead and patch it and then see if it did something. And so now at least you have signal on on on that you can, you know, apply your patches in a way that's not going to take down your entire service if a patch goes wrong. So this is very nice. I think for people using the systems manager patch management they're going to be very happy with this. Still really angry at Windows patching in general because that's just the state of life. [00:32:33] Speaker A: Do like that Windows patching is. [00:32:36] Speaker C: If. [00:32:36] Speaker A: Once if you can get it wrangled and you spend the time to build a good process around becomes easier but it's never easy because you know it's perpetually having to deal with new changes between patch Tuesday the one off updates that they do out there. Everything else it's just a pain in the butt and you have to manage. [00:32:58] Speaker C: So much like you have to manage infrastructure in order to build a good patch management system with Windows. Like you can't like trying to do it just all with public sources and doing all those things like you I. There are patches that I know are applicable to the OS that are just not available to me unless I stand up a Windows Update service and manage that for the end of time. [00:33:18] Speaker A: I thought they deprecated wsus. [00:33:20] Speaker C: I thought they did too. I think they keep saying that but yet it's the only thing that works still. [00:33:28] Speaker A: Yeah it is. Here's an R. Yeah. [00:33:30] Speaker C: And it doesn't have API support and it never will because they've deprecated it. [00:33:34] Speaker A: Windows WSUS will no longer be developed starting in Windows Server 2025 encourages businesses to adopt cloud based solutions. Client server updates such as Windows autopatch Intune and Azure Update Manager which don't get me started on Azure Update Manager. [00:33:50] Speaker C: Yeah, yeah. It's just, it's incredibly frustrating and like the patches, the patches that I'm referring to specifically are like not. They're not random or like it's to the Edge browser. Like it's not crazy and it's like just could not patch the Edge browsers using public sources had to point at a WS server. Like it makes no sense to me. [00:34:11] Speaker D: Yeah, I mean you think about every you I would have thought this would got better. It just hasn't and I'm kind of shocked about it still. [00:34:19] Speaker C: I know like I. I remember asking you like how could it possibly work like this like a decade ago and it still works like this. [00:34:27] Speaker D: I mean you've heard my rants about ad. So anyways. [00:34:34] Speaker B: There are a lot of cloud cost management tools out there but only Archera provides cloud commitment insurance. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment to shortest 30 days. If you don't use all the cloud resources you've committed to they will literally put the money back in your bank account to cover the difference. Other cost management tools may say they offer commitment insurance, but remember to ask will you actually give me my money back? Achero will click the link in the show Notes to check them out on the AWS Marketplace. [00:35:14] Speaker D: Well, Amazon's giving us a new. [00:35:17] Speaker A: Q. [00:35:18] Speaker D: Coordinator so Amazon is introducing CLI Agent Orchestrator, or CAO for short, an open source framework that enables multiple AI powered CLI tools like Amazon QCLI and or CLAUDE code direct specialized agents under a supervisor agent, addressing limitations of single agent approach for complex enterprise development projects. CAO uses hierarchical orchestration with TMUX session isolation and model context protocol servers to coordinate specialized agents, for example orchestrating architecture, security, performance and test agent simultaneously during mainframe modernization projects. The framework supports three orchestration patterns handoff for synchronous transfers, assign for parallel execution or send message for direct communication plus scheduled runs like cron like automation with all processing occurring locally for security and privacy. It currently supports Amazon Q Developer CLI and Cloud Code with planned expansion to OpenAI, Codex, Gemini, Quin and Aden. No pricing has been mentioned as is open source available at the GitHub the.com AWS Labs CLI agent orchestrator, which I mean this is great except for probably mostly works well with Q and the other ones probably not so much. But I do appreciate that there's something one more tool in the agent orchestration world. [00:36:28] Speaker C: I mean, I wonder because it's. I mean that if you're interfacing with the cli, they're all kind of the same thing but at that level, like there's a kind of common element there. But I also wonder like this seems like a lot of configuration and a lot of management of the configuration to make work and I'm not sure that it would be worth it just to be able to use different tools as much as I like being able to select different models for different tasks. It seems a little heavy. [00:36:54] Speaker D: I mean this is how you get those, you know, CRON job security patches or you know, coding fixes. Fix all my bugs. Use an agent orchestrator like this. [00:37:02] Speaker C: If I wanted to run it off of my desktop, sure, why not? But I'm not going to do that. [00:37:07] Speaker A: Come on, what fun are you? [00:37:10] Speaker C: I'm not guy who's going to have critical production, you know, processes running on my desk that or a computer that's running in my desk. [00:37:18] Speaker D: I mean you're writing code on your desk that you push a production. So technically you have production things on your thing. [00:37:22] Speaker C: Well I have production code, but it's the functionality is pushed to the cloud. [00:37:29] Speaker D: Amazon ECS is Now Publishing AWS CloudTrail data events for insight into API activities. Basically this allows you to see ECS Agent API activities enabling detailed monitoring of container instance operations including polling, telemetry sessions and manage instance logging security operations. Teams gain comprehensive audit trails to detect unusual access patterns, troubleshoot agent communication issues and understand how container instance roles are utilized for compliance requirements. The feature uses the new data event resource type AWS ECS container instance and is available for ECS on EC2 and all AWS regions, but the ECS manages the support in selected regions. Standard CloudTrail data event charges apply and there's a previous visibility gap in ECS operations as teams can now track agent level activities. I mean I get that this wasn't tracked before and that's bad and security wants to know all the things that are happening, but what's the use case? Brian, as our security guy, I mean. [00:38:26] Speaker C: I know when I was managing ECs like at the hardware layer like running on you know, dedicated compute, this is definitely something that you know like captured in the ECS logs and pushed upstream so that they could make so larger orchestration engines who are more aware of multiple clusters could make decisions based off of it. And so I imagine as things have moved more into like the Fargate model where you don't have that option, the need for this is a little bit more. But it also like, you know, a lot of the reasons why I needed to make those larger orchestration levels is because I was managing things at a compute level and needed to make decisions on scaling and capacity management and things that are sort of part of the managed services when you're using, you know, Fargate, although it can be more expensive. So I mean and this, you know, is definitely something I would use sparingly because the, the ECS API is Agent API is chatty. So this seems like it would be very expensive very fast. [00:39:25] Speaker A: Yeah, that was mainly my concern when you know, I was reading about this is put system log event. That's that feels like it's going to be very often and seeing that millions of times. The other thing is it's just going to clog up your cloud trail. So if you are looking to debug something, how many times is going to appear in there for something you don't like you might not care about. [00:39:48] Speaker C: Oh, this is specific to that new managed instances. This is the thing that's in between Fargate and running your own computer. So that's why this makes sense. Yeah. So this is basically the same use case I had. Except for when you were running managed instances, you didn't have the ability to pull from the ECS agent logs on the host itself. But you do still have to manage capacity. You do still have to manage things like this is the one that has GPUs, and so that's probably what this unlocks for a lot of people. [00:40:19] Speaker D: Makes sense. I mean, I get it. It just. It's definitely. [00:40:24] Speaker A: Oh no. [00:40:25] Speaker C: I've worked very hard to avoid needing this and so yeah, I know there are people that need this and I'm sorry. [00:40:30] Speaker D: Yeah, maybe you should find a job if you need this. All right, let's move on to Google Cloud. This week they've got several things for us. G4 VMs powered by Nvidia RTX 6000 black hole GPUs are now generally available. They offer up to 9x throughput improvement over the G2 instances and support workloads from AI inference to digital twin simulations with configurations of 1, 2, 4 or 8 GPUs. The G4 VMs feature enhanced PCIe based peer to peer data paths that deliver up to 168 throughput gains and 41% lower latency for multi GPU workloads. Addressing the bottleneck issues common in serving large AI models that exceed single GPU memory limits. Each of you provides 96 gig of GDR7 memory, up to 768 gigabytes total native FB4 precision support and multi instance GPU capabilities that allows partitioning into four isolated instances, enabling efficient serving of models from under 30 billion to over 100 billion parameters. Nvidia, Omniverse and Isaac SIM are now available on Google Cloud Marketplace as turnkey solutions for G4VMs, enabling immediate deployment of industrial digital twin and robotics simulation applications with full integration across gka, gke, Vertex AI, dataproc and Cloud run. These are available immediately with broader regional availability than previous GPU offerings. Though specific price details were not provided in this particular announcement, I can tell you the answer is expensive. [00:41:54] Speaker C: Definitely expensive. [00:41:56] Speaker A: All I heard was GPUs and I'm like, yep, that's expensive. [00:42:01] Speaker C: Yeah, I mean it's. I'm still very far away from understanding these larger workloads because I don't do the type of development where I need to. Definitely. I'm more on the other side where I'm trying to like, how would I finally get AI on a very tiny little computer where very much on the edge so that I don't have to write complex mapping rules. [00:42:21] Speaker D: Indeed, for those of you who use Dataproc, Google's giving you Dataproc 2.3 which chooses a lightweight FEDERMP high compliant image that contains only essential Spark and Hadoop components, reducing the CVE exposure and meeting strict security requirements for organizations handling sensitive of data. Optional components like Flink, Hiveweb, HCAT and Ranger are now deployed on demand during cluster creation rather than prepackaged, keeping clusters lean by default while maintaining full functionality when needed. Custom images allow pre installation of required components to reduce cluster provisioning time while maintaining the security benefits of the lightweight base image. Image supports multiple operating systems including Debian 12, Ubuntu 22 and Rocky 9 deployment as simple as specifying version 2.3 when creating clusters via the GCloud CLI, Google employs an automated CVE scanning and patching combined with manual intervention for complex abilities to maintain compliance standards and your security posture. So it's Alpine for data Proc and when you need to use any of the tools you actually need, they won't be installed like vim making you install them and then your container that was lean is now no longer lean. [00:43:27] Speaker C: So nice but on the contrary like FedRamp has such tight SLAs for vulnerability management that you don't have to, you know carry this risk or request an exception because of, you know, Google not patching Flink as as fast as you would like them to. Yep, at least perspective control where at the end user where they can say well I'm not going to use that. [00:43:49] Speaker D: BigQuery Studio is getting a new interface, introducing an expanded Explorer view that allows users to filter resources by projects and type with a dedicated search function that spans across all BigQuery resources within an organization, addressing the common pain point of navigating through large scale data projects. The reference panel provides context or information about tables and schemas directly within the code editor, eliminating the need to switch between tabs or run exploratory queries just to check column names or data types, which is particularly useful for data analysts writing complex SQL queries. Google has streamlined the workspace by moving job history to a dedicated tab accessible from the Explorer pane and removing the bottom panel clutter, also allowing users to control tab behavior with double click functionality to prevent unwanted tab replacements. The update includes co generation capabilities where clicking on table elements and the reference panel automatically inserts query snippets or field names the editor reducing manual typing errors and speeding up query development workflows and driving all of us crazy when we have to troubleshoot and we go, what did you do? And they go, I clicked the button. Yeah, yeah, it didn't work. [00:44:47] Speaker C: SQL like syntax and I don't know either one the like or the SQL. I mean some of these things, the reducing clutter and stuff is definitely something that I've run into. Like having a bunch of job executions and not being able to sort of see your query and the job executions because the screen's too small. It sucks. This is definitely something I've run into, although I'm a little nervous about having all the BigQuery resources across an organization available in a single console just because it sounds like permissions nightmare and GCP permissions since it's evaluating things like tables and, and data sets sort of on demand. Evaluating a policy on demand. Like it typically shows you everything and then just gives you an error. So it's a little, a little rough. I'm a little nervous about this. [00:45:38] Speaker D: I mean, we'll see. I suspect that our data analyst friends will love it. I have tried to use this interface a couple times and I find it overwhelming initially. And so if I can make it, you know, if this is a step towards making it more approachable for, you know, people who are not as experienced yet, that would be helpful too. So we'll see. But I also worry about it's just AI all the way down, then what are you going to do? [00:46:02] Speaker C: I mean, this definitely makes a lot of Gemini query building things a lot easier and then also easy to share them across. GCP project. [00:46:11] Speaker D: Well, finally you can manage all of your prompts in Vertex and AI SDK, enabling developers to create, version and manage prompts programmatically through Python code rather than tracking them in spreadsheets or text files. Feature provides seamless integration between Vertex, AI Studio, visual interface for prompt design and SDK for programmatic management, with prompts stored as centralized resources within Google Cloud projects for team collaborations. Enterprise security features include CMIC or Customer managed encryption keys and VPC service controls addressing your compliance requirements for organizations handling sensitive data in their AI applications. I mean, if your prompt has a. Has sensitive data in it, I have questions already. [00:46:49] Speaker A: Who doesn't like PII in your prompt? [00:46:52] Speaker C: Come on. [00:46:52] Speaker D: Yeah, it doesn't matter. Like, hey, can you see if this Social Security number exists in any of my data, please? [00:46:57] Speaker C: Like, yeah, beautiful. [00:47:00] Speaker D: All right, well, that's nice, I guess. [00:47:03] Speaker C: I mean, it just shows that everyone's having the same sort of problem. Like, I know that this has recently become my life, which is trying to figure out, you know, in Different AI workloads. Like I'm copying and pasting prompts from different things rather than typing it new every time. And it's like patterns are starting to become a little ingrained and people are having their sort of workflows and. And if there's not a tool like this, you end up figuring a way out. And that can be a document or, you know, a spreadsheet or whatever. So this is, this is kind of nice to bring sort of structure to that. And then, you know, the fact that it's SDK with Python management makes me real happy because I don't have to click through some UI somewhere to figure out where my prompt is. [00:47:43] Speaker D: It's not in go for you, so that's nice. [00:47:46] Speaker C: Yeah, I mean that's. Yeah, I don't mind if it's in go, that's fine. [00:47:49] Speaker D: But you like it better. It's a Python, so it's easier to. [00:47:52] Speaker C: Easier to read and understand. Sure. [00:47:54] Speaker D: Only because you know it. [00:47:58] Speaker A: Well. [00:47:59] Speaker D: If you were excited about Claude Code Web earlier and being able to do things remotely, or if you're excited about GitHub Copilot, GitHub Enterprise is now supporting Gemini Code Assist as well. Now bringing your AI powered code reviews to enterprise customers using GitHub Enterprise Cloud and on premises GitHub Enterprise Server. This addresses the bottleneck where 60.2% of organizations take over a day for code changes to each production due to manual review processes. The services provides organizational level controls including centralized custom style guides and org wide configuration settings, allowing platform teams to enforce coding standards automatically across all all repositories. Individual teams can still customize repo level settings while maintaining organizational baselines. This is all built under the Google Cloud terms of service. The enterprise version ensures code prompts and model responses are stateless and not stored, with Google committing not to use customer data for model training without your permission. This addresses enterprise security and compliance requirements for AI assisted development and it's currently in public preview with access through the Google Cloud console. Service includes a higher pull request quota than the individual developer tier and Google is developing additional features including agentic loop capabilities for automated issue resolution and bug fixing. [00:49:04] Speaker C: I'm sort of fascinated on how much, you know, hooks this is into GitHub because a lot of these features are directly competitive with Copilot. [00:49:14] Speaker A: But if you're a Google shop and you've already have Gemini as your approved tool. [00:49:19] Speaker C: Oh, I mean, I get it. [00:49:20] Speaker A: I mean that's what they're targeting and hoping that it's going to kind of let People that are a Gemini shop completely take over and not. And still let you leverage Git because. Does Google have an alternative? [00:49:36] Speaker C: They do have their own source repository service. [00:49:39] Speaker A: Was it as good as codecommit? [00:49:40] Speaker C: I don't know. I wouldn't dare use it. [00:49:42] Speaker A: I'm kidding. Is it as good as Azure DevOps? [00:49:47] Speaker C: I mean you know they like, like Amazon, they have this sort of similar sort of context of like. [00:49:52] Speaker D: Well they have cloud build. [00:49:53] Speaker C: Right. So they don't have a CI pipeline built. They have a separate repository and they integrate together if you want to go and plug it all. Yeah, I mean it's, it's just sort of the, the ability to sort of do organizational wide things is, is super powerful for these tools and I'm just sort of surprised that you know, GitHub allows that. It seems like they would have to develop API hooks and externalize that. [00:50:16] Speaker D: They do these all through the marketplace. Right. So I mean they have the web hooks, they have the APIs already. They exist and so you're just plugging into that ecosystem they've already built for other tools to leverage now do they eventually get hostile towards other AIs other than GitHub Copilot? They could, but I think it's kind of anti what Microsoft has been trying to preach for the last several years. So I mean right now they're good citizens. But that may change in the future. We'll see. [00:50:40] Speaker C: I forgot about the marketplace interaction. You're right, that, that does make it makes it a little bit more approachable. This wasn't like requiring custom development to enable. Just what I was thinking. [00:50:49] Speaker A: Yeah, GitHub is a marketplace. I don't remember that. [00:50:52] Speaker D: Yes. [00:50:53] Speaker A: I don't used to. I'll have to look into that more. [00:50:56] Speaker D: It's the preferred way you do integrations with third party tools like if you want integrate Snyk into GitHub, you typically do it through a marketplace plugin for example versus doing API keys for individual. [00:51:07] Speaker C: Users, things like that. [00:51:10] Speaker A: Got it. We're more of a BitBucket shop so I've had to do that. [00:51:13] Speaker D: Well that's, that's unfortunate for you. I mean I assume problems. I assume BitBucket will eventually get AI. I mean JIRA and Confluence have it now. Why can't BitBucket? [00:51:22] Speaker A: Yeah, I think, I think they def. They definitely have. I've gotten some notifications for if I. They are behind the eight ball in my opinion. [00:51:29] Speaker C: Well, and it's Atlassian. They'll charge you through the nose. [00:51:31] Speaker D: Yeah, they're going to charge you through the nose for it. Number two, you, you probably have to upgrade your BitBucket, which you probably haven't upgraded since you installed it. Do they have a SaaS version? [00:51:41] Speaker A: No, no, it was the SAS version. [00:51:42] Speaker D: They do have a SAS version. [00:51:43] Speaker C: Yeah. Okay. [00:51:44] Speaker A: I mean, I still think BitBucket is their redhead stepchild that they just kind of tag along. There's like, you know, keep lights on team and that's about it. But they have released some small features though. [00:51:56] Speaker D: They sell it to the people who, you know, refuse to use GitHub for some reason because they hate Microsoft. That's a brilliant strategy. [00:52:04] Speaker A: It's significantly cheaper also. [00:52:06] Speaker C: Yeah. Yes. [00:52:09] Speaker D: All right. Vertex AI context caching is reducing your cost by 90% for repeated content in Gemini models. By storing pre computed tokens, implicit caching happens automatically while explicit caching gives developers controls over what to content to cache. For predictable savings. The feature supports caching up to 2048 tokens, up to Gemini 2.5 Pro's 1 million token context window across all modalities. Text, PDF, image, audio and video. Global and regional endpoint support. Use cases include document processing for financial analysis, customer support, chatbots with detailed system instructions, code based QA for development teams and enterprise knowledge based queries. Implicit caching is enabled by default with no code changes required and clears within 24 hours. While explicit caching charges standard input token rates for initial caching, then a 90% discount on reuse plus hourly storage fees based on the TTL itself. Integrated with provision throughput ensures production workloads benefit from caching and explicit cache supports customer managed encryption keys for additional security and compliance. [00:53:05] Speaker C: I mean this is awesome. Like this is if you have a workload where you're going to have very similar queries and or prompts and have it return similar data. Like this is definitely nice than. Rather than having to regenerate that every time. So this is. They've been moving more and more towards this and I like to see it sort of more at a platform level now. Whereas you could sort of implement this like in a weird way using directly in the model, like in a, in a, like a notebook or something. This is more. It sort of just. I think it's a turn it on and it works. [00:53:42] Speaker A: Is it? I. I still wonder is it. I mean it sounds like it's in line which is always good. So it kind of will handle like you said. But eventually is there going to be. If you are leveraging AI for any PII or anything else like that, you have to be careful with some of these tools. Because I'm just thinking it's like Redis or anything else with SQL back in the day. [00:54:04] Speaker C: Yeah, I mean, it is. It's exactly the same problem that you're going to have with like Redis, you know, where it's like, oh, we've got a bad thing in the cache, we got to clear the whole thing. Right? Or it could be just a prompt that's returning bad information. You know, it doesn't necessarily have to be pii, it's just less than desired and you have to go clear it all out. [00:54:21] Speaker A: I'm on PII hell at work, so, you know, that's where my brain is. [00:54:25] Speaker C: Gotcha then. [00:54:27] Speaker D: Our final Google announcement for today, Cloud Armor is introducing several new features for you and it was named A strong performer in the Forester wave. The new features launched include hierarchical security policies, now in general availability, that enable WAF and ddos protection at the organization, folder and project level Allowing centralized security management across large GCP deployments with consistent policy enforcement A new enhanced WAF inspection capabilities and preview Expanding request body inspection from 8 kilobytes to 64 kilobytes for all pre configured rules Improving detection of malicious content hidden in larger payloads while maintaining performance the GA4 network fingerprinting support is now generally available, providing advanced SSL, TLS, client identification beyond J, offering deep behavioral insights for threat hunting and distinguishing legitimate traffic from malicious actors. The organizational scoped address groups are now generally available, enabling IP rangeless management across multiple security policies and products like Cloud Next gen firewall, reducing configuration complexity and duplicative rules and Cloud Armor now protects media CDN with network threat intelligence and ASN blocking capabilities and general availability Defending media assets at the network edge against known malicious IPs and traffic patterns. [00:55:35] Speaker C: These are some pretty advanced features for, you know, like a cloud platform provided laugh. It's pretty cool. I find myself using this more and more of the day job just because it's easy to implement and then pretty configurable once you have it enabled through, you know, definition of policies and rules. I haven't run into the, you know, the eight kilobit sort of inspection limit yet, but that's because the rules are kind of expensive and I am turning them on very cautiously. So it's kind of. You gotta be careful there. But yeah, no, I look forward to using this and getting the better information when looking at the ginormous amount of logs that any waften generates. [00:56:16] Speaker A: I mean a lot of the other features from an organizational perspective sound really nice too. That Feel like something that Ryan should be implementing on his day job if he's not. But like IP addresses and stuff like that sound phenomenal for a security team that is managing a multi project strategy. [00:56:34] Speaker C: Oh definitely. I mean and you know there's, there's been ways to do this, but not good ways. You know, like it's like having a giant basically shared WAF layer. That's your, that you're, you have a poor platform team, they're responsible for it now they have to offer WAF to the rest of the company. And so this is nice because you can now delegate some of that while still maintaining that, you know, regions or you know, traffic from an embargoed country just never gets in and not having to work with 27 different teams to get that enabled. [00:57:04] Speaker D: Agreed. Moving on to Azure for this week. General availability the observed capacity metric in Azure Firewall. Azure Firewall's new observed capacity metric provides real time visibility into capacity unit utilization, helping administrators track actual scaling behavior versus provision capacity for better resource optimization and cost management. And you might want to combine that with the next announcement which is that they're now allowing firewall pre scaling which allows administrators to reserve capacity units in advance for predictable traffic spikes like holiday shopping seasons or product launches, eliminating the lag time typically associated with auto scaling your firewall resources. So now you can see the observed capacity so you can actually determine what that pre scaling need is actually required. So to combine these two announcements this week are actually pretty good if it was 10 years ago. [00:57:51] Speaker A: I mean that's just a core problem with Azure though is that there's still the concept of servers under the hood. You tell them how many servers you want of everything and it blows my mind that that's still the world you're in versus saying I want a load balancer, I want a firewall, I want a NAT gateway. I think NAT gateways actually might be managed for you. I'd have to double check but you still say I want X capacity where you know with AWS and a little bit I've done with gcp more of that's obfuscated from you. I guess with AWS you have to like pre warm the load balance like that. But I think a lot of that's. [00:58:31] Speaker C: Which is probably still in there. I just haven't touched it and I don't know how long because it's just not needed anymore for. Because I'm not the size fan. [00:58:39] Speaker A: Yeah I think the last time I think I read they, they don't recommend it and at that point you probably are on, you know you have a tam you, you know they know you're launching a big thing and they would I'm sure reach out. So you're at that massive level either way. But like everything with Azure feels like you are literally telling them the metrics, the size and it's just a, in my head it's anti cloud pattern because I don't want to manage capacity. This is why I'm paying a premium for you to do it for me. [00:59:10] Speaker C: Yeah, it's, you know, like it, I guess it's, it's part of the management. Right. You're not maintaining the underlying infrastructure in terms of patching and updating and, and but it's also like the having to do capacity management just sort of is that last little bit where you're like I want to do this last little bit. Like just take it away, just bill me. [00:59:31] Speaker A: It's like on soccer ISO. I don't remember I was doing one of the things and they're like tell me how you capacity plan. I was like I have my quotas and I have scale sets that scale up and down for me and I have alert set up with things fail. Like I leverage automation to do it for me and explain that to an auditor is always a fun time. [00:59:53] Speaker D: What can go wrong? [00:59:54] Speaker A: Yeah. [00:59:55] Speaker D: In public preview this week, environmental sustainability features are now available to you and the Azure API Management plane. Azure API Management introduces carbon aware capabilities that allow organizations to route API traffic and adjust policy behavior based on the carbon intensity data, helping reduce the environmental impact of API infrastructure operations. The feature enables developers to implement sustainably focused policies such as throttling non critical API calls during high carbon intensity periods or routing traffic to regions with cleaner energy grids. This aligns with Microsoft's broader carbon negative commitment by 2030 and provides enterprise with tools to measure and reduce the carbon footprint. Other digital services at the API layer CAR customers include organizations with ESG commitments and sustainability reporting requirements who need granular control over their cloud infrastructure's environmental impact. And in general I think this is weird. Like I'm going to use some type of API thing to determine that if I use this compute in this region that's on coal power right now, it would be less energy efficient than if I use the this other one that's using, you know, maybe hydropower is that kind of the gist of this. [01:01:02] Speaker A: So APIMs are one stupidly expensive. If you have to be on the premier tier it's like $2,700 a month and then if you Want ha. You have to have two of them. So like whatever they're doing under the hood is stupidly expensive. If you ever had to deal with the SharePoint, they definitely use them because I've hit the same error codes as we provide to customers. The second side, when you do scale them, you can scale them to be multi region apims in the paired region concept. So in theory what you can do based on this is route in the uk if UK north or was it central versus south, you know, has a cheaper or more environmentally efficient one, you could route to your paired region and then have the traffic coming that way. The reason why I think that this is interesting is one, I keep stating that green is going to be on the forefront of one of these keynotes at one point and I'm just going to keep putting that out there until I give up on our prediction show. But on the flip side, I think it's interesting that especially in Europe this is such a big deal that the cloud vendors AKA Azure is targeting building stuff and targeting these ESG things. I'm seeing it more in my day job when customers buy our SaaS product is they'll ask what the green capacity is or green metric, whatever it is for our product and how many CO2 tons of CO2 we're releasing in Tier 1, 2 and 3. So I think it's interesting that Azure is trying to take it on and put the power in the hands of their customers to maybe say this isn't the best thing to do versus Azure just taking care of it for you and letting the customer say it's okay to slow down this. We don't care right now about it. We don't care about our customers response time right now because our green capacity is more important. [01:03:00] Speaker C: Yeah, I mean it's all about balancing the books with carbon offsets. Right. It's both from a Microsoft perspective and then also an Azure perspective customer perspective. They're trying to have a policy where they can say like you know, this is how we're, we're managing our, our, our carbon sort of footprint and here's the things we're doing to reduce it and just trying to balance that with, with the things that are generating or increasing the carbon footprint. So it's, it seems like a. [01:03:30] Speaker A: A. [01:03:31] Speaker C: Lot right now I think because we're sort of early weirdly but I imagine if this becomes one of those things like just like routing for latency if we're, it becomes really standard for all of our tools where we're taking account the carbon impact. [01:03:50] Speaker A: I mean, I definitely think that it's a very specific company they're targeting for this. Most people don't care. [01:03:56] Speaker C: It's definitely a company of scale, right? Yeah, companies of scale at this point. [01:04:00] Speaker A: Yeah, yeah. And you're also talking about like you're already using the apm, which is a ridiculously expensive service, so you know, you're clearly burning tons of capital on Azure too. [01:04:14] Speaker D: All right, next up is Azure Storage Discovery is now generally available as a fully managed service that provides enterprise wide visibility into data states across Azure Blob Storage and Data Lake Storage, helping organizations optimize costs, ensure security compliance and improve operational efficiency across multiple subscriptions and regions. The service integrates Microsoft Copilot and Azure to enable natural language queries for storage insights, allowing non technical users to ask questions like show me storage accounts with default Access tiers hot above 1 TB. With these transactions, really a non technical user asked that question and receives actual visualizations without coding skills. Key capabilities include 18 month data retention for trend analysis insights across capacity generate activity, security configurations and errors of deployment taking less than 24 hours. Initial insights from 15 days of historical data pricing includes a free tier with basic capacity and configuration insights retained for 15 days while standard plan adds advanced activity error and security insights with 18 month retention. Specific pricing varies by region. Target use cases include identifying cost optimization opportunities through access tiered analysis, ensuring security best practices and still and managing data redundancy requirements across global storage estates. [01:05:27] Speaker C: We talked a little bit about this when it was announced in preview but like I remember like I don't have a lot of Azure experience but just when Focus Data came out with her FinOps report in the structure, just generating a new report, I had to set up a storage infrastructure and then once I set it up, got the report running and updating everyone, I could never find that storage bucket ever again. It was gone. I was, it was. I. It was, it wasn't gone because it was still being accessed but I could never find it in the console. So when I was asked to change it, you know, and I would go through and I would try to change my permission scopes and try to do all the things to try to find it but it was just invisible to me. And so I think that this is one of those things where hopefully it doesn't have same limitations and you'll be able to to actually see your entire storage footprint versus whatever permissions you have in the specific subscription based off of your billing scope at the time, which is for a non everyday Azure user is too much to keep pass. [01:06:26] Speaker A: Well that's why you naturally language query it because obviously you're caring about that. As a FinOps person, you can definitely care about what your redundancy level is for each of your storage and the finops person should totally be making that decision. [01:06:40] Speaker C: Well, I'll tell you, when I was looking for this report I had a lot of natural language and I was shouting it at my computer. [01:06:46] Speaker A: So I've actually leveraged it. It was free for a couple months. We set it up at my day job. We were kind of poking around it. It's good if you, you know, if you're a large company with multiple different teams kind of setting it up and figuring out what and you know, your centralized team doesn't know all the different storage accounts, it definitely shed a few interesting insights on us that you know, for my team that we definitely probably should have known to start off but learned a lot along the way. But it's a simple report. You know, there's been multiple press releases about just shows you what's there. It's no different than kind of the network security report they released a few weeks ago. It's an aggregate review for your organization about what you have running if you don't know. So they're really targeting the larger scale enterprises. If you're a small enterprise and you kind of know these things, it's no big deal. [01:07:44] Speaker D: No big deal, he says. All right, well there's two new models available for you today in Azure AI Foundry. First up is Sora2, which is populating the entire Internet with AI crap. Just terrible videos and all AI generated and even has his own Sora apps so you can go to the Sora social media to see all of your friends with their dumb videos made on Sora too. Not a big fan of this one for lots of reasons. I think Grok 4 is also available to you in Azure AI foundry featuring the 128k token context window, native tool use and integrated web search capabilities in the Grok product. Pricing for that one starts at 2 million per million input tokens and $10 per million output tokens for GROQ4 with faster variants available at lower costs in the future. The Sora pricing I did not find when I wrote this out, but they're available to you in Azure AI Foundry so I'm sure there's a billion model. I just don't know what it is. [01:08:39] Speaker C: I mean apparently it's not expensive enough considering how much crap there is. So like needs they should raise the. [01:08:44] Speaker D: Prices, I think, like the Sora stuff is all free right now if you're going through the app, which is part of the problem, because no one, no one has to do anything special. All right, and our final Azure story. Azure is releasing PowerShell scripts to help customers migrate from Application Gateway v1 to v2 before the April 2026 retirement deadline. I'm shocked the deadline wasn't, you know, September 20th, 25th, 2025. Addressing a critical infrastructure transition that's needed. The enhanced cloning script reverse preserves configurations during migration, while the public IP retention script ensures customers can maintain their existing IP addresses, minimizing disruption to your production workload. Migration tooling targets enterprises running legacy application gateway Standard or WAF SKUs who need to upgrade to standard V2 or WAF V2 for continued support to access newer features scripts automate what would otherwise be a complex manual migration process, reducing the risk of configuration errors and downtime during your transition. Customers should begin planning migrations now as 2026 deadline approaches, with these scripts providing a standardized path forward for maintaining application delivery infrastructure. Or, you know what, instead of writing all these scripts, they could just do it for you. [01:09:48] Speaker A: There's so many issues with this. [01:09:49] Speaker C: Yeah, this removes what otherwise would be a complex manual process. I'm like, you're giving people PowerShell. It's a complex manual process. [01:09:58] Speaker A: I mean, I've noticed this a lot when they go from V1 to V2 stuff where they don't. They're either they first just tell you it's now because the deprecation notice for this for V1 to V2 was April 28, 2023. They announced this. So they gave people five years, you know, to, sorry, three years to move over. [01:10:22] Speaker C: But migration through attrition at that point. Right, right. [01:10:25] Speaker A: And now they're like, okay, these are the customers that are left that are using it. But like, honestly, for somebody that leverages the app gateway at my day job, like, V2 is exponentially more powerful. There's definitely less in it in the worlds of the documentation. But I just love hating on app gateways also, which is why it's in here. At least I'm honest about what I hate. But also, like, the other part of this that I wanted to bring up was around the public IPs. Like, I totally understand IP addresses and people whitelisting or putting in a lot of this IP addresses. I just still don't believe that at a core principle that that is something that you really need to do. Plus, if you were on a public IP address, that Azure provided back then and you have to move over here. You probably also have to move from a single zone to a multi zone IP address, which means you need a new IP address to start off. So like I almost wish they didn't offer this because I want people to move to more modern stuff and they're just leaving themselves more tech debt, you know, to do it. So just take it on and fix the problem though. I say this at my day job. [01:11:39] Speaker D: I mean this feels right on brand for Microsoft. Like you're saying all these things and I'm just like this is. But this is Microsoft's mo. [01:11:45] Speaker A: I know. [01:11:46] Speaker C: And if you have fintech customers it is like pulling teeth. [01:11:50] Speaker A: Pulling teeth to get them to get. [01:11:51] Speaker C: Rid of their IP restrictions and get. [01:11:54] Speaker D: Rid of your URL. [01:11:55] Speaker A: Wait till you tell your fintech customer that you are having a CDN now and see how well that goes over with that. Because they can't whitelist allow this the IP address anymore which is loads of FUD also. Or you just tell them to put in the allow this all of Azure front door which then they look at you like you're crazy. I'm like, well this is what you asked for. [01:12:18] Speaker C: I do remember pointing a customer ages ago at the giant AWS JSON. I'm like, just run a thing that loads this every once in a while and then you'll have just what you need. And then just looking at their face as they were mentally stabbing me through the eyes, they could see it. [01:12:35] Speaker A: Yep. About two years ago I had this conversation. [01:12:40] Speaker D: Just two years ago. It seems that long ago at all. [01:12:42] Speaker C: Yeah. [01:12:43] Speaker D: All right, and our final story for this week is Oracle. Oracle's AI Agent Studio is expanding with new marketplace elements and partner integrations for fusion apps, allowing customers to build AI agents using models from Anthropic, Cohere, Meta and others alongside Oracle's own models. I mean, do you even know what Oracle's models are called? [01:13:01] Speaker A: Bad to ask, I forgot they had models. [01:13:04] Speaker D: Either do I. The platform enables creation of AI agents that can automate tasks across Oracle fusion cloud applications including erp, HCM and CX with pre built templates at low code development tools for the business users. Oracle is partnering with major consulting firms like Accenture, Deloitte and Infosys to help customers implement AI agents. Those likely mean significant professional services costs for most deployments. [01:13:28] Speaker C: That's exactly what that sounds like to me. Like oh yeah, they're partnering with these giant firms that will come in with armies of engineers and build you a thing and then hopefully document it before running away. [01:13:39] Speaker A: You have never had to deal with those people and or deal with the mess that they deal with after. Never. [01:13:45] Speaker C: Never had to come in and clean that up. Nope, nope. [01:13:49] Speaker A: Definitely never clean that up. As a consultant to a company, hey, this other person hired this and they do this. And then we were brought in to clean up that mess. That was always fun. Yeah. [01:14:01] Speaker D: It's a never ending cycle. Never ending. All right, gentlemen, it's another fantastic week here in the Cloud. Past us, so we'll keep an eye on Amazon. Hopefully we don't have any outages to next recording. See if we go to rca and we'll see you all next week here in the Cloud. [01:14:17] Speaker C: All right, bye, everybody. [01:14:18] Speaker A: Bye, everyone. [01:14:22] Speaker B: And that's all for this week in Cloud. We'd like to thank our sponsor, Archera. Be sure to click the link in our show notes to learn more about their services. While you're at it, head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening and we'll catch you on the next episode.

Show Notes

Titles we almost went with this week

Follow Up

General News

AI is Going Great – Or How ML Makes Money

Cloud Tools

HugOps Corner – Previously Known as AWS

GCP

Azure

Oracle

Closing

Chapters

Episode Transcript

Other Episodes

Episode 128

128: Azure puts its gold in CloudKnox

Episode 68

68 – The Cloud Pod is as free as Github for Teams

Episode 199

199: All AI Products Agree, Earnings are down