345: Damn It… my excuse is now gone for Disaster Recovery

Welcome to episode 345 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Matt are in the studio this week and are ready to bring you all the latest in cloud and AI news, including what’s going on between Anthropic, the DOD, and OpenAI, what the war means for Middle East data centers (Spoiler – I hope you have a good Disaster Recovery plan), and Transit Gateway pricing changes that are enough to make a grown man cry. And don’t bother waiting: Matt has completely forgotten almost two years of “bye everybody” and now claims full amnesia as to what his outtro is. Oh well. Let’s get into today’s show.

Titles we almost went with this week

Claude Learned to Use a Computer Better Than Your Dad **OpenAI
Amazon and OpenAI’s $138 Billion AI Bromance
When Two AZs Go Dark the Cloud Gets Crispy
Fifty Billion Reasons AWS Loves OpenAI Now **Anthropic
Azure Still Wins Even When AWS Thinks It Did
Fire, Water, and a Multi-AZ Assumption Goes Up in Smoke
Claude Refuses to Go Full Skynet for the Pentagon
GPT-5.3 Instant Finally Stops Lecturing You
No Killer Robots Without Human Approval Please
Terraform Finally Sees Your Forgotten Cloud Resources
Stage Before You Rage Deploy Azure Firewall
CrowdStrike to Zscaler AWS Wants Your Security Tab
One Hub to Rule Your API Sprawl
Transit Gateway Attachments Just Got Surprisingly Expensive
Azure Container Registry Finally Has Room for Your AI Hoarding
Bedrock Gets a Roommate OpenAI Moves In
Azure Firewall Gets a Safety on the Trigger
Stop Writing Scripts, Just Import the Dang Infrastructure
Audit Your APIs Before March 2026 Bites You
Damn it… my excuse not to DR is gone
I’m Epically Furious about DR

AI Is Going Great – Or How ML Makes Money

03:34 Anthropic acquires Vercept to advance Claude’s computer use capabilities

Anthropic acquired Vercept, a team specializing in AI perception and interaction, to strengthen Claude’s computer use capabilities.
The Vercept founders, including Ross Girshick, bring deep expertise in how AI systems visually interpret and interact with software interfaces.
Claude Sonnet 4.6 shows substantial improvement in computer use benchmarks, jumping from under 15% on the OSWorld evaluation in late 2024 to 72.5% today.
The model is now approaching human-level performance on tasks like navigating spreadsheets and completing multi-tab web forms.
Computer use enables Claude to operate inside live applications the way a human would, handling multi-step workflows across tools that cannot be automated through code alone.
This is relevant for enterprise use cases involving document processing, browser-based workflows, and cross-application task management.
This is Anthropic’s second acquisition in a short period, following the purchase of Bun, which was tied to the Claude Code milestone. The pattern suggests Anthropic is actively acquiring specialized engineering teams rather than just technology assets.
For developers and businesses building agentic workflows on Claude, the improved computer use performance means more reliable automation of complex, real-world software tasks without requiring custom integrations or APIs for every application involved.

05:18 Justin – “It seems like every day I have to update Claude Code because they released a new feature or a new capability.”

12:34 Improving skill-creator: Test, measure, and refine Agent Skills

Anthropic has updated its skill-creator tool for Claude Agent Skills, now available on Claude.ai, Cowork, and as a plugin for Claude Code.
The update brings software development practices like testing, benchmarking, and iterative refinement to skill authoring without requiring users to write code.
The core addition is an eval framework that lets skill authors define test prompts, describe expected outputs, and verify skill behavior across model updates.
A practical example given is the PDF skill fix, where evals isolated a positioning failure on non-fillable forms and guided a targeted fix.
A new benchmark mode tracks eval pass rate, elapsed time, and token usage, and can be integrated into CI systems or local dashboards. Multi-agent parallel eval execution is also included to reduce test time and prevent context bleed between runs.
Comparator agents enable A/B testing between two skill versions or between a skill and no skill, with blind judging to reduce bias in assessing whether a change improves output quality.
Anthropic notes that as base-model capabilities improve, some capability-uptake skills may become unnecessary, and the eval framework is positioned as a step toward skills being defined by natural-language descriptions of desired outcomes rather than detailed implementation instructions.

13:54 Justin – “For things that are actually in pipelines or agentic capabilities where you want things to be specific, this is great.”

14:35 Statement on the comments from Secretary of War Pete Hegseth

Anthropic has publicly refused to allow Claude to be used for mass domestic surveillance of Americans or fully autonomous weapons, citing concerns about current AI reliability and civil liberties.
These two exceptions led to a breakdown in negotiations with the Department of War after months of discussions.
The Department of War is moving to designate Anthropic as a supply chain risk under 10 USC 3252, a designation Anthropic states would be the first time applied to a US adversary. Anthropic has indicated it will challenge any such designation in court.
From a practical standpoint, the legal scope of a supply chain risk designation is narrow. It would only affect the use of Claude on Department of War contract work, leaving commercial API customers, Claude.ai users, and non-DoW contractor use cases completely unaffected.
This situation raises a broader question for cloud and AI vendors about the terms under which they can negotiate acceptable use policies with government customers.
The outcome could set a precedent for how American companies handle government contracts that conflict with their own usage restrictions.
Anthropic notes it has been deployed in US government classified networks since June 2024, making this dispute notable for the AI industry as more frontier model providers pursue federal contracts through programs like FedRAMP and classified cloud environments.

Statement from Dario Amodei on our discussions with the Department of War

Anthropic has publicly refused the Department of War’s requests to remove two specific safeguards from Claude: restrictions on mass domestic surveillance use cases and on fully autonomous weapons systems.
This is notable because Anthropic was already the first frontier AI company to deploy models in US classified networks, National Laboratories, and custom national security configurations.
The Department of War has threatened to label Anthropic a “supply chain risk,” a designation previously reserved for US adversaries, and to invoke the Defense Production Act to force removal of these safeguards. Anthropic notes that these two threats are contradictory since one frames Claude as a security risk while the other frames it as essential to national security.
The autonomous weapons position has a specific technical basis: Anthropic states current frontier AI systems lack sufficient reliability for fully autonomous target selection and engagement, and they offered to collaborate with the Department on R&D to improve reliability, an offer that was not accepted.
For cloud and enterprise listeners, this situation establishes a precedent in which an AI provider publicly declines government contract terms on safety grounds rather than on commercial grounds, with direct implications for how AI vendors structure acceptable use policies in high-stakes government and defense cloud deployments.
Anthropic has indicated it will support a smooth transition to another provider if offboarded, signaling that continuity planning for AI-dependent military operations is now a real operational consideration for defense cloud infrastructure teams.

Our agreement with the Department of War

OpenAI signed a classified AI deployment agreement with the Pentagon using a cloud-only architecture, meaning models run on OpenAI infrastructure rather than on edge devices or government-controlled hardware, which is central to how they enforce their safety constraints.
The agreement includes three stated red lines: no mass domestic surveillance, no directing autonomous weapons systems, and no automated high-stakes decisions without human approval.
OpenAI retains full control of the safety stack and has cleared personnel embedded with the deployment.
The cloud-only deployment model is the key technical differentiator here. By keeping models off edge devices, OpenAI argues it can run and update classifiers independently to verify red lines are not crossed, which would not be possible with on-premise or edge deployments.
The contract language locks in current surveillance and autonomous weapons laws as the standard, meaning even if those laws or DoD policies change in the future, usage must still comply with the standards in place at signing. This is a notable contractual mechanism for maintaining guardrails over time.
OpenAI requested that the same contract terms be made available to all AI labs, including Anthropic, framing this as an attempt to establish a consistent baseline for how the government engages with frontier AI providers on classified work.

21:04 Justin – “The precedent that could be set, potentially, that the government can declare any vendor they want to a supply chain risk feels like it’s gonna violate several amendments to the Constitution…”

New Model Section

21:38 Gemini 3.1 Flash Lite: Our most cost-effective AI model yet

Google launched Gemini 3.1 Flash-Lite in preview, available through the Gemini API in Google AI Studio and Vertex AI, priced at $0.25 per million input tokens and $1.50 per million output tokens, positioning it as a cost-focused option for high-volume workloads.
Compared to 2.5 Flash, the new model delivers 2.5x faster Time to First Answer Token and 45% higher output speed according to Artificial Analysis benchmarks, while scoring 86.9% on GPQA Diamond and 76.8% on MMMU Pro.
The model includes configurable thinking levels, letting developers dial reasoning depth up or down depending on task complexity, which is useful for balancing cost and quality across different workload types.
Practical use cases highlighted include high-volume content moderation, translation, UI generation, and real-time dashboard creation, with early adopters like Latitude, Cartwheel, and Whering already using it in production.
For GCP customers running inference at scale, the combination of low per-token pricing and higher throughput speed makes this a practical option to evaluate against existing model choices in Vertex AI pipelines.

22:09 Google reveals Nano Banana 2 AI image model, coming to Gemini today

Google has released Nano Banana 2, technically named Gemini 3.1 Flash Image, which replaces both the standard and Pro variants of the previous Nano Banana model across Gemini, AI Studio, Vertex AI, and Flow simultaneously.
The model draws on Gemini 3.1 LLM web knowledge to improve object fidelity and infographic accuracy, and Google claims it delivers text rendering quality comparable to the previous Pro tier at Flash-tier speeds.
For developers building multi-character or complex scene workflows, the model supports consistent rendering of up to five characters and up to 14 distinct objects per workflow, with expanded output options ranging from 512px square to 4K widescreen.
The full replacement of prior Nano Banana variants means GCP customers on Vertex AI have no migration choice here, so teams relying on the previous Pro model for production workloads should validate outputs against the new model promptly.
Pricing details were not disclosed in the announcement, so Vertex AI customers should check the Vertex AI pricing page directly for updated image generation costs tied to the Gemini 3.1 Flash Image model.

22:32 Justin – “I’m excited to plug this one into our show cover generator; I’ve been using Nano Banana 1, and if you’ve checked out our show covers lately, you’ve noticed they’ve become fun cartoons based on our show titles.”

22:54 GPT-5.3 Instant: Smoother, more useful everyday conversations

OpenAI released GPT-5.3 Instant as the new default model in ChatGPT, available to all users today and to developers via the API as gpt-5.3-chat-latest, with GPT-5.2 Instant remaining available for paid users until June 3, 2026.
The update targets conversational quality issues that benchmarks typically miss, specifically reducing unnecessary refusals, moralizing preambles, and overly cautious responses that users flagged as frustrating in GPT-5.2 Instant.
Hallucination rates show measurable improvement: 26.8% reduction in high-stakes domains like medicine, law, and finance when using web search, and 19.7% reduction using internal knowledge only, based on OpenAI’s internal evaluations.
Web search integration is notably improved, with the model now balancing retrieved results against its own reasoning rather than defaulting to link lists, producing more synthesized and immediately usable answers.
Developers should note this is a drop-in update to the existing model endpoint, meaning applications using gpt-5.3-chat-latest will automatically get the improved behavior, which could affect any downstream applications that relied on the previous refusal or response patterns.

25:07 Matt – “Testing the models before you roll them out into production. One of the things… how do you actually test these models and prove they’re working? And a lot of customers and questionnaires all require measurable statistics.”

AWS

27:58 Amazon DC Impacted in Operation Epic Fury

Two simultaneous outages hit AWS Middle East regions on March 1-2, with ME-CENTRAL-1 (UAE) suffering physical fire damage to a data center that knocked out two of three availability zones, and ME-SOUTH-1 (Bahrain) experiencing a localized single-AZ power failure.
The UAE incident demonstrated a critical edge case where S3, normally resilient to single-AZ loss, began failing for ingest and egress once a second AZ went down, highlighting that multi-AZ redundancy assumptions break down when two zones are simultaneously unavailable.
Recovery timelines extended beyond 24 hours in both regions due to the need for physical facility repairs, cooling system restoration, and coordination with local authorities, underscoring that some failure modes fall outside software-level remediation.
AWS recommended customers failover to EU regions for ME-CENTRAL-1 workloads, restore from EBS snapshots in unaffected regions, and use the allow-reassociation flag to migrate Elastic IPs to healthy AZs, which are standard DR playbook steps that many customers may not have pre-tested.
This incident is a practical reminder that multi-AZ deployments alone are insufficient for high-availability requirements in smaller regions with fewer AZs, and that cross-region DR plans with tested failover procedures are necessary for critical workloads.
Directly from Status Page: Due to the ongoing conflict in the Middle East, both affected regions have experienced physical impacts to infrastructure as a result of drone strikes. In the UAE, two of our facilities were directly struck, while in Bahrain, a drone strike in close proximity to one of our facilities caused physical impacts to our infrastructure. Finally, even as we work to restore these facilities, the ongoing conflict in the region means that the broader operating environment in the Middle East remains unpredictable. We recommend that customers with workloads running in the Middle East consider taking action now to back up data and potentially migrate your workloads to alternate AWS Regions

29:38 Justin – “This is a real big deal because as our show title said tonight… DR is going to become a real big deal now. If you’re in the business where you need to host data for other customers across the globe, your job just got a lot harder.”

37:26 Amazon invests $50B in OpenAI, deepens AWS partnership with expanded $100B cloud deal

Amazon is making a $50 billion investment in OpenAI as part of a $110 billion funding round that also includes SoftBank and NVIDIA, valuing OpenAI at $730 billion pre-money.
Separately, OpenAI and AWS are expanding their existing cloud agreement by $100 billion over eight years, which analysts estimate could add roughly $17 billion annually to AWS revenue.
A key technical component of the deal is OpenAI committing to consume 2 gigawatts of capacity on Amazon’s Trainium chips, giving AWS a high-profile validation of its in-house AI silicon at a scale that helps justify Amazon’s $200 billion capital expenditure plan for 2026.
AWS and OpenAI will co-create a Stateful Runtime Environment delivered through Amazon Bedrock, allowing enterprise customers to build AI agents that retain context and handle complex multi-step tasks, with AWS serving as the exclusive third-party cloud distribution provider for OpenAI Frontier.
Microsoft retains exclusivity over stateless OpenAI API calls, meaning simple one-and-done AI requests still route through Azure, while Amazon is positioning AWS as the infrastructure layer for stateful, context-aware, and agent-based workloads where the compute intensity and revenue potential are substantially higher.
Amazon also maintains its existing partnership with Anthropic, meaning AWS customers now have access to models from two of the leading AI labs, which broadens the options available through Bedrock without requiring customers to commit to a single model provider.

41:29 Justin – “I am more and more convinced every day that we are in an AI bubble. I do not see how they’re going to generate the revenues required to cover the capital investments that all of these cloud providers are making.”

43:18 AWS Security Hub Extended oﬀers full-stack enterprise security with curated partner solutions

AWS Security Hub Extended is a new plan that bundles curated third-party security tools from partners like CrowdStrike, Okta, Splunk, Zscaler, and Proofpoint directly into the Security Hub console, covering endpoint, identity, email, network, and cloud security in one place.
AWS acts as the seller of record for all partner solutions, meaning customers get a single consolidated bill, pre-negotiated pay-as-you-go pricing, and no long-term commitments, which removes the overhead of managing separate vendor contracts.
All security findings from both AWS native services and partner tools are normalized using the Open Cybersecurity Schema Framework (OCSF) and automatically aggregated in Security Hub, making cross-environment threat correlation more straightforward.
Enterprise Support customers get unified Level 1 support across all participating solutions, which reduces the friction of figuring out which vendor to contact when issues span multiple tools.
The Extended plan is generally available now across all commercial AWS regions where Security Hub is supported, with both consumption-based and flat-rate pricing options available at aws.amazon.com/security-hub/pricing.

44:11 Justin – “Thank you, Amazon. It’s only taken you 10 years to get to this point – because this is cool. Build partnerships with your security vendors, standardize the inputs, and make connections for those things so they all connect together, and if I can do all that through my cloud vendor, who I already have commitments with? I think that’s fantastic.”

Quick Hits

45:41 AWS announces pricing for VPC Encryption Controls

Just pricing BUT CRAZY
VPC Encryption Controls exits free preview on March 1, 2026, introducing a fixed hourly charge per non-empty VPC with the feature enabled in either monitor or enforce mode, with no charge for empty VPCs.
The feature offers two operational modes: monitor mode audits for unencrypted traffic flows, while enforce mode actively blocks resources that would allow unencrypted traffic within or across VPCs in a region.
A notable billing consideration is that enabling encryption support on a Transit Gateway triggers standard VPC Encryption Controls charges for all attached VPCs, regardless of their individual encryption mode setting, even if those VPCs are empty.
For compliance-focused organizations, this feature provides a centralized mechanism to audit and enforce encryption-in-transit across VPC traffic flows, which is a common requirement in regulated industries like finance and healthcare.
Customers should audit how many non-empty VPCs they plan to enable this on before March 1, 2026, and pay close attention to Transit Gateway attachment costs, as those charges can accumulate across a large number of attached VPCs. Detailed regional pricing is available on the VPC pricing page.

46:00 Matt – “Go cry a little bit.”

48:03 Policy in Amazon Bedrock AgentCore is now generally available

Policy in Amazon Bedrock AgentCore is now generally available, giving security and compliance teams a way to define and enforce tool access rules for AI agents without touching agent code, which is a meaningful separation of concerns for enterprise governance.
The natural language to Cedar conversion is a practical feature, letting non-developers author policies that automatically translate to the AWS open-source policy language, lowering the barrier for ops and compliance teams to participate in agent governance.
The AgentCore Gateway acts as an inline policy enforcement point, intercepting agent-tool traffic and evaluating each request before allowing or denying access, which mirrors familiar patterns from API gateway and service mesh architectures.
The feature is available across 13 AWS regions at launch, including major US, European, and Asia Pacific regions, giving organizations with data residency requirements reasonable coverage from day one.
Pricing details are not specified in the announcement, so teams evaluating this for production workloads should review the AgentCore pricing page and documentation at docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy.html before planning deployments.

49:27 Ryan – “I like the Cedar natural language processing, but I wonder how practical it is to write policies that allow agent-to-agent and tool communication.”

GCP

57:07 Combat API sprawl using Apigee API hub

Apigee API hub now integrates directly with API Gateway to automatically synchronize API definitions, OpenAPI specs, and gateway configurations in near real-time, giving platform teams a single control plane for APIs spread across multiple gateways and platforms.
The new specification boost add-on, currently in public preview, uses AI to scan API specs for gaps like missing usage examples or undefined error codes, then generates an enhanced parallel version labeled specboost-draft without overwriting the original, so teams can compare before adopting.
The core problem being addressed is that incomplete or undocumented APIs cause AI agents to fail at function calling or miss APIs entirely, so centralizing and enriching specs directly improves agent reliability in agentic workflows.
Both features are available now, with API Gateway users seeing onboarding prompts directly in the console.
Pricing details for the spec boost add-on are not specified in the announcement, so teams should check the Add-on management section of the API hub for current cost information.
Organizations running legacy specless proxies with no documentation stand to benefit most immediately, as the spec boost add-on can generate documentation for APIs that currently have none, making them visible to both developers and automated tools.

52:08 Matt – “Any undocumented API is always a problem, whether you’re using it or one team uses something they don’t know, or a client finds that should be a dark API that is public, and that always becomes a problem. So, a way to centralize that and kind of help address API sprawl in general is a great thing and will make people’s lives so much better.”

52:41 Improve chatbot memory using Google Cloud

Google Cloud’s polyglot storage approach for chatbot memory combines Memorystore for Redis, Cloud Bigtable, and BigQuery to handle short, mid, and long-term conversation history, respectively, addressing a common scaling challenge for conversational AI applications.
Memorystore for Redis handles the hot layer with sub-millisecond latency using Redis Lists and RPUSH commands, while Bigtable serves as the durable mid-term store using a user_id#session_id#reverse_timestamp key pattern to enable efficient range scans across millions of simultaneous sessions.
Bigtable’s garbage collection policies allow teams to retain only recent data, such as the last 60 days, in the high-performance tier, while older data flows asynchronously to BigQuery via Pub/Sub and Dataflow for archival and analytics without impacting live application performance.
Cloud Storage handles unstructured multimedia artifacts using a URI pointer strategy with signed URLs, keeping the primary databases lean while maintaining secure, time-limited access to files generated or uploaded during conversations.
This architecture is relevant to any team building production-scale agentic applications on Vertex AI Agent Builder, particularly in industries like customer service, healthcare, and financial services, where maintaining accurate long-term conversation context is a compliance or user experience requirement. Pricing varies across each component based on storage volume and query usage.
Ryan loves this almost as much as he loves The Eagles.

Quick Hits

55:42 Spanner columnar engine in preview

Spanner columnar engine is now in preview, adding columnar storage alongside traditional row-based storage to enable analytical query acceleration of up to 200x on live operational data without impacting transactional workloads.
This addresses the longstanding trade-off between OLTP and analytical performance in a single horizontally scalable system.
The engine uses vectorized execution to process data in batches rather than row-by-row, and Spanner automatically routes large-scan analytical queries to the columnar representation.
A new major compaction API also lets users manually trigger the conversion of existing data into columnar format.
A key use case is reverse ETL from Iceberg lakehouses, where processed analytical data from BigQuery, Databricks, Snowflake, or Oracle Autonomous AI Lakehouse gets loaded into Spanner for sub-second, high-concurrency serving. This targets scenarios like real-time dashboards, AI agent features, and user-facing applications that need low-latency access to precomputed insights.
The BigQuery integration is notably bidirectional, supporting federated queries via external datasets, reverse ETL pushes from BigLake Iceberg tables into Spanner, and live CDC streaming from Spanner back into BigQuery and BigLake Iceberg via Datastream. Oracle GoldenGate 26ai also now supports direct replication into Spanner.
The feature is available in preview and can be enabled on existing Spanner tables via a DDL change, with benchmark queries available on GitHub.
Pricing follows standard Spanner node pricing, with no separate cost structure announced for the columnar engine specifically.

55:52 Justin – “If you don’t know anything about columnar databases, you don’t know how cool that is.”

Azure

57:31 Announcing new public preview capabilities in Azure Monitor pipeline

Azure Monitor pipeline now supports TLS and mutual TLS for TCP-based ingestion endpoints in public preview, allowing teams to encrypt data in transit and enforce mutual authentication without relying on external proxies or custom gateways.
This is particularly relevant for regulated environments and edge deployments where plain TCP ingestion no longer meets security requirements.
The new execution placement configuration gives Kubernetes users direct control over how pipeline instances are scheduled across nodes, addressing practical problems like port exhaustion, multi-tenant isolation, and availability zone distribution.
Notably, if the cluster cannot satisfy placement rules, the pipeline simply will not deploy, making failures predictable rather than silent.
Data transformations allow teams to filter, aggregate, and normalize telemetry before it reaches Azure Monitor, including converting raw syslog or CEF messages into standardized schemas using KQL templates. This addresses the cost and complexity of ingesting high-volume noisy data and cleaning it up after the fact.
All three capabilities are in public preview today and target organizations running Azure Monitor pipeline on on-premises infrastructure, edge locations, and large Kubernetes clusters.
Pricing is not separately detailed for these features, so costs would follow existing Azure Monitor ingestion and data processing rates, which vary by volume.

58:38 Matt – “It’s their ETL pipeline service… that’s kind of why this is a big deal.”

59:43 Microsoft Sovereign Cloud adds governance, productivity, and support for large AI models securely running even when completely disconnected

Microsoft has expanded its Sovereign Cloud offering with three new capabilities targeting organizations that need to operate in fully disconnected environments: Azure Local disconnected operations, Microsoft 365 Local disconnected, and large model support in Foundry Local.
These are aimed at government, defense, and regulated industries where external connectivity may be intentionally restricted or prohibited.
Azure Local disconnected operations allow organizations to run infrastructure with Azure governance and policy controls without any cloud connectivity, meaning management and workload execution stay entirely within customer-operated environments. This is now generally available worldwide, though pricing is not publicly listed and would depend on hardware and licensing configurations.
Microsoft 365 Local disconnected brings Exchange Server, SharePoint Server, and Skype for Business Server into the sovereign private cloud boundary, with Microsoft committing support for these workloads through at least 2035. This extends productivity capabilities to teams operating in air-gapped or isolated environments without requiring a cloud connection.
Foundry Local now supports large multimodal AI models running on-premises using NVIDIA GPU infrastructure, enabling local inferencing entirely within customer-controlled data boundaries. This moves beyond the small model support Foundry Local previously offered and is currently available to qualified customers rather than broadly.
The overall architecture is designed to span connected, hybrid, and fully disconnected modes under a consistent governance model, which reduces the operational complexity of managing separate toolsets for different connectivity scenarios.
Organizations considering this stack should evaluate hardware requirements carefully, given the GPU dependencies for AI inferencing workloads.

57:25 Best Practice: Using Self-Signed Certificates with Java on Azure Functions

Winner of the dumbest feature of the week:

Java developers on Azure Functions Linux who connect to services secured by self-signed certificates frequently encounter SSL handshake errors because the JVM only trusts well-known Certificate Authorities by default. The recommended fix is creating a custom truststore in the persistent /home directory and pointing the JVM to it via JAVA_OPTS application settings.
The core reason to use /home for the truststore rather than system JVM directories is that the Linux Functions file system is ephemeral, meaning any changes outside /home are wiped on restart, scaling, or platform updates. Storing the keystore at a path like /home/site/wwwroot/my-truststore.jks ensures it survives those events.
One practical deployment gotcha worth noting is that ZipDeploy or Run From Package configurations can overwrite /home/site/wwwroot contents during code deployments, so storing the .jks file in a separate directory like /home/my-certs/ is a safer long-term choice.
Azure Functions Linux behaves differently from Azure App Service Linux in a notable way: App Service startup scripts often auto-import platform-managed certificates into the JVM keystore, but Functions does not, meaning OS-level tools like curl may succeed while Java code still throws handshake errors.
For teams that prefer not to manage server-side keystore files, two code-based alternatives exist: loading an Azure-managed certificate from /var/ssl/certs via custom SSLContext code, or bundling a locally built JKS file inside the application JAR. Both require application code changes, which adds maintenance overhead compared to the JAVA_OPTS approach.

1:03:46 Justin – “This is just a way for you to troubleshoot certificates even worse than you were troubleshooting it before.”

Quick Hits

1:05:19 Announcing general availability of Azure Intel® TDX confidential VMs

Azure has moved its Intel TDX confidential VMs to general availability, using 5th Gen Intel Xeon processors to provide hardware-enforced isolation that protects data while in use, which addresses a longstanding barrier for organizations running sensitive workloads in the cloud. Notably, existing applications can be deployed without any code changes.
The new VM series (DCesv6, DCedsv6, ECesv6, ECedsv6) introduces NVMe local SSD support as a first for Azure confidential VMs, delivering roughly 5x more throughput and about 16% lower latency compared to the previous SCSI generation, with IO latency reduced by approximately 27 microseconds.
These VMs are the first in Azure confidential compute to use the open-source OpenHCL paravisor, which increases transparency and allows customers to cryptographically verify workload integrity rather than simply trusting the cloud operator.
The open-source component is available at github.com/microsoft/openvmm.
Intel AMX acceleration is built in, making these VMs suited for confidential AI workloads such as protecting model weights and running cross-organization AI pipelines without exposing underlying data.
Azure Boost support adds up to 205k IOPS, 4 GB/s remote storage throughput, and 40 Gbps network bandwidth.
General availability is currently limited to the West US and West US 3 regions, with support for Windows Server 2025 and Ubuntu 22.04 and 24.04. Pricing is not specified in the announcement, and customers can request preview access in additional regions at aka.ms/acc/v6preview.

1:10:10 Generally Available: Draft & Deploy on Azure Firewall

Azure Firewall Policy now supports a two-phase Draft and Deploy workflow, meaning teams can stage policy changes before committing them, which reduces the risk of unintended disruptions during updates.
Previously, any policy change triggered a full firewall deployment, which could cause delays and service interruptions.
This feature separates the authoring phase from the deployment phase, giving teams more control over when changes go live.
The feature is particularly useful for organizations with strict change management processes, as it allows multiple edits to be batched and reviewed before a single deployment is executed, rather than deploying each change individually.
This is now generally available, so production workloads can rely on it. Azure Firewall Policy pricing remains consumption-based, and customers should check the Azure Firewall pricing page at azure.microsoft.com for current rates, as costs vary by policy tier and region.
Teams managing complex or high-traffic environments will benefit most, since reducing the frequency of full deployments directly translates to fewer maintenance windows and more predictable firewall behavior.

1:10:27 Azure Container Registry Premium SKU Now Supports 100 TiB Storage

Azure Container Registry Premium SKU now supports up to 100 TiB of storage, a 2.5x increase from the previous 40 TiB cap, with no configuration changes required for existing registries to benefit automatically.
The increase directly addresses a real operational pain point where enterprises were splitting workloads across multiple registries just to stay under limits, adding complexity to access control and networking that had nothing to do with actual business requirements.
AI and ML workloads are a clear driver here, as teams storing large model artifacts, training outputs, and inference containers were consuming registry capacity faster than anticipated, alongside normal container workload growth.
Microsoft also improved geo-replication data sync speeds for new replicas and added a storage consumption view in the Azure Portal Monitoring tab, two improvements that had been customer requests for some time.
The 100 TiB limit is exclusive to Premium SKU, so teams on Basic or Standard tiers will need to upgrade to access it, though Premium also includes geo-replication, private endpoints, and enhanced throughput.
Pricing details for Premium SKU storage are available at the Azure Container Registry pricing page.

1:10:47 Ryan – “So now instead of two windows container images you can store FOUR.”

1:13:37 New Azure API management service limits

Azure API Management is rolling out updated resource limits starting March 2026, aligning classic tier limits with v2 tier limits across entities like API operations, tags, products, and subscriptions. This affects all service tiers in a phased rollout over several months.
Existing classic tier customers whose usage exceeds the new limits will be grandfathered in, with their limits set 10% above observed usage at the time the new limits take effect.
New services and those under the new thresholds will be subject to the updated limits immediately.
Limit increase requests will only be considered for Standard, Standard v2, Premium, and Premium v2 tiers, with Premium customers receiving priority. Requests are evaluated case by case and are not guaranteed, so teams relying on high resource counts should audit their usage now.
Before requesting a limit increase, Microsoft recommends reviewing the Manage Resources Within Limits documentation at learn.microsoft.com, as some increases can introduce latency or affect service capacity.
This is a practical reminder that limits exist to protect shared infrastructure performance, not just to restrict usage.
Pricing for API Management tiers varies, with the Developer tier starting around $0 for testing and the Premium tier running substantially higher for production workloads. Customers on lower tiers, like Consumption or Developer, cannot request limit increases, so production workload planning should account for tier selection early.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

[00:00:00] Speaker A: Foreign, [00:00:08] Speaker B: Where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. We are your hosts, Justin, Jonathan, Ryan And Matthew. [00:00:18] Speaker A: Episode 345 recorded for March 4, 2026. Damn it. My excuse is now gone for doctor. Good evening, Matt. How you guys? How you doing? [00:00:29] Speaker C: I'm good, how are you? [00:00:30] Speaker A: Good. Other than my car's in the shop because they said it would be done but then it wasn't done, so then I had to Uber home and so we're recording late and so now it's super late for you. Ryan's lost in feeding his kids and I don't. Jonathan I think turned into a pumpkin. So hopefully Ryan joins us here as soon as his kids are properly fed and no longer trying to take over the house, you know, it's fine. Yeah. I mean, is it Lord of the Flies? Like I just imagine Ryan's house just Lord of the flies all the time, so. [00:01:01] Speaker C: Well, he's outnumbered. [00:01:03] Speaker A: Yeah, he's outnumbered. So I did listen to Jonathan and you crushed last week's episode. Great job. Nice work. Appreciate it. [00:01:09] Speaker C: I don't know why I re. I really enjoyed that. I think that was one of our better ones for some reason, like at least of Jonathan Mines together. [00:01:14] Speaker A: You guys have some good discussions, which I was. I was thinking we'd spend a little more time on some topics talking about different ways to think about it and different stuff. And I can tell you right now, this week you probably have a bunch of that as well. So we're now in the new world which is going to probably change the cloud world forever. So it's going to be kind of interesting, but let's just get into some basic news then get into the fun stuff here. So first up, Anthropic acquired Versept to advance Cloud's computer use capabilities. There's a team specializing in AI perception and interaction to strengthen Cloud's computer use capabilities. I've used cloud computer use. It's very slow, so I am hoping Percept picks up the pace of that. It makes it a little bit better. Clouds on 4 to 6 did show substantial improvements in computer use benchmarks, jumping from 15% on the OS World evaluation in late 2024 to 72 and a half percent today. Models now approaching human level performance on tasks like navigating spreadsheets and completing multitab web forms. Computer use enables cloud to operate inside live applications the way a human would, handling multi step workflows across tools that cannot be automated through code alone. This is Anthropic's second acquisition in a short period following the purchase of bun, which was tied to the cloud code milestone. And the pattern suggests Anthropic is actively acquiring specialized engineering teams rather than just technology assets. So I look forward to seeing what we get out of Recept. [00:02:29] Speaker C: Yeah, I mean, I've been using Claude more and more in PowerPoint, in Excel. I have a side project I've worked where it's like generating the Excel files and it's fairly impressive. I mean, it's definitely making some of the same mistakes over and over again. And I'm like, no, no, no, please. I actually use formulas. Don't just dump the number in there. Therefore, if you ever tweak it in the Excel, it was all tied together. But overall, you know, I definitely started with 4.5 on a few of these projects and got upset with it and then moved to 4.6 and it seems to be doing a lot better, just kind of working through it on both, you know, leveraging the native built in, I'm going to call it native, but the plugin for Claude as well as the, you know, actually just interacting with it. [00:03:14] Speaker A: Yeah, I'm, I'm, you know, a couple things. It seems like every day I have to update cloud code because they've released new features and new capabilities and I've gotten agents work, sub agents working pretty well. I've gotten, you know, then there's all kinds of new skills coming out. There's all kinds of different things that are just, you know, make it better and better. And so it's, it's constantly reading, refining your pipeline, refining what you're doing, tweaking it, and then rinse, repeat and go through the whole process again. So it's, it's sort of interesting, but it's, it's so cool because it's such a fun time. It's sort of like the earliest of the Internet when like, you know, we were figuring out, oh, we don't have to just use HTML, we can use CSS and we can add graphics and we can do all this other cool stuff. I remember all that developing in the web and, you know, getting out of the box gardens of Prodigy and AOL and you know, then people start putting video on and it was super slow and dumb because it was all in up and you're like, well, that's not going to be it. That's not a reliable use case. And then, you know, everything's evolving. Broadband comes out. All that like these type of major paradigm shifts, if you're paying attention to them. Are really interesting. So it's for a history buff who likes these type of changes. It's kind of cool to see a new technology going now. I'm not super excited about, you know, risk to my future prosperity as a. As an employee slash career. I'm not super excited about some of that stuff, but I'm hoping to become an AI native who can use all these tools really well, and then that should hopefully give me some security because I'll be an expert at running, you know, people and ait. AI people. [00:04:39] Speaker C: Yeah, I mean, I'm not quite as advanced with you. I've started playing with sub agents here and there. I mean, I have a pretty good kind of workflow for a couple of my projects that I have. You know, hey, write it, review it by multiple different Personas, kind of iterate through it and whatnot. I do want to play with more subagents. I'm just not there yet. It's kind of one of those things I need to sit down and just dive deep into for a little while. You know, I'm trying to, you know, the other thing is I'm trying to teach, you know, especially different teams and whatnot at my company how to do it, because it's amazing how much faster you get at things, you know, and then let alone working on multiple projects at once. So, like, yesterday I was sitting there and I was like, oh, here, go do this spike work for this thing I need to do and go work on my little side Excel project. I know, hey, let me go jump on with one of my employees and kind of walk through a technical issue that they were having. So, you know, and doing it all three at once, you know, it was like losing track of the different projects, too, I was working on. But it's. It's kind of fun still. [00:05:38] Speaker A: Well, and, like, I still strongly encourage, like, everyone on my team. And even if you're, you know, here's listening to us and you're trying to figure out good ways to use it, like create your boss as an AI prompt. Like, you know, you know, the basic things you. I mean, assuming you have a good relationship with your boss, you probably know things he cares about, things he worries about. When he reviews your projects, reviews your work, what are the things he always asks you about? You know, what's his personality type? You know, all these things, you can add them into a prompt and you can save that prompt. You can even turn into a skill if you want to get fancy. But, you know, use what you're sending to your boss. And say, hey, I'm gonna about to send this email to my boss or to this project manager or to whoever. And you have these skills or you have these prompts and like, put it in the system and then you can find out, like, it gives you good feedback. I was telling someone trying to write a business case for a new headcount for something, and they had some data. And I was like, you know, this. This use case doesn't make any sense to me. The. The math's bad. And I said, you know, I had to go present this to our chief of staff, and if I give this to her, she's going to, you know, blow it up, so I need you guys to fix it. And they're like, well, what do you mean? I'm like, well, here. And I, like, have a prompt that's kind of, you know, some things I know about her and how she thinks about business and things like that. And. And I asked it, like, what's the percentage chance that you would even approve this as a headcount based on this email? [00:06:52] Speaker B: Email. [00:06:52] Speaker A: And it comes back and it's like 40%. I'm like, so, like, you really want this position? You really think this is important? You think a 40% is a good enough chance? And they're like, no, no. And so, like, they took my prompt and now they've been RA. I think they have up to like, a 90% chance. You know, again, I don't know if AI is that accurate on that, but I'm like, hey, you're still making a better case. It's helping you refine the case. And, and like, people don't default to that yet, because, again, like, the people just aren't used to these things. And that's where it's. It's interesting too, as you go through, like, transformation of your engineering team, getting them. You know, it's. It's the transition to cloud native all over again, but now it's AI native. And it's, you know, it's a culture change. It's thinking differently. It's how do I use the AI to do the task? Or how do I. How do I use agentic agents to do a bunch of different things? And so it's this. This paradigm shift. And it's cool and it's. But, like, some people just aren't trying. I'm sort of, like, frustrated because I'm like, my default mode is how do I use AI to make this better? And like, with anthropic tools, which are great, you know, it can. It has enough, I've had enough historic, historic conversation with it that it can mimic my writing sty. So like, it can help me write emails that are very similar to how I would write them, but I just have to tweak little minor things. And like, all my reviews this year were written with AI because it just, you know, I. Not that I didn't want to put the time into the reviews, but like, you know, I have really clear points I want to make, but I'm not really good at the warm and fuzzies of why, you know, these things are so it helps me put, you know, some of the wrappers around it and like, helps me suggest different action plans for the person to help improve in certain areas. I'm telling them they need to improve it. So like the, and I've been giving those reviews now finally. And like the feedback is, these are really good, this is really helpful. It gives me some guidance and it's like, so. And I told them all, I'm like, hey, I use AI a lot. I'm not gonna, not gonna hide that fact. I had a lot of reviews to do. But, you know, it's interesting to get the feedback and they're like, you know, and I tell them how I did it and I'm like, look, I incorporated the feedback from other people, incorporated your feedback, I incorporated mine and the things I thought you were doing well and things. And I, you know, we put this together together is per. They're like, this is cool. So they're, they're getting it. But AI native is just the same as cloud transformation transitions and it's going to take time for people. [00:08:54] Speaker C: Yeah, I'd say I'm not nearly as far as like the first thing I do, but you know, I think in the last two weeks I'm spending a lot more time being like, okay, the first thing I need to do is do AI and have it review it or have it scan it or have it kind of think about it, you know, versus even just the simple things of, hey, I'm having this error in Python, how do I fix it? Just. And rather than going to Google, just going to AI, you know, especially if it has the context of your entire repo and everything else, it really does make such a dramatic improvement, you know, and I remember seeing there was a stack overflow chart that I think we've talked about in the past that like the usage of stack overflow in the last like five years, it's just this like plummeting line downwards of, of, you know, barely getting used. And it's just, it's fascinating because it's so much better and so much faster to get the same results. And I feel like half the time I'm like, you know, on crack because how fast I'm getting stuff done or on speed because I'm getting stuff done so much faster than I was before. So it's, it's definitely a mind shift change and I'm not quite there yet, but I'm actively working on it for, you know, my, my own sanity. [00:10:07] Speaker A: Great. Except speaking of skills, I was, I could use the segue right ahead. [00:10:13] Speaker C: Yeah, I, I thought I set you up for a little bit. [00:10:15] Speaker A: A little bit. You tried and I just didn't pick up the pieces you were putting down. Anthropic has updated Skill Creator tool for Cloud Agent skills now available on Claude AI cowork and as a plugin for cloud code. This update brings software development practices like testing, benchmarking, and iterative refinement to skill authoring without requiring users to write the code themselves themselves. The Core Edition is an eval framework that lets skill authors define test prompts, describe expected outputs, and verify skill behaviors across model updates. And a practical example given is the PDF Skill Fix where evals isolated a position failure on non fillable forms and guided a targeted fix. A new benchmark mode tracks eval pass rate, elapsed time and token usage and can be integrated into CI systems or local dashboards. Multi agent parallel eval execution is also included to reduce test time and prevent context bleeds between your runs. Comparator agents enable AV testing between two skill versions or skill versus no skill with blind judging to reduce bias and evaluating whether a change actually improved output quality. The problem knows that as a base model capabilities improve some capability, upload skills maybe become unnecessary. And the eval framework is positioned as a step towards skill event skills eventually being defined by natural language descriptions or desired outcomes rather than detailed implementation instructions. So again, the skill builder getting these is great because being able to QA them and be able to test different iterations of it is important. Especially if these skills are like key to your SDLC pipeline. The ones I was giving you earlier about like you know, you can create a Persona and create a skill, those ones probably don't need this. But you know, for things that are actually in pipelines or in agentic capabilities where you you want things to be specific, this is great. [00:11:45] Speaker C: Yeah, I definitely like the, you know, A B testing and the, and the multi agent parallel, you know, for evaluation. Kind of really getting that multiple things it's the Ralph Wiggums, you know, like, hey, run the same thing over and over and over and over and over again until you kind of get something out there, you know, and get a consistent result. And I think that's just a great way to kind of build these things out and hopefully in a definable pattern where even though AI is by nature a little bit unpredictable to get it kind of close to where you want it to be. Agreed. [00:12:18] Speaker A: Well, Friday was a bad day for Anthropic. Basically, Pete Hesketh, the Secretary of War head basically has banned Anthropic from supporting the military, which then followed with President Trump banning usage in all federal agencies. Basically, you know, they're saying Anthropic publicly refused to allow CLAW to be used for mass domestic surveillance of Americans or fully autonomous weapons, citing concerns about current AI reliability and civil liberties. These two exceptions led to a breakdown in negotiations with the Department of War. After months of discussion, discussions prior War is moving to designate Anthropic as a supply chain risk at under, under 10 USC 3252, which has never been put onto a US company before and typically only been put onto US adversaries like Huawei, for example, was designated this in the past. Anthropic has indicated it will challenge any such designation in court. And from a practical standpoint, the legal scope of a supply chain association is narrow. It only affect use of cloud on Department of War contract work, leaving commercial API customers, cloud, AI users and non Dow contractor use case clearly unaffected. Maybe this is where things get tense because the way that the ban is awarded is that you're not allowed to use anything that came from Claude. And so if the output of a cloud code session into a software product that then is sold to the US Government, if that is considered part of a supply chain risk, that could be pretty devastating because a lot of companies, you know, it may not be that you're selling to the government directly, but you're selling to Boeing or you're selling to a Lockheed Martin or to another, you know, contractor. And so they are under these same regulations. So it could have really big ripples for Anthropic and Anthropic's future as a company. And interestingly enough, Anthropic in general is saying, you know, we're, we're basically saying that, you know, they wanted to add all lawful uses to the contract and they didn't feel that was appropriate and that they didn't want these things used for war and they have good reasons for not wanting to use for war. There was a study that came out two weeks ago where war games were performed with using all the leading AI models. And I think it was something like 30 out of 31 simulations. They all went for the nuclear option at some point and destroyed major countries with nuclear. So yeah, maybe you don't want to use these AI models yet in a gentic warfare perhaps, you know, if you want to, you know, not have nuclear Armageddon Terminator style, which I also feel [00:14:41] Speaker C: like goes against some of the core prompts that are in there. So I wonder if DOD when they set it up is, you know, getting their own custom core prompt which removes some of those things because there's things in the core prompt that's like do not harm people. War in theory always harms somebody, whether you believe which side of it there's still someone getting harmed. So it'd be kind of curious how either way it was getting around that then. That's the only thing I can really think of. [00:15:09] Speaker A: Well, I mean, I think, I mean, my understanding is that these models were deployed inside the DoD, on servers, inside DoD data centers. And so, you know, those models may not be identical to the models that we had, you know, for public consumption, of course, for obvious reasons. So I'm curious, you know, it's interesting because, you know, looking at this like, or you realize Anthropic was involved in the taking down of Maduro and even in this particular scenario they have six months to ramp down this relationship, which then leads them into the weekend where, you know, they basically declared war on Iran with Israel and attacked them, probably using cloud data as well. And so I could see potentially where Anthropic, you know, who has this desire to do no harm and to do the right things and you know, to be more aspirational than that, doesn't want their system used for government. But also they ran to the government very quickly. They're, they're the first cloud provider to be available in federal government. So it's all in very interesting. And to make it even more suspicious if you're in the conspiracy theory side, a few hours after they basically terminated the contract with Anthropic, OpenAI signed a classified AI deployment agreement with the Pentagon using a cloud only architecture, meaning models run on OpenAI infrastructure rather than on edge devices or government controlled hardware, which is central to how they enforce their safety constraints. The agreement includes three stated red lines. No mass domestic surveillance, no direct autonomous weapon systems, and no automated high stakes decisions without human approval. But they did allow them and they retain the full control, but they did allow them to keep the any lawful purpose, which people say is a very common way to get around some of these restrictions around domestic surveillance and autonomous weapons, if they can point to laws. And then also the point is, in my mind is that the current administration doesn't seem to really care about laws, so. Also somewhat problematic. [00:16:56] Speaker C: So. [00:16:57] Speaker A: So, you know, it looks like a pretty big coup for OpenAI, who happens to have a lot of investors who are either friends with the administration or in the administration, who basically took out their number one competitor in the federal market through this action and then basically signed the exact same contract with them with actually more restrictions. Cause they're still maintaining access to all the servers and they're prevent, you know, they control what goes through them and basically they're set to win a lot of money from the government based on this. All a little bit suspect. [00:17:31] Speaker C: Yeah. It's gonna be interesting to see kind of where all this flushes out because the news kind of always blows it up. And I think I saw that there was like, you know, after all that, like there was something I read it about like mass unsubscribing, deleting from OpenAI and ChatGPT and all these things. You know, even, you know, if you talk about like it being a supply chain risk, like going back to anthropic, it's like, how do you prove where it was used? And does that mean you can't use anthropic models anywhere? So even if you are in GitHub copilot, you can't use Opus or Sonnet, you know, and it me like in my head, it doesn't really mean you're gonna get anywhere. This is just people blowing smoke and being upset, you know, and saying because they got, they couldn't get something and they had a microphone at their hand. So all this really kind of will take a little bit of time. [00:18:24] Speaker A: Yeah, it's gonna go. It's gonna take years to record. And then also like the precedent that could be set potentially that, you know, the government can basically declare any vendor they want to as a threat to supply chain risk, you know, feels like it's going to violate several amendments of the Constitution. So, you know, I expect this is going to become a Supreme Court battle at some point. You know, how many years it takes to get to the Supreme Court will depend. Well, we'll come back to this in a couple of minutes here, but let's first jump into some new models this week. New toys, new shiny things. Gemini 3.1 flash light launched this is the in preview but with the Gemini API and Google AI. Studio and Vertex AI priced at 25 cents per million input tokens and a dollar 50 per million output tokens compared to 2 and a half 2.5 flash. The new model delivers 2 and a half x faster time to first answer token and 45% higher output speed according to artificial analysis benchmarks. Model includes configurable thinking levels letting developers dial reason dial reasoning depth up or down depending on task complexity, which is pretty nice. They also revealed nanobana two technically named Gemini 3.1 flash image, which replaces both the standard and Pro variants of the previous nanobana models across Gemini AI, Studio, Vertex AI and Flex Flow simultaneously. The model draws on Gemini 3.1 L on web knowledge to improve object fidelity and infographic accuracy, and Google claims it delivers text rendering quality comparable to previous Pro tier at Flash tier speeds. I'm excited to plug this one into our our show cover generator because I've been using Nano Beno 1, which is pretty good and if you've checked out our show covers lately, you've noticed they've become fun cartoons based on our show titles. So so that's all thanks to AI because I'm not artistic and neither is any of our other co hosts, but it's been fun to do those. Definitely not and then GPT 5.3 incident is our last model from OpenAI. This is a new default model in Chat GPT available to all users today and to developers by the API. There's GPT 5.3 chat latest and with GPT 5.2 instant remaining available for paid users until June 3rd. This is one of those things that's kind of annoying and some developers are really not liking this is these new models come out and they're great, but they don't have time to test and get things into the new models before they deprecate the old ones. So that's a bit of a complaint I've been starting to hear from some developers. The other thing is they say this has a 26.8% reduction in high stakes domains like Medicine, law and finance for hallucinations, and a 19.7% reduction when using internal knowledge bases based on OpenAI's internal evaluations. I do have to say that the amount of hallucinations I catch ChatGPT in is definitely higher than some of the other models that I play with. Not the not the open source ones. Open source ones are also bad hallucinations to be honest, but at least the big ones Anthropic Gemini. This one, I feel like they have the highest rate of hallucinations, at least from my experience. I know yours. [00:21:19] Speaker C: Yeah, I kind of feel the same. I don't have data to do it, but I live mainly in Claude and you know, I don't feel like I've the amount of hallucinations, especially in the 4.5 and 4.6 feel like has gone down. You know, I work in a Azure shop and I use Windows. So you know, we still do a good amount of programming in PowerShell. And you know, there is like you used to just pull up like, hey, get this thing. It would just make up a get dash powershell command. And I've kind of feel like that's gone down quite significantly in the last couple models, you know, 4.5, 4.6, which I think is great. I ran kind of a head to head on something. I think it was like two weeks ago. It was right before four, six came out. So somebody can figure out what the math of whenever that was. And GPT kind of went down like weird side paths and didn't produce anything useful out of it. The one thing I do want to come back to what you mentioned, which is kind of testing the models before you kind of roll them out into production. One of the things that, you know, as you work through any of the living in the security world also, you know, the ISO 4201 or any of the customer questionnaires is how do you actually test these models and prove that they're working and everything along those lines. And a lot of these, you know, customers and questionnaires and I think even the 4201 all require measurable statistics. And with the rate of, you know, these things coming out, you know, before I feel like it was like, oh, this new EC2 instance is now a little bit faster and better. Should we move to it? Great, let's run our test on it. But with things rolling out and you get about only about six months to really play with it, you know, before you're like, okay, I'm so far out of date. You know, you have to kind of automate that your entire test suite, which again, moreover, I had more things to do in life. [00:23:14] Speaker A: Agreed. Oh, hi Ryan. Welcome. Hey there. Do your kids fed the. [00:23:19] Speaker B: Yeah, the emergency is over and everyone is fed and quiet. [00:23:24] Speaker A: So far we described your house as a Lord of the flies type situation when the children are hungry and that you had to address that. So. [00:23:30] Speaker B: Yeah, and I'm solo parenting, so it's like I'M really Lord of the Flies right now. [00:23:35] Speaker A: Yeah. [00:23:36] Speaker C: I'm impressed you're recording, honestly. [00:23:38] Speaker A: Yeah, I know. I would have bailed out. I'd be like, yeah, that's not gonna happen. [00:23:42] Speaker C: To be fair, it's like 10 o', clock, so we kind of already went down that path a little bit. [00:23:46] Speaker A: Exactly. I mean, I was sort of hoping that you were, you know, you were busy on your roadie tour for the Eagles, since that's your, apparently your lifelong dream. [00:23:55] Speaker B: How dare you. [00:23:57] Speaker A: Hey, I thought me. Heather. Heather. [00:24:00] Speaker B: Yeah, Heather's gonna get a talking toot as well. [00:24:02] Speaker A: Like, that's all. Yeah, well, we. We just finished up the AI section, so did you have any comments you wanted to say about Anthropic getting kicked out by the federal government and replaced by OpenAI or, you know, before we get into the next juicy topic, which will probably also relate to that somewhat, [00:24:22] Speaker B: I just find it kind of annoying when like the. The part of the news cycle that I'm trying to ignore starts to interact with the part of the news cycle that I feel is like, necessary for my job to. I'm like, oh, no, that's really my only comment is that I've been really conflicted into do I read this or not? Because I want to know the details of what OpenAI. OpenAI acquiesce to. But I haven't done my research, so [00:24:47] Speaker A: I don't have a lot to say. Yeah, we were just talking about the oddness of all the timing and impacts of some of those things. [00:24:56] Speaker C: Chaos. Yeah. [00:24:58] Speaker A: Well, we're moving on to aws and this weekend again, the administration, you know, for the multiple times this year has decided to overthrow another government, this time with Israel. And in this case they attacked Iran, which, you know, Iran does not have any data center regions that we talk about for any of our cloud providers, except for the fact that Iran was not happy with being attacked, especially since we killed their Supreme Leader, you know, in the one of the attacks. And so they have been basically attacking the rest of the Middle east and key security targets. And they targeted two data centers in Dubai. So Operation Epic Fury from the US has impacted two regions, ME Central one and UAE and two A's and then also a debris that hit a adjacent target in Bahrain took out an A and ME south one and they basically were down for about three days. [00:25:52] Speaker C: You. [00:25:53] Speaker A: They're actually still recovering in those locations. They are. They do have power back on those locations. But you know, the most telling part to me is that the statement on the status page from Amazon is due to the ongoing conflict the Middle East. Both affected regions have experienced physical impacts as a result of drone strikes in the uae. Two of our facilities were directly struck, one Bahrain. A drone strike in close proximity to one of our cities caused physical impacts to our infrastructure. Even as we work to restore these facilities. The ongoing conflict in the region means that the broader operating environment in the Middle east remains unpredictable. And we recommend that customers with workloads running in the Middle east considering are taking action now to backup data and potentially migrate all workloads to alternative AWS regions, including the eu, US and other parts of Asia. So, you know, this is a big deal because, you know, as our show title said tonight, you know, both, both Matt and I were like, man, doctor is going to become a real big deal now because you know the, the chances of having a, you know, a military action that takes out a data center, all of a sudden now it came to fruition and you know, from very low percentages of that chance to now, you know their targets, these regions and it's shown now that, you know, wherever these regions are, they're going to be potential military targets. You know, you might want to consider that if you're looking at living near a data center, which is maybe why some people are very upset about data center buildouts in their backyards. And this is a big deal and it's going to become also a big deal for data sovereignty and sovereignty laws and if you're in the business where you need to host data for other customers in these across the globe, but your job just got a lot harder. So unfortunately that's the reality we now live in after this weekend, unfortunately. And so now we will continue to see the fallout from this and the long term implications to the US economy, US companies in the global stage. But you know, hug ops to the Amazon team who, you know, are recovering these data centers in the middle of potentially being attacked again. Definitely something I didn't really consider when I was racking and stacking servers back in the day that I could potentially be attacked militarily. So very scary times considering where you [00:27:57] Speaker B: did that as well. [00:28:00] Speaker A: Well, there I did expect it when I worked in Iraq. But you know, I was thinking I'm in the data centers in Seattle and in Las Vegas and I'm just reacting and sacking servers. I'm like, oh, you know, it's no big deal, I'm just doing that. And I was like, wow, this, you know, something goes down. This is a target why I'm here. [00:28:15] Speaker B: So yeah, no, I mean and it makes sense to be a target. [00:28:18] Speaker A: Right. [00:28:18] Speaker B: There's a lot of infrastructure if you want, you know, the same what we're seeing in Ukraine where they're talking power infrastructure. Like, I get it, you know, it's. [00:28:26] Speaker C: I mean, logically it makes sense, but the back of my head, it wasn't ever anything I've ever really thought about was, you know, to how do you know, have the most, you know, terror in the world Is go attack AWS's, you know, data centers in half the Internet it's going to be down permanently. And half the companies would probably be down if you did a coordinate attack across multiple regions and data centers and everything along those lines. And somewhere now I'm on some FBI watch list as I'm saying this, you know, but, you know, crazy again, you know, that that's kind of the world we live in where, you know, the data center because, you know, government or anything else is hosted there that, you know, my, my day job. Your day job. Anything now know, dramatically matters for these things. And now I actually to have a reason to do doctor versus just saying the odds of, you know, three zones going down and taking down the. The region. You know, Azure has more motivation to get it back up and running and can do their doctor probably faster than we can do all of ours. You know, and test and do all that is probably not true anymore. Yeah. [00:29:34] Speaker B: You know, if can't doctor a physical data center. [00:29:37] Speaker A: Yeah. Maybe it's time to create an agentic. Dr. Agent story. Because the other side of this is, you know, if you're a technologist and this is Corey Quinn's point, you know, if. If I'm in technology and an attack happens on American soil and takes down data centers, you know, you're not enacting the doctor. You're going to be selling yourself to the highest bidder who needs, you know, as a mercenary to basically help whatever you need to help with. And if someone's offering, you know, 500 an hour to get their infrastructure back up or help them get their systems back up, are you going to be worried about your day job at that point? Because already the day job may not exist. It may not. You know, the world has changed dramatically in that scenario. And so like these things, you know, they're scary to think about. And that's kind of the, you know, I go back in my career to, you know, mid 2000 SaaS is kind of new. You know, the number of customers who required, you know, proof of annual doctor tests and wanted to be involved was like pretty low. And it's kind of consistently been pretty low. And then soc, you know, you put your doctor test into your SOC and then kind of everyone just checks the box and I was like oh, they have Dr. And now with this I suspect that we're going to go back to the era where customers want to be involved in your doctor test. They want to be able to validate that their services all function and work. And this also puts pressure on even your doctor that you have today. So that becomes interesting too. So lots of potential future investment in the Dr. Automation space. And how do you make Dr. Easier? Because let's not kid ourselves. Dr. For most companies is difficult. Even the ones that are good at it, suck at it. Even Yahoo, where Ryan comes from, you know, they always talk about how they would fail out of complete data centers but when they did that they still had problems. And oh for sure Cloudflare, you know, today has problems where they find dependencies in a single data center that they didn't, they had accidentally built. Not because anybody intended to make it a single point of failure, just the way certain things connected and routed it became a single point of failure. And so, you know, the scrutiny on Dr. Has become now dramatically increased I think. [00:31:33] Speaker B: Yeah, I mean the only solution really is not, is to make it not a doctor Right. [00:31:38] Speaker A: Like it's, make it active. Active. [00:31:40] Speaker B: It has to be, it has to [00:31:41] Speaker A: be business as usual. [00:31:42] Speaker B: And that's, that was sort of the, the Yahoo way. And you're right, there's still absolutely issues with it. [00:31:47] Speaker C: But, but for the average company, hot hot is just so expensive. You know, you're running your databases in multiple regions, you're running your compute, you're taking, you know, whatever your traffic is and then you bring it in one region but your SQL writer is in the other region. So now if you happen to be on your Canary, you know, 5% of traffic that hits the, you know, slower region like now that it's end user experience gets affected, you know, because that's kind of one of the things I always told people. [00:32:17] Speaker B: Maybe we can actually use this to modernize to you know, newer, newer database technologies that have global, global span tables. [00:32:24] Speaker A: Yeah, like all I think is like this is one more reason that you know, no SQL and moving to spanner and moving to other solutions that don't have that same dependency are a good idea. And you know, distributed databases that actually are distributed versus tied to a legacy OCP model like Oracle and SQL Server. Yeah, I just I the long term ripples, you know, not Only of AI in our business, but also of now, this, this event and what happened to Amazon could have happened to anybody. I mean the only lucky side of this is that you know, Azure is not fully built out in that region. KSA has not been impacted for Google yet. But you know, like even in the KSA case, Google is only planning to do one region in ksa. And so you know, the problem is in the data sovereignty. Like where's the doctor? Well, we'll just use one of the azs as the doctor. Well, okay, two, two AZs got taken out. Now what's your story? So yeah, it's, it's definitely going to change things. Long term ramifications are here. Like it's going, it won't change it tomorrow, but as this military operation fury wraps up in four weeks to 45 years from now, you know, things will change and the world is no longer the same place it was a week ago, which is kind of scary in a lot of ways. So you know, I hug your loved ones. That's all I can say. [00:33:44] Speaker B: Yeah. And for anyone who ever heard me joke about like when the data center is a smoking crater I met from an asteroid. An asteroid, [00:33:55] Speaker C: yeah. [00:33:56] Speaker B: So not funny Matt. [00:33:58] Speaker A: That's not as funny. We actually have a story coming up in the Google section that like oh, that's just forwarding on that show title that, that blog post this week. Similar, similar problems but well, Amazon is making a $50 billion investment into OpenAI as part of its $110 billion funding round. That also includes SoftBank and Nvidia valuing OpenAI at $730 billion pre money. Separately OpenAI and AWS are expanding their existing cloud agreement by $100 billion over eight years. Man, I want to negotiate that EA which an analyst estimate could add roughly $17 billion annually to AWS revenue. A key dynamic component of the deal is OpenAI committing to consume 2 gigawatts of capacity on Amazon's Trainium chips, giving AWS a high profile validation of its in house AI silicon at a scale that helps just fire Amazon's 200 billion capital expenditure plan for 2026. I mean are we saying Anthropic's investment into Trainium was not a good enough of a high profile customer? Sort of a weird statement. AWS opening high will co create a stateful runtime environment delivered through Amazon Bedrock allowing enterprise customers to build AI agents that retain cont handle complex multi step tasks. With AWS serving as exclusive third party cloud distribution provider for OpenAI Frontier. Microsoft does retain exclusivity over stateless open AI API calls, meaning simple one and done AI requests still route through Azure. While Amazon is positioning AWS as the infrastructure layer for stateful context aware and agent based workloads where the compute intensity and revenue potential are substantially higher. They're also maintaining their existing partnership with Anthropic. And so, you know, again, this is just no one has enough GPU capacity for everybody and so the more you can spread the load around, the happier everyone's going to be. Anthropic, especially after this weekend's actions has become very popular for a lot of people and so that's definitely been a problem. Anthropic's had a lot of downtime this last week as there's just been a crush of new load coming onto their servers. [00:35:50] Speaker B: As my life revolves around cloud now, I noticed, yeah, [00:35:57] Speaker C: I feel like this is kind of like, you know, when clouds first started, you know, everyone's like we're, you know, we're signing a single cloud deal and that's kind of where we're going. And you kind of see that over time where it's like, nope, we're going to multi cloud. And these AI vendors are kind of having that same thing where you need features or you need in this case, you know, specific gpu, you know, and Trainium or you know, to these types of chips that just aren't in existence in these places. So you have to go multi cloud in order to really get it. Now the question is, you know, it sounds like most of these companies are these plotted OpenAI from what I'm reading through the, you know, tea leaves here is they're kind of sending certain types of traffic to each place and you know, as this kind of called out but you know, Anthropics kind of had the same type of thing in it where it means they're still not like we were just talking about going hot hot to both these cloud providers. There really still sounds like they're just choosing which traffic goes where. It kind of would like for them to go more true high high and just route the traffic to whichever way has capacity and available. So if you know, Azure has an issue in East 2 where a lot of their compute is or you know, or AWS does has us yellow one that goes down. You know, they can just dynamically route that traffic to the other place but it doesn't really feel like that's the way they're building out that backend system systems. [00:37:18] Speaker B: It's a big size too. 50 billion like Amazon's investment in anthropic is what, 10 billion? Like it's not, not nearly the same size. [00:37:28] Speaker A: Oh yeah, I, I, I, they might be bigger now. I think originally they started at 10 billion, but I think it, I think they have done further rounds. But I mean, I mean the big thing is I'm like $17 billion in new AWS revenue to pay for, you know, their $200 billion in capital investment. Like where does that come from in the global economy? And the only place that I think of is massive layoffs. That's all I can see. [00:37:53] Speaker B: I don't know. We've made up money for years, we'll just make up some more. [00:37:57] Speaker A: I mean you print more money than just make inflation worse and then, you know, nothing as good happens out of that. So I, yeah, again I, I am more and more convinced every day that we are in an AI bubble. Like I, I do not see how they're going to generate the revenues required to cover the capital investments that all of these cloud providers are making. It just doesn't, it doesn't pan. I mean I did some comparisons to the dot com era. You know, I normalize the capital spend between the two and I was looking at the charts and I'm like we're already spent 3 1/2 x what we spent in the dot com bubble and that was a bubble. And you know, I see the same problem. Like the issue there was we had all these companies investing a ton of capital to build out these businesses. Then people weren't using enough and they weren't making enough money. And then you look at AI today, it's heavily subsidized and so you have all of this, you know, are people going to be willing to pay? You know, is there is a goal right now, like keep the prices really low and then jack them up later when there's no one, you know, there's no other option. And now everyone's like, oh my God, all this stuff was super cheap, is now thousands of dollars and we have to go back to doing the show notes manually. I'll be really sad that day. But you know, it's like I just, I don't know, I, I, I don't get how they get 17 billion. I mean this is one, this is, this is just, this is the only one. This is open AI and they're gonna make $17 million new revenue. Like okay fine, but it's not just them. It's Azure and it's Oracle and it's Google and it's all these guys and they're all just saying the same thing. And I'm like, where's the revenue gonna come from? Like, that's what I don't understand. That's, that's the part I'm confused about. So, and now we have to re, you know, we have to build real, build brand new Patriot missiles because we just blew them all up in Iran. So the government's not buying it. It's just crazy. [00:39:42] Speaker B: Well, maybe those missiles are AI driven, so the government is buying. I don't know. [00:39:45] Speaker A: Yeah, maybe. I don't know. All right. Amazon Security Hub Extended offers full stack enterprise security with curated partner solutions this is a new plan that bundles curated third party security tools from partners like Crowdstrike, Okta Splunk, Zscaler and Proofpoint directly into Security Hub Console, covering endpoint, Ident, email network and cloud security in one place. AWS acts as the seller of record for all partner solutions, being customers get a single consolidated bill pre negotiated as pay as you go pricing and no longer term commitments, which removes the overhead of managing separate vendor contracts. All security findings from both AWS native services and part tools are normalized using the Open Cybersecurity Schema Framework or ocsf and on the aggregate into Security Hub, making cross environment threat correlation more straightforward. Enterprise support customers get unified level one support across all participating solutions, which reduces the friction of figuring out which vendor to contact when issues span multiple tools. And the extended plan is generally available now across all commercial regions where Security Hub is supported. And thank you Amazon, it's only taken you 10 years to get to this point because this is, this is cool. And this is actually, I hope all SIM vendors do this. Build partnerships with your security vendors, standardize the inputs and make connectors for those things that all connect together. And if I can do all that through my cloud vendor who I already have commitments with, I think that's fantastic. [00:41:06] Speaker B: And I don't have to make our security team spend six months writing parsing rules. [00:41:09] Speaker A: Awesome. Yeah, yeah, but just, I mean, but even like now I've taken all the friction out of doing other contracts and doing all that. It's all through my Amazon bill. Like, it's just, you know, these are the, these are the vendors that the security teams want to use and security teams love their vendors and they love their tools. So they don't want to all move to cloud native. So embrace the fact they don't want to all move to cloud native and make it easy to use the tools with the cloud. That's what I wanted and that's why I wanted for 10 years. And so Amazon has finally given this one to me, which I appreciate. [00:41:36] Speaker B: Yeah, well Security Hub is finally growing into a real tool. That too like the last few years, like you know they started with guardduty which was great but it wasn't a sim, you know. And so now this is, you know, they're getting, they're getting to that place where Security Hub has all the functionality you need for SIM and Soar. It's not quite there yet for, for Soar, but getting there. [00:41:57] Speaker A: All right, let's move. We have a couple quick hits this week. Amazon has finally announced the encryption controls are exiting the free preview on March 1st. Introducing a fixed hourly charge for non MTV PCs with a feature enabled either monitor or enforce mode. And if you were excited about this because you were like yeah, we get this thing out of the box and it's going to be amazing. Go check the price because oh my God. [00:42:16] Speaker C: And then cry. [00:42:17] Speaker A: And then cry a little bit. [00:42:19] Speaker C: Yeah, up to 22 cents I think I saw was the highest one. But like 15 to 22 cents per region per VPC that adds up real fast. [00:42:28] Speaker A: Price per non empty VPC per hour. So if you have a lot of VPCs out there, each of them costs that 22 cents. That's in the, that's in the Hong Kong region. But I mean like the main ones in the US that probably like 15 cents per hour for each of them. Each VBC that's not empty like oh my God, you think about a massive multi account project or company like this could go expensive very, very quickly. Yeah. [00:42:52] Speaker B: I mean the pricing means that this is, this is trying to get a compliance envelope encryption control met. [00:42:59] Speaker C: Right. [00:42:59] Speaker B: Like that's, that's what this is, right, where you're going to throw money at it because you kind of have to for some reason. Right. [00:43:05] Speaker A: Which I mean again it's good for finance and healthcare but man it's going to cost like so yeah, definitely. Glad it can't look, I'm glad it exists. I hope it gets a price cut at Marine event. How's that? [00:43:15] Speaker C: Yeah, I mean so if you just have one VPC and you have Dr. $225 out the door before you do [00:43:23] Speaker A: anything, but no one has just two VPCs. [00:43:26] Speaker C: No. [00:43:27] Speaker B: Well that's why like I was thinking about like the compliance bit, right? Where you've got a little island of workload that you're trying to like wrap all the, all the things around like. [00:43:36] Speaker A: Oh yeah, if you had a control, like if you had a control tier vpc. That's where you want to have the most security. Because all the stuff coming in out of the control tier from other VPCs, that's what you're thinking. I mean that makes some sense. [00:43:46] Speaker B: Yeah, either a control tier or like you, you have, you know, there's definitely ways that you sort of wrap everything up and then. And you just meet the compliance that way. So if you have like a mutual TLS or if you have certain things like that, you can, you know, kubernetes and you enable, you know, service mesh. You kind of get that as long as you're using the right encryption module. So I can see why they're sort of doing that in the same way. And a lot of companies do just throw everything on a single VPC with shared subnets. But yep, good luck with that. [00:44:14] Speaker A: Good luck. And then our final Amazon story policy and Amazon Bedrock Agent Core is now generally available, giving security and compliance teams a way to define and enforce tools across our tool access rules for AI agents without touching agent code, which is a meaningful separation of concerns for enterprise governments. The natural language to Cedar conversion is a practical feature letting non developers author policies that automatically translate to AWS open source policy language, lowering the barrier for ops and compliance teams to participate in agent governance. The Agent Core gateway acts as an inline policy enforcement point, intercepting agent tool traffic and evaluating each request before allowing or denying access, which mirrors familiar patterns of API gateway and service mesh architectures. The feature is available across 13 AWS regions at launch, including major US, European and Asia Pacific regions. Giving organizations with data residents requirements reasonable coverage from day one. Pricing details were not specified in the announcement, so I'll check that out and see what that's going to cost us. But in general, I really like this as an announcement. This is one area that I feel Vertex is a little bit behind is in particularly in security tooling around agents and really AI use cases in general. And so you know, Amazon may not have a great model like Nova that people actually want to use, but at least they have the tooling around the stuff to at least use other people's models. [00:45:29] Speaker B: Yeah, no, I mean this is my day job every day now and it's terrible because you're right, we are cheap Google Shop and it's, there's just not a lot of tooling there. You know, I like the, you know, Cedar Natural language processing, but I wonder how practical it is to write policies that allow agent to agent and tool communication like it's, it's kind of a Difficult thing to do when everything's the wild, wild west right now. Like, what policy do you write? [00:45:54] Speaker A: Yeah, I mean, I, I definitely think it's the fact that you can at least approach it that way I think is helpful because when you first get into Cedar, I remember when they first announced it that what was the old. Oh, what was the old security conference called? Reinforce. They announced the thing and they were showing us the math and you and I were like, this is cool, but I don't understand any of the math. And then the policy language came out and it was like, okay, it's not the math, but it's not easy either. And then, you know, to be able to make it easy for a compliance person to write something that's nice. So I hope that gets better. But yeah, it looks like consumption base for policy authorization requests are. I, you know, there's four zeros there, guys. I got on. That's 10,000, you know, point 25 cents per 10,000ths per authorization request. I don't know how. And 13 cents per 1,000 input tokens for the AI part of it. So, yeah, I mean, it's not terribly expensive, so that, you know, unlike the encryption service. So that's good. But yeah, definitely, we should really talk to Google about that. Like, what's their plan for Security and Vertex? Because I feel they're really behind. Yeah, it's already. [00:47:01] Speaker B: Yeah, already in the works for me because it is sort of struggling through the tooling right now has been difficult. [00:47:08] Speaker A: Yeah, you should invite me to that too, because I'm curious. [00:47:12] Speaker B: We'll do. [00:47:13] Speaker A: And then let's move on to gcp, whose title this week is not in good form. Combat API Sprawl. We have to remove those languages from everything soon. API Sprawl Using Apogee API Hub, this now integrates directly with the gateway to automatically synchronize API definitions. Open API specs and gateway configurations in near real time, giving platform teams a single control plane for APIs spread across multiple gateways and platforms. The new specification, Boost Add on, currently in public Preview, uses AI to scan API specs for gaps like missing usage examples or undefined error codes, and then generates an enhanced parallel version labeled Spec Boost Draft. With overriding the ritual, the teams can compare before adopting. The core problem being addressed is that Incomplete or undocumented APIs cause AI agents to fail at function calling or Ms. APIs entirely. So centralizing and enriching specs directly improves agent reliability in your agent workflow. Both features are available now with API gateway users seeing onboarding prompts directly in the console and pricing details for the spec boost add on are not specified yet, but coming soon. [00:48:12] Speaker C: I mean undocumented APIs is always a problem, you know, whether it's you're using it or one team uses something they don't know or a client finds what should be a dark API that is public and that always becomes a problem. So a way to centralize that and you know, kind of help address API sprawl in general is a great thing and will make people's lives so much better. [00:48:38] Speaker A: Agreed. Google's Cloud Google Cloud's polyglot storage approach for Chatbot memory combines Memory Store for Redis, Cloud, BigTable and BigQuery to handle short, mid and long term conversation history respectively, addressing a common scaling challenge for conversational AI applications. This is all for the new improved Chatbot Memory Memory server for Redis handles the hot layer with sub millisecond latency using redis lists and RP or R push commands with BigTable serving as durable midterm storage using a user ID session and reverse timestamp key pattern to enable efficient range scans across millions of simultaneous sessions and bigtables. Garbage collection policies allow teams to retain only recent data such as the last six 60 days in the high performance tier, while older data flows asynchronous to BigQuery via Pub sub and dataflow. Cloud Store handle is unstructured multimedia artifacts using URI pointer Strategy with sand URLs keeping the primary database lean while maintaining secure time. Limited access to files during or uploaded during your conversation architecture is relevant to any team building production scaleogentic applications on Vertex AI Agent builder particularly in industries like customer service, healthcare and financial services where maintaining accurate long term conversation context is a compliance or user experience requirement. [00:49:45] Speaker B: I love this this, I mean this type of thing is like this isn't a new service, this isn't new functionality of services. This is just reference architecture that people can use and and copy for a relatively new pattern of of application. [00:49:59] Speaker A: Right. [00:49:59] Speaker B: So a lot of people haven't really solved this at scale yet with how do you, how do you maintain chat chat history, memory and do all those things? So I love that they publish this and and how to do it and it's something that if I ever you know, get into the the chat, you know, app writing phase of my life going to copy. So it's fantastic. Agreed. I suppose it will have to make money though because it's using BigTable and BigTable's expensive. [00:50:27] Speaker A: Yeah, I mean like they're just, they're just driving money and other tools. That's all that's happening. But I mean by being able to have conversational memory is such a huge advantage for the agentic agent and even the ability to have it be able to reference back chat history to be able to query that. I mean there's so many benefits. Like it's, it's super annoying when you go chat bot and you're like well remember yesterday we talked about this thing and it's like I don't know what you're talking about. And it's, that's one of the reasons why I really love Claude again because it has a pretty robust memory infrastructure to you know, it like I'll referencing like hey, I'm working on this thing. He's like oh well are you using that for the cloud pod or using that for your other project? Or are you using it? Because depending on which of those contacts you want me to think about this in, I have different answers and I'm like oh well actually now I'm intrigued. What are the answer for all three of those contacts? And it gives you like different insights and intel and like oh, I had, I hadn't even thought about that. There was something I was talking to about, I don't remember it was, but it was something with my son and you know, I was asking him some advice on something and then like later on I was asking some other questions. This is about your son? Like okay, that's a little creepy. No, it's not. In fact it's about me. But you know, we are related so things happen. But you know, it's kind of interesting. Spanner is getting a columnar engine in preview, adding columnar storage alongside traditional row based storage to enable analytical query acceleration up to 200x on live operational data without impacting transactional workloads. If you know anything about columnar databases, you don't know how cool that is. But for those of you who are in the data lake space, you know columnar is a pretty big speed improvement for a lot of use cases when you're doing queries on different fields in the database that don't have necessarily have indexes. And so there you can take a query that runs on traditional OTP table that takes 30, 40 minutes, move it to a columnar setup where you want to query all the red houses for example and you know, get that back in seconds where before we had to do a table scan and it'd be a big pain in that. So columnar is great. I'm actually I didn't know Spanner didn't already support Columnar, so this is a bit of a news to me. [00:52:21] Speaker B: Yeah, me too. Me. [00:52:23] Speaker A: But you know, I am glad to see this now arrive as well. It's a great feature future, so I appreciate this one. [00:52:28] Speaker B: Yeah, I haven't really done deep enough into querying, you know, these stores to really know the differences in performance. I've been involved in lots of cost conversations and technologies. You know, based off of its ability to do this. It can get expensive. [00:52:43] Speaker A: It can get expensive. But again, for the people who use this, typically a data warehousing team, where it's already expensive. [00:52:49] Speaker B: Yeah, it's true. [00:52:53] Speaker A: All right, let's move on to Matt's Azure. [00:52:56] Speaker B: He owns the whole thing now. [00:52:57] Speaker C: Wow. I get all of Azure. Thank you, Justin. [00:53:01] Speaker A: You're welcome. [00:53:02] Speaker C: Thank you so much for that. [00:53:03] Speaker A: You're going to be on the hook for that. You know, hundreds of billions of dollars in capital they're spending this year, so just remember that. [00:53:10] Speaker C: Can I pocket it first before and then just not pay the bill at the end? [00:53:15] Speaker A: You know, you need a Scrooge McDuck room to go like swim in the cash that they're my golden coins. Yeah. All right. Azure Announcing new public preview capabilities in Azure Monitor Pipeline, which is a service that I don't actually know what it monitors, but let's assume it's a pipeline because again, naming for Azure is hard. It now supports TTLs and mutual TLS for TCP based ingestion endpoints and public Preview, allowing teams to encrypt data in transit and enforce neutral authentication without relying on external proxies or custom gateways. This is relevant for regulated industries and edge deployments where plain TCP injection no longer meets security requirements. The new execution placement configuration gives Kubernetes users direct control over how pipeline instances are scheduled across nodes, addressing practical problems like port exhaustion, multi tenant isolation and available zone distributions. Data transformations allow teams to filter, aggregate and normalize cemetery before it reaches Azure Monitor, including converting raw syslog or CEPH messages into standardized schemas using KSQL templates. All three capabilities are in public preview today and target organizations running Azure Monitor Pipeline in an on premise infrastructure. Edge locations and large Kubernetes clusters. Yeah, I just don't know what this is exactly. It feels like service mesh for certain services, but it's like a bridge to on prem though. Like that's where I'm like, huh? [00:54:28] Speaker C: It's their ETL pipeline service. So it's having mtls on your ETL Pipeline to essentially be able to say for this customer, for all of our stuff, it is encrypted and transient the entire way and we know it is because we're using mtls and it's encrypted with our certificate. So because it is their pipeline, their ETL service that's kind of why this is a big deal. But do you need mutual TLS for that? Typically like that's a. It's a checkbox to make sense of person. Happy. Annoying. Look at yourself. Tell me about it. [00:55:05] Speaker B: Oh I'm not. I'm not in the risk to side of the business. Careful now [00:55:13] Speaker A: them are fighting words. I don't know. [00:55:14] Speaker C: Yeah he's like I don't care about risk. I just want all the tools. [00:55:20] Speaker B: See now you got me. Yeah yeah Security engine. [00:55:23] Speaker C: I want to spend all the money and don't care about why we're spending it. Just want to spend the money. [00:55:27] Speaker A: We're good checking boxes. That's why. That's why you spend the money. Microsoft is rightfully timing expanding their sovereign cloud offering with three new capabilities targeting organizational need to operate in fully disconnected environments. The Azure Local Disconnected operations Microsoft 365 Local disconnected and large model support in Foundry Local these are aimed at government, defense and regulated industries where external connectivity may be intentionally restricted or prohibitive like potentially in a war zone in Iran. Azure Local operations allow organizations to run infrastructure with Azure governance and policy controls that any cloud connectivity meaning management and workload execution stay entirely within customer operated environments. This is now generally available worldwide. Perfect timing. The pricing is not publicly listed. I'm saying expensive depending on the hardware and licensing configuration you plan to use. Microsoft 365 Local Disconnected brings Exchange Server, SharePoint Server and Skype for Business Server into the sovereign private cloud boundary. [00:56:18] Speaker C: Oh, good question. I thought Skype was dead. [00:56:22] Speaker A: I thought so too so I. [00:56:24] Speaker B: But this is Skype for business. [00:56:26] Speaker A: It's completely separate. [00:56:27] Speaker C: Well yeah, I know when you see the team's card something like if you. If it doesn't load right like my laptop it actually will say like Skype in the URL still. [00:56:35] Speaker A: Yeah, I mean teams is really Skype. [00:56:37] Speaker C: So like so I know it's not dead but like I thought they were trying to kill it off completely. Maybe I'm wrong still. [00:56:43] Speaker B: And you'd think this is a name right? Just rename it. But no, no, no. [00:56:49] Speaker A: So. So this. Wow. This is. This is a journey. I just looked this up. Skype for business Service formerly Microsoft Office Communication Server which I never heard of and Microsoft Link server which I have heard of. [00:57:00] Speaker B: I have heard of Link. [00:57:01] Speaker A: Oh wow, I've heard of that. That's a throwback real time communication service offer that provide the issue for enterprise instant messaging presence and voip ad hoc and structured conferences and PSD activity through a third party gateway or SIP trunk. It does say this was released in 2019. The last stable release was released in August of 2025. So I mean it, it's definitely still being actively developed. So yeah, I thought they killed this guy brand but apparently not. [00:57:26] Speaker C: Oh, it's their subscription edition. [00:57:29] Speaker A: The article here the Foundry Local now supports large multimodal AI models running on premise using Nvidia GPU infrastructure, enabling local inferencing entirely within customer controlled data boundaries. The overall architecture designed to span connected hybrid and fully disconnected modes under a consistent governance model, which reduces the operational complexity of managing separate tool sets for different connectivity scenarios. Again released before things went down, but good timing. Microsoft's. [00:57:55] Speaker B: Yeah, definitely. I mean this is, this will save people from having to run like full data centers right in regions. [00:58:01] Speaker A: Yeah, I mean or you know, it's an option if you now all of a sudden find yourself in Europe and you don't trust a US entity, which you know, is a thing that's going to happen more and more. So [00:58:17] Speaker B: we're all going to need a drink after this. Our listeners are going to need two. [00:58:22] Speaker A: I mean it was like when Safe harbor fell, I was like that's really a bummer because that's going to be a pain. And then, you know, all the sovereignty stuff has now come out of Sly harbor. And then you're like, okay, but like it's not really that real. And like now it's like no, no, it's real. Real. [00:58:34] Speaker C: No, it's real. We just were, you know, going lower. [00:58:37] Speaker A: We were in denial. Yeah, you know, totally. So then the dumbest feature of the week in my opinion is a article [00:58:44] Speaker C: I found subject other. Okay, that's headlined of the week. [00:58:49] Speaker A: The headline is Using self signed certificates with Java on Azure functions for Linux is now something you can do with its best practice. Java developers on Azure functions who connect to services secure by self signed certificates frequently encounter SSL handshake errors because the JVM only trusts well known certificate authorities by default. Recommended fix is of course creating a custom trust store in the persistent home directory and pointing the JVM to it via Java Ops application settings. The core reason to use home slash home for the trust or rather than system JVM directories that the Linux functions file system Is ephemeral, meaning any changes outside slash home are wiped on restart. Oh, thanks Microsoft. One practical deployment gotcha worth noting is that zip deploy or run from package configurations can overwrite the WW root contents. And so this is just a way for you to troubleshoot certificates even worse than you're troubleshooting it before. And the fact that this is something they had to write a best practice for. Sort of interesting to me as well. This is hilarious. [00:59:41] Speaker C: Best practices for self sighting. [00:59:43] Speaker B: Self side certs. That's chef's kiss. [00:59:45] Speaker C: Yeah, for like bypassing security. It's like. Wait, what? Yeah, that's the whole reason this article's in here. [00:59:52] Speaker B: Yeah, I mean it's funny because I never. I don't create Java anything and so I only have to support it occasionally, but I've never had to support it on an ephemeral workload. Like yeah, what a pain in the ass. [01:00:04] Speaker C: I mean I've done. I've helped people with lambda Java functions [01:00:09] Speaker B: before and I imagine this is something if they're not. If you're not using a public, you know. URL. [01:00:16] Speaker A: Well, I mean the beauty is in Amazon you probably are using cert ACM and so this isn't a problem. But does Azure have an acm? [01:00:24] Speaker C: No, that's the problem. Just don't stab me more. [01:00:27] Speaker B: Well, you still gotta. I mean you. You have to be using a public domain though. [01:00:33] Speaker C: Correct. [01:00:34] Speaker B: So if you're not using. If it's internal traffic and spam for and you don't have a routable domain, it's still screwed. Even using ACM. [01:00:42] Speaker C: Well, ACM has private key. Private ACM. [01:00:45] Speaker B: I know, but the CAs don't know that. [01:00:49] Speaker C: Touche. Touche. Yeah, that would be the problem there. [01:00:54] Speaker A: A couple other quick hits for Azure this week they've announced a whole new series of Alphabet soup. The DCESV6, DCEDSV6, ECESV6 and ECEDSV6. [01:01:06] Speaker C: Excuse me. [01:01:07] Speaker A: Which are. Thank you. Azure is. These are all intel TDX Confidential VMs. They're generally available using the 5th gen Intel Xeon processors provide hardware enforced isolation that protects data while in use addresses a long standing barrier for organizations running sensitive for floats in the cloud. So thanks I guess. Again, I don't know, I still don't understand their naming convention. I say it every time they announce a new instance. I don't get it. [01:01:29] Speaker B: I've stopped understanding any of them though. Like yeah, Amazon and Google are the same. [01:01:34] Speaker A: Like I've. [01:01:35] Speaker C: It's too. [01:01:35] Speaker B: Too many now. [01:01:36] Speaker A: Yeah, I mean like at least the these, the main ones I get like the G versus the I versus the A. Those are pretty clear to me on Amazon and I'm pretty clear on the N is for the high end networking. But then like you, you throw the D in there and I get confused with what the D. It's something with the storage, but I don't really know [01:01:52] Speaker C: to use the storage. It actually they have a formula for it. [01:01:58] Speaker A: Is there a formula for the Azure ones? Because that's what I need. Like the. I need the key. [01:02:02] Speaker C: There's a formula. It's like family, sub family, number of CPUs constraints, additive features. That's like storage or anything along those lines. Accelerator type, memory capacity and inversion. So it's like C is for confidential. So the DC is for confidential. The EC is for confidential. So like that's that first part. And then I have to have multiple screens because I definitely did not memorize Azure's naming convention. If I did, God, I hate myself a little bit. [01:02:38] Speaker B: And this is why AI is going to take over the world because like I, my context window's not big enough to keep this together. [01:02:45] Speaker A: Right. [01:02:45] Speaker B: And so like even though you're going to go explain it to us all like that information is going to be gone in a second because it's not going to stick. [01:02:52] Speaker C: Yeah. The middle ease encrypted S is for premium SSD storage. And then V6. So I knew the V6 part. [01:03:01] Speaker B: That's about it. You mean the only intuitive part? Yeah. [01:03:08] Speaker A: I realized the dc DC sounds for. [01:03:11] Speaker C: It's for. Yeah. [01:03:13] Speaker B: The confidential. [01:03:14] Speaker A: Like okay. [01:03:15] Speaker C: Huh. Yeah. Then you have the list of all the families. The one I know is B is burstable, which I like a lot better than T series because T, I still don't know what. [01:03:25] Speaker B: T never made sense either. [01:03:26] Speaker C: Yeah, yeah. I, I assume there was a reason somewhere online. I never figured that out. But like B for burstable, I got that one. You know, I think it's C for compute and D for general. So that's like what the first letters I think always mean. And E is memory optimized because E is for something. [01:03:46] Speaker B: There was one of one of Amazon's. Didn't Amazon have a weird one like M wasn't the memory optimized one? [01:03:53] Speaker C: No, M is mixed where R was RAM and C is compute. [01:03:58] Speaker B: And that's where I would get confused. [01:04:00] Speaker A: So yeah, and they get weird ones like the X and you get the HPC ones and yeah, there's a bunch [01:04:07] Speaker B: of weird ones, but just the normal ones would throw me off just because Yeah, I would say M is the memory, but you're right, it's, that's the common workload or the generalized workload pretty much. [01:04:15] Speaker A: If you know, MC&R pretty much. Majority of people are doing GPUs and LLMs, that's the ones they need. And then there's, there's a bunch of other ones after that, but. [01:04:24] Speaker B: And it's N and GCP for some reason that I've never cared about. [01:04:30] Speaker A: That's the general compute for them. But they also, they also have M and they also have a couple other letters too. [01:04:35] Speaker B: Yeah, yeah, no, they have, they have a few. I mean they have the same thing. They're just different. Slightly different. And because they're only slightly different, they won't ever stick in my head because it's full of Amazon. [01:04:44] Speaker A: I wonder, you had to wonder, like if you all could redo your naming convention for instances, what would you change it to? Like, what's your, what's your ideal naming convention? And I, it's kind of like, you know, I, I evolved my server naming convention over the years. You know, like when I first got into it, I named all my computers after Simpsons characters. Like, that's cute, but dumb. And then you know, you're like, okay, I need a naming convention. I need like, you know, things like Rack location and you know, what's, what is the purpose of the server and so you evolve it over time. I, but like, you know, Amazon, you know, they designed that thing being a cloud and they had M1 instance right now it's that forever. Like would they go back and change it? And like when do you divorce yourself from this thing that's clearly no longer working for anybody on the Azure side? [01:05:24] Speaker B: And well, but that supposes that there is one that's going to work. [01:05:27] Speaker A: And I don't know that there is. I don't know there is one either. But I, you know, it's definitely, you know, let's use natural language and you just describe what you want, we'll just get it for you. That's how we're going to switch. Solve this problem. We talked about it in preview, but now it's generally available. Azure firewall policy now supports a two phase draft and deploy workflow, meaning teams can stage policy changes before committing them, which reduces the risk of unintended disruptions during updates. Congratulations, Ga. If you'd like to burn money, Amazon or sorry, Azure has a new SKU for you this week. Supports up to 100 terabytes of storage at 2 1/2 x. Increased from the previous 40 terabyte cap with no configuration changes required for existing registries to benefit automation. And of course the name is the Azure container registry premium SKU. [01:06:08] Speaker C: So no, the SKUs are there. They just bumped it from 40 to 100 terabytes. [01:06:13] Speaker A: Oh, oh, I see. I'm sorry. [01:06:14] Speaker B: So now instead of two Windows container images, you can store four. [01:06:18] Speaker A: Ooh, that's so fancy. Apparently it's addressed a real operational pain where enterprises were splitting workloads across multiple registries just to stay under limits because they're using Windows containers. Because that's just dumb. Adding complexity to access controls and networking that had nothing to do with actual business business requirements. So you. But why 100 terabytes? I mean again, like this is, this is something like why wouldn't you just back whatever your container history solution is with your Blob storage? [01:06:43] Speaker C: Because you're doing it wrong. [01:06:44] Speaker B: Oh, this is just charging your customers. That's all this is. It is, it is absolutely backed by the Blob store. [01:06:49] Speaker A: But I guess I just understand like why you wouldn't just charge for usage and then the. I understand the 100. I understand the artificial 100 terabit limitation. That's what I don't understand. [01:07:00] Speaker C: Because you have to put a limit somewhere otherwise someone stupid is going to be. I can't launch my container. That's 500. 500 terabytes. [01:07:07] Speaker B: Oh no. There's an architecture reason somewhere in the back end where they're, they're stringing two cans together and the minute you string enough cans together, you have to string another can to another place. It doesn't work out. [01:07:17] Speaker A: I mean if you have a hundred terabit container, I can't even have a mess. But again, I said this is the container registry. This is where I make a point is like you. Yeah, like this is where I struggle with it. I'm like, again, yes, you should have a per container storage limit and I'm sure there is 1 in AKS. I just know what it is. [01:07:34] Speaker B: Yeah, well, no, this is their tenant management. [01:07:36] Speaker A: Right? [01:07:37] Speaker B: Because this is the only reason you have a limit at the tenant limit is because you've, you've organized something about the way you're managing the back end into groups and something about that prevents scaling. [01:07:48] Speaker C: Right? [01:07:49] Speaker A: Yeah, it doesn't. [01:07:49] Speaker B: It's not the storage layer, you know, that's been solved many times. But their abstraction layer they're using for managing Azure tenants is thinking like, why [01:07:57] Speaker A: would, if you're going to go fix that, why would you go from 40 terabits to 100 terabytes. I. [01:08:01] Speaker C: Because it's. [01:08:01] Speaker B: Because it's easier, Justin. [01:08:03] Speaker C: That's why. [01:08:03] Speaker B: Otherwise, it's rewriting the whole thing. You know, it is like this. [01:08:06] Speaker A: This is your day job. [01:08:07] Speaker B: You sit in these meetings. [01:08:09] Speaker A: But I would argue, and if I was in the meeting, I argue like, this is like, we need to be able to bridge a hundred multiple. Like, you're basically saying customers are working around you by provisioning multiple of this thing. So. Okay, cool. So we could do the same thing. We can provision multiple of the 100 terabit thing or the 40 terabit thing, whatever, depending on how much contiguous space I need. But then, like, if I can abstract the complexity of that and then make it just look like it's unlimited to the user, but in the background, I'm blanking into 100 terabyte segments. Fine, I don't care. That's my point. It's like, I would abstract the complexity so your. [01:08:40] Speaker B: Your development team can rewrite the logic, or you could just make your infrastructure team lay bigger pipes. Which one will you choose? [01:08:48] Speaker A: I'm fixing that. Every time [01:08:53] Speaker C: he's like, I plead the fifth. [01:08:54] Speaker A: I mean, it's not what we would [01:08:55] Speaker B: choose, but it's the. It's the. [01:08:57] Speaker C: The. [01:08:57] Speaker A: The. [01:08:58] Speaker B: The good fight that we fight every day. [01:08:59] Speaker A: Yeah, that's true. Right, so, and then our final story for this week is Azure API Management is rolling out updated resource limits starting in March, aligning classic tier limits with the V2 tier limits across entities like API operations, tags, products, and subscriptions. And I hope when they say they're aligning them, that immediately they're lowering the V2 ones to the classic tier. I assume they figured out whether the tech debt was preventing the classic tier from scaling, and now they meet the V2 reason, which makes the whole point of why do you have V2 silly, but here we are. So there you go. [01:09:29] Speaker C: Because moving is hard unless V2 is [01:09:32] Speaker B: for something else as well as limits, [01:09:33] Speaker A: but it probably is. I. I don't know. [01:09:35] Speaker C: Yeah, it's. APIMS v1, v2 are all nightmares, so. But it's nice that they're kind of moving stuff in a more streamlined, you know, level. So migration and things like that should be a little bit easier over time. [01:09:51] Speaker A: That's a plus. Well, gentlemen, we have woken to our new national nightmare of Dr. Is real and anthropic is no longer allowed for the federal government. We don't know what that means yet, and we're gonna find all these things out and. [01:10:04] Speaker B: And what did OpenAI do to get that contract? [01:10:08] Speaker A: No, no. See, what happened was they sold. They. They apparently kissed, you know, the president's ass and treated him like the, you know, the authoritarian that he is. And the anthropic guy wouldn't do that. Yeah, I mean, that's what I heard. [01:10:23] Speaker B: Yeah. I mean, I'm. That's just how life is these days, so. [01:10:27] Speaker A: Yeah, indeed. Well, we will see you next week here in the Cloud, hopefully. And have a good one. Bye, everybody. [01:10:36] Speaker C: See ya. [01:10:39] Speaker B: And that's all for this week in Cloud. Head over to our [email protected] where you can subscribe to our newsletter, join our Slack community, send us your feedback, and ask any questions you might have. Thanks for listening, and we'll catch you on the next episode.

Show Notes

Titles we almost went with this week

AI Is Going Great – Or How ML Makes Money

AWS

GCP

Azure

Closing

Chapters

Episode Transcript

Other Episodes

Episode 257

257: Who Let the LLamas Out? Bleat Bleat

Episode

Episode 11 – Screaming in the last week of the cloud pod

Episode 136

136: Take us to your Google Cloud Digital Leader

345: Damn It… my excuse is now gone for Disaster Recovery

Show Notes

Titles we almost went with this week

AI Is Going Great – Or How ML Makes Money

AWS

GCP

Azure

Closing

Chapters

Episode Transcript

Other Episodes

Episode 257

257: Who Let the LLamas Out? *Bleat Bleat*

Episode

Episode 11 – Screaming in the last week of the cloud pod

Episode 136

136: Take us to your Google Cloud Digital Leader

257: Who Let the LLamas Out? Bleat Bleat