352: Google Next: Rebrandapalooza

Welcome to episode 352 of The Cloud Pod, where the weather is always cloudy! Justin, Matt, and Ryan are safely back from Vegas (Ryan and Justin, anyway), and they have all the news and announcements from Google Next. Plus, we have Ryan’s take on Phish, news from Cloudflare, and a shoe company making a pivot. There’s a lot to cover, so let’s get started!

Titles we almost went with this week

Redact Yourself Before You Wreck Yourself OpenAI **Anthropic
Fork Yeah Cloudflare Artifacts Is Here
Git Happens at Scale on Cloudflare
Bucket List Item Checked Lambda Mounts S3 File Systems
Terraform Your Agents Before They Terraform You
Cloud Run Gets GPUs and Finally Hits the Gym
Spanner Goes Rogue, Leaves the Cloud Behind
Knowledge Catalog Knows What Your Agents Did Last Query
One Control Plane to Rule a Million Chips
No More Incognito Windows for Your AWS Identity Crisis
Your Agent Can Now Write Files Without Burning Everything Down
Spend Caps Finally Tell Runaway AI Jobs to Chill
RIP Vertex, long live the agent
Agents all the way down
Google Next: This is the dawning of the Age of Agentic
Allbirds Proves AI Hype Needs No Infrastructure

A big thanks to this week’s sponsors:

There are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it’s really simple. Archera gives you the cost savings of a 1 or 3-year AWS Savings Plan with a commitment as short as 30 days. If you do not use all the cloud resources you have committed to, Archera will literally cover the difference. Other cost management tools may say they offer “insured commitments”, but remember to ask: Will you actually give me my rebate? Because Archera will.

Check out thecloudpod.net/archera to schedule a demo today.

We also wanted to tell you about something coming to the US for the first time — WeAreDevelopers World Congress!

They’ve been doing this in Europe for years, 15,000-plus attendees in Berlin, it’s one of the biggest developer events over there. Coté from Software Defined Talk is actually speaking at their Berlin event this summer, so we’ve got some firsthand context here. In September, they’re launching the North America edition. San José, September 23 to 25. 500-plus speakers, 18 tracks — cloud, infrastructure, DevOps, security, AI, data engineering, all of it. Speakers from Datadog, Honeycomb, Sentry, Google, LinkedIn, and Stack Overflow. Olivier Pomel, Christine Yen, Milin Desai, Kelsey Hightower – plus workshops and masterclasses, not just talks. These are people who know how to do a developer conference at scale. wearedevelopers.us, code DEVPOD26 for 15% off. Group rates on top of that for 4 or more.

General News

06:12 Amazon invest up to $25 billion in Anthropic part of AI infrastructure

Amazon has committed up to $25 billion in additional investment in Anthropic, bringing its total potential investment to $33 billion. The latest $5 billion tranche is based on Anthropic’s $380 billion valuation, with up to $20 billion more tied to commercial milestones.
In exchange, Anthropic has committed to spending over $100 billion on AWS over the next decade, with a specific focus on Trainium custom AI chips, and plans to bring nearly 1 gigawatt of Trainium2 and Trainium3 capacity online by end of the year.
Anthropic cited real infrastructure strain from growing enterprise and consumer demand for Claude, noting reliability and performance impacts, which gives this deal a practical operational motivation beyond financial positioning.
Amazon is now a substantial investor in both Anthropic and OpenAI, having committed up to $50 billion to OpenAI in February, which raises notable questions for developers about how AWS positions competing AI platforms on its infrastructure.
With Anthropic also holding compute agreements with Microsoft Azure and Google, and now securing up to 5 gigawatts of total capacity, the company is distributing its infrastructure across multiple providers despite naming AWS its primary training partner.

08:46 Justin – “The big question is going to be when one of these companies – OpenAI or Anthropic – finally goes public, and they start publishing these things; what people’s actual reaction is to their financials .”

10:48 SpaceX Strikes $60 Billion Deal for Right to Buy Coding Startup Cursor

SpaceX struck a deal with AI coding startup Cursor, valued at either a $60 billion acquisition or a $10 billion partnership fee, giving Cursor access to xAI‘s Colossus supercomputer, which runs 200,000 Nvidia H100-equivalent GPUs for model training.
Cursor had been compute-constrained despite reaching $1 billion in annual recurring revenue and a $29.3 billion valuation, so this deal directly addresses their infrastructure bottleneck for scaling model intelligence.
The partnership positions SpaceX to compete in the AI coding tools space against Anthropic and others, notable given xAI’s Grok has publicly acknowledged falling behind competitors in coding capabilities.
For developers and cloud users, this deal signals continued consolidation between compute providers and AI coding tools, which could influence pricing, model availability, and platform lock-in decisions for teams building on AI-assisted development workflows.
SpaceX’s recent acquisition of xAI, combined with this Cursor deal, suggests a vertical integration strategy connecting rocket-company compute infrastructure directly to developer-facing AI products ahead of a potential IPO later this year.

12:27 Justin – “The thing I don’t get is the $10 billion partnership versus the $60 billion acquisition. What’s the triggering events on those things? When is it a partnership, versus when is it now an acquisition? And does that mean that these people who are working at Cursor – if it’s a partnership, aren’t getting equity? That’s a bummer.”

AI Is Going Great – Or How ML Makes Money

13:36 The next evolution of the Agents SDK

OpenAI has updated its Agents SDK to general availability, adding native sandbox execution, configurable memory, and filesystem tools modeled after Codex.
Agents can now read and write files, run shell commands, install dependencies, and apply code patches within controlled environments without developers building that infrastructure themselves.
The SDK introduces a Manifest abstraction that standardizes how agent workspaces are defined across sandbox providers, including Blaxel, Cloudflare, E2B, Modal, and Vercel, with storage integrations for AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.
This gives developers a consistent path from local prototype to production deployment.
Built-in snapshotting and rehydration mean a failed or expired sandbox container does not terminate a long-running agent run, as the SDK can restore state in a fresh container from the last checkpoint. This addresses a practical reliability gap for agents working on multi-step tasks.
The SDK incorporates several emerging agentic standards, including MCP for tool use, AGENTS.md for custom instructions, and the skills spec for progressive capability disclosure.
OpenAI positions this as reducing the maintenance burden on developers as these patterns evolve.
The updated SDK is currently Python-only, with TypeScript support planned for a future release.
Pricing follows standard API rates based on tokens and tool use, and features like code mode and subagents are still in development for both language runtimes.

13:59 Ryan – “As long as it also logs and has permissions and some sort of boundaries, I don’t have to kill it. It’s just terrifying because we already have people that are just throwing questions into any chat tool, and just then running whatever command it spits out indiscriminately. And now that’s just going to happen at a faster rate.”

19:56 Introducing Claude Opus 4.7

Claude Opus 4.7 is now generally available across Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry at the same pricing as Opus 4.6: $5 per million input tokens and $25 per million output tokens.
The model targets complex, long-running agentic coding workflows, with early testers reporting 13% higher resolution on a 93-task coding benchmark and 3x more production task resolution on Rakuten-SWE-Bench compared to Opus 4.6.
Vision capabilities received a notable upgrade, with Opus 4.7 now supporting images up to 2,576 pixels on the long edge, more than three times the resolution of prior Claude models.
This opens up use cases like computer-use agents reading dense screenshots and data extraction from complex technical diagrams, though higher-resolution images will consume more tokens.
Anthropic is using Opus 4.7 as a testbed for cybersecurity safeguards before any broader release of its more capable Mythos Preview model.
The model includes automatic detection and blocking of prohibited cybersecurity uses, with a new Cyber Verification Program available for legitimate security professionals doing penetration testing or vulnerability research.
A new high effort level sits between the existing high and max settings, giving developers finer control over the reasoning-versus-latency tradeoff.
Developers migrating from Opus 4.6 should note that the updated tokenizer can increase token counts by roughly 1.0 to 1.35 times, depending on content type, and a migration guide is available on the Claude platform.
File system-based memory improvements allow Opus 4.7 to retain notes across multi-session agentic work, reducing the need to re-establish context at the start of each task.
This is particularly relevant for enterprise teams running parallel agent workflows where continuity across long runs matters.

21:50 Ryan – “I didn’t realize it’s the same price because every platform that I’m using this in, Opus 4.7 is so much more expensive than 4.6.”

28:14 Introducing Claude Design by Anthropic Labs

Anthropic launched Claude Design in research preview for Pro, Max, Team, and Enterprise subscribers, powered by Claude Opus 4.7.
It enables users to create interactive prototypes, pitch decks, wireframes, and marketing assets through conversational prompts and inline editing controls.
A notable workflow feature is the Claude Code handoff, where finished designs are packaged into a bundle that developers can pass directly to Claude Code for implementation, creating a tighter loop between design and engineering.
Claude Design builds a team-specific design system during onboarding by reading codebases and design files, then automatically applies brand colors, typography, and components to every subsequent project.
Teams can maintain multiple design systems simultaneously.
Early user data from Brilliant suggests complex pages that required 20-plus prompts in other tools needed only 2 prompts in Claude Design, indicating meaningful efficiency gains for interactive prototype creation.
Export options include Canva, PDF, PPTX, and standalone HTML, with organization-scoped sharing and collaborative editing.
For Enterprise customers, the feature is off by default and must be enabled by admins in Organization settings.

30:56 Building the agentic cloud: everything we launched during Agents Week 2026

Cloudflare held its first Agents Week, shipping a new set of primitives across compute, security, and tooling specifically designed for running AI agents at scale.
The core premise is that traditional cloud infrastructure built around one app serving many users does not fit a model where individual users each run multiple concurrent agents.
On the compute side, Cloudflare launched new environments supporting both full operating system containers for package installation and terminal commands, and lightweight isolates that start in milliseconds for high-scale deployments.
They also shipped a Git-compatible workspace designed for agent-generated code moving from prototype to production.
Security and identity were treated as built-in defaults rather than add-ons, with new tools for connecting agents to private networks and managing autonomous actions taken on behalf of users across an organization.
The agent toolbox additions include inference, search, memory, voice, email, and a browser primitive, giving agents the ability to perceive, remember, and communicate without developers assembling separate third-party services.
Cloudflare also addressed the web infrastructure side, releasing tools for existing websites to control bot access, package content for agent consumption, and measure their readiness for agent-driven traffic, acknowledging that most of the current web was built for human browsers rather than automated agents.

31:40 Justin – “I look forward to Cloudflare taking down Cloudflare, and then writing an RCA with these great tools.”

32:14 Artifacts: versioned storage that speaks Git

Cloudflare launched Artifacts in private beta, a versioned file system built on Git that lets developers and agents programmatically create, fork, and manage Git repositories at scale via a REST API and native Workers API, with public beta targeted for early May 2026.
The system is built on Durable Objects with a Git server written in Zig and compiled to a roughly 100KB WebAssembly binary, enabling tens of millions of isolated repo instances per namespace while handling the full Git smart HTTP protocol with zero external dependencies.
Cloudflare is also open-sourcing ArtifactFS, a filesystem driver that mounts large Git repos using a blobless clone and lazy file hydration, reducing startup times for multi-gigabyte repos from roughly 2 minutes down to 10-15 seconds, which at 10,000 sandbox jobs per month translates to approximately 2,778 compute hours saved.
Beyond source control, Artifacts supports use cases like per-session agent state persistence, customer config versioning with rollback, and session forking, using Git semantics such as diff, revert, and clone as a general-purpose state management layer rather than just a code storage tool.
Pricing is designed for agent-scale workloads, charging based on storage consumed and operations performed rather than repo count, with plans to bring Artifacts to the Workers Free plan with fair use limits as the beta progresses.

32:54 Justin – “…another way it’s going to take down Cloudflare, so I look forward to that.”

34:53 Cortex Agents: The Platform Powering Snowflake Intelligence and Enterprise AI Agents

Snowflake is launching Cortex Agents as a full enterprise agent platform with several capabilities now generally available, including multi-tenancy with row-level data isolation, agent versioning with commit-based rollback, resource budgets for per-agent and per-team spending controls, and Cortex Agent Evaluations using their GPA (Goal-Plan-Action) framework.
MCP connector support is coming soon to GA, allowing Cortex Agents to connect natively to external tools like Salesforce, Jira, GitHub, Slack, and Google Workspace using the Model Context Protocol standard, with the same Snowflake role-based governance applied to those external connections.
The Code Execution Tool (public preview soon) gives agents a sandboxed Python environment with session-level isolation, letting agents generate and run code on demand during conversations without accessing data outside the current session scope.
The GPA evaluation framework is a notable technical detail here: in benchmark testing against TRAIL/GAIA, it captured 95% of human-annotated errors compared to a 55% baseline, and localized errors to specific trace spans with 86% accuracy, giving teams a structured alternative to subjective human review.
The cost governance model is more granular than typical platforms, supporting both agent-level and per-team shared budgets with configurable threshold actions, such as alerts at 80% spend and automatic access revocation at 100%, which addresses a practical concern for enterprises deploying agents across multiple business units.

35:06 Justin – “If you need your agents close to your data, this is a great way to do it. I definitely would look into cost with this one, because Snowflake is not cheap.”

36:11 Introducing GPT-5.5

GPT-5.5 is now generally available in ChatGPT and Codex for Plus, Pro, Business, and Enterprise users, with API access priced at $5 per 1M input tokens and $30 per 1M output tokens, and a Pro variant at $30 input and $180 output per 1M tokens.
The model shows notable agentic coding improvements, scoring 82.7% on Terminal-Bench 2.0 and 58.6% on SWE-Bench Pro, while using fewer tokens than GPT-5.4 to complete the same tasks, which partially offsets the higher per-token cost.
For cloud and enterprise workloads, GPT-5.5 was co-designed with and served on NVIDIA GB200 and GB300 NVL72 systems, with inference optimizations including dynamic load balancing heuristics that increased token generation speeds by over 20%.
Knowledge work benchmarks are worth noting for enterprise buyers: 84.9% on GDPval across 44 occupations, 78.7% on OSWorld-Verified for autonomous computer use, and 98.0% on Tau2-bench Telecom for customer service workflows, suggesting practical applicability across business functions.
OpenAI is classifying GPT-5.5 as High under its Preparedness Framework for both cybersecurity and biological capabilities, and is introducing a Trusted Access for Cyber program through Codex that gives verified defenders expanded access with fewer restrictions, which has direct implications for security teams evaluating AI-assisted vulnerability management.

37:31 Ryan – “That’s kind of cool. That’s the first I’m hearing of those kind of frameworks for their testing, and testing the safety AI aspects and having a rating, which I like.”

37:58 Introducing workspace agents in ChatGPT

OpenAI is launching workspace agents in ChatGPT as a research preview for Business, Enterprise, Edu, and Teachers plans, positioning them as an evolution of GPTs powered by Codex and designed for shared team workflows rather than individual use.
These agents run persistently in the cloud, meaning they can continue working on long-running tasks without user interaction, and can be triggered on a schedule or deployed directly in Slack to handle incoming requests automatically.
The practical use cases OpenAI highlights include a lead outreach agent that reduced 5-6 hours of weekly rep work to an automated background process, and an accounting agent that handles month-end close tasks, including journal entries and variance analysis in minutes.
(mention privacy filters) On the enterprise controls side, admins get role-based access management, a Compliance API for auditing every agent configuration and run, built-in prompt injection safeguards, and the ability to suspend agents, which addresses a common concern about autonomous agents operating within sensitive business environments.
Pricing is worth noting for teams evaluating adoption: workspace agents are free until May 6, 2026, after which credit-based pricing kicks in, giving organizations a window to test and build before committing to costs.

38:42 Introducing OpenAI Privacy Filter

OpenAI released Privacy Filter, an open-weight 1.5B parameter model (with only 50M active parameters) for detecting and redacting PII in text, available now on Hugging Face and GitHub under the Apache 2.0 license for free commercial use and fine-tuning.
The model uses a bidirectional token-classification architecture with constrained Viterbi span decoding, processing up to 128,000 tokens in a single forward pass across eight PII categories, including private persons, addresses, account numbers, and secrets like API keys and passwords.
A key practical advantage for cloud and on-premise deployments is that the model runs locally, meaning sensitive data never needs to leave the device for de-identification, which directly reduces exposure risk in logging, indexing, and training pipelines.
Performance benchmarks show a 97.43% F1 score on the corrected PII-Masking-300k benchmark, and fine-tuning on small domain-specific datasets can lift accuracy from 54% to 96% F1, making it adaptable for legal, medical, and financial workflows.
OpenAI explicitly notes this is not a compliance certification or anonymization guarantee, and recommends human review in high-stakes settings, which is an important caveat for developers considering it as a drop-in solution for regulated industries.

38:52 Justin – “If you’re looking for a lightweight built-in option inside of Codex to find privacy PII, this little model sits on top of it and does great work.”

40:33 Introducing ChatGPT Images 2.0

ChatGPT Images 2.0 can now handle small text, UI elements, icons, and complex layouts at up to 2K resolution; no more getting something “close enough. It actually delivers what you asked for.
Previous versions struggled outside of Latin-based text, but now it has solid support for Japanese, Korean, Chinese, Hindi, and Bengali, where the language is baked into the design itself.
When paired with a reasoning model, it can search the web, plan the image structure, self-check its work, and even produce multiple distinct images from a single prompt.
Images 2.0 supports everything from wide 3:1 banners to tall 1:3 mobile screens. Useful for social graphics, presentations, posters, and more, all without manual resizing.
This replaces the back-and-forth between prompting, designing, and editing. You describe what you need, it researches, writes, and visualizes from start to finish.

41:29 Matt – “I like that it can do multiple at the same time. That’s a nice feature.”

Cloud Tools

42:19 Register domains wherever you build: Cloudflare Registrar API now in beta

Cloudflare Registrar API is now in beta, allowing developers to search, check availability, and register domains programmatically through three straightforward API endpoints, keeping the entire workflow inside editors, terminals, or agent-driven tools.
The API integrates directly with Cloudflare’s MCP server, meaning agents in environments like Cursor or Claude Code can already discover and call Registrar endpoints without any additional integration or custom tool definitions.
Cloudflare maintains its at-cost pricing model through the API, charging exactly what the registry charges with no markup, and WHOIS privacy protection is enabled by default at no extra charge.
Registration typically completes synchronously within seconds, but the API also handles longer operations by returning a 202 Accepted with a polling URL, using the same response shape either way to simplify agent logic.
The beta currently covers search, check, and registration for a curated set of TLDs, with Cloudflare actively working to expand the API to include transfers, renewals, contact updates, and eventually a broader registrar-as-a-service offering for multi-tenant platforms.

AWS

43:29 AWS Interconnect is now generally available, with a new option to simplify last-mile connectivity

AWS Interconnect is now generally available in two flavors: multicloud for private Layer 3 connections between AWS and other cloud providers (starting with Google Cloud, Azure coming later in 2026), and last-mile for connecting on-premises locations to AWS through network providers like Lumen, AT&T, and Megaport.
The multicloud option uses IEEE 802.1AE MACsec encryption by default on physical links, routes traffic entirely over private backbones without touching the public internet, and includes built-in redundancy across at least two physical facilities.
Pricing is a flat hourly rate based on bandwidth tier and region pair, so check the pricing page before sizing your connection.
Provisioning is handled through the AWS Direct Connect console in a few clicks, generating an activation key that completes the handshake on the partner cloud side.
However, there are gotchas to watch for, including non-overlapping IP ranges, matching MTU settings between VPCs, and consistent IPv4/IPv6 configuration on both sides.
Last-mile connectivity automatically provisions four redundant connections, configures BGP routing, enables MACsec and Jumbo Frames by default, and supports 1 Gbps to 100 Gbps with bandwidth adjustable from the console without reprovisioning. It includes a 99.99% availability SLA up to the Direct Connect port.
Current multicloud availability covers five region pairs across US East, US West, and Europe, connecting to Google Cloud, with last-mile launching in US East N. Virginia only.
The open specification published on GitHub under Apache 2.0 allows other cloud providers to implement the standard and become Interconnect partners.
AWS Interconnect -multicloud pricing is available here, and last-mile pricing can be found here.

44:29 Justin – “Good to see it in GA; hopefully it gets expanded out pretty quickly.”

44:43 Amazon Quick for marketing: From scattered data to strategic action

Amazon Quick is an AI-powered marketing intelligence tool built on AWS that connects to existing tools like HubSpot, Salesforce, Slack, and Adobe to create a unified knowledge graph from scattered marketing data.
Pricing is available here, with support for MCP and OpenAPI integrations for extending to other systems.
The tool addresses three specific marketing pain points: campaign performance reporting, competitive intelligence, and content creation. Quick claims to reduce competitive analysis from days to 30 minutes and content production from three hours to under 20 minutes.
Quick Flows allow teams to automate recurring tasks like weekly performance summaries and monthly competitive reports on a schedule, shifting work from manual queries to automated delivery.
This is a notable distinction from standard AI chat assistants that require active prompting.
On the security side, Quick runs within the customer’s AWS environment, queries and responses are not used to train external models, and role-based access controls are included.
This positions it as an enterprise-focused offering rather than a consumer AI tool.
The product references an MIT study showing AI cut document creation time by 40% and improved output quality by 18% among 444 professionals, which gives some external grounding to the productivity claims.
Teams considering this should evaluate it against existing point solutions like dedicated BI tools or standalone AI writing assistants they may already have in place.

48:12 Amazon CloudWatch now supports cross-region telemetry auditing and enablement rules

CloudWatch now lets customers audit telemetry configuration and enable telemetry from services like EC2, VPC, and CloudTrail across multiple regions from a single control point, reducing the operational overhead of managing observability at scale.
Enablement rules can be scoped to specific regions or all supported regions, and rules set to cover all regions automatically expand to include new regions as they become available, which is useful for organizations with growing AWS footprints.
A practical use case is a central security team creating one organization-wide rule for VPC Flow Logs that consistently applies across every account and region, eliminating gaps in telemetry coverage that could create blind spots.
The feature is available in all AWS commercial regions with standard CloudWatch pricing applying to telemetry ingestion, so costs will scale with the volume of logs and metrics collected rather than the feature itself carrying an additional charge.
For teams managing multi-account AWS Organizations setups, this reduces the risk of misconfigured or missing telemetry in individual accounts, which has historically required custom automation or third-party tooling to enforce consistently.

47:58 Ryan – “…this has always been a challenge, even before I was doing security and trying to do log governance across these things, trying to have different serving farms basically in multiple regions and having to log into different web pages to view the metrics on each one. They sort of fix that with the ability to reference metrics in a foreign site a little while ago, but you could only do it for metrics. And so this is definitely something I’m glad to see that you can use.”

50:15 Introducing granular cost attribution for Amazon Bedrock

Amazon Bedrock now automatically attributes inference costs to the IAM principal making the call, with data flowing into CUR 2.0 via a new line_item_iam_principal column.
This works across all Bedrock models at no additional cost and requires no changes to existing workflows.
The feature supports four distinct access patterns: direct IAM users or API keys, application roles on AWS compute, federated identity through providers like Okta or Azure AD, and LLM gateway architectures.
Each scenario has different configuration requirements, with the gateway scenario being the most complex since it requires per-user AssumeRole session management to avoid all traffic appearing under a single identity.
Cost allocation tags can be attached to IAM users or roles, or passed dynamically as session tags through identity providers, and once activated in AWS Billing, they appear in Cost Explorer under an iamPrincipal prefix. This enables chargeback reporting by team, project, cost center, or tenant without building custom tracking infrastructure.
For organizations running LLM gateways like LiteLLM or custom proxies, the solution requires the gateway to call AssumeRole per user and cache those credentials for up to one hour, which keeps STS call volume manageable but introduces architectural changes.
The default STS rate limit of 500 AssumeRole calls per second per account may require a limit increase for high-throughput deployments.
Tags take 24 to 48 hours to appear in Cost Explorer and CUR 2.0 after activation, and IAM principal data must be explicitly enabled in the CUR 2.0 data export configuration before any attribution data will appear.

52:12 AWS Lambda functions can now mount Amazon S3 buckets as file systems with S3 Files

Lambda functions can now mount S3 buckets as file systems using S3 Files, which is built on Amazon EFS, allowing standard file operations without the overhead of downloading objects or managing ephemeral storage limits.
Multiple Lambda functions can connect to the same S3 Files file system simultaneously, enabling shared workspaces without custom synchronization logic, which is particularly useful for multi-step AI and machine learning pipelines.
The integration pairs well with Lambda durable functions, where an orchestrator can clone a repository to a shared workspace while parallel agent functions analyze it, with automatic checkpointing handling execution state.
Configuration is supported through the Lambda console, AWS CLI, SDKs, CloudFormation, and SAM, though the feature is limited to Lambda functions not configured with a capacity provider.
Pricing adds no additional charge beyond standard Lambda and S3 rates, and the feature is available in all AWS regions where both Lambda and S3 Files are supported.

52:19 Justin – “Thanks. Could have announced that last week.”

53:41 From developer desks to the whole organization: Running Claude Cowork in Amazon Bedrock

Claude Cowork is a desktop application (macOS and Windows) that lets knowledge workers delegate research, document analysis, data processing, and report generation to Claude, with all model inference routed through Amazon Bedrock in your AWS account rather than Anthropic’s infrastructure.
Pricing is consumption-based through your existing AWS agreement with no per-seat licensing from Anthropic, which is a notable distinction from Claude Enterprise and could make cost modeling more predictable for organizations with variable usage patterns.
Enterprise security controls are central to the integration, including AWS IAM or Bedrock API key authentication, VPC endpoint network isolation, CloudTrail audit logging, and OpenTelemetry export to CloudWatch, with Anthropic receiving only aggregate telemetry that can be disabled.
Setup relies on device management tools like Jamf, Microsoft Intune, or Group Policy to push a managed configuration to Claude Desktop, specifying the model ID, Bedrock inference profile, and auth method, which means IT teams control rollout rather than individual users configuring their own credentials.
Organizations already using Claude Code in Amazon Bedrock can reuse the same infrastructure setup for Cowork, and both in-region and cross-region inference profiles are supported to address data residency requirements across different geographies.

56:51 Justin – “The problem is that instead of building a proper enterprise backend that would do all the things they want, they partnered with Work OS. And so while Work OS has a bunch of things, it doesn’t have all the things that you would want, and this is a problem also for OpenAI, as well, because they also partner the same way. And Snowflake partners with them. But some have done a better job than others in how they lay out some of these tools.”

57:54 Get to your first working agent in minutes: Announcing new features in Amazon Bedrock AgentCore

Amazon Bedrock AgentCore now includes a managed agent harness feature that lets developers define an agent’s model, tools, and instructions via API calls without writing orchestration code, reducing initial setup from days to minutes.
It supports popular frameworks, including LangGraph, LlamaIndex, CrewAI, and Strands Agents.
The new AgentCore CLI (available on GitHub at github.com/aws/agentcore-cli) keeps the full agent lifecycle in one workflow, covering local prototyping, deployment, and operations from a single terminal with CDK support and Terraform coming soon.
AgentCore now includes persistent session state via a durable filesystem, enabling agents to suspend mid-task and resume where they left off, which makes human-in-the-loop workflows practical without custom storage plumbing.
Pre-built coding agent skills give tools like Claude Code and Kiro curated knowledge of AgentCore best practices rather than just raw API access, with plugins for Codex and Cursor coming by the end of April.
The managed agent harness is in preview across four regions (Oregon, N. Virginia, Sydney, Frankfurt) with no additional charge for the CLI, harness, or skills beyond standard resource consumption.
Full pricing details are here.

58:49 Ryan – “This is a great feature; this now makes it competitive with Vertex AI’s AgentBuilder, and so now it’s a useable option on Amazon. Awesome.”

GCP

Pre-Next Announcements

59:40 Gemini 3.1 Flash TTS: New text-to-speech AI model

Gemini 3.1 Flash TTS is now available in preview across three surfaces: the Gemini API and Google AI Studio for developers, Vertex AI for enterprises, and Google Vids for Workspace users, giving GCP customers multiple integration paths depending on their use case.
The model scored an Elo of 1,211 on the Artificial Analysis TTS leaderboard based on blind human preference testing, and was placed in the top quadrant for balancing speech quality with low cost, though specific per-character or per-request pricing was not disclosed in the announcement.
A new audio tags system lets developers embed natural language commands directly into text input to control vocal style, pace, tone, and accent at a granular level, including mid-sentence expression changes, which reduces the need for custom voice training pipelines.
The model supports native multi-speaker dialogue and more than 70 languages with localized style and accent controls, making it a practical option for developers building global or multilingual audio applications.
All generated audio is automatically watermarked using Google’s SynthID technology, embedding an imperceptible signal that allows detection of AI-generated content, which is a relevant consideration for enterprises with compliance or content authenticity requirements.

1:00:01 The Gemini App is now available on Mac OS

Google has released a native Gemini app for macOS, available free to all Gemini users on macOS 15 and above, downloadable at gemini.google/mac.
This is a desktop client rather than a GCP infrastructure announcement, so its relevance to enterprise GCP customers is indirect.
The app includes a screen-sharing feature that lets users pass local files and on-screen content directly to Gemini for context-aware assistance, which could be useful for analysts or developers reviewing complex outputs without leaving their workflow.
A keyboard shortcut (Option + Space) surfaces Gemini from any application, positioning it as a system-level assistant similar to Spotlight, aimed at reducing context-switching during tasks like spreadsheet work or document drafting.
The app integrates with Google’s existing generative media tools, including image generation via Imagen (Nano Banana) and video generation via Veo, giving creative users access to those capabilities without opening a browser.
Google has indicated this initial release is a foundation for a broader desktop assistant strategy, with additional features planned, so organizations evaluating AI assistant tooling for their teams should monitor how this evolves alongside Workspace and GCP integrations.

1:01:33 Create Expert Content: Deploying a Multi-Agent System with Terraform and Cloud Run

Google’s Dev Signal is a four-part tutorial series showing how to build and deploy a production multi-agent system using Google ADK, MCP, Vertex AI memory bank, and Cloud Run, with the full code available at the GoogleCloudPlatform devrel-demos GitHub repository.
The deployment architecture uses Terraform to provision least-privilege service accounts, Artifact Registry, and Secret Manager integrations, following the Agent Starter Pack patterns to avoid common security pitfalls like over-permissioned default compute accounts.
Observability is handled through OpenTelemetry integration with a single otel_to_cloud=True flag in the FastAPI server, which exports agent traces to Cloud Console showing LLM invocations and MCP tool calls, though production traces are sampled, so targeted evaluation runs are needed for full request visibility.
The system distinguishes between two types of monitoring: system traces for identifying latency and timeout issues at scale, and reasoning traces for targeted evaluation of specific agent decisions, which is a practical distinction teams often miss when moving prototypes to production.
Pricing for this stack depends on Cloud Run usage, Vertex AI memory bank calls, and Secret Manager API requests, all billed separately at standard GCP rates, so teams should factor in the multi-service cost model when estimating production expenses.

1:02:05 Google Next: the Conference

32K attendees, 3 keynotes, 25 spotlights, 700+ breakouts, 260 announcements (yeah, we counted.)

Justin:

Wiz + Google Cloud Security/Product Offering

- Antigravity IDE + Gemini CLI (agent mode) enhancements
  - Data Agent Kit with VS Code/ Claude Code and Gemini CLI (close but no cigar)
Ironwood TPU GA and/or dedicated Inference-based CHIP

Ryan

Gemini 3.1 Pro GA & Teasing Gemini 3.5 or 4 or future model

Enhancements with agents and Agentic (THE ENTIRE CONFERENCE)
VMware interruption based on Kubernetes? (Opposite of Tanzu)

Matt

Default Guardrails in AI in general. How Gemini will have guard rails via Vertex.

Agent Identity, Agent Gateway, and Model Armor

Agentic coding tooling and how developers are leveraging Agentic (SDLC)

Data Agent Kit & Agentic Task Force
3 Non AI Announcements (at the conference, but not on stage, so…)

This is genuinely the best we’ve ever done. Time to go buy a lotto ticket and lose.

Runner Ups

A2A protocol 1.0 released – Donated to CNCF

- Turboquant Ships in Vertex AI
- Something waymo

Biqquery AI Agents – Part of Data Agent Kit

Gemini 3.1 Flash GA

- Axion Gen 2

Nano bananas updates
Sovereign Cloud AI
Gemini Robotics API Preview
Hugging Face
AWS Activate type program
AP2 Payment Protocol
AI in Android
Gemini + Boston Dynamics
Glasswing Answer

How many times is AI said on stage?

JUST THE FIRST KEYNOTE WAS 132 Times!!

2nd Keynote: 55 Times

Matt – 99

Ryan – 75

Justin – 115 Winner

That makes Justin the overall winner for this year’s NEXT predictions.

Here’s our Claude-based tier ranking of the 260 announcements:

1:09:21 TIER S — Headline

Agent platform (Vertex AI evolves)

The story of the keynote. Vertex AI is being repositioned as the Gemini Enterprise Agent Platform
16 named sub-features in three buckets:
- Build: ADK (graph-based sub-agent networks), Agent Studio (no-code → ADK export), Agent Designer
- Run: Agent Runtime (sub-second cold starts), Agent Sandbox (now everyone), Memory Bank, Sessions, long-running agents
- Govern + Optimize: Agent Identity (cryptographic ID per agent), Agent Registry, Agent Gateway, Anomaly Detection, Security dashboard, Simulation, Evaluation, Optimizer
Strategic frame: the agent is now the unit of work, not the model call
Hot takes to fight over:
- Is “delegating business outcomes” the new “infrastructure as code”?
- Does the 16-feature stack feel cohesive or like marketing bolt-ons?
- Does this simplify Vertex’s SKU sprawl or make it worse?
Matt’s guardrails prediction lands naturally inside the Govern bucket.

1:10:27 Customer scale — agents actually in production

Strongest competitive flex of the keynote. Pick 4-5 to land on-mic:
- Mars — Gemini Enterprise as the primary AI operating system for the global workforce (the headline customer)
- Merck — agentic platform across R&D, manufacturing, commercial; 75K employees
- GE Appliances — 800+ agents across manufacturing, logistics, supply chain
- Tata Steel — 300+ specialized agents in 9 months
- Deutsche Telekom MINDR — 95%+ reduction in event management times (best ROI quote)
- Citadel Securities — TPU research workloads 4x faster, 30% lower cost, days → minutes
- Highmark Health Sidekick — $27.9M in value in 2025 alone
Frame: Last year was “agents are coming,” this year was “here’s the receipts”
Hot take: name the equivalent customer slate from AWS, Azure, or Snowflake. You can’t.
Skip if tight on time: Home Depot Magic Apron, Macy’s “Ask Macy’s”, Papa John’s, Virgin Voyages Rovey, Capcom, Citi Sky, Vodafone, Unilever

1:11:44 TPU 8t and 8i (the silicon split)

8t (training): ~3x compute vs Ironwood
8i (inference): purpose-built, 80% better perf/$, optimized for MoE + agentic workloads
TorchTPU: native PyTorch, full Eager Mode — kills the JAX-only friction
Strategic: only hyperscaler shipping dedicated inference silicon this generation
Practitioner angle: agent workloads (lots of small inference calls) tilt economically toward Google if 80% perf/$ holds in production
Justin’s prediction wins twice — he specifically called the dedicated inference chip

1:12:22 TIER A — Strong second tier

Wiz expands (multi-cloud agent visibility)

Lead with: acquisition formally closed
Wiz AI-APP — code-to-cloud-to-runtime AI Application Protection Platform
Killer move: Wiz now supports AWS Agentcore, Azure Copilot Studio, Salesforce Agentforce, Databricks
- Google is selling security to customers who’ll never run a workload on GCP
- Different posture than they’ve had historically
Other Wiz news worth a mention:
- Inline AI security hooks in IDEs
- Wiz Skills — validated attack-surface findings exposed to coding agents for auto-remediation
- AI-Bill of Materials — auto-inventory of every AI framework, model, IDE extension across your environment (shadow-AI killer)
- Lovable vibe-coding integration (security scanning inside Lovable)
Hot take: most strategically interesting acquisition payoff Google has shipped in years.

1:13:46 Partner fund — $750M + Forward-Deployed Engineers

$750M innovation fund for partner agent development
Agent Marketplace + Agent Gallery — 70+ partner-built agents at launch
- Accenture, Adobe, Atlassian, Deloitte, Lovable, Oracle, Palo Alto, Replit, S&P Global, Salesforce, ServiceNow, Workday
Forward-Deployed Engineers with Accenture, Deloitte, McKinsey — Google making its own engineers available through partner GTM
Hot take: this is a Palantir-style move. Google admitting agent adoption needs hand-holding and putting money + bodies behind it
Open question: Does this reshape the SI economics, or is it just GTM theater?

1:14:48 Antigravity + Data Agent Kit + Gemini 3.1 Pro

Gemini 3.1 Pro in preview across Vertex / Gemini Enterprise / Antigravity / Android Studio / Gemini CLI / AI Studio
Data Agent Kit — portable suite of skills, MCP tools, plugins; turns VS Code and Gemini CLI into native data workspaces
Full-stack vibe coding from AI Studio → Cloud Run is now GA (Firestore + auth out of the box)
Hot take: this is the developer story. Cursor / Claude Code / Replit competitors take note.
Justin and Ryan both have prediction wins here

1:15:25 Agentic Data Cloud — Knowledge Catalog + Cross-cloud Lakehouse + Spanner Omni

Knowledge Catalog — universal context engine; maps business meaning across the data estate. Foundation for accurate agent execution.
Cross-cloud Lakehouse (BigLake renamed) — Iceberg REST Catalog, federation with AWS Glue / Databricks / Snowflake / SAP, cross-cloud caching cuts egress
Spanner Omni — Spanner runs multi-cloud, on-prem, even on a laptop
- This is the most underrated announcement of the keynote
- Fight over: is this the new Aurora-anywhere? Does it actually pull workloads off RDS / Cosmos?
Lakehouse federation for AlloyDB — live joins between transactional + analytical without ETL.

1:17:17 TIER B — Solid block

Workspace AI — Workspace Intelligence + Studio

Workspace Intelligence — unified semantic understanding across Docs / Slides / Gmail / projects / org domain knowledge
Workspace Studio — no-code agent builder; skills deployable across Workspace
M365 → Workspace migration tool — competitive shot at Microsoft, easy to move emails/files/conversations
Sovereign controls + client-side encryption — lock processing to US/EU; CSE means even Google can’t see
Auto browse with Gemini in Chrome Enterprise (US)

1:17:53 Cloud Run grew up

Full-stack vibe coding deploy from AI Studio (GA)
NVIDIA RTX PRO 6000 Blackwell support — run 70B+ parameter models without managing GPU infra, scales to zero
Billing caps (long-requested!) — set max monthly spend, resources de-activate when hit
Cloud Run sandboxes for ephemeral isolated agent execution
SSH into running containers (preview)
Hot take: Cloud Run is positioning itself as the default agent runtime, period

Gemini Enterprise for CX

Shopping agent + Food Ordering agent (Papa John’s first user)
Omnichannel Gateway — agent context across web / mobile / voice
Agent Assist — coaching mode for human agents in complex situations

1:19:04 BigQuery AI

AI.PARSE_DOCUMENT — single SQL function for OCR + layout + chunking via Gemini’s layout parser
TabularFM — zero-shot regression/classification, no feature engineering
BigQuery Graph — entity/relationship modeling natively in the warehouse
Reverse ETL — one-click sync from lakehouse to AlloyDB/Spanner for low-latency serving
Connected Sheets with TimesFM — zero-shot forecasting in Google Sheets
BigQuery hybrid search — semantic + full-text in one function
35% YoY perf improvement, lower processing cost
Hot take: biggest “Monday morning” change for data teams in the entire keynote

1:19:32 TIER C — Lightning round

Virgo Network

Custom interconnect: 134K TPUs in a single fabric, 1M+ across sites
A5X with NVIDIA Vera Rubin NVL72 — up to 960K GPUs cross-site
The “we can scale further than anyone else” mic drop

1:20:05 Rapid storage

Rapid Bucket — 15 TB/s bandwidth, 20M req/s, sub-millisecond latency, single-zone
Rapid Cache (formerly Anywhere Cache) — 2.5 TB/s aggregate read; 2.2x faster checkpoint restores
Managed Lustre at 10 TB/s throughput; 2.6x faster checkpoints

1:20:54 Axion expands

N4A GA — 2x price/perf vs x86; 30% better perf/$ for GKE Agent Sandbox vs other hyperscalers
C4A.metal preview — first Axion bare metal (Android dev, automotive sim, custom hypervisors)
Confidential Computing on G4 (Blackwell) + C4 (Granite Rapids) — confidential AI workloads

1:21:54 Fraud Defense

reCAPTCHA evolves into a platform that distinguishes bots, humans, AND agents
Agent-specific capabilities coming for the digital commerce journey (account → payment → checkout)
Closest thing in the wrap-up to the AP2 protocol prediction nobody hit

1:21:50 Post-quantum crypto

KMS Quantum Safe Key Imports (preview)
PQC in Cross-Cloud Network
Boring but important — Google front-running the regulatory ask

1:22:00 GKE upgrades

4x faster node startup, 80% faster pod startup, 5x faster model loading
GKE hypercluster — single control plane, millions of accelerators, multi-region (private GA)
Predictive latency boost in GKE Inference Gateway — up to 70% lower time-to-first-token
KV Cache tiering across RAM / Local SSD / Cloud Storage / Lustre
RL Scheduler, RL Sandbox, RL Observability for reinforcement learning workloads

1:22:33 Three themes that emerged

Agent platform is the new operating system. Vertex’s rebrand to Gemini Enterprise Agent Platform isn’t cosmetic — Google restructured the portfolio so the unit of work is an agent, not a model call.
Wiz is now Google’s multi-cloud trojan horse. Supporting AWS Agentcore + Azure Copilot Studio + Salesforce Agentforce means Google is happy to sell security to customers who’ll never run on GCP. New posture.
Customer scale is the real flex. Mars, Merck (75K employees), GE Appliances (800 agents), Tata Steel (300 in 9 months), Deutsche Telekom (95% MTTR reduction). Other hyperscalers can match the silicon. They can’t yet match this deployment depth on stage.

1:23:00 Conspicuously absent

A2A 1.0 / CNCF donation — third-party press reported it, not in the official wrap-up
No Boston Dynamics or Waymo crossover
No Gemini Robotics API preview
No Hugging Face deal
No AP2 Payment Protocol (Cloud Fraud Defense is the closest cousin)
No Nano Banana update
No Glasswing answer
No Turboquant in Vertex

1:23:24 Less important stuff

Bigtable in-memory; Memorystore for Valkey 9.0
AlloyDB AI search at 10B vectors; new AlloyDB AI functions
Firestore Enterprise edition (full-text + geospatial + JOINs)
Firebase SQL Connect; Firebase Phone Number Verification
NetApp Volumes Flex Unified + ONTAP-mode
Filestore for GKE; Hyperdisk Exapools / ML / Balanced improvements
Cloud WAN expansion to 25+ countries; NCC Gateway with Palo Alto + Symantec
Cloud Armor managed rules (Thales Imperva); Cloud NGFW Advanced Malware Sandbox
Private Service Connect: 40+ published services, endpoint-based security
Looker Studio renamed to Data Studio; Looker Dashboard Agents; AI assistants
CME Group ultra-low-latency partnership for financial exchanges
Google for Startups AI Agents Challenge ($90K prize, $500 credits)

Google Cloud Next 2026 Wrap Up

Google Cloud Next 26 featured 260 announcements centered on what Google calls the “Agentic Era,” with the headline being the Gemini Enterprise Agent Platform, which replaces Vertex AI as the primary platform for building, scaling, and governing AI agents with new components like Agent Runtime (sub-second cold starts), Agent Memory Bank, Agent Identity with cryptographic IDs, and Agent Gateway for fleet management.
On the infrastructure side, Google announced 8th-generation TPUs split into two variants: TPU 8t for training workloads delivering roughly 3x higher compute than the previous generation, and TPU 8i for inference and reinforcement learning with up to 80% better performance-per-dollar, alongside new Axion-based N4A VMs now generally available at up to 2x better price-performance than comparable x86 VMs.
The Agentic Data Cloud introduces a Knowledge Catalog as a universal context engine, a Cross-Cloud Lakehouse (formerly BigLake) built on Iceberg REST Catalog spanning AWS and Azure, and Spanner Omni, which extends Spanner’s globally consistent database to run on-premises or on other clouds, addressing the challenge of agents needing consistent data access across fragmented environments.
Security got notable attention with the completed Wiz acquisition now reflected in integrated tooling, Model Armor expanding to Agent Gateway and Firebase, a new Google Cloud Fraud Defense platform (evolved from reCAPTCHA) now generally available, and post-quantum cryptography support in Cloud KMS for quantum-safe key imports, all aimed at securing agentic workloads specifically.
Storage announcements include the new Cloud Storage Rapid Bucket delivering over 15 TB/s bandwidth with sub-millisecond latency now generally available, Managed Lustre Dynamic tier priced at $0.06/GB-month, and Hyperdisk ML throughput increased to 2 TB/s aggregate, all targeting the checkpoint and model loading bottlenecks common in large-scale AI training.

Next ‘26 day 1 recap

Google Cloud Next ’26 centered on moving AI into production at enterprise scale, with the Gemini Enterprise platform serving as the connective tissue across a unified stack spanning chips, models, data, agents, and security. The Gemini Enterprise Agent Platform is essentially a rebranded and expanded Vertex AI with new tools for building, scaling, governing, and optimizing agents.
On the infrastructure side, Google announced two new TPU 8 variants with distinct purposes: TPU 8t for training scales to 9,600 TPUs with 2 petabytes of shared memory, while TPU 8i for inference delivers 80% better performance per dollar than the prior generation using a new Boardfly topology. The new Virgo Network and Google Cloud Managed Lustre at 10 terabytes per second throughput round out the infrastructure updates.
The Agentic Data Cloud rebrands and expands Google’s data platform with notable additions, including a Knowledge Catalog for contextual grounding, a Lightning Engine for Apache Spark claiming 4.5x speed over open-source alternatives, and a Cross-Cloud Lakehouse based on Apache Iceberg that lets customers query data in AWS or Azure without copying it.
Security got substantial attention with three new agents in Google Security Operations for threat hunting, detection engineering, and third-party context enrichment, all currently in preview. The Wiz acquisition is now complete, and new Wiz integrations include inline security scanning in IDEs, an AI Bill of Materials for inventorying AI frameworks and models, and a Lovable platform integration generally available in May.
Google Workspace is being repositioned from a productivity suite into what Google calls a semantic intelligence layer, with new features like AI Inbox in Gmail, Drive Projects as an active collaborator, and an Ask Gemini interface in Google Chat that can take actions like scheduling meetings or creating documents directly from the chat window.

Next ’26 day 2 recap

Google Cloud Next Day 2 centered on the Gemini Enterprise Agent Platform, positioned as the evolution of Vertex AI, offering tools to build, scale, govern, and optimize autonomous agents. The keynote used a multi-agent marathon route planner for Las Vegas as a practical demonstration of the platform’s capabilities.
The Agent Development Kit, remote MCP servers, and Agent Runtime work together to give agents instructions, skills, and tools, while Agent Registry functions as a DNS-like directory for discovering and connecting deployed agents across a system.
Agent Platform Sessions and Memory Bank address a common problem in agentic systems by allowing agents to retain learned knowledge across interactions without stuffing raw text into every request, which improves performance over time.
Debugging and observability are handled through Agent Runtime trace view and Gemini Cloud Assist, which let developers use natural language to investigate logs and pinpoint issues, with fixes applied directly from an IDE connected via MCP and redeployed automatically.
Security is addressed through Agent Identity, which gives each agent a unique, immutable credential, and Agent Gateway, which enforces IAM policies to restrict agent actions to approved sources. Wiz integration adds code and infrastructure scanning with remediation suggestions, and notably supports Anthropic Claude Code as an alternative tooling option alongside Google’s own tools.

Partner-built agents available in Gemini Enterprise

Google has added partner-built agents from its Agent Marketplace directly into the Agent Gallery inside the Gemini Enterprise app, with partners including Salesforce, ServiceNow, Workday, Oracle, Atlassian, and Palo Alto Networks, among others. Each agent must pass a four-step evaluation covering basic functionality, output accuracy, autonomous execution, and enterprise standards to earn the Google Cloud Ready – Gemini Enterprise designation.
The governance model is worth noting for enterprise IT teams: employees can browse and request agents, but administrators retain approval control over deployments and can manage access at a granular level. Every agent also gets a cryptographically secure identity for audit trail purposes, and Agent Gateway plus Model Armor screen traffic to prevent data from being used for model training.
Google announced a 750 million dollar partner fund for agentic development alongside this launch, and partners selling through the Marketplace are reportedly closing deals 112 percent larger, with purchasing cycles accelerating by up to 50 percent. This creates a clear commercial incentive for ISVs to build and list agents on the platform.
The agent catalog covers a wide range of industries and functions, including supply chain optimization from Accenture, tariff management from Deloitte, financial analysis from S&P Global, identity security from Saviynt, and healthcare intake workflows from Synthpop. This breadth suggests Google is positioning the Agent Gallery as a general-purpose enterprise AI distribution channel rather than a niche tool.
Pricing for individual agents will vary by partner and likely requires existing subscriptions in some cases, such as the Alteryx AI Insights Agent requiring an Alteryx One subscription. Gemini Enterprise offers a 30-day free trial at console.cloud.google.com/freetrial for organizations wanting to evaluate the platform before committing.

Level Up Your Agents: Announcing Google’s Official Skills Repository

Google announced an official Agent Skills repository at Cloud Next 2026, launching with 13 skills covering products like BigQuery, Cloud Run, GKE, Firebase, and Gemini API, plus Well-Architected Framework pillars and recipe-style guides for common tasks. The repository is available at github.com/google/skills and is free to use.
Agent Skills address a practical problem called context bloat, where loading too much information into an AI agent’s context window increases token costs and degrades model performance. Skills are compact Markdown-based documents that agents load only when needed, rather than pulling in full documentation sets.
The format is described as open, meaning it is not locked to Google’s own tooling. Skills work with Google’s Antigravity and Gemini CLI agents as well as third-party agents, and installation is handled via a single npx command.
The announcement positions Skills as a complement to existing approaches like the Google developer documentation, the MCP server, giving practitioners a lighter-weight alternative when full real-time documentation grounding is unnecessary or too costly.
For teams building AI agents on top of Google Cloud services, this provides a structured way to keep agents accurate on GCP-specific APIs and best practices without manual prompt engineering or expensive context loading. Google indicated that more skills will be added in the coming weeks.

Introducing Gemini Enterprise Agent Platform

Google launched the Gemini Enterprise Agent Platform, which consolidates Vertex AI capabilities with new agent-specific tooling for building, scaling, governing, and optimizing AI agents. All future Vertex AI services and roadmap updates will be delivered exclusively through this platform rather than as a standalone service.
The platform introduces four governance-focused components: Agent Identity assigns each agent a unique cryptographic ID for auditable trails, Agent Registry maintains a central library of approved tools, Agent Gateway enforces security policies across environments, and Agent Anomaly Detection flags unusual reasoning using an LLM-as-a-judge framework.
Agent Runtime now supports long-running agents that maintain state for multiple days, with sub-second cold starts and a Memory Bank for persistent context across sessions. This addresses a practical gap where most agent frameworks previously lost context between interactions.
Developers can access over 200 models through Model Garden, including Gemini 3.1 Pro, Gemma 4, and third-party models like Anthropic Claude, with a low-code Agent Studio path and a code-first Agent Development Kit that processes over six trillion tokens monthly. Agent Garden provides pre-built templates for use cases like invoice processing, financial analysis, and code modernization.
Real-world deployments mentioned include Comcast rebuilding its Xfinity Assistant, Color Health using agents to schedule cancer screenings, and PayPal using Agent Payment Protocol for secure agent-based commerce. Pricing details are not specified in the announcement and would need to be confirmed through the Google Cloud console at console.cloud.google.com/agent-platform/overview.

Gemini Cloud Assist at Next ‘26

Gemini Cloud Assist is shifting from a reactive assistant to a proactive operations platform, using an agentic architecture to handle tasks like infrastructure troubleshooting, cost anomaly detection, and application design without waiting for user prompts.
The redesigned Application Design Center lets teams describe infrastructure goals in plain language and get back visual architectures with deployable Terraform templates, integrated with Security Command Center to enforce organizational policies from the start.
A 24/7 FinOps agent monitors for cost anomalies and correlates spending spikes with specific triggers like auto-scaling events or new resource creation, allowing teams to query cost data in natural language instead of manually aggregating reports.
MCP server support extends Gemini Cloud Assist beyond the Google Cloud console into IDEs, CLIs, and third-party tools like ServiceNow and Slack, reducing context switching for development and operations teams.
Petco reported a 60% reduction in Google Cloud-related questions to their cloud team after adopting Gemini Cloud Assist, suggesting meaningful productivity gains for platform teams supporting large developer organizations. Pricing details are not specified in the announcement, so teams should check the Gemini Cloud Assist admin console for current costs.

Unify analytical and operational data for AI

Google announced what it calls an “Agentic Data Cloud” at Google Cloud Next, focused on eliminating the separation between operational and analytical data systems. The goal is to let AI agents query both live transactional data and historical analytical data without complex data movement pipelines.
Three specific capabilities are now available or in preview: Lakehouse federation for AlloyDB lets operational systems query BigQuery data directly, Reverse ETL for BigQuery pushes analytical results into AlloyDB, Bigtable, or Spanner with sub-millisecond read latency, and the Spanner Columnar Engine is now GA with analytical queries running up to 200 times faster than standard transactional queries.
Datastream now supports real-time Change Data Capture into Apache Iceberg tables from AlloyDB, Cloud SQL, Spanner, and Oracle, streaming operational changes directly into the open Lakehouse format for immediate use in BigQuery ML and feature engineering workflows.
Knowledge Catalog, formerly Dataplex, is being extended with integrations across AlloyDB, BigQuery, Bigtable, Cloud SQL, and Spanner to provide a unified metadata layer. The intent is to reduce inconsistent data definitions that can cause AI agents to produce inaccurate outputs.
Native vector and full-text search are being embedded directly into AlloyDB, Bigtable, Cloud SQL, Firestore, and Spanner, and graph federation is being added across BigQuery and Spanner. This removes the need to move data into separate search or graph engines for hybrid retrieval and GraphRAG patterns. Pricing for these features is not specified in the announcement and would vary by service and usage.

Introducing the Google Cloud Knowledge Catalog

Google is evolving its existing Dataplex service into the Knowledge Catalog, a context engine designed to feed AI agents accurate business semantics, data relationships, and verified SQL patterns to reduce hallucinations and improve query accuracy.
The service aggregates metadata from a broad range of sources, including BigQuery, AlloyDB, Spanner, Cloud SQL, and third-party catalogs like Collibra and Atlan, plus enterprise platforms like SAP, Salesforce, and Workday through a preview feature called Enterprise Connectivity.
A notable enrichment capability is Smart Storage, which automatically tags and embeds metadata for files as they land in Google Cloud Storage buckets, making unstructured data immediately discoverable by agents without manual curation steps.
The search layer uses hybrid retrieval with access control awareness, meaning agents can only retrieve data assets they are explicitly authorized to see, which addresses a practical governance concern when deploying autonomous agents at enterprise scale.
Bloomberg Media is cited as an early customer, using Knowledge Catalog to power an internal Data Access AI Agent that translates business questions against their data lake. Pricing details are not publicly listed, so teams evaluating this should check cloud.google.com/products/knowledge-catalog for current information.

The future of data lakehouse for the agentic era

Google Cloud announced a next-generation cross-cloud Lakehouse built around Apache Iceberg, offering fully managed Iceberg storage with read/write interoperability across BigQuery, Managed Apache Spark, and third-party engines like Databricks and Snowflake (Preview). The goal is to let teams process the same data across multiple engines without duplication, which Spotify is already doing across BigQuery and Dataflow.
A new cross-cloud interconnect and caching capability (Preview) gives BigQuery and Managed Apache Spark high-performance access to data stored in AWS S3 Iceberg tables, with claimed price-performance comparable to AWS-native solutions. Catalog federation (Preview) extends this to AWS Glue, Databricks, SAP, and Snowflake, with Confluent Tableflow support coming later this year.
The Lightning Engine for Apache Spark claims up to 2x price-performance over competing high-speed Spark alternatives using vectorized execution and optimized I/O, with no code changes required. This runs within Managed Service for Apache Spark, formerly known as Dataproc.
Knowledge Catalog (formerly Dataplex) now provides always-on context for AI agents by continuously learning how enterprise data is used and mapping relationships within unstructured files. This feeds grounded context to agents built with tools like Agent Developer Kit and Model Context Protocol.
Real-time change replication from Spanner, AlloyDB, and Cloud SQL into BigQuery is now GA, with Iceberg replication in Preview, enabling operational data to feed directly into lakehouse workloads. Pricing is not specified in the announcement and would vary based on storage, compute, and cross-cloud data transfer usage.

What’s New in the Agentic Data Cloud

Google is rebranding and expanding Dataplex Universal Catalog into the Knowledge Catalog, which aggregates business context from third-party platforms like Salesforce, SAP, ServiceNow, and Workday, then uses hybrid search with access-control-aware retrieval so agents only act on data they are authorized to see.
The new Google Cloud Data Agent Kit (Preview) drops into existing developer environments like VS Code, Gemini CLI, and Claude Code, automatically selecting frameworks like dbt, Spark, or Airflow and generating production-ready code, with three specialized agents for data engineering, data science, and database observability now available at various GA and Preview stages.
Google is expanding MCP support across BigQuery, Spanner, AlloyDB, Cloud SQL, and Looker, using existing IAM policies and VPC Service Controls to govern agent interactions rather than requiring separate security configurations.
The cross-cloud lakehouse now supports bi-directional federation with Databricks Unity Catalog, Snowflake Polaris, and AWS Glue Data Catalog using the open Iceberg REST Catalog standard, and Spanner Omni (Preview) extends the Spanner engine to run on-premises or across other clouds for the first time.
On the performance side, Google is citing up to 2x price-performance improvement for Apache Spark via Lightning Engine, up to 34% cost reduction for BigQuery autoscaling workloads, sub-millisecond Bigtable reads via a new in-memory tier, and up to 10 terabytes per second throughput with Managed Lustre, though specific pricing details were not disclosed in the announcement.

Next 26 storage announcements

Cloud Storage Rapid is now generally available in two forms: Rapid Bucket, which uses Google’s internal Colossus system to deliver over 15 TB/s bandwidth and sub-millisecond latency, and Rapid Cache, which provides 2.5 TB/s aggregate read throughput for existing buckets with no code changes. The headline numbers for AI training are checkpoint writes 3.2x faster and restores 5x faster compared to traditional object storage.
Google Cloud Managed Lustre now delivers up to 10 TB/s throughput, a 10x increase from last year, and adds a new Dynamic tier priced at $0.06 per GB per month that serves data from persistent disk rather than object-based caching to avoid performance degradation under load.
Smart Storage adds automated metadata annotation directly in Cloud Storage, so objects get labels, extracted entities, and compliance signals attached at write time without custom pipelines. A new Cloud Storage MCP server lets AI agents read, write, and analyze Cloud Storage data using the standard Model Context Protocol, which reduces the need for separate retrieval layers.
Storage Intelligence, already used by 70% of Google Cloud’s largest customers managing over 50 billion objects each, gets zero-configuration dashboards that surface cost anomalies and integrate Security Command Center’s data governance signals with no setup required, plus enhanced batch operations supporting multi-bucket actions on billions of objects at once.
The ecosystem additions include NetApp Volumes Flex Unified, supporting both block and file protocols on the same storage pool with ONTAP API compatibility, Filestore for GKE scaling down to 100 GiB shares, and Google Cloud Backup and DR gaining agentic AI capabilities to autonomously audit and remediate backup coverage gaps with new GA support for AlloyDB and Filestore.

Introducing Virgo Network megascale data center fabric

Google introduced Virgo Network, a specialized scale-out data center fabric designed for AI workloads, built on a flat two-layer non-blocking topology that reduces network tiers and latency compared to traditional data center architectures. It underpins the AI Hypercomputer platform and connects up to 134,000 TPU chips with up to 47 petabits per second of non-blocking bi-sectional bandwidth in a single fabric.
The architecture separates east-west accelerator traffic (handled by Virgo) from north-south storage and compute traffic (handled by the existing Jupiter network), allowing each layer to evolve independently without system-wide disruptions. This decoupling also means bandwidth dedicated to accelerator-to-accelerator communication is non-blocking and not competing with general data center traffic.
Virgo delivers 4x the bandwidth per accelerator and 40% lower unloaded fabric latency compared to the previous generation, which matters specifically for latency-sensitive inference workloads and large synchronized training jobs where a single slow node can degrade the entire cluster.
Reliability at this scale is addressed through independent switching planes for fault isolation, sub-millisecond telemetry for observability, and automated straggler and hang detection to minimize training job interruptions. Google frames this around maximizing “goodput,” meaning the useful work completed relative to total time, rather than just raw throughput.
No pricing details were provided in the announcement, as Virgo Network is infrastructure-level and costs would surface through TPU and AI Hypercomputer product pricing rather than as a standalone purchasable service.

What’s new for Google Cloud databases at Next’26

Google announced Spanner Omni, a downloadable edition of Spanner that runs outside of Google Cloud, including on-premises data centers, other clouds, and edge environments. This gives organizations using Spanner’s distributed database capabilities more deployment flexibility without being locked into a single cloud region or provider.
AlloyDB received notable vector search improvements, scaling to 10 billion vectors using Google’s ScaNN index and delivering up to 6 times faster vector queries compared to standard PostgreSQL HNSW indexes. The addition of native BM25 support, coming soon, enables hybrid search combining vector retrieval with full-text search in a single database.
Managed remote MCP servers are now generally available for AlloyDB, Bigtable, Cloud SQL, Firestore, and Spanner, with preview support for Memorystore, Datastream, and Oracle Database at Google Cloud. This removes the operational burden of self-hosting Model Context Protocol infrastructure for teams building AI agents that need secure, reliable access to enterprise data.
The lakehouse integration announcements bridge the gap between transactional and analytical workloads, with AlloyDB now able to query live BigQuery and Iceberg tables directly from the PostgreSQL data plane without data movement. Datastream also now supports continuous replication from AlloyDB to Iceberg tables, which is useful for real-time ML feature engineering pipelines.
Bigtable is adding a new in-memory tier with sub-millisecond read latency as part of a new Enterprise Plus edition, and Memorystore for Valkey 9.0 is now generally available with a managed migration path from self-managed Redis. Both updates reflect Google’s push to offer managed caching and low-latency storage options with enterprise security features like ACLs and token-based authentication.

Introducing Spanner Omni

Spanner Omni is a downloadable version of Google’s Spanner database now in preview, allowing deployment on-premises, across clouds, on Kubernetes clusters, or even a laptop, rather than being limited to Google Cloud infrastructure. The developer edition is available for free download today at the link in the show notes, with a commercial edition requiring direct contact with Google.
On the technical side, Google had to replace two core Spanner dependencies to make this work. Colossus, Google’s proprietary distributed file system, was replaced with a software abstraction layer that writes to local file systems, and TrueTime’s atomic clock and GPS-based synchronization was replaced with a software-based alternative that still provides error-bounded time synchronization.
Internal benchmarks show Spanner Omni can process millions of queries per second across petabytes of data in a single regional deployment, and it supports the full multimodal feature set, including SQL, graph, key-value, full-text search, vector search, and columnar analytics.
Three primary use cases are emerging from early adopters: hybrid failover, where managed Spanner in Google Cloud serves as primary, and Spanner Omni handles disaster recovery on-premises, a write-once-run-anywhere approach for ISVs and SaaS providers, and on-premises modernization for organizations with regulatory or data sovereignty requirements that prevent full cloud adoption.
Pricing for the commercial edition is not publicly listed yet, so organizations interested in production use will need to engage Google directly at cloud.google.com/consulting/spanner-omni to discuss terms.

TPU 8t and TPU 8i technical deep dive

Google’s eighth-generation TPUs split into two specialized chips: TPU 8t for large-scale pre-training and TPU 8i for inference and reasoning workloads. This specialization reflects a recognition that training and serving have distinct hardware bottlenecks that a single chip design cannot optimally address.
TPU 8t introduces native FP4 support, SparseCore for embedding lookups, and a new Virgo Network fabric that can link over 134,000 chips with 47 petabits per second of non-blocking bandwidth. Combined with TPUDirect Storage and Managed Lustre 10T, Google claims 10x faster storage access compared to seventh-generation Ironwood TPUs.
TPU 8i uses a new Boardfly network topology inspired by Dragonfly principles, reducing chip-to-chip communication from 16 hops to 7 hops in a 1,024-chip pod. This 56% reduction in network diameter directly benefits Mixture-of-Experts and reasoning models that require frequent all-to-all communication patterns.
On performance-per-dollar, Google claims TPU 8t delivers 2.7x improvement over Ironwood for training, while TPU 8i delivers 80% improvement for low-latency inference on large MoE models. Both chips also deliver up to 2x better performance-per-watt, which matters for customers managing energy costs at scale.
The software stack supports JAX, native PyTorch (currently in preview), Keras, and vLLM, with XLA handling hardware-specific translation transparently. Customers interested in access can submit an interest form at cloud.google.com/resources/tpu-interest, though pricing details have not been publicly disclosed.

Introducing Spend Caps AI Cost Visibility Next ’26

Google Cloud announced Spend Caps in private preview, allowing FinOps and DevOps managers to set hard budget limits at the project level for services including AI Studio, Gemini Agent Platform, Cloud Run, Cloud Run Functions, and Maps. Unlike traditional budget alerts, Spend Caps automatically pause API traffic when a budget threshold is reached while leaving underlying resources intact, addressing the risk of runaway AI training jobs or unoptimized models draining budgets quickly.
A new FinOps Explainability Agent, built on Gemini and accessible through Google Cloud Billing, autonomously analyzes AI cost drivers and answers natural language queries such as breaking down spend by API key or comparing input versus output token costs across specific Gemini models. This addresses the challenge of AI costs blending into general infrastructure spend, making ROI attribution more straightforward.
Google reported that since launching Gemini Cloud Assist for FinOps, cost reporting adoption increased 75% and time spent on cost analysis decreased 18%, providing some baseline context for the value customers are seeing from AI-assisted billing tools.
Two additional private previews were announced alongside Spend Caps: enhanced billing account hierarchies that aggregate spend across multiple billing accounts, including Other Eligible Services, and contract commitment reporting that shows burndown progress within Enterprise Agreements. Both features target larger organizations managing complex commercial arrangements with Google Cloud.
Spend Caps are currently in private preview with a signup form available, and no specific pricing details were provided for the new FinOps tooling beyond its availability in the Google Cloud Billing console.

Next ‘26: Redefining security for the AI era with Google Cloud and Wiz

Google Cloud announced three new security agents in Google Security Operations at Next 26: a Threat Hunting agent, a Detection Engineering agent, and a Third-Party Context agent, all in preview. The existing Triage and Investigation agent has already processed over 5 million alerts, reducing the typical 30-minute manual analysis to 60 seconds.
Wiz, now fully part of Google Cloud, is expanding its AI-Application Protection Platform to cover new agent studios, including AWS Agentcore, Microsoft Azure Copilot Studio, and Salesforce Agentforce, plus Databricks. New capabilities include inline AI security hooks for IDEs, agent-based remediation via Wiz Skills, and an AI Bill of Materials to inventory shadow AI tools across an environment.
Google Cloud is introducing Agent Identity and Agent Gateway as part of the Gemini Enterprise Agent Platform, giving AI agents unique identities with scoped permissions and enforcing policy on all agent-to-agent and agent-to-tool traffic. Model Armor now integrates with Agent Gateway, LangChain, and Firebase to provide runtime protection against prompt injection and data leakage without code changes.
On the data security side, Confidential Computing support is coming to G4 VMs with NVIDIA RTX PRO 6000 Blackwell GPUs and C4 VMs with Intel TDX, both in preview. KMS is also adding quantum-safe key imports in preview, addressing organizations starting to plan for post-quantum cryptography requirements.
ReCAPTCHA is being rebranded and expanded into Google Cloud Fraud Defense, now generally available, with agent-specific capabilities for distinguishing bots, humans, and AI agents coming in preview. Chrome Enterprise is adding shadow AI reporting and AI-aware extension threat detection to help organizations manage unsanctioned AI tool usage at the browser level.

Looker updates for agentic BI at Next ‘26

Google announced Looker BI Agents at Cloud Next, introducing Dashboard Agents and Agentic Workflows that go beyond static answers to trigger downstream business actions, all grounded in the Looker semantic layer and existing enterprise governance frameworks.
Several features moved to GA, including Embedded Conversational Analytics, Visualization Assistant, Self-service Explores with CSV and Excel blending, and CI/CD pipeline support, giving teams more production-ready options without waiting on preview limitations.
The new MCP integration adds a managed MCP server native to Looker, and a VS Code extension introduces a LookML AI Agent that translates natural language descriptions into production-ready LookML code, reducing the technical barrier for model authoring.
Knowledge Catalog integration in preview allows Looker to transform metadata into a semantic graph, which is positioned as a way to reduce AI hallucinations by giving agents the context needed to complete tasks autonomously.
Pricing details were not disclosed in the announcement, so teams evaluating these features should check cloud.google.com/looker directly, particularly for the preview features, which may have different availability or cost structures once they reach GA.

Next ‘26: Announcing new partner-supported workflows for Google Security Operations

Google Security Operations is expanding its partner ecosystem with 13 new integrations announced at Next ’26, bringing the total vendor count to over 300. The new partners span data ingestion, automated response, and bi-directional API workflows, covering gaps in areas like SAP logs, VMware ESXi threats, and application-layer attacks.
Three distinct integration patterns are supported: data feed integrations that pre-map telemetry to Google’s Unified Data Model schema, response integrations that automate alert triage and case management, and bi-directional API workflows that let partner platforms pull Chronicle detections without requiring analysts to switch consoles.
Notable technical additions include Synqly Mesh offering bi-directional normalization between UDM and the Open Cybersecurity Schema Framework (OCSF), and Contrast Security streaming verified runtime attack telemetry to surface confirmed application exploits as cases correlated with WAF and EDR signals.
AI-assisted triage shows up across multiple integrations, with Torq applying agentic AI to filter detections and autonomously execute response actions like endpoint isolation, and Prophet Security using natural language threat hunting with bidirectional sync back to Google Security Operations.
Vendors interested in joining the ecosystem can download the Google Security Operations Build Partner Guide and request a development environment through the Google Cloud Security Tech Partners team. Pricing for individual integrations is not specified in the announcement and would vary by partner.

The new Gemini Enterprise: one platform for agent development

Google rebranded and expanded Vertex AI into Gemini Enterprise Agent Platform, consolidating model access, agent development, governance, and deployment tooling into a single system aimed at enterprise-scale agent management.
The platform introduces Agent Identity, which assigns each agent a unique cryptographic ID for auditability, alongside Agent Gateway for securing agent-to-agent communications and Model Armor for protection against prompt injection and data leakage.
A new Memory Bank and Memory Profiles feature gives agents persistent long-term context across sessions, allowing them to retain user preferences and historical interactions rather than starting fresh each time.
The Gemini Enterprise app adds a no-code Agent Designer for non-technical users, a centralized Inbox for monitoring long-running agents, and a Projects workspace that preserves team context as a persistent company asset rather than individual chat history.
The partner ecosystem integration brings agents from Adobe, Salesforce, ServiceNow, Workday, and others directly into the in-app Agent Gallery, with Google Cloud validation for security and interoperability before deployment. Pricing details were not disclosed in the announcement, so listeners should check cloud.google.com/ai for current pricing information.

What’s new in Gemini Enterprise

Google is expanding Gemini Enterprise with long-running agents that can autonomously execute multi-step workflows for hours or days, handling tasks like financial reconciliation or sales prospecting without constant human supervision. This is managed through a new Inbox command center that categorizes agent activity into actionable groups.
The Enhanced Agent Designer lets non-technical users build agents using natural language or a visual interface, with reusable Skills that codify specific workflows and human-in-the-loop checkpoints for review and approval at critical steps.
Governance is built into the platform at no additional cost through three key controls: Agent Identity for unique digital IDs and least-privilege access, Agent Registry for IT-managed agent catalogs, and Agent Gateway for centralized network policies and protection against risks like prompt injection.
Projects and Canvas introduce team-level collaboration by creating shared workspaces where humans and agents co-create together, with cross-platform support spanning Google Workspace, Microsoft 365, and OneDrive, plus the ability to export directly to Microsoft Office formats.
The new Agent Marketplace integrates into the existing Agent Gallery, allowing organizations to browse and deploy third-party agents from partners like Accenture, Oracle, and ServiceNow, while BYO-MCP support lets admins connect custom or third-party business tools without writing code. New features will roll out over the coming months, and pricing details are available at cloud.google.com/gemini-enterprise.

Introducing Google Cloud Fraud Defense, the next evolution of reCAPTCHA

Google Cloud Fraud Defense is the rebranded and expanded version of reCAPTCHA, now positioned as a broader trust platform that handles not just bot detection but also AI agent verification and multi-stage fraud across entire user journeys. Existing reCAPTCHA customers are automatically migrated with no action required and no pricing changes.
The platform introduces an agentic policy engine that lets businesses allow or block traffic based on risk scores, automation types, and agent identity, addressing the growing reality that AI agents are being used to complete end-to-end transactions on behalf of users.
A notable new mitigation tool is a QR code-based challenge designed to require human presence when suspicious agent activity is detected, replacing traditional CAPTCHA puzzles with a method intended to make automated fraud economically impractical rather than just technically difficult.
Google cites a 51% average reduction in account takeover for customers using the unified trust model, and the platform currently protects over 14 million domains globally, including 50% of Fortune 100 companies, giving it broad signal coverage that individual site data cannot replicate.
The platform integrates with emerging standards like Web Bot Auth and SPIFFE for agent identity verification, which is worth watching for teams building or securing agentic workflows since standardized agent identity is still an evolving area across the industry.

What’s new for Cloud Run at Next ‘26

Cloud Run is adding support for NVIDIA RTX PRO 6000 Blackwell GPUs, now generally available, allowing teams to serve models with 70 billion or more parameters without managing underlying infrastructure, including automatic scale-to-zero when idle to avoid unnecessary GPU costs.
Google AI Studio now supports full-stack app deployment directly to Cloud Run with a single click, combining server-side code, Firestore, and user authentication in a generally available workflow aimed at lowering the barrier for new developers.
A new Cloud Run MCP server is now generally available, giving developers and AI agents a standardized way to deploy and manage applications programmatically, which fits into the broader push toward agentic workflows.
Cloud Run is introducing individual instances as a primitive resource, separate from services or jobs, allowing teams to run long-running background agents more directly, though this feature is currently in preview with select customers only.
Billing caps are coming soon, letting teams set a monthly spend ceiling after which Cloud Run resources are deactivated, which addresses a common concern for teams running unpredictable or experimental workloads on pay-per-use infrastructure.

What’s new in GKE at Next 26

GKE Agent Sandbox launches as a new isolated execution environment for running untrusted AI agent code, using gVisor kernel isolation to support 300 sandboxes per second at sub-second latency, with up to 30% better price-performance on Axion processors compared to other cloud providers.
GKE hypercluster enters private GA, enabling a single Kubernetes-conformant control plane to manage up to one million chips across 256,000 nodes spanning multiple Google Cloud regions, reducing the operational burden of managing hundreds of disconnected clusters for large AI training workloads.
Inference performance improvements include ML-driven Predictive Latency Boost in GKE Inference Gateway, reducing time-to-first-token latency by up to 70%, plus automatic KV Cache storage tiering that delivered over 40% TTFT reduction when offloading to RAM and nearly 70% throughput improvement when offloading to Local SSD for long-context workloads.
New reinforcement learning capabilities in preview include an RL Scheduler to address straggler effects, an RL Sandbox for millisecond-scale kernel-level isolation during reward evaluation, and out-of-the-box observability dashboards, targeting the GPU and TPU idle time that occurs between RL pipeline steps.
Intent-based autoscaling adds native custom metrics support to the Horizontal Pod Autoscaler, reducing autoscaling reaction time from 25 seconds to 5 seconds while eliminating dependencies on external monitoring stacks that could cause autoscaling failures if they go down.

AI infrastructure at Next ‘26

Google announced eighth-generation TPUs at Cloud Next, split into two specialized chips: TPU 8t for training (delivering nearly 3x higher compute performance than prior generation, with 121 exaflops in a single superpod) and TPU 8i for inference (offering 80% better performance per dollar with 5x lower on-chip latency). This is the first time Google has offered distinct TPU chips optimized for different workload types rather than a single general-purpose design.
The Virgo Network is a new data center fabric with 4x the bandwidth of previous generations, capable of connecting 134,000 TPUs in a single data center or over one million TPUs across multiple sites into a unified training cluster. Google is also making it available for NVIDIA-based A5X instances, supporting up to 960,000 GPUs across multiple sites.
Storage improvements include Google Cloud Managed Lustre now delivering 10 TB/s of bandwidth (10x improvement over last year) with 80 petabytes of capacity, plus a new Rapid Buckets feature on Cloud Storage offering sub-millisecond latency and 20 million operations per second to keep accelerator utilization at 95% or higher during training checkpoints.
GKE received notable orchestration updates targeting agentic workloads, including node startup times 4x faster, pod startup reduced by up to 80%, and an updated Inference Gateway using ML-driven routing that cuts time-to-first-token latency by more than 70% without manual tuning.
Native PyTorch support for TPUs (called TorchTPU) is now in preview, joining existing JAX and vLLM support, which reduces friction for teams who want to run existing PyTorch models on TPU hardware without significant code changes. Pricing for these new offerings has not yet been publicly detailed, with availability described as coming soon.

Azure

1:30:36 Optimize object storage costs automatically with smart tier—now generally available

Azure Smart Tier for Blob and Data Lake Storage is now generally available, automatically moving objects between hot, cool, and cold tiers based on actual access patterns.
Data inactive for 30 days shifts to cool, then cold after another 60 days, and immediately returns to hot upon re-access with no retrieval or early deletion charges.
The feature eliminates the need to manually configure and maintain lifecycle rules, which is particularly useful for organizations managing large analytics workloads, telemetry data, or data lakes with unpredictable access patterns.
During preview, over 50% of smart-tier-managed capacity automatically shifted to cooler tiers.
Pricing includes standard hot, cool, and cold capacity rates with no tier transition fees, but a per-object monthly monitoring fee applies to objects managed by the smart tier.
Objects smaller than 128 KiB stay in hot tier permanently and do not incur the monitoring fee, so workloads with many small files should factor that into cost planning.
Setup requires a storage account with zonal redundancy and is available via the Azure portal or API, either at account creation or by switching an existing account’s default tier to smart. Legacy account types like GPv1 and page or append blobs are not supported.
Smart tier is available now in nearly all zonal public cloud regions, with broader regional coverage and updated Storage SDK support planned in upcoming releases. More details and pricing are at azure.microsoft.com/en-us/pricing/details/storage/blobs.

1:31:21 Justin – “Thanks, you finally got what Amazon’s had for a while.”

1:38:37 What’s new in Microsoft Entra – March 2026

Microsoft Entra ID is adding synced passkeys, passkey profiles, and phish-resistant MFA support for Linux SSO, giving organizations more options to move away from passwords while meeting compliance requirements for stronger authentication.
Starting June 1, 2026, Entra Connect Sync and Cloud Sync will block hard-match operations for users with assigned Entra roles, closing a potential attack path where on-premises AD attribute manipulation could be used to take over privileged cloud accounts.
Admins should review their hybrid sync configurations before that date.
The Microsoft Authenticator app now includes jailbreak and root detection for Android, with a phased rollout moving from warning to blocking to wipe mode, meaning users on non-compliant devices will eventually lose access to Entra credentials entirely.
Agent management is consolidating under Agent 365 as the single control plane, with the existing Entra admin center Agent registry and collections blades retiring May 1, 2026, and the current registry Graph API being deprecated and replaced, requiring re-registration of agents using the old API.
Entra ID Governance added several notable features this quarter, including SCIM 2.0 API support, delegated workflow management in Lifecycle Workflows, and a new billing meter for guest users, which organizations relying on governance features for external identities should review for potential cost impact.
Why June 1st? Turn this on today!

1:34:17 New in Azure SRE Agent: Log Analytics and Application Insights Connectors

Azure SRE Agent now supports Log Analytics and Application Insights as native connectors, allowing the agent to run KQL queries directly against workspaces and App Insights resources during incident investigations, replacing the previous approach of shelling out to Azure CLI commands. (REALLY? Bombastic side eye.)
Setup is simplified compared to the manual RBAC approach: selecting a resource from the dropdown automatically grants the agent’s managed identity Log Analytics Reader and Monitoring Reader on the target resource group, with a manual entry fallback if resource discovery fails.
The feature is backed by the Azure MCP Server using the monitor namespace, giving the agent read-only tools like monitor_workspace_log_query and monitor_table_list, with no ability to modify alerts, retention settings, or workspace configuration.
Practical use cases include AKS cluster investigations where the agent can automatically query ContainerLog, KubeEvents, and application traces across multiple connected workspaces to surface errors and failure patterns without manual intervention.
The connectors are currently behind an early access flag under Settings > Basics, though Azure SRE Agent itself is generally available.
Pricing is not detailed in the announcement, so listeners should check sre.azure.com/docs for current cost information.

1:35:14 Justin – “So they REALLY want you to burn tokens.”

1:35:41 Azure Key Vault HSM Platform One Retirement: What Purview BYOK Customers Need to Know

Azure Key Vault is retiring its legacy HSM Platform One on September 15, 2028, and customers using Microsoft Purview Information Protection with Bring Your Own Key (BYOK) will need to migrate their tenant root keys to the modern FIPS 140-2 Level 3 certified HSM platform before that date or risk losing encryption and decryption capabilities.
The migration is not straightforward because Azure Key Vault does not support exporting keys once imported, meaning customers must re-import their original on-premises key material into a new vault, which can be a lengthy process if that original key material is no longer readily accessible.
Microsoft is recommending customers start planning now, despite the 2028 deadline, particularly because coordinating across security, compliance, and HSM teams to recover or regenerate lost key material can take considerable time.
The practical steps involve confirming whether your tenant key sits on the legacy HSM platform, creating a new Key Vault on the modern platform, and updating your Purview configuration to reference the new vault, with Microsoft support available for customers who no longer have access to the original key material.
This announcement is most relevant to enterprise customers in regulated industries who have adopted BYOK for compliance reasons, and they should review the updated guidance at the Microsoft Learn documentation for tenant root key management to understand prerequisites and supported migration paths.

1:36:19 Matt – “The thing is, Microsoft does give you a decent amount of time to do stuff, but what’s always fun is if you buy a three-year reservation you’re stuck with it, and you have to deal with returning it right now, because otherwise you’d have negative time…”

After Show

1:38:11 Allbirds shares soar 580% after pivot from shoes to AI

Allbirds announced a $50 million deal to rebrand as NewBird AI, shifting its business model from footwear to GPU compute infrastructure and on-demand cloud services built for AI workloads.
The company’s stated rationale is a supply gap in AI compute capacity, with plans to purchase GPUs and offer them as on-demand cloud resources to businesses that cannot access sufficient computing power through existing providers.
Analysts are skeptical, with one branding consultant describing the move as using the company’s existing stock market shell for an unrelated business rather than a genuine operational pivot.
The 580% share surge on a press release, despite no demonstrated product or AI-related revenue, has led retail analysts to categorize this as a meme stock situation driven by AI sentiment rather than fundamentals.
For cloud podcast listeners, this story is a useful data point on how GPU scarcity narratives are influencing capital markets, and raises questions about the credibility of new entrants claiming to address AI compute shortages without established infrastructure or track records.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

[00:00:00] Speaker A: Foreign. [00:00:06] Speaker B: Welcome to the Cloud pod where the forecast is always cloudy. We talk weekly about all things aws, GCP and Azure. [00:00:14] Speaker A: We are your hosts, Justin, Jonathan, Ryan and Matthew. [00:00:18] Speaker B: Before we get into this week's news, we want to take a minute to tell you about We Are Developers World Congress, which is finally making its way to North America this September. If you spent any time in the European tech scene, you probably know the team behind it. They've been running World Congress in Berlin for over a decade and it's a big deal over there, pulling in more than 15,000 developers every year. Our friend Kote from Software Defined Talk is actually speaking at the Berlin event this July. And from what we've seen, these are the people who know how to put on a good developer conference. This September 23rd through 25th, they're bringing it stateside to San Jose. Organizers are expecting more than 10,000 developers with over 500 speakers across 18 different content tracks covering the entire stack, including Cloud, DevOps, AI Security, Software Architecture, data engineering, front end and developer experience. If you've got a team, everyone's going to find a full schedule. It's not just sit and listen sessions. There are keynotes, workshops, masterclasses and hands on labs. The kind of stuff you can take back home and work on. On Monday, there's an impressive list of speakers including names from Datadog, Honeycomb, Sentry, Google, LinkedIn, Stack, Overflow, Netflix, Microsoft and Stripe plus Kelsey Hightower, Oliver Pommel, Christine Yen, Scott Hanselman and Angie Jones head over to werdevelopers us to grab your ticket and use code DEVPOD26 for 15% off that stacks with their group rates. If you're bringing four or more people and honestly at that price, you should probably bring the whole team. [00:01:51] Speaker C: Episode 352 recorded for April 28, 2026 Google Next rebrand a Palooza Good evening, Ryan and Matt. How are you guys doing? Hello. [00:02:01] Speaker A: Doing good. [00:02:02] Speaker D: Good. Welcome back from Vegas. [00:02:04] Speaker C: I mean, Ryan and I survived it. That was the most important part of Vegas. [00:02:08] Speaker A: Barely, barely. [00:02:10] Speaker D: We survived and we did not succeed in recording there were some medical emergencies. So I'll take that as the excuse [00:02:16] Speaker C: that that's a valence, a completely valid excuse. So we, we understand, we appreciate your attempt. I mean the most important part about Vegas for me was that it's over because I gotta go home. But Ryan, Ryan got to do his first, first sphere experience. So I'm putting on the spot. Yeah, he got to the sphere. It was not the Eagles. As much as I've tried to get him to go to the Eagles at the Sphere. [00:02:35] Speaker A: It does not. [00:02:36] Speaker C: And I, I do, I. But I, I made the statement earlier this week to you that I said we should go see the Eagles at the Sphere because they may. The Sphere might actually redeem the Eagles for you. [00:02:44] Speaker A: Zero chance. [00:02:46] Speaker C: The visuals could be awesome. I don't know. I am just saying there's a chance because I've been to a couple concerts now. It's a Sphere and it's amazing. And so this is your first experience? [00:02:52] Speaker A: This is my first going to the Sphere at all. [00:02:54] Speaker C: You should share. You should share how you feel about it. [00:02:56] Speaker A: Yeah, so I went and I saw a Fish with a buddy of mine at the Sphere and I gotta say, like the. I'm not a huge fish fish fan like passing but that is the greatest like meld of two things like getting a. A jam band with those visuals and the sound. And the sound for that venue is amazing. [00:03:17] Speaker C: Oh yeah. When you read the. Of the details behind that building and like how many speakers and how many directional speakers at your seats there are and like it's, it's a crazy impressive venue. [00:03:26] Speaker A: Like, I mean it's really, really, really impressive. And none of it is like, it's not like a typical concert where there's this huge array of things. Like it's all behind basically this scrim that they use for all kinds of really cool visuals. And it really changes the way that you do concert lighting. I thought like it's really neat because they. Well, they had sort of like faux concert lighting, like almost like cartoon lighting at certain parts of the show. But it really changed the lighting in the venue like in terms of like looking around on the crowd and stuff. It's a lot brighter. There's a very different concert going experience than I've ever experienced. So that was. I think it's really wild. Really cool. I. I look forward to looking at, you know, going to other shows. There's. [00:04:07] Speaker C: Yeah, I've seen the, you know, of course there's a movie. It's like a this Planet Earth type movie where they show you nature which was cool in there, although really big animals. And then I've seen of course wizard of Oz there. I saw the preview with Google Next last year. Then I saw the actual movie there as well, which is kind of like a 4D experience where they have like, you know, they add wind and all kinds of stuff to it. It's pretty neat. And then I've seen now a couple of concerts there. I saw Backstreet Boys and I'm Not a Backstreet Boys fan, so. But my wife is. And so I. We went and it was great. I had amazing concert experience. It was awesome. And so I'm definitely looking forward to Metallica, who's supposedly coming. [00:04:45] Speaker A: Yep, it's. [00:04:47] Speaker C: I think it's official now. [00:04:48] Speaker A: Yeah, it's official October through May or something like that. It's a huge residence. [00:04:52] Speaker C: Oh, wow. [00:04:52] Speaker D: It's a long time. [00:04:53] Speaker A: Yeah. [00:04:53] Speaker C: I mean, the Bagsby Boys that were there forever and then, you know, even the Fishes. This is not the first time Fish has been there. No, it's not. [00:05:00] Speaker A: I don't think they have a residency. I think they just hit it up. [00:05:03] Speaker C: But yeah, then the Eagles, you know, they. They keep coming back regularly too. So. Yeah, there's always that option for you. But yeah, no, I definitely. It's a cool place and it's worth going to Vegas for that concert experience. It's expensive. That's my only complaint. [00:05:15] Speaker A: It is very expensive. [00:05:16] Speaker C: It only holds, I think, what, 8,000 people. I mean, it's not a huge venue either, so it's expensive for the right reasons, but, you know, it's. It's pretty cool. Definitely worth checking out if you're there. So. I'm glad you got to do it. I'm glad you had a good time. It's. It's amazing. [00:05:30] Speaker A: It really is. [00:05:31] Speaker D: I didn't realize how small the venue is. [00:05:34] Speaker A: I didn't realize it was only the 8,000. I'm surprised by that. [00:05:36] Speaker C: Yeah. Never mind. You're low. Sorry. It might be the floor capacity. Beer. Seating capacity. Let's do real time follow up. [00:05:43] Speaker D: I said it was such definity. [00:05:45] Speaker C: I did. I did. Standing for capacity is 1400 and then total capacity is 20,000. Seating capacity 17,000. That makes more sense. [00:05:51] Speaker D: Yeah. [00:05:51] Speaker C: Yeah, that's about right. [00:05:52] Speaker D: Yeah. It feels better. [00:05:53] Speaker C: Yeah, but it's still. It's. Yeah. And they keep trying to build more of them. I think the only place that's agreed to build another one is Dubai so far. Because of course, Dubai. [00:06:00] Speaker A: No, there's another one I heard that's being built out in Virginia. Like. [00:06:04] Speaker C: Oh, really? I know they wanted to build one in London and London was like, yeah, no, the light pollution. But I'm like, the outside of the sphere is cool. Like, it's definitely awesome on its own. Right. Especially in this. In the whole thing. But, you know, the venue inside is actually way cooler than even the outside of it. So I don't. I don't know why they are insistent on having the outside of the sphere other than the marketing potential of it. But, you know, apparently killed it in London, which makes sense because no one in London wants a giant eyeball looking at everybody. [00:06:30] Speaker A: Yeah, it is kind of like. Yeah, like where else could, other than Dubai, would it. Would you build one where it makes sense? [00:06:37] Speaker C: Singapore makes sense to me. [00:06:39] Speaker D: Good call. [00:06:40] Speaker C: I get maybe Sydney as well, but that one I'm not sure about. So Tokyo would definitely, probably fit in right in there. There's definitely some places I think it would work, but they're expensive to build. They're like, I think it was how many billions of dollars it was? Many, many billion, five billion, something like that. [00:06:57] Speaker D: Well, they did it once, so it should be cheaper the second time. [00:07:00] Speaker C: That's what they say. That's what they do say. But you have seen that inflation and tariffs have raised prices of everything, so maybe not. You never know. [00:07:09] Speaker D: It's kind of fitting with the giant eye and being in D.C. with all the government stuff. So everyone's spying on each other. Of makes sense there. [00:07:18] Speaker C: Maybe they should make a ballroom out of it and then they could just put it right next to the White House. Just saying it. All right, let's get into real news here. Google Next happened. We'll get to that in a minute. But there was a lot of news before Google Next that we have to get through here quickly. First up, Amazon and Google have invested Amazon 25 billion and Google 40 billion into anthropic. Both of these partnerships are a commitment to buy more capacity and measured by gigawatts. On the Amazon side, they're going to be using the Trainium 2 and Trainium 3 capacity. And on the Google side, they're going to be using the TPU architectures. So Anthropic is definitely taking advantage of all the GPUs they can get as their growth continues to be astronomical. And so, yeah, they invest the money and then they pay it back to them, which is sort of a weird money on paper problem. I don't really understand how it works, but that seems to be what happens all the time right now in the AI space. [00:08:08] Speaker A: It does. [00:08:10] Speaker C: And so Anthropic continues to get big amounts of money and their valuation continues to be absolutely crazy. [00:08:17] Speaker D: There's. [00:08:19] Speaker A: Yeah. If I understood how the, you know, the, the money thing worked, I saw a funny thing where it was. Someone was plugging an extension cord into itself. As a description of how. How this was working. [00:08:32] Speaker C: Yeah. [00:08:33] Speaker D: Kind of what it feels like, though. [00:08:34] Speaker A: Yeah. [00:08:34] Speaker D: I give you money, you give me back to me because you're going to spend it on my platform and somewhere in there somebody gets a salary. Like it's kind of like you can't infinitely make power and there's always some loss. There's some lost money somewhere in here [00:08:49] Speaker A: and you probably can't tax this in the same way. Right. So it's all just sort of how they're getting away with like I don't know, we're investing, it's not buying invest [00:08:58] Speaker C: but like I mean like how much shares eventually that means Anthropic has to either return that investment is it's a loan and they get paid back with interest or they own some portion of the company and then the company has to go public to basically redeem that money to them. So it's a lot of money. I don't know. The IPO on this company is going to be massive when it happens. It is, it's going to be amazing [00:09:20] Speaker A: and but I wonder like the financials then I think it'll be one of those that balloons up right away and then craters crashes because I think the I I AI still is very unsustainable in terms of cost. So it's yeah kind of nuts. [00:09:36] Speaker C: I mean the biggest thing is when they file their. Is it the 10k that comes out before they go public or the prospectus, whatever. You know, that's what, that's what killed. What was the name of the co working space we were, we worked. He's like when you actually looked at the financial numbers you were like oh my God, this is a house of cards. And it basically imploded their entire ipo. So the big question is going to be when one of these companies, OpenAI or Anthropic finally goes public and they start publishing these things, what people's actual reaction is to their financials and yeah, [00:10:07] Speaker A: and is it can they sustain like the level of growth that the wall that the street wants so well they [00:10:13] Speaker D: can't sustain it and they're essentially subsidizing it. Something I saw for I think it was GitHub where they were saying it's the $30 you pay per person. A low user and don't hold me to these numbers was like 20 bucks it was costing them. The medium was around 30 but the highs are like $90 per agent. Like they're losing money on every single like over half the seats that they're spending right now just to get people on it. [00:10:40] Speaker C: Well they, they GitHub. GitHub announced today. [00:10:43] Speaker D: Today. [00:10:44] Speaker C: We'll talk about that, we'll talk about that next week but because it came after our cut off for the week. But yeah, no, they, they've realized they're about to lose a bunch of money, especially with Agentic. So they, I think they have been losing money. Oh, I'm sure they've been losing money, but they're going to lose a lot more money. That's the problem. So I think, I think you're going to see everyone moving into consumption pretty heavily. Even Anthropic has talked about moving away from the monthly plans, just moving to pure consumption because the reality is they're subsidizing your use of AI at those cost tiers. Then people get mad because they change the, you know, they change the models, they kick out open claw, etc, and then people get mad and it's like, well, but if you were truly paying for the consumption you're using, then you wouldn't be as upset about this. [00:11:25] Speaker D: So I think it was like 220. I saw somebody like looked at the number of tokens and did the math out. It was like $220 or the $100 plan is essentially what you're getting equivalent to tokens. [00:11:36] Speaker C: Right. So we'll continue to see Anthropic and invested in, I'm sure next, you know, probably two or three more cash infusion because their side of it is they are paying a lot of money to these cloud providers to provide TPUs and GPUs and Nvidia itself. And so eventually that merry round has to stop. So we'll see. SpaceX has apparently struck a deal with AI coding startup Cursor via either a $60 billion acquisition or a $10 billion partnership fee, giving Cursor access to Xai's Colossus supercomputer, which runs 200,000 Nvidia H100 equivalent GPUs for model training. Cursor. Cursor has been compute constrained despite reaching a billion dollars in annual recurring revenue and a $29.3 billion valuation. So this deal directly addresses their infrastructure bottleneck for scaling model intelligence. The partnership positions SpaceX to compete in the AI coding tool space against Anthropic and others, notably given xai's Groq has publicly acknowledged felling behind competitors encoding capabilities for developers and cloud users. This deal signals continued consolidation between compute vendors and AI coding tools, which could influence future pricing. SpaceX's recent EX AI combined with their Crystal deal suggests a vertical integration strategy connecting Rocket company compute infrastructure directly to developer facing AI products ahead of a potential IPO later this year. This is, this IPO is going to Be a lot of funny money. This one might be one. Oh my. Yeah. [00:12:58] Speaker A: Like, it's so weird. Like, you know, it's. It's already bad that they, you know, the. They sold this, the AI. Where's the X AI to the SpaceX. Like, that's just giving money to yourself. [00:13:10] Speaker C: Yeah. I can't believe they didn't rename SpaceX to SpaceX AI. Wouldn't that be more sense? Is that what you would have done? I think people saw this announcement come out right before next and everyone was like, why is SpaceX buying cursor? Then you have to remember how they bought xai. And it's like, oh, okay, this makes perfect sense in the light of xai, but still doesn't make sense though. So that's my problem with it. Mean, the thing I don't get is the $10 billion partnership versus the $60 billion acquisition. Like, what's the triggering events on those things? Like when, like when is it a partnership versus when is it now an acquisition? And does that mean that these people who are working at Cursor, if it's a partnership, aren't getting equity? Because that's a bummer. [00:13:50] Speaker A: Yeah, I didn't catch that. It was one or like it. So that's kind of interesting. [00:13:55] Speaker C: I think it starts as a $10 billion investment and then it can turn into a 60 billion. It's like a 60 billion option on it is basically my understanding. Yeah. [00:14:04] Speaker D: That's still feels strange. [00:14:07] Speaker C: It's a weird deal because I think reality is Elon doesn't have any money. It's not liquid. [00:14:12] Speaker A: Not real money. [00:14:12] Speaker C: Yeah. And so this is his way of, like, I can basically tie you down at a. At a reasonable price, and then if the valuation goes down further, I can get you cheaper. Maybe later. [00:14:21] Speaker A: Right. [00:14:22] Speaker C: That's kind of my feeling. [00:14:23] Speaker A: Yeah. I mean, Tesla's no longer going to make cars, they're just going to make robots. SpaceX is doing AI like they're. It's cats and dogs living together. It's mass hysteria. [00:14:32] Speaker C: Yeah, it's crazy talk. Crazy talk. All right, well, AI is how ML makes a lot of money, apparently. This week, OpenAI updated his agent SDK to general availability, adding native sandbox execution, configurable memory and filesys, and tools modeled after codecs. Agents can now read and write files, run shell commands, install dependencies, and apply code patches within controlled environments without developers building an interest for themselves. And Ryan says kill it with fire. [00:15:02] Speaker A: No, as long as it also logs and has permissions and some sort of boundaries. I don't have to kill it. [00:15:09] Speaker D: Admin rights for everyone. [00:15:11] Speaker C: It's. [00:15:11] Speaker A: No, it's just terrifying that people. Because we already already have, you know, people that are just throwing questions into any chat tool and just then running whatever command it spits out indiscriminately. And now that's just going to happen at a faster rate and people don't quite understand what's going on and in a lot of cases, don't care to. [00:15:33] Speaker C: Right. [00:15:33] Speaker A: Just do what I want. [00:15:34] Speaker D: Ryan should never look at how I play with my dev environment for like my personal AWS account. Like, hey, cloud, just go run fully autonomous and just run whatever commands you want. Just. It's fine, don't worry about it. [00:15:46] Speaker C: So you're, you're saying that I shouldn't go use this website that's by a company called Moonshot? That's Chinese. I don't. And I, I don't know what my data is doing. I mean, I think the reality is most people don't actually know what's in their prompts. And if you ever actually look at the prompt logs, you should be like, oh my God. Because it sends everything, all the things. Yeah. [00:16:05] Speaker A: I don't think people quite realize what data gets consumed. Yeah, Credentials and environment files, all kinds of stuff gets sent somewhere. And it doesn't necessarily mean that it's exposed, but it, it also doesn't mean that it's secure. And so like, if there's ever kind of a leakage or any kind of breach at one of these AI companies where they can get a hold of that data, if somehow it'd be a problem. [00:16:26] Speaker D: I've definitely seen people say, when it goes like, insert key here. And they go, no, no, no, give me the script. Here's my key. Create the script for me. I'm like, you understand that like one, you just killed like, I don't know, three some whales by you this whole other prompt and all you had to do was just copy and paste the key into this spot and then your key is. Now, probably not that. I mean, like you said, it's not unsecure, but it's not the most secure thing anymore. It's somewhere in the logs of one of these companies. [00:16:59] Speaker A: The truth is, it's not really the agent problem. The truth is, is that we don't have the correct authorization permissions model for people. Right. And so we give people the broad standing access. And now that's a concern because there's going to be a whole bunch of automation that's basically pretending to Be me running on these local environments. And so like that's really the issue. And so I think it'll change. It'll have to change the identity and access management plans that we use. So there'll be a lot more, you know, just in time permissions and accepting of that kind of thing. So a continuation of what you already have with AI of like is this okay? Click approve. Is this okay? [00:17:38] Speaker C: I mean I think the fear that you have to have in that idea is okay. So we are asking for a lot of permissions all the time in like cloud code. Right. Or Gemini cli. It's like I want to run this script, I want to do this thing. And you're like yes, yes, always, always, always. Because you get tired of. You get. It just becomes prompt fatigue. Which is kind of what happened with like Windows UAC back in the day. You know, people just got to the point of like whatever. I just hit yes. I don't even read these things when it rays out your screen and makes you do a thing. And I think that's the risk that you have even inside of some of these agentic users identities is that if you're going to constantly be having to approve things, then you don't actually get agentic, which is what the whole point was. And so the balance of security versus agentic versus least privilege is going to be a really interesting friction that's going to be going out the next year or two. [00:18:29] Speaker A: Yeah. And it's crazy because I don't have great answers. It's bad on both sides. I want the productivity of agentic. I use the so much more productive because of these things. And I want to enable agents on my work laptop where I can do much more sophisticated things. But, but I also am terrified of, of that and people being able to like execute directly against production environments, you know, because they're port forwarding into that environment, they can just run a command and, and the, the AI knows to do it because you left some other file that says how to do it. And so it's like, oh, this is useful information. I'll consume this and, and then just use that. [00:19:11] Speaker C: Yeah, just. Well, you know the things like, you know, cowork and these other tools like you can create whole agents that run on your laptop that are now, you know, invoking. They have access to your identity and things like that. And so yeah, the boundaries are evolving and changing and the perimeterless enterprise has a suddenly new tone to it, doesn't it? [00:19:32] Speaker A: Oh yeah. [00:19:32] Speaker C: More sinister than it used to be. [00:19:34] Speaker D: Yeah, I Mean, at one point you got to have your end users be responsible for what they're doing and, you know, help them help themselves and help the company. Because that takes some ownership. [00:19:46] Speaker A: They have to take some ownership. But I don't know how to give same guardrails. And that's really the problem. As a security person. I don't want to say no to everything. That's the joke. But the goal really is to provide those same guardrails and then not, you know, the problem is it's not everyone agrees what those guardrails should be, but that's fine. And this one, I. I don't know, other than like saying, yeah, if you do something bad, you're fired. I got nothing. Right. Like, it's like. And I don't want to do that. [00:20:09] Speaker D: But some at one level, at some point it does come down to that. [00:20:14] Speaker A: Like, yeah, yeah, we all have to have some accountability. But it's also, you know, it's it when it's so quick to shoot yourself in the face. Like, it's all it has. There has to be some sort of level of protections as well. [00:20:28] Speaker C: And it doesn't have to be we just fire people. [00:20:30] Speaker A: Yeah, okay. Non calvary. [00:20:33] Speaker D: Blame AI for it. It's fine. [00:20:34] Speaker C: Yeah, it's fine. I mean, that's what they're doing now. AI layoffs for business problems. [00:20:39] Speaker A: That's true. [00:20:39] Speaker C: Popular. [00:20:40] Speaker A: Kind of a different way to do [00:20:41] Speaker C: it, but I guess, yeah, well, if you'd like to burn Money. Claude Opus 4.7 is now generally available across cloud products. The API, Amazon Bedrock, Google Cloud, Vertex and Microsoft Foundry all at the same price as Opus 4.6. So you get $5 per million input tokens and 25 million per output tokens. Model targets complex long running agentic coding workflows with early testers reporting 13% higher resolution on a 93 task coding benchmark and 3x more production task resolution on Rakuten software benchmark compared to the Opus 4.6 Vision capabilities received a notable upgrade with Opus 4.7 now supporting images up to 2,576 pixels on the long edge, more than three times the resolution of prior Claude models. This opens up cases like computer use, agents reading dense screenshots and data extraction from complex technical diagrams through higher resolution images with consuming more tokens. Anthropic is using Opus 4.7 as a test bed for cybersecurity safeguards for any broader release of its more capable Mythos preview model. The model includes automatic detection and blocking of prohibited cyber security uses with a New cyber verification program available for legitimate security professionals doing penetration testing or vulnerability research. A new extra high effort level to burn those tokens faster sits between the existing high and max settings, giving developers finer control over the reasoning versus latency trade off. Developers migrating for Opus 4.6 should note that updated tokenizer can increase token counts by roughly 1 to 1.3 to 5 times depending on content type and a migration guide is available to you out there. So yeah, this is a fun trick. We just make the tokenizer be less efficient and then we make more revenue. It's beautiful. The last thing is File system based memory improvements allow OPA support 7 to retain notes across multi session agentic work, reducing the need to reestablish context at the start of each task. This is particularly relevant for enterprise teams running parallel agent workflows which are still not GA but I use every day. [00:22:26] Speaker A: Well, there's a lot of other platforms in which you can sort of enable this right where it's you're sort of tying it together with with glue ecosystem. It's funny that they this is. I didn't realize it's the same price because every platform that I'm using this in Opus 4.7 is so much more expensive. [00:22:43] Speaker C: Well, it's more expensive because of that tokenizer and other, you know, memory things and other reasons why it is a bit more expensive. And also you can put bigger images into it which you know it. The thing is if you were pasting images in and they were too big it would just automatically downsize them previously. Now it just takes a bigger image so now more tokens it has to deal with. So there's lots of little things there that you need to be aware of in that conversation. [00:23:06] Speaker D: I mean I definitely found it burning more tokens and I switched back to 46 for a few things. But also as I use AI more and more, I'm starting to try to be a little bit more conscious of what model I choose where and not just saying let me use Opus all the time. Sonnet does perfectly fine for 90% of what I need. You know, except for when I'm doing deeper research and analysis and trying to fix a pretty deep bug that I flip over to Opus. [00:23:36] Speaker C: I mean if you don't need reasoning, even haiku can do a lot of work for you. So like if. If you don't need a lot of reasoning and you have a very specific clear ask like I need you to take these secrets and convert them to parameter store to save me money. Haiku can do that in Its sleep. Yeah, you know, and so that's like, it's a very clearly defined this and that task where you need a thought and like red engineering. So that's where you want the Sonnet or the opus. But I've never been a big OPUS user because a, it is outrageously expensive. Now if I'm in the middle of like a production bug and I need to figure out what's broken with this code real quickly, yes, I might use OPUS then. But I also like a lot of my coding stuff. I'm using more and more into like Kimi 2.5, not via moonshot, by the way, but through, you know, Ollama and other solutions that provide, you know, these. Because those, those models are being rated near the same level as like an OPUS or a Sonnet. And they're like three or four dollars per million input token. You know, it's so much cheaper. So the reality is like, you definitely should shop around. I. So I started using Bifrost in the last week just because I wanted an easier way to migrate between different model agents. And Bifrost is super cool way to like, add multiple providers, be able to choose them just the way you would any other tool. And you can use COD code, you can use open code, you can use Cortex or any of the other agents that are out there that if whatever one you prove you like. So you don't need to change your tool, you just change the backend. Everything else works the same and then you just, you know, this is working great. I'll keep using it. If not, then you switch. But there's lots of opportunities to save yourself a lot of token money. [00:25:10] Speaker A: Definitely. You do have to kind of invest and understand that whole ecosystem because it's fine once you get it up and running. Getting it set up is a little complex. [00:25:19] Speaker D: That setup is what takes the time. But the payoff on it is a lot if you're doing a lot of development. And that's kind of the balancing act here is if you are doing a lot, then it makes sense to spend the time. If you're not, then something, you know, off the shelf. [00:25:38] Speaker A: Yeah, it is kind of funny because I, I go back and forth with my wife who's, you know, thankfully not in tech, and like, she goes, she. I walk her through some of my setups and she about halfway through start, stops listening. And then she, she understands that it's clearly like too complex and she just wants to ask a question somewhere. It's like, okay, yeah, yeah. [00:25:59] Speaker C: The reality is that like, you know, don't give up your data center. Maybe buy some GPUs, because running these models yourself might be the only way to save yourself a lot of money. I mean, cautionary tales. I don't think we talked about it on the show. Maybe we did. But you know, Uber, they spent their entire year budget for tokens in four core in four months, like an entire budget. They now have to figure out something else. And so yeah, maybe don't use Opus number one, but yeah, like those are big risks that businesses. So I think the, this area of being how do we be more efficient? I think is going to become a bigger area and I'm hoping there's be some tools coming out that'll actually like dynamically look at what you're asking and make the decision. Because right now that exists sort of in some of the, the APIs out there. If you have specific rules like, oh, they're asking a question about this or they're asking this, I will go to this model. So you can do like really static routing rules. But like, I really wish there was a more dynamic, you know, interpretation of like, oh, this is, this is a super easy question. This is a more complex question. And then it can automatically route between them. I think this is what we'll probably see as kind of the next, next phase of AI tooling will probably be a bunch of tools that help do that exact thing. [00:27:05] Speaker A: I think you do. I think that is an option for some of the coding stuff like Cursor, you have to configure it and then Copilot can also do that. I'd like to see it expand outside of the coding stuff for, you know, maybe more just a general. [00:27:18] Speaker C: But then you also risk, you know, if it's like, I think Anthropic is working on something an adaptive mode where basically it'll adapt between the three models. But then going back to this tokenizer, hey, we need more revenue this month. Change the, change the adapter to use more Opus this month. [00:27:34] Speaker A: But they are, you know, like on the flip side they are doing granularity. You can now choose the amount of reasoning that you select even, even above the model. And so like there's, there's other things that they're doing, giving more buttons. But you have to be, you have to be flipping around a lot of [00:27:48] Speaker D: these things where I want that model router that like Azure and other cloud providers to really, that's the intelligence I want in that service. Like I want to, weirdly, I don't say I want to pay the cloud providers much in general in life because I give way too much money either way but I want to pay for something that will do that intelligence like Justin, you said like hey this is a simple query. Send it to HiQ. It's an if this then that you know versus no this is hey, I need to architect this entire thing, go to Opus and burn a few extra dollars to get a little bit more detail so we can use the cheaper things cheaper models later on because everything is broken down in a much more simplified way. [00:28:31] Speaker C: Agreed. Model routers hot hot topic for sure. All right, moving to Claude Design was at least this is their answer to Google Stitch basically anthropic launch cloud design and research preview for Pro Max Enterprise Cut subscribers Powered by Cloud Opus 4.7 enables users to create interactive prototypes, pitch decks, wireframes and marketing assets through conversational prompts and inline editing controls. The Nova workflow features that cloud code handoff for finish designs are packaged into a bundle that developers can pass directly to cloud code for implementation, creating a tighter loop between design and engineering. Cloud design builds a team specific design system during onboarding by reading code bases and design files that automatically applies brand colors, topography and components to every subsequent project, design teams can maintain multiple design systems simultaneously. Earlier user data from Brilliant suggests complex pages are required, 20 prompts and other tools need only two prompts in cloud design, indicating a meaningful efficiency gain for interactive prototype creation. Expert options include canva, PDF, PPTX and standalone HTML with organizational scope sharing and collaborative editing. And for enterprise customers, the feature is off by default and must be enabled by admins in the organizational settings. So yeah, designer, it's great. Love it. [00:29:44] Speaker A: I haven't had a chance to play around with it. I mean I I really like these kind of tools just because, you know, big joke is my my UX skills suck. So these are I love even even with AI, my UX skills suck. Which I find hilarious. [00:29:58] Speaker C: I mean, I mean the reality is Claude was pretty good at design before this tool existed. Like it. You know if you put the front end design plugin into Claude, it does a really good job on its own. So you know, it's nice to see it does. I did unfortunately decide to use this and try it out in the middle of a coding thing and it Opus ran my credits right now so I had to wait. I had to wait two hours to get my next allotment of credits or use overages. But yeah, it's definitely impressive what it [00:30:24] Speaker D: can do now is it as good as they say? Because I feel like a lot of the Anthropica. You know, we talked about this I guess two, three weeks ago. Other stuff feels always a little bit overhyped. [00:30:35] Speaker C: So I mean I haven't tried Mythos, but you know, the analysis of the market is that Mythos is a bit overhyped. I played with this personally and I was very impressed with this output. It's at par or better than Stytch, so that's the only thing I can compare it to. I don't use Canva, I don't use Figma, so I can't compare it to those tools. Maybe it's not as good as those tools. Again, those are specialty tools for designers. I am not a designer nor do I ever want to be one. And so, you know, for what I what I need, it's pretty darn good and I like it. But again, I'm not a designer. So yeah, I'll keep an eye out for some articles about people saying it's really shit and I'll let you guys know. [00:31:07] Speaker D: We're so optimistic. Yeah, there are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it's really simple. Archera gives you the cost savings of a one or three year AWS savings plan with a commitment as short as 30 days. If you do not use all the cloud resources you've committed to, Artrera will literally cover the differences. Other cost management tools may say they offer insured commitments, but remember to ask will you actually give me my rebate? Archera will check out thecloudpod.netarchera to schedule a demo Today. [00:31:55] Speaker C: Cloudflare held its first agents week, which was not that exciting actually, but they had several things they released over the week. The first one up was the Agentic cloud, so now you can run all your agentic workloads on top of Cloudflare. The new environment supports both full operating system containers for package installation and terminal commands, and Lightweight which isolates the start in milliseconds for high scale environments. They also shipped a git compatible workspace design for agent generated code moving from prototype to production. Security and dynamic were treated as built in defaults rather than add ons. Thanks with new tools for connecting agents to private networks and managing autonomous actions taken on behalf of users across your organization and the agent toolbox. Additions include inference, search, memory, voice, email and a browser primitive giving agents the ability to perceive, remember and communicate. With cloud developers assembling separate third party services, I look forward to Cloudflare taking down Cloudflare with us then writing an RCA with these great tools. [00:32:44] Speaker A: So it'll be a very well documented RCA indeed. That's pretty funny. [00:32:50] Speaker D: Will the AI blame itself for it? [00:32:53] Speaker A: I mean, it does in my code all the time. [00:32:56] Speaker C: Apologizes about time, he says. You're right, Justin, you I was mistaken. I had to do that. I'm like, cool, cool, appreciate you. That's why having Personas is great, because even the same, even Claude is the developer versus Claude as the qa and they argue with each other. It's fantastic. In addition, this agent, we also launched Artifacts and Private Beta, a version file system built on git that lets developers and agents programmatically create, fork and manage git repositories at scale via REST API and Native Workers API with Public beta targeted for early May. System is built on durable objects with a git server written in zig and compiles a roughly 100 kilobyte webassembly binary, nailing tens of millions of isolated repos instances per namespace while handling the full Git Smart HP protocol with zero external dependencies. Cloudflare is also open sourcing Artifact File System, a file system driver that mounts large git repos using a Blob list clone and lazy file hydration, reducing startup times for multi gigabyte repos from 2 minutes to 10 to 15 seconds. So yeah, I guess that's good too. Also another way that it's going to take down Cloudflare, so look forward to that. Yeah, well, it's kind of interesting. [00:33:56] Speaker A: I didn't like it solves a problem I didn't know existed. Like in terms of, you know, being able to have an agent manage multiple git repositories more quickly. [00:34:05] Speaker C: Oh, let me tell you how much I learned about work trees. [00:34:07] Speaker A: Oh yeah, okay. [00:34:08] Speaker C: In the last month, like dealing with multiple agents and wanting to work on the same code base, you're like, well, your answer is work trees. Work trees are not designed for humans, so thank God there's AI because AI can make them work. Because like, like the idea of work tree was, well, you're in the middle of doing code, but you need to do a hotfix against production, which is on the master branch. And so you need ability to do this quickly. And that was kind of the. The gist of the. When I read the documentation on this, it's like gist of why they created it. But the idea was you only had one or two work trees ever. Right. And so now reading, you know, with my AI, I can have like 1015, 20 of them. And let AI deal with that nightmare, because I can't. It hurts my brain. [00:34:46] Speaker D: It's one of those git features I didn't know existed and immediately solved the problem when I was trying to do like, what Justin's saying, Hey, I need three things going on at once that all are going to collide. And yeah, I, I tried to. You do it manually myself at one point, for one thing. I was like, no, too hard, not worth the effort. You know, let's go have AI do this for this thing for me. [00:35:08] Speaker A: You know, just merge and rebase in like a cyclical fashion for these feature branches until you just end up ripping one out before committing to me. [00:35:17] Speaker D: No, I end up with a feature branch that just has too many things in it that's not one feature. Because I'm like, ooh, let me add this next thing as I'm doing it. Or I just commit and it's like this code doesn't work and it's untested and then I go back to my main branch, I check it out, I do the hotfix I need and then I go back to my thing and I don't know where I am because I just have this commit that's called untested code. [00:35:39] Speaker C: Nice. [00:35:39] Speaker A: Brilliant. [00:35:41] Speaker C: We'll work it out. We'll work on the QA process for your code. It's fine, Matt, we got you. [00:35:45] Speaker D: That's why I have a QA agent doing it for me. [00:35:47] Speaker C: Right, Exactly. Snowflake is launching Cortex agents as a full enterprise agent platform with several capabilities now generally available, including multi tenancy with row level data isolation agent versioning, with commit based rollback and resource budgets per agent and per team spending controls. So if you need your agents close to your data, this is a great way to do it. I definitely would look into your cost of this one because Snowflake is not cheap for computers. [00:36:08] Speaker A: Nope, I, I do, I mean, I look forward to the, the MCP connector which they announced as part of this. I think that that's going to be a lot of fun. I think that's mostly because I don't like SQL interfaces and any kind of database. [00:36:22] Speaker C: So like. [00:36:23] Speaker A: But it, you know, like, I do think that there's a lot of value in running tools directly in the data, just because there's. It keeps it sort of contained and isolated and removes some of the concerns I have, like running it locally with, with a direct session open to Snowflake, which is what everyone wants to do. So I, I do like this kind of model, but yeah, it's Snowflake, so how much can it cost me? [00:36:47] Speaker C: A lot. A lot for sure. [00:36:49] Speaker A: Yeah. [00:36:50] Speaker C: Well you know, normally we expect Amazon to be a spoiler on Next and they did their part, but OpenAI came in hard and heavy on this one too. So first up they announced GPT 5.5, generally available in both ChatGPT and Codex for Plus Pro, business and enterprise users with API access priced at $5 per 1 million input tokens and $30 per 1 million output tokens. So a little bit more expensive than Opus and a Pro variant at $30 input and 180 output for 1 million tokens. The model shows notable agentic coding improvements, scoring 82.7 on the Terminal bench 2.0 and 58.6 on swe bench pro by using fewer tokens than GPT 5.4 to complete the same task. Which, you know, it's pretty nice for cloud and Enterprise Workflows GPT 5.5 was co designed with and served on Nvidia GB200 and GB300 NVL72 systems with inference optimizations including dynamic load balancing heuristics that increase token generation speeds by over 20%. Print the money faster Knowledge worker benchmarks are worth noting for enterprise buyers of 84.9 on GBP VAL across 44 occupations and a 78.7 on OS World Verified for autonomous computer use and 98.0 on Tao2 bench telecom for customer service workflows. OpenAI is classifying GPT5.5 as high under its Prepared Framework Preparedness framework for both cybersecurity and biological capabilities, and has introduced a Trusted Access for Cyber program through Codex that gives Verified defenders expanded access with fewer restrictions, which has direct applications for security teams valuing AI assisted vulnerability management. [00:38:19] Speaker A: That's kind of cool. I mean it's first I'm hearing of those kinds of frameworks where they're testing those things and sort of testing the, [00:38:27] Speaker C: I don't know, [00:38:29] Speaker A: not sustainable but the safety AI sort of aspects of it and having a rating which I like, but then having the ability to sort of give people access beforehand. It's like reporting a vulnerability like directly before going public. Kind of nice, yeah. [00:38:44] Speaker C: To make Orion grumpy, OpenAI then launched workspace agents in ChatGPT as a research preview for business, enterprise, edu and teacher plans, positioning them as evolution of GPTs powered by codecs and designed for shared team workflows rather than individual use. These agents run persistently in the cloud, meaning they can continue working on long running tasks without User interaction and can be triggered on a schedule or deployed directly in Slack. Proud to use case OpenAI highlights included lead outreach agent that reduced five to six hours of weekly rep work to automated background processing and an accounting agent that handles month end close tasks including journal entries and variance analysis. On the enterprise control side they've added several things including admins getting role based access management, a compliance API for auditing every agent configuration and run, a built in prompt injection safeguard and the ability to suspend agents and another feature, privacy filtering, an open weight 1.5 billion parameter model reducting and redacting PI and text. Available now on hugging face and GitHub under the Apache 2.0 license. So if you're looking for a lightweight built in option inside of Codex to find privacy pii, this little model sits on top of it and does great work. [00:39:47] Speaker A: Yeah, so there's nothing in here that makes me angry because it's not really running in your local work system pretending to be me. Right? Like it's. I think that running on these platforms like you know, Gemini Enterprise or or OpenAI Enterprise is the. Is the right way to do it and the safe way to do it because they are sandboxed environments and they have access to so much. They're dedicated agent identities that you can control and give specific permissions to. So I do like that sort of model for these things and you know they're just now they're making me really happy with the like the PI filters on top of these things because I think that running in those platforms is the only way you can sort of enable at least that I know of that you can enable that sort of sidecar AI Watchdog, which is cool and [00:40:35] Speaker D: this is great also for not just that before you could integrate this into your product. You know. So if you are taking any data from a customer you could at least say we're not just sending it all from whatever they type into the LLM. We do have it run through some sort of filter that does attempt to catch it with the caveat is it's not 100% but we can at least say hey, Matt's Social Security number 112-23344 is not actually going and that will automatically should catch it and then at least obfuscate it. It's the old DLP coming back in [00:41:10] Speaker C: a new way and it continues with GPT images 2.0. They're apparently it's way better at following instructions. It can now handle small text, UI elements, icons and complex layouts at up to 2K resolution. No more getting something close enough it actually delivers what you asked for. It works in non English languages. Previous versions struggle outside of Latin based text. Now it solidly supports Japanese, Korean, Chinese, Hindi and Bengali where language is baked into the design itself. It can think before it generates. When paired with a reasoning model, it can search the web, plan the image structure, self check its work and even produce multiple distinct images from a single prompt. ImageGen meets agentic AI. Flexible as a gracious rating format. Supports everything from wide three to one banners to tall one to three mobile screens. Useful for social graphics presentations, posters and more without manual resizing. And it wants to be your entire creative workflow. The big picture play here is replacing the back and forth between prompting, designing and editing. And as a person who does a lot of image generation with ChatGPT and Gemini Nano Banana thank God I like [00:42:09] Speaker D: you can actually do multiple at the same time. That's, that's a pretty nice feature because when you don't necessarily know what you want, you know, I create for less important things than Justin does. Like, you know, artwork for the podcast. You know, mine are more like let me just generate this to send to a coworker to troll them a little bit. [00:42:27] Speaker A: Oh trust me, he does plenty of that as well. [00:42:29] Speaker D: Generate multiple. But it's nice that it will actually generate multiple of these right now and then I could tweak it from there to make it as you know. I don't know how to say it nicely but you know as mat matisms as possible for. For the person I'm sending you to. Matastic maybe I think it was cynicism. Matissism. [00:42:56] Speaker C: If you were sad when Google got rid of the ability to register domains and even using something like hover or GoDaddy, heaven forbid. Or just using Route 53 and just pointing it right on over to GCP which that's what I do. [00:43:10] Speaker D: That's what I do, what most people do. [00:43:12] Speaker C: You can now use Cloudflare as they have a Cloudflare Register API now in beta, allowing developers to search, check availability and register domains programmatically through three straightforward API endpoints. Keep the entire workflow inside editors, terminals or agent driven tools, or via Terraform, which is also quite nice. The API integrates directly with Cloudflare's MCP server, meaning agents and environments like Cursor or cloud code can already discover and call register endpoints without any need, additional integration or custom tool definitions. Also, the hacker MCPS can now automatically create phishing domains off your poorly spelled domain name. So maybe buy some extra domains this week. [00:43:46] Speaker A: Yeah, and definitely check your domain certification. [00:43:54] Speaker C: Okay AWS we're almost through guys, with at least the AWS section. [00:44:01] Speaker A: We got through the AI section so that's good. [00:44:03] Speaker C: We got through AI so that was a lot. But first up AWS Interconnect is now generally available. We previously talked about this, but basically it provides multi cloud private layer 3 connections between AWS and other cloud providers, starting with Google Cloud and with Azure coming later this year. Last Mile this is the last mile for anything on premise locations at AWS through network providers like Lumen, AT&T and Megaport as well. The multicloud option uses 802.1 AE MacSec encryption for by default on physical links, route traffic entirely over private backbones without touching the public Internet and includes built in redundancy across at least two physical facilities. Pricing is a flat hourly rate based on bandwidth tier and region pair, so check the pricing page for sizing your connection and provisioning is handling through the AWS Direct Connect console in a few clicks now why I said it is supported on gcp. It is not supported on all regions yet. So currently it's U.S. east, U.S. west and Europe for AWS and for Google Cloud it's U.S. east. Sorry, sorry. US East, U.S. west and Europe for Google Cloud and it's U.S. east, North Virginia only currently for AWS. So thanks for that but good to see in GA. Hopefully it gets expanded out pretty quickly. We'll ignore all of the future region expansion announcements that Amazon will make for the next six months about this. [00:45:15] Speaker D: They have to hit their blog post quota. [00:45:17] Speaker C: Yeah they do. So Amazon Quic came out in October and I think we mercilessly mocked it for its dumb name and I had never used it. But today they apparently came out with a new desktop tool as well as a bunch of features to help marketing intelligence capabilities. So it connects tools like HubSpot, Salesforce, Slack and Adobe to create a unified knowledge graph from scattered marketing data all available through the tool directly as well as they now support multi account sign in for the same browser, which is interesting because it uses Quicksight to support all of its logins. So it's not your normal AWS console account, it is your Quicksight account. If you don't have one of those, you can create one which is easy to do and this allows you to basically jump into qwik. And so I downloaded Qwik because they also released a desktop client and I never used it and it's a lot like ChatGPT with more integrations into Quicksight, so. But it has ability to schedule agents, it can do repetitive tasks, it can access all the other tools like I just mentioned for Adobe. All of them are out there and available. [00:46:16] Speaker A: Yeah, nothing makes me understand things less than something like this where it's Amazon Qwik, which makes sense ties into Quicksight. Okay. I can kind of get the tie in there. It's a local client that allows you to just run general AI, you know, agentic workflows. What does that have to do with a BI tool? [00:46:35] Speaker D: Well, they use the tool that added new marketing to figure out the name of it, which is the problem. [00:46:43] Speaker A: And they use the authentication mechanism for Amazon Quicksight which has been a terrible like just wart on Amazon Identity for. For ages. [00:46:51] Speaker D: No, you can make this identity set up with your IM when you. [00:46:57] Speaker A: Only because you can. [00:46:57] Speaker C: Because I. [00:46:57] Speaker A: Only because you can do that with Quicksight now. [00:47:00] Speaker C: Right, right. [00:47:01] Speaker D: But at least it's tied to your [00:47:02] Speaker A: IAM still have to have a quick side account. I don't understand. This makes no sense to me. Quit doing things that don't make sense. [00:47:11] Speaker D: Well, I just think at one point they need to figure out how to better quicksight needs to go to like I am or you know, Amazon SSL identity, whatever they rebranded to that my brain is not processing right now Identity center, you know and this should tie into there or I am. It just, it's weird that they have a whole other essentially thing which. What is it under the hood and is it just Cognito? I'm not sure. Yeah, kind of for some reason feel like it is like it has like these little nuances when you play with quicksight. They always felt like they like, like we're just going to run Cognito under the hood for this. [00:47:49] Speaker C: I mean I'm 100% sure it's cognito underneath. And then someone said, well, to use this new tool we have to write another Cognito pool. And they're like, yeah, or my dead body. Just use the Quicksight one. [00:48:01] Speaker A: Oh, you're right, that's exactly what happened. [00:48:06] Speaker C: They threatened him with Cognito. [00:48:08] Speaker D: I still remember for a customer we had to do doctor for Cognito. And the answer was like you can't, you can't extract, you can't do backups, you can't do anything. I think you can do a few other things now, but you can't move it. So wherever you set it up, you damn well better hope you set that thing up correctly because there's no going back. [00:48:29] Speaker C: Just use okta that's the right answer. Or ping or anybody who provides you a managed service for this. Yeah, now that they support it's a [00:48:37] Speaker A: little bit more federated identity makes it easier. [00:48:40] Speaker C: Amazon CloudWatch now lets customers audit telemetry configuration and enable telemetry from services like EC2, VPC and CloudTrail across multiple regions from a single control point, reducing the operational overhead of managing observability at scale scale. Enablement rules can be scoped to specific regions or all support regions. And rules set to cover all regions automatically expand to include new regions as they become available, which is useful for organizations with growing AWS footprints. A practical use case is a central security team to creating one organization wide rule for VPC flow logs that consistently applies across every account and region, eliminating the gaps in telemetry coverage that could create blind spots. The feature is available in all AWS commercial regions with standard CloudWatch pricing applying to telemetry ingestion. So cost will scale with the volume of logs and metrics reflect rather than the feature itself carrying on additional charge. For teams managing multi account AWS organization setups, this reduces the risk of misconfigured or missing telemetry in an individual account. [00:49:30] Speaker A: Yeah, I mean this has always been a challenge across, you know, even before I was doing security and trying to do log governance across these things. Trying to have, you know, you know, different serving, you know, farms basically in multiple regions and having to log into different web pages to view the metrics, you know, on each one they, you know, they sort of fix that with the ability to sort of reference metrics in a foreign site a little while ago, but you could only do it for metrics. And so this is definitely something I'm glad to see that you. [00:50:03] Speaker D: I have a really good tool, Ryan, that you should try. Elasticsearch. [00:50:10] Speaker A: He said the thing. [00:50:13] Speaker C: Why would you do that to him? [00:50:14] Speaker A: Yeah, that's mean. [00:50:16] Speaker C: Yeah, so mean. [00:50:17] Speaker A: All right, now I've got a little twitchy. [00:50:21] Speaker D: See, I get to see Ryan's face when I say that because he was all excited. I was gonna say something useful and then there was just pure sadness there. [00:50:29] Speaker C: He's got. [00:50:29] Speaker A: He's going to recommend Bind Plane or one of these cool tools that actually does the thing. No, it's easy. [00:50:35] Speaker D: No, no, no. [00:50:36] Speaker A: A log aggregator. [00:50:38] Speaker D: Yeah, but not Amazon. [00:50:42] Speaker C: Bedrock is now automatically attributing inference costs to the IAM principle, making the call. Novel thought. With data flowing into the CUR20 format via new line item IAM Principal column, this works across all Bedrock models at no additional costs and requires no changes to your existing workflows. Feature supports four distinct access patterns, including direct IAM users or API keys, application roles on AWS Compute, Federate identity through providers like Okta or Azure AD and LLM gateway architectures. Each scenario has different configuration requirements, with a gateway scenario being the most complex since it requires a per user assumed role session management to avoid all traffic appearing under a single identity. Cause allocation tags can be attached to IAM users or roles or passed dynamically as session tags through identity providers and once activated in AWS billing, they appear in Cost Explorer under IAM Principal Prefix. This enables chargeback reporting by team, project, cost center or tenant without building custom tracking infrastructure. For organizations using LLM gateways like Lightlm or Bifrost or custom Proxies, the solution requires the gateway to call assumed role per user and cache sequentials for up to one hour, which keeps SDS call volume manageable but introduces architectural challenges. Tags take 24 to 40 hours to appear in Cost Explorer Incur, and I assume it's in the CUR 2.0. It'll eventually end up in the Focus spec as well, so that'll be nice when that comes. [00:51:54] Speaker A: Yeah, it wasn't that long ago that you were pleading for this feature because you. [00:51:59] Speaker C: I mean, I'm still pleading for it on Vertex. [00:52:00] Speaker A: Yeah, I know. [00:52:02] Speaker C: At least you get on Amazon Bedrock now. [00:52:04] Speaker A: Yeah. Because it is. It's really complicated to see these things and it's, you know, unless you're going to give everyone their own dedicated Amazon account, like what are you supposed to do? So. [00:52:12] Speaker C: Yeah, I mean, part of the reason why you use a Light LLM or Bifrost is to help with this problem of having a gateway do it. So it's sort of weird to me that you would then want to also add the complexity of now trying to get to map to your billing data other than I'm sure all the billing tools don't have any understanding what an AI gateway is. [00:52:28] Speaker A: No. So and so it sort of abstracts the problem all over again. Right. Like it's. [00:52:33] Speaker C: Right. Exactly. Yeah. Well, we talked about S3 files last week and I'm pleased to announce that AWS Lambda functions can now mount Amazon S3 buckets as a file system with S3 files. Thanks. Could have announced that last week. Coming soon to you. Fargate support I'm sure is next. [00:52:51] Speaker A: I mean, this is a really neat, you know, thing if you're. If you know s3 file works out [00:52:56] Speaker C: better than the fuse mounting does. [00:52:58] Speaker A: I do really like this. I haven't used it yet, but I would. I would use this because I do think it's a great thing to have in those stateless application components. [00:53:08] Speaker D: Sorry, I'm trying to think of all the other tools they're going to add. Yeah, but those are all Fargate like container solutions. [00:53:16] Speaker C: I'm like, okay, they'll add it to. What's that? They'll add it to like Luster. For some reason we want to understand because we don't know what Luster really does. You know, things like that. [00:53:26] Speaker A: Is it. [00:53:26] Speaker C: Let's do it. Yeah. I don't know. Yeah. Or the real they'll add official support in S3 fuse to S3 files and you'd be like, what? Isn't that kind of. It's already a bridge to that. So like there's all kinds of ways they can mess with us now. [00:53:42] Speaker A: That'd be kind of cool actually. [00:53:44] Speaker D: It's netapps will support oh no, you're right. It will be net to be NetApp supporting S3 files. [00:53:50] Speaker C: Come on, that will really suck. S3 files is NetApp. [00:53:54] Speaker A: Exactly. [00:53:54] Speaker D: I understand. [00:53:57] Speaker C: Yeah, that's a mess. Cloud Cowork is a desktop application for Mac and Windows. Lets knowledge workers delegate research, document analysis, data processing and important to Claude and typically to get coworker you have to be a direct cloud Anthropic customer. But now you can have all the model inference routed through Amazon Bedrock in your AWS account rather than over Anthropic's infrastructure, which could be super helpful. Pricing is consumption based through your existing AWS agreement with no per seat licensing from Anthropic. There's a notable distinction from cloud Enterprise and could make cost modeling more predictable for organizations with variable usage patterns. Enterprise security controls are central to the integration including aws, IAM or Bedrock API key authentication, VPC ENDPOINT Network isolation, CloudTrail audit logging and OpenTelemetry export to CloudWatch with Anthropic receiving only aggregated telemetry that can be disabled. Setup relies on device management tools like jamf, Intune or Group Policy to push a managed configuration to cloud desktop specifying the model id, Bedrock inference profile and AUTH method which means it teams control rollout rather than individual users configuring their own credentials. Organizations can use cloud are already using cloud code on Bedrock can reuse the same if you're set up for coworker and both in region and cross region inference profiles are supported to address data residency requirements across different geographies. [00:55:10] Speaker D: I mean this goes back to with the prior one you can Actually now see where, who's doing what. If you have people set up the IM or anything else like this is, you know, I assume these were kind of one was waiting on the other because. And it's going to be interesting say to see because you know, I've talked to friends and whatnot in the industry where people are like 30% of your job needs to be using AI and now with some of the build back and everything else you can do and if your company routes everything here, you could very easily see, okay, this person's using X number of tokens here, Y number of token. This person's using Y and start to compare people. But then I think the real question is how useful is that data? Because maybe they're just not efficient with their tokens and they're having conversations with Boltbot about quick versus quicksight. [00:55:59] Speaker A: No, I mean it's for me it's more along the lines when you, when you use Anthropic's enterprise directly. With Anthropic you lose a lot of visibility into who's doing well. Like you can see tokens usage for users but you can't really see, you know, what, what permissions are they, you know, doing what, what are the agents doing? And I think using this, I haven't played around with it directly but I think it would allow more visibility through the standard Amazon tooling to, to see those IAM transactions, see what's going on. Calls to network VPC endpoints would be captured by flow logs and so it would offer a lot more tools for you know, a security org to you know, route to a sim and, and be able to do investigations on anomalous traffic or any kind of learning rules playbooks, which is not something that you can do with directly using anthropic enterprise or cloud enterprise. [00:56:48] Speaker D: So you're saying cloud enterprise isn't very [00:56:50] Speaker C: enterprise Y. I mean it is not in some ways and it is in others. [00:56:55] Speaker A: They're trying to make it, it will be with time. [00:56:57] Speaker C: Right? [00:56:57] Speaker A: They're working on those things. Yeah, it's just not there yet. [00:57:00] Speaker C: The problem, the problem is is that instead of building a proper enterprise backend that would do all the things they want, they partnered with work os. And so while work OS has a bunch of things, it doesn't have all the things that you would want. And this is a problem also for OpenAI as well because they also partnered with same way and snowflake partners with them. But you know, some have done a better job than others in how they lay out some of These tools. [00:57:21] Speaker A: I didn't know Snowflake because that does answer a lot of my question because I have a lot of same complaints with Snowflake. [00:57:29] Speaker C: So all these, all these people are using basically this backend platform that helps them manage SaaS apps and it they're sort of limited to what's available in the tool, which is interesting. [00:57:39] Speaker D: Yeah, that does make a little bit more sense though. I then go back to the well 46 is a SaaS killer. Why isn't one of these vendors just develop their own tooling for it? [00:57:49] Speaker C: Well, I mean that's, that's the thing. I was like anthropic hasn't killed work OS yet and they're using it all over the place. You'd think they just built the right thing. SAS pageddon might be damned. All right, our final Amazon story then we get into the fun of GCP Amazon Bedrock Agent Core now includes a managed Agent harness feature that lets developers define the agent's model tools and instructions by API calls without writing orchestration code, reducing initial setup from days to minutes. It supports popular frameworks like Langgraph, Llama Index, Crewai and Strands agents. The new Agent Core cli available on GitHub keeps the full agent lifecycle in one workflow covering local prototyping, deployment and operations from a single terminal with CDK support and Terraform coming very soon. Agent Core now includes persistent session state via durable file systems, enabling agents to spend mid task and resume where they left off, which makes human loop workflows practical without custom storage, plumbing, pre built coding. Agent skills give tools like cloud code and CURO curated knowledge of Agent Core best practices rather than just raw API access with plugins for codecs and cursor coming by the end of April, the Manage Agent harnesses in preview across four regions with no additional charges for the CLI harness or skills beyond standard resource consumption. [00:58:56] Speaker A: I mean this is a great feature. This is now makes it competitive with Vertex AI's agent builder and so like I, you know, now it's a usable option in Amazon. Awesome. [00:59:06] Speaker C: I mean other than it requires you to use NPM install which I hate. [00:59:12] Speaker D: Why? [00:59:13] Speaker A: Yeah, I know, me too. [00:59:14] Speaker C: But you know, maybe I'll up in homebrew. A lot of these end up in Homebrew pretty quickly after I mean cloud [00:59:19] Speaker A: code maybe use NPM because you know, it's evil too. [00:59:23] Speaker C: Well, at least they'd move it to Homebrew as well so you don't have to do that. [00:59:28] Speaker A: Oh I didn't. [00:59:28] Speaker C: I don't. [00:59:29] Speaker A: I haven't Installed it since. Cool. [00:59:31] Speaker C: All right, gcp. They had several nonsense pre next. So which is just mean we already got to cover a conference. [00:59:38] Speaker A: Come on. [00:59:38] Speaker C: We already got to cover your conference. So we'll go through these kind of quick. First up, they have a new Text to Speech AI model, Gemini 3.1 Flash TTS, now available in preview, available in all the googly things. The model scored an ELO of 1211 on Archel Daniel's TTS leaderboard, which I have no idea what that is, but sounds impressive. And so there you go. If you need text to speech, here you go. Solution A. If you've desperately wanted an app in your computer that's a client to access Gemini, which I have wanted, you can now have that wish fulfilled as there's now a desktop client that works on Mac and Windows. So you can now run Gemini natively in your computer without having to go to a web page because web pages are dumb. Now, if they'd only learned this about Gmail and make an actually good Gmail client, we can all be much happier with our lives. [01:00:22] Speaker A: That's really funny. That doesn't. [01:00:24] Speaker C: Yeah, yeah. [01:00:26] Speaker D: As you said, I was like, well, [01:00:27] Speaker C: they're building it out. [01:00:28] Speaker D: Why did they do it for other things and take out Apple Mail or Thunderbird or any of these other ones? [01:00:36] Speaker C: Because none of the other people like out. No one likes Outlook. So everyone else, we love Cloud Club. We love Cloud chat. We love Chat GPT, we love all these agents. And so Google's like, well, everyone else loves those things. But when you talk about email with Gmail and Outlook, everyone hates Outlook. So like, well, we'll give you something different. I guess that's my, that's my theory at least. [01:00:55] Speaker A: That's funny, actually, because I use Outlook because I prefer it over the Gmail web interface. [01:01:01] Speaker C: Yeah. So I actually found a Gmail client that I actually like finally, that talks native Gmail on the back end so it doesn't have all the weird IMAP bs. And I, I. [01:01:11] Speaker A: Because that's the only problem with Outlook in Gmail. [01:01:13] Speaker C: Yeah. [01:01:13] Speaker A: Yeah. [01:01:14] Speaker D: That's the big problem. And that's why I've tried them multiple times. And I'm always like, nah, I'm good. [01:01:19] Speaker C: Yeah. So it's a, it's a product called MimeStream, which is a terrible name. And it's like, I think it's 20 bucks a year and I've been using it and I love it. So looking for it. [01:01:29] Speaker A: That's a good option. [01:01:31] Speaker C: Next up is they created expert content with a ability to Deploy multi agent system with Terraform and cloud run. Google's Dev Signal is a four part tutorial series showing how to build and deploy production multi agent system using Google adk, MCP Vertex and memory bank and cloud run. So if you've been confused how to do this, they're giving you a full white paper on how to do it which only because of everything else they announced. Next I thought we should talk about it. You're welcome. All right, let's start with the most important thing. The conference. We're here. 32,000 people, three keynotes, 25 spotlights, 700 breakouts, 260 announcements. 260 announcements. Not all on the main stage, thank God. But you know, the most important thing is that points were awarded and I have to say guys, we were competitive this year. [01:02:19] Speaker A: Yeah. [01:02:20] Speaker C: So first up I went first and I nailed Wiz and GCP Security, Agentic defense and a Wiz AI app and acquisition of course they acknowledge on stage all hit beautifully in the first day. I was very happy. I did hope for a new anti gravity plus Gemini CLI capability. I didn't get it. They did give us data Agent kit. I am not taking a point for it, I don't think it's legit as a point but basically they give you inside of Gemini CLI access to the data Agent kit which makes it easier to talk to BigQuery and to other tooling on the data side. It's a nice tool. I just didn't think it was really fitting what I intended of this and since I was being nice and I didn't want to win over technicality I said no. [01:03:00] Speaker A: I would, I would argue that one to the death too. [01:03:02] Speaker C: Oh yeah, as you should have because I don't think it would be fair. And then I got an Ironwood successor which they overd delivered on with a TPU two TPUs, one for training, a TPU eight T and a TPU eight I for inference. So I got two TPUs for the price of one. So pretty nice. Next up, Orion came in Strong with Gemini 3.1 Pro General availability and they feature Model Ts so they talked about several new models coming later this year. So nicely done. And they gave you angentic enhancements. Basically the entire keynote was Ryan, the entire conference, whole thing. The age of agentic is definitely here. Yeah, would Gael we should have made [01:03:40] Speaker D: him defy that one more next time. [01:03:44] Speaker C: I mean he was looking for an agentic capability and they delivered a bunch of agenda capabilities I so I was at Tuesday I was at a infrastructure summit that I was invited to. And they go, and we're going to ga GK Agent. I was like, son of a bitch. Because I'm like, that's going to help Ryan right there. And then the next day I saw the opening of the keynote and I was like, oh no, Ryan's going to own this. Yeah. But they did not give you a VMware or Kubernetes interruption play. The closest thing comes to is maybe Spanner Omni if I were to stretch at it. But even then, even I won't argue that. Yeah. And then Matt, you did get two points as well. So default guard rails for agent identity, Agent gateway and model armor. I mean they named all three of those on. So that was a, that's a clean, cleanest point on, on here. There's no wiggle on that one. And then they did give a Gentex SDLC through data agent kit, a Gentex task force and task force was really the key piece you needed for the sdlc. So congratulations on that. I will tightenly tell you that you did also win three non AI announcements at the conference. It just did not mention them on the stage. Other than Virgo Network was the one thing. And even Virgo Network I would argue is pretty close to basically being a, an agentic feature. So you know, but you're close on that one. So you know, nice job everyone. Congratulations. A round of applause. Now it takes us to the tiebreaker to see who wins the game. Now I just want to point out that, you know, we chose how many times to say AI or artificial intelligence on state. Now I will tell you, they did not say artificial intelligence very much. It was mostly AI. But in the first keynote they said it 132 times. [01:05:21] Speaker A: Yeah, yeah. [01:05:22] Speaker C: Double what they said in last year's. As I went back and looked, I was like, this is crazy because like I won this last year too with like 90 or something, which was, you know, still crazy high. And so 132 times in the first keynote and the second keynote 55 times. And so because of that I went high, which I wasn't going to go high. I was going to go between Matt and Ryan. I stole the win from Matt with 115, which is price is Right rules. I did not go over. And so therefore I won this one by seven. I was within 17. Pretty close. [01:05:54] Speaker A: Yeah, I remember I was listening to the keynote and I wasn't exactly counting, but it was just AI, AI, AI. Like I knew in my head that if it came down to a tiebreaker I didn't win. And I. I really did think of how much they said it the year before, that maybe there'd be some sort of. [01:06:09] Speaker C: You know, I mean, at least Amazon took some of the feedback from last year and they added non AI announcements to their keynote. I. Google needs to take a similar break. Yeah. [01:06:21] Speaker A: Like, I don't remember any, like, even the number of announcements you said, I'm sort of surprised to hear that number because I didn't feel that listening to. [01:06:28] Speaker C: Well, it definitely wasn't on stage. I mean, the reality is they announced a ton of stuff in sessions and all over the conference. But you know, from a pure. Like, for the main crux of the conference, like, unless you cared about Agentic, you cared nothing about the keynotes. [01:06:41] Speaker A: No, exactly. [01:06:42] Speaker C: Yeah. Which was. Which is kind of a shame because I feel like it really diminishes the power of cloud. And like they announced a lot of other cool stuff and we'll talk about some of them here in a little bit. But you know, in general, it's just like, you gotta mix it up. I just, you know, AI is cool. I get that you have a huge need to please your investors. I know everyone's excited about Agentic. I get all that. Just don't make it everything. [01:07:05] Speaker A: Everything. I'm exhausted. [01:07:07] Speaker C: Yeah. I mean, to the point where I'm. I'm not even considering going to next. Next year. I'm just like, if it's just going to be an agentic AI disaster like this, I don't know that I want to. And it just like the. And the energy was so off this year, I felt like. Because a. The reality is a lot of people are worried about Gentex killing their job. So we're literally at the conference to learn how Gentex is going to destroy your job, which is kind of a depressive loop cycle that you're at. Like, we should be having a good time with this conference, celebrating what we do and learning a bunch of new cool stuff. And it's like, no, I'm learning about how my job's going to be destroyed. Cool. Yeah. [01:07:37] Speaker A: I think that's part. I've been really thinking about this since the end of the conference, is trying to figure out like a lot of the agentic stuff that they talked about. Like, I have a hard time applying directly. And so it does kind of feel that way. Like, you know, it is. All the things that you can do with agentic workflows now are generally replacing the people in the middle. Like, it's not about new capabilities. It's not about, it's just about doing the same things faster. And it, it left me like sort of uninspired, which usually for these cloud conferences I come out and I want to do all the things and play with all the new toys and I can't wait to introduce them to the rest of the business and try to, you know, get the, get the, you know, budget and time to roll them out. But this one, I really didn't. [01:08:17] Speaker C: Yeah, it was kind of, kind of a shame. All right, so then we had to deal with the 260 announcements and how do we even tackle them? And so with Claude, we, we assessed and we tried to rank them by buzz, which we, we analyzed Twitter and we saw what people were talking about and tweeting about. We looked at strategic significance to AI and all those things. And then of course the impact to you, the practitioner. And so we've, we put them into an S tier. Sorry, into a tier ranking, basically S, A, B and C. We didn't have any Ds which is not normal for a ranking, but it's kind of hard to do on new tools. I don't even know what they are yet to say they're F tier. We should probably maybe do an. Do a bonus episode when we do bonus shows on just ranking our favorite services or something. Yeah, that'd be kind of fun. It might take us hours, but it'd be exciting. So we'll jump into this. So tier S, first up is the story of the keynote. We just talked about it agentic and the most agentic part of the whole thing is that they're repositioning Vertex AI as Gemini Enterprise Agent platform. So basically they're killing the vertex branding in favor of Gemini Enterprise and then basically something after that. So there are 16 name sub features in this build which includes your ADK, agent studio and Agent Designer. You are run for Agent runtime, Agent Sandbox memory bank sessions and long running agents. And the memory feature they built out is pretty impressive. It's in the developer keynote. They have a pretty good demo of it. Govern and optimize for Agent Identity Identity, which is a cryptographic ID per agent, Agent registry, Agent gateway, anomaly detection, security dashboard, simulation, evaluation and optimization, all inside of that. So basically their basic strategic frame was agent is now the unit of work, not the model call itself. And it'll be interesting to see how this continues to drive over the next year as this continues to evolve. [01:10:02] Speaker A: Yeah, no, I was excited by especially the identity stuff because I'm A security [01:10:06] Speaker C: nerd was very cool. The next one customer scale. There's lots of mentions and this is one of the things that the 260 is a bit of a cheat. If you look at the blog post. There's a lot of customer successes which is great and I love good customers. But the most important thing in this is that there's actually a lot of customers doing a lot of things with Gemini Enterprise, the Gentex platforming. So Mars was highlighted, Merck was highlighted. GE had 800 plus agents across manufacturing, logistics and supply chains. Just some really good case studies. So if you're looking for how do I use this thing? There's probably a case study for your industry which I think is always helpful to know like what are people doing? They did in the keynote have Home Depot's magic apron basically talking about how they use it at Home Depot. It's kind of interesting idea. Again, Virgin Voyages with other one with Rovi where they kind of COVID a bunch of the things they're doing with these type of capabilities. And so if you're looking for real life examples of how to use agents in your day to day, there's a case study for you. I guarantee it. [01:11:01] Speaker A: They never get specific enough for me on these case studies. [01:11:04] Speaker C: No, but a lot of times you just need ideation, right? I get enough of that gist of it like oh this is cool. I could see how I could apply that to my thing. That's basically what you're trying to do. [01:11:13] Speaker D: That's kind of the way I use them. It's like okay, oh there's this thing over here. Oh, I could do this in tweak in these 16 ways and then it's [01:11:20] Speaker C: useful for my business. TPU at and 8 I definitely an S tier item. You know 3x compute versus Ironwood on the training side and the inference being purpose built, 80% better per dollar, optimized for MOE and agentic workloads. The Torch TPU native Pytorch full eager mode kills the jocks only friction they were having before. And there's only hyperscaler shipping dedicated inference silicone in this generation which is a bit of a stretch because there is the inferentia chips over at Amazon. But since they didn't get updated they have a chance to say these are the most current generation. So well done marketing at Google for calling that out. Those are the S tier items from the announcement going into tier A Wizard. You know of course the acquisition formally closed right before the conference. The Wiz AI app which is Code to cloud to runtime AI application protection platform. Wiz now supports AWS Agent, Core Azure, Copilot, Studio, Salesforce, Agent Force and Databricks. And I assume we'll also have Snowflake coming as well. So if you're looking for agent control, Wiz has a bunch of stuff for you there. Otherwise, news across the conference inline AI security hooks and IDEs wiz skills or validated attack surface findings exposed to coding agents for auto remediation An AI bill of materials for auto inventory of every AI framework model ID extension used across your environment to kill all the shadow AI and level vibe coding integration for security scanning inside a lovable. I mean this is going to be super interesting how it pays off over the next year or two or three or five or decade. But lots of cool opportunities in Wiz. And my interest in Wiz suddenly increased a lot after the conference. [01:12:53] Speaker A: Mine too. Yeah, specifically the agents and all these ecosystems. Right? That's a, that's a big problem. Agent Sprawl is a thing so. [01:13:02] Speaker C: Exactly. [01:13:02] Speaker D: You don't trust agents. Come on, what's wrong with you? [01:13:04] Speaker A: I do not. [01:13:08] Speaker C: So one of the things that we've been hearing a lot if you've been tracking some of these things is chat, GPT or OpenAI. They'll actually put it a forward facing engineering team or forward deployed engineering team. Same thing with Anthropic. So they'll basically get a big customer like an accenture or a McKesson or somebody. They'll come in basically say hey, we'll put a bunch of engineers into your business to help accelerate your AI agent story and really help drive forward. And so Google has got into this now with The Partner Fund, a 750 million innovation fund for partner agent development. This allows you to get agents built into the agent marketplace and the agent gallery. With already 70 plus partner built agents out there at launch from big companies like at Lesion and Lovable and Palo Alto Salesforce serves now workday, et cetera. They do have four deployed integers at Accenture, Deloitte and McKinsey. And Google is making its own end users available through the partner go to market strategy. Very Palantir style move. So well done. Yeah, if you're looking for help, talk to your Google rep. [01:14:07] Speaker D: I think AWS has been pushing this for a while too, so I'm kind of surprised that they're just doing it or do they just expand it and are highlighting? [01:14:14] Speaker C: I mean they're committing a 750 million to it. They were doing it before 4 so that's what I figured more money to it. Y Gemini or sorry in the Anti Gravity Data agent kit Gemini 3.1 so 3.1Pro and preview across Vertex Gemini, Anti Gravity Android Studio Gemini and AI Studio. The Data Agent Kit which is a portable suite of skills MCV tools, plugins and tools for VS code and Gemini CLI into native data workspaces like BigQuery. Full stack of vibe coding from AI Studio to Cloud Run which that one Ryan you should kill. And so for an engineering developer side, this is all about competing with Cursor Claude and Replit out there. If you're excited about data and you're looking for more gentic and your data cloud, you've got Knowledge Catalog, Cross Cloud Lakehouse and Spanner Omni. So the Knowledge catalog is a universal context engine, maps business meeting across the data state and is a great foundation for accurate agent execution. The Cross Cloud Lakehouse, formerly named Big Lake is the Iceberg rest catalog and now federates with AWS glue Databricks, Snowflake SAP and Cross Cloud. Caching through the interconnect that we talked about will actually cut your egress costs with this tool. And then Spanner Omni is their Spanner running multi cloud on prem or even on a laptop. This is the most underrated announcement of the keynote in my opinion. You can now run Spanner on your laptop, which is great from a local perspective. [01:15:39] Speaker D: Yeah, that's pretty cool. [01:15:40] Speaker C: Yep, so glad to see that one. And then Lakehouse Federation for LADB live joins between transactional and analytical workload without doing the ETL first. So you don't have to do things like AWS glue to move your data. Thank God. But the Cross Cloud Lakehouse, that's a big one. Being able to access your data inside a snowflake or inside a of data bricks from Cloud Lakehouse means you don't have to move that data around, which is a big sustainability thing. It also allows you to take advantage of different tools. So hey, maybe you bought Snowflake first and now you're looking at some databricks agent capabilities. You're like, well I don't want to move all my data. Well, you don't have to with this federation. So the advantage of having Iceberg as a, as a standard across the industry now makes this a powerful thing that you can make iceberg endpoints available to everything and pull it all together with a single Lakehouse type configuration. [01:16:29] Speaker A: Yeah, I, I couldn't love this more. I mean there's so many data sets that are splurged out like that if you have questions that need to be answered across them like it's such a pain. So it's so great. [01:16:41] Speaker C: I did see some of these features rolling out in Tier B here on Google Workspace in the admin I was kind of like oh, I see some of the things coming together. But if you're a workspace admin, which is probably not most for listeners, they've given you AI capabilities Unified semantic understanding across docs slides, Gmail projects and Org domains Workspace Studio no code agent builder skills deployable across your workspace Microsoft 365 to workspace migration tool to make it easier to move than ever before Sovereign controls plus client side encryption for all your US and EU sovereignty concerns and auto browse with Gemini now available to in Chrome Enterprise. So that's all available to you out of the box. Cloud Run got a bunch of features including the ability to deploy vive coded apps from AI Studio Nvidia RTX Pro 6000 Blackwell support. So now it can run 70 billion parameter models without managing GPU infrastructure through cloud Run billing caps. You know you can set max monthly spend resources deactivate when hit Cloud Run sandboxes for ephemeral isolated agent execution SSH into running containers down preview and hot take Cloud Run is going to be the default run agent runtime behind after GKE in my opinion. [01:17:51] Speaker A: Yeah I I saw many labs on how to run multi agentic workflows on Cloud Run and it's it definitely looked very alluring. I I'm going to play with it. I've been using Cloud Run a lot so I'm I'm excited by this. You know we make jokes about the full stack 5 coding but if this is running in my cloud ecosystem and where I have visibility in that it's a lot better than running than whatever's running in Ripplet or Lovable, which I have no visibility or any insight into what it's doing. [01:18:18] Speaker C: So Agreed. And then finally BigQuery AI was our last in this tier AI parse document single SQL function for OCR plus layout plus chunking via Gemini layout parser tabular FM BigQuery graph support, reverse ETL connected sheets with Times FM BigQuery hybrid searching and 35% year over year performance improvement with lower proxy and cost so that's not bad. Lots of work for data teams to get into though. And then our tier C lighting round the Virgo network. If you're doing any type of training of any kind, this is amazing for you. It's a custom interconnect with 134,000 TPUs can be bridged together into a single fabric across 1 million-plus across sites. A5X is Nvidia Vera Rubin NVL72 with up to 960,000 GPUs cross site. And they can scale further than anybody else they said, which is impressive. [01:19:13] Speaker A: 130,000 TPUs is a lot for me. Just ask, you know, AI, what color I should paint my living room. [01:19:19] Speaker C: Yeah, it's a lot. A lot of water to destroy. Rapid storage capabilities, rapid buckets, rapid cache and manage luster all available to you so you can get faster snapshotting, faster capabilities there. [01:19:31] Speaker D: The rapid buckets is kind of cool, you know, especially with these data sets, you know, 15 terabytes per second bandwidth is one. Just an impressive number to say. That's what I forget that because as I was looking at the notes I swore it was a typo. You know, at 15 gigabytes a second I was like okay, that's cool. But 15 terabytes, you know. But I assume that's competing with the S3 like one zone, correct? Single zone, whatever it is. Like the. Where it's tied to a single zone for speed. [01:20:01] Speaker A: I don't remember what they called it. [01:20:02] Speaker C: Express Zone, I think, I think Express. [01:20:04] Speaker A: Yeah. [01:20:05] Speaker D: It's too hard to keep all the naming conventions. [01:20:07] Speaker C: It is, it is Axiom, which is of course their custom arm silicone. The N4A is now generally available which gives you 2x price performance versus the X86 hardware and 30% better performance for GKE agent sandbox versus other hyperscalers. There's a new C4Ametal instance available to you in preview. So if you wanted Axion, but you need the bare metal for some reason you have that. And now you can get confidential computing on the G4. The C4 for all your confidential AI workloads for the spooks if you need it. Recaptcha, which is probably being killed by AI every moment of the day now because of all the image recognition stuff. I can't imagine that that's lasting has been rebranded into the new fraud defense package which is now a platform that distinguishes bots, humans and agents as agents. Their capabilities coming in for the digital commerce journey for account to payment to checkout. And this is the closest of anything that got to clear our AP2 idea. But no mention of AP2 anywhere at the conference. So bummer post Quantum crypto had a little bit of appearance with KMS Quantum safe key imports capability, PQC and cross cloud networking which is great. And then GKE got several quality of life improvements too including a 4x faster node startup, 80% faster pod startup and a 5x faster model loading. The GKE Hypercluster is now available with a single control plane, millions of accelerators and multiple regional in private general availability predictive latency boost in GK inference gateway up to 7% lower time to first token KV cache tiering across RAM and local SSD cloud storage for Lustre and RL scheduler, RL Sandbox and RL Observability for reinforcement learning workloads on top of gke. So nice. So in general I think the big thing agent platform is the new operating system Vertex. Now Gemini Enterprise agent platform isn't cosmetic. Wiz is now going to be a huge part of their push with mandiant into the security of your entire organization and they're going to. They're showing it already with already heavy integration into agents. It is pretty impressive. And it's not all vaporware. Lots of good customers up there. Yeah, you know, things that were kind of missing. I didn't see anything for robots. There was a Gemini Robotics area on the conference floor which is interesting. I didn't really see a Nano Banana update. It was mentioned on stage, but no update to it. No answer to Glasswing or Mythos, no Turboquant and Vertex yet. So maybe those will come later this year or next year's next and out there. And then these are things. A couple things here below the cut we didn't think were worth even putting on the chart. That was Cloud and I's work on trying to make sense of this massive amount of announcements. [01:22:43] Speaker A: It was. Yeah, it is not easy. [01:22:46] Speaker D: Bravo Justin for going through all that. Bravo. [01:22:49] Speaker C: Yeah, it was a lot. We have all of the Deep Dive articles on all of these in our show notes. So if you're curious to read more about any of these capabilities, you can find all of the articles from the conference. There are a couple that I'd like to jump into that didn't make the cut up there but are cool. Google has announced an official agent skill repository this basically is launching with 13 skills covering products like BigQuery, Cloud Run GK, Firebase and Gemini API. So this teaches your agent how not to be stupid with Google services, which I really appreciate. So the fact that this is now available to you as officially sanctioned with lots more coming later this year, I think that's a really great enhancement and something that you should be plugging into all of your agents if you're Doing something with Google Cloud for sure. Gemini Cloud Assistant which last year we kind of panned a little bit. It's kind of your SRE agent if you will is moving from a reactive system to a proactive operational platform. Using a Gentex architecture to handle tasks like infrastructure troubleshooting, cost analytic detection and application design without waiting for prompts. The redesigned Application Design center lets teams describe infra goals in plain language and get back visual architectures with deployable terraform templates integrated with Security Command center to enforce organizational policies from the start. A24.7 FinOps agent monitoring for cost anomalously anomalies and correlates spending spikes specific triggers like auto scaling events. AMCP server support extends Gemini Cloud Assist beyond the Google cloud console into IDE CLIS and third party tools like ServiceNow and Slack and Petco was their case study on this one. Reported a 60% reduction in Google Cloud related questions to their cloud team after adopting Gemini Cloud Assist. Would have loved to have had this at a prior life where we had a lot of Amazon accounts. [01:24:24] Speaker A: Oh yeah, I missed that. This tied into the Security center enterprise so I'll have to play around with that. [01:24:31] Speaker C: Pretty cool. Well, if ever you can have it, you know, design through application Design center you describe what you want and then have it basically give you a diagram and how it needs to tie into the command center. Is a pretty nice. [01:24:40] Speaker A: It's pretty sweet. Yeah, yeah, yeah. They are going to replace my job. Sweet. [01:24:48] Speaker C: It will be as grumpy as you though. That's the question. Yeah. Will it be. [01:24:52] Speaker A: Will it be as entertaining? [01:24:54] Speaker C: No. No, no. [01:24:57] Speaker A: All right. [01:24:57] Speaker C: That's a Google Next. That was a lot we got there. [01:25:01] Speaker A: Yeah. That was crazy. I know. I was definitely exhausted by the end of the week as I usually am Staying an extra day. [01:25:08] Speaker C: I mean that might be the staying up too late every night problem. [01:25:10] Speaker D: I was gonna say definitely. Or is that the extracurricular activities? [01:25:14] Speaker C: Well, it's the running around. [01:25:15] Speaker A: You're doing miles of walking all day and then yes, you're going out and you're meeting with people. [01:25:19] Speaker C: I average like 15,000 steps every day. So just running around, you know, all the different places it. It is a tire and you know, it's dry. I think it was like the driest trip to Vegas I've ever had. It was like, you know, I just was dying. My sciences were killing me all week. So yeah, it was. It was. I was glad to go home to be honest. Hit the. Hit the moisture of the bear. You have to plan like ah. I can breathe again. So it's nice. [01:25:45] Speaker A: I do think the sessions were well organized. I didn't, you know, like things I wanted to go to I could get. I had time in between. They weren't spread out everywhere in different wings of the conference center with the one exception of some of the EBCs being in two locations. [01:26:00] Speaker C: Oh yeah, you had to find the secret compute one that's in the corner of the show floor. Oh great. Thank you for that. I mean the one thing I, I continue to feel like they've outgrown Mandalay. I felt that last year. It felt that way again this year. They did a lot of things to try to help crowd control and to try to make it easier but you know, like they had how many breakout overflow rooms for the, for the arena? I mean like I think I count at least 6. Plus the FinOps X had their own, you know, viewing party they were doing in their restaurant. They ran out there at the event. [01:26:34] Speaker A: I didn't know that. [01:26:36] Speaker C: So you know, I think it's just, you know, they are going to be there again next year. They've already announced dates and everything and they'll be back. But I, I can't see them continuing to do Mandalay for more than a year or two more. I mean especially if they're as successful as they want to be with Wiz and Agentic. I, I think there's a lot more interest in this conference going forward. [01:26:54] Speaker A: So I mean it's definitely full, but I don't know if it's big enough to take over two yet. [01:26:58] Speaker C: I mean I don't think, I don't think it's a takeover two but I like move it to the actual convention center or to the Sans convention center. Like those are much bigger facilities than Mandalay. I mean Mandalay is a great facility, but it's not, it's not the size of Venetian or, or the actual convention center in Vegas. [01:27:14] Speaker D: So the problem is once you have more than one spot moving. Like I remember the first year Amazon, they still were at Sans and they like this one thing. At the other one you just, you don't make it over there in time. There's no time between stuff. So it's like you don't want to outgrow it because as soon as you do, it's too big. I like what Amazon started doing, which is splitting up the conferences a little bit into like the security conference and [01:27:38] Speaker C: then they stopped doing that. They're not doing a reinforce anymore. They stopped. [01:27:42] Speaker D: I like but I like that idea because Then you can leave this for general and maybe shrink it down. But I think no one went to the other conferences. [01:27:50] Speaker A: What's the problem? Yeah, I think they should do that at Re Invent. They should have like all the, the different hotels and stuff they spread out to. They should. And they try to do it a little bit. Like they try to keep all the security breakouts in one section, all the compute. But I think they should like double down on that model and, and figure out how to like have, you know, overflow rooms per Persona for all the main keynotes and, and really have them be a little bit more self contained. [01:28:15] Speaker C: My, my thing about Reinvent was the Sands. If you go out the back door is right next to the Link, which has a monorail station. And the monorail goes right to the big convention center. And so when they decided to, they were outgrown, you know, sans why they didn't just go to the convention center and get both, like very similar to our Convex would do, and then just have, you know, free tickets on the monorail to take you over there. Instead they said, oh, we're going to do MGM and we're going to do Aria and we're going to do the former Mirage and we're going to have all these things and all these buses. And that was a nightmare. That was terrible. So, like, if you could figure out how to simplify between two locations. I think two locations is workable. Seven locations that year. That was the year I swore off Reinvent. [01:29:01] Speaker A: Yeah, that was my last two. [01:29:03] Speaker D: Was it the first year they went to like more than two? [01:29:06] Speaker A: It was the second. [01:29:07] Speaker C: The first year we kind of gave the benefit of the doubt that they'll figure this out. And then the second year was terrible. Yeah, yeah, yeah. [01:29:12] Speaker D: I just remember I went to go do something and I saw the line and it must have been like two or three thousand people deep to go from one place. The other I was like, I'm just walking, like, I'll make it there faster. [01:29:24] Speaker C: Got it. And you did. [01:29:25] Speaker D: Probably may have stopped for a beer on the way, so probably not, but, you know, it's fine. [01:29:30] Speaker C: Well, there's all kinds of frozen drink machines. We joked with Ryan about that. He should have one. All right, well, people are getting tired of us. Seriously. Let's get through Azure here to wrap this up for the week, Azure has announced optimized object storage costs automatically with smart tier. This is a smart tier for Blob and Data. Lake storage is generally available automatically. Moving objects between hot, cool and cold tiers based on actual access patterns, Data Inactive for 30 days Shifts to cool, then cold after 160 days and immediately returns to hot upon re access with no retrieval or early deletion charges. The feature eliminates the need to manually configure and maintain lifecycle rules, which is particularly useful for organizations managing large analytics workloads, telemetry data or data lakes with unpredictable data access patterns. So pricing includes standard hot, cool and cold capacitors with no tier transition fees. But a per object monthly monitoring fee applies to objects managed by smart tier. So. Thanks. You finally got what Amazon's had for a while. [01:30:24] Speaker D: Like 10 years. I feel like intelligent tiering. [01:30:27] Speaker C: I mean it might be 10 years. I don't feel like it's been that long, but it could be you guess [01:30:32] Speaker D: bullpot as we're doing it live. [01:30:34] Speaker C: Yeah. What's this? Microsoft Entre ID is adding synced passkeys, passkey profiles and phish resistant MFA support for Linux sso, giving organizations more options to move away from passwords while meeting compliance requirements for stronger authentication. Hey, can we get this Ryan? Yes, immediately. [01:30:49] Speaker A: Yeah, yeah. [01:30:50] Speaker C: Last passwords, last password changes. I would like. [01:30:53] Speaker A: You might have to fix AD. [01:30:55] Speaker C: Yeah. Starting June 1, 2026, Entre Connect sync and Cloud Sync will block hard match operations for users with assigned entre roles, closing a potential attack path where on premise ad attribute manipulation could be used to take over privileged cloud accounts. Well, that's a fun bug. Admin should review their hybrid configurations before that date. Why is it. Why not till June 1st? Yeah. What do this today? [01:31:15] Speaker D: I told everyone the problem. I think first they're turning it on by default. So maybe it's that. [01:31:23] Speaker C: Maybe people time to do it themselves. Yeah, that could be. I was. You should turn this on too, Ryan. Microsoft Authenticator app now includes jailbreak and root detection for Android with a phase rollout moving from warning to blocking to wipe mode. The users on non compliant devices will eventually lose access to Andre credentials entirely. I'm sure people are going to love that. Yeah. Yeah, that one's gonna. That one's not gonna go over. Yeah. [01:31:45] Speaker D: Ryan, you should turn this one on. [01:31:48] Speaker A: I haven't jailbroke my phone. It's not my problem. [01:31:50] Speaker C: Yeah, I mean I. I'm an iOS user. I don't do that. You might with your Android phone. [01:31:55] Speaker D: Mine's not. No, I'm just on the straight pixel. It's easier. [01:31:57] Speaker C: All right. [01:31:58] Speaker D: I used to jailbreaking. It's not worth it anymore. [01:32:01] Speaker C: Yeah, I tried go breaking Apple phone one time and I was like this is terrible. I'm never doing this again and I undid it. Agent Management is consolidating under Agent 365 as the single control plane with existing Entre Admin Center, Agent Registry and collection blaze retiring May 1, 2026 and current registry Graph API being deprecated and replaced. I mean, I don't know what any of that meant, but happy retirement. And then finally Entra ID Governance Ad Support Several notable features this quarter, including SCIM 2.0 API support, delegated workflow management and lifecycle workflows, and a new billing meter for guest users. Which organizations rely on governance features for external identities should review for potential cost impact does SCIM 2.0 include ability to sync now? Because that's my one big complaint about SCIM is it's always like eventually I'll get your accounts provisioned. It's never instantaneous and it never works as fast as I need it to. [01:32:50] Speaker A: And you always have to choose between SCIM and on the fly provisioning, right? You can't have both. That's always so frustrating, so I hope it does. I don't actually know it's in skim 2.0, but if anything's better. But I am really excited about even if it's not the increased workflow management of that means you could means you can do it. [01:33:05] Speaker C: Agreed. Oh thanks. I normally don't like Entre updates, but those seem good. Azure SRE Agent now supports Log analytics and Application insights. What did you do before as native connectors allowing the agent to run KQL queries directly against workspaces and app insight resources during incident investigations, replacing the previous approach of shelling out to Azure cli. Really? You shelled out to Azure CLI before Pipe Grip? [01:33:31] Speaker D: Yep, same things that everyone's always done. [01:33:37] Speaker C: Great, thanks. That's awesome. No wonder why your tokens get used so fast. [01:33:41] Speaker A: Yeah, exactly. [01:33:41] Speaker C: Setup is simplify as compared to manual RBAC approach. Selecting a resource from the drop down MP grants the agent managed any log analytics reader and monitor reader on the target resource group mail entry fallback to resource every fails then the AI that removes the permissions or move those later. But that's fun right? The feature is backed by the Azure MCP server using the Monitor namespace, giving the agent read only tools like Monitor Workspace, Log Query and Monitor Table List. Practical use cases include AKS cluster investigations where the agent can automatically query container log queue events and application traces. So they really want you to burn tokens? [01:34:10] Speaker A: Well but that said, that's the only way you'll ever understand kubernetes logging. [01:34:15] Speaker C: That's true. That's true as well. Yeah. All right. Well, I mean I don't log on Azure ever so good luck to you, Matt. I hope those work out well for you. [01:34:25] Speaker D: Thank you. I think. I'm not really sure. I mean if I go to shell, it's going to be better. [01:34:31] Speaker A: Yeah. [01:34:32] Speaker C: And finally, I look forward to hearing how your Azure Key Vault retirement plan is going to go as apparently Azure Key Vault is retiring its legacy HSM platform1 on September 15, 2028 and customers using Microsoft Purview Information Protection will. Let's bring your own key need to migrate their tenant rootkeys to the modern FIPS142 Level 3 certified HSM platform before that date. Because who loves a good HSM migration? I don't. [01:34:55] Speaker A: Yeah. No, those. I can't imagine that's easy. I've never actually done one which. [01:34:58] Speaker C: Because why would you. I don't ever want to do it if I don't have to. [01:35:04] Speaker A: Exactly. [01:35:04] Speaker D: Like that's like that's one of those. I may go get a new job before I have to go deal with this. [01:35:09] Speaker C: Yeah, yeah. 2020 is your time, so that's good. [01:35:12] Speaker D: The thing is, Microsoft does give you normally a decent amount of time to do stuff, but what's always fun is if you buy, I don't know, a three year reservation, you're stuck with it. You have to deal with returning it right now because otherwise you would have negative time once it's there. You know, unrelated to. I don't know if they do it for HSM1 platforms or not, but I've been burned by that a few times now. [01:35:35] Speaker A: Lovely. [01:35:36] Speaker C: Well, gentlemen, we made it of this Google marathon. Wow. [01:35:41] Speaker A: I didn't think it was going to happen. [01:35:43] Speaker D: And AI marathon, I feel like it was very top and middle heavy this week. [01:35:47] Speaker C: It was. Yes. AI. I mean all the AI players wanted to make get their announcements out before Google dropped all theirs. So you know, I didn't see that coming. But now in hindsight I'm like, oh yeah, that makes sense. [01:35:57] Speaker A: If we had sponsors, you know, trying to throw money at us, we'd probably have the cloud pod and the AI pod. We'd have to break it all out. [01:36:04] Speaker C: I don't know if I could do. We can't get you guys together for one podcast recording. [01:36:07] Speaker A: Oh. [01:36:07] Speaker C: I mean, if that's what I'm saying. [01:36:09] Speaker A: It'd be average. [01:36:09] Speaker C: To throw a pod would have to be just AI. Just AI. That's true. [01:36:14] Speaker D: Right. [01:36:17] Speaker C: All right, gentlemen. See you next week, hopefully with wildlife stories. [01:36:20] Speaker A: All right, bye everybody. [01:36:22] Speaker C: Bye, everyone. [01:36:23] Speaker A: Another week of cloud news wrapped up. Bolt will collect the news, Justin will get the notes, Jonathan will write some code, Ryan will watch the perimeter, and Matt will reluctantly watch Azure till next week for AI, Amazon, Google Cloud and Azure. And hey, maybe even Oracle, who knows? Check out the CloudPod.net for our newsletter. Join our Slack, message us on socials or leave a review. I got new headphones, so that sounded amazing, [01:36:59] Speaker C: but. You want to talk about shoes, Ryan? [01:37:01] Speaker A: Well, you know, why not talk about shoes in a technology podcast? [01:37:05] Speaker C: Makes, you know, I mean, I, the technology of shoes is really impressive. There's a lot of rubber involved. There's a lot of, you know, 3D printing of shoes now they do at Nike. I mean there's, there's definitely some cool technology angles, but I don't think this is that. [01:37:17] Speaker A: Oh, it's not. [01:37:18] Speaker C: Oh, I mean, I don't know. You tell me. [01:37:21] Speaker A: No, it's not. This is a shoe company named allbirds announcing a $50 million deal to rebrand as new bird AI. So shifting from making shoes to GPU compute infrastructure on demand cloud service built to host your AI workloads, which, okay, sure, that's a, that's a, that's a turnaround. [01:37:44] Speaker C: I mean like, did they just have a lot of warehouse space? They just know what to do with it? They're like, well, if we turn this into data center, we can, you know, do a lot better for ourselves. I mean, it's a, it's a crazy pivot. I mean, they sold off the shoe part of the business for like, yeah, pennies, like nothing. [01:37:58] Speaker A: Like, they didn't try to get any money back on that. Like, I don't know if it was like going downhill for years. And this is how they tried to pivot something. [01:38:06] Speaker C: I mean, that's my understanding is that Allbirds as a brand had kind of fallen out of favor and they hadn't innovated any new shoes. And so while there's a very loyal tech pro following of Allbirds in the Bay, it wasn't really sustainable as a growth business anymore. So they were going to be in a situation where they would be optimizing margins, which is never a fun business to be in. And so, yeah, pivot to AI. I guess that's the way. [01:38:30] Speaker D: That's what. [01:38:30] Speaker A: It's the new hotness. [01:38:31] Speaker D: Right. So it' crazy. [01:38:33] Speaker A: The thing that really annoyed me is that on this announcement their stocks surged 580%, which I would be mad if I invested in a shoe. Shoe company. And they pivoted to some random other business, but apparently that's not the. [01:38:48] Speaker D: You would actually be very happy. I feel like. [01:38:51] Speaker A: Well, that's true. Yeah, I'd be angry, but I'd have money, which softens the blow a bit. [01:38:57] Speaker D: 250 to just S of 17. I mean, it's now sitting. You know, that was unsustainable, but it's now sitting at like seven bucks. Six and change. High six and change. So it's definitely staying higher than they were at with just saying, we're doing AI, we're doing GPUs. And I don't understand. [01:39:17] Speaker C: And his company has never done. Well, it's kind of like IPO is November 1, 2021, at $520 a share, which is crazy to me for a shoe company. [01:39:26] Speaker A: Right. [01:39:27] Speaker C: And basically it's, it's never, never retained the heights of its IPO. You know, within a year, let's see, November 22nd, it was down to 56 bucks a share. So they literally lost like 90% of their stock price. And then it's basically their peak was February 20, 23, where it's $54 now. It's after 24, and it's just basically been slowly. It's been under $10 a share now for almost a full year. So I mean, they had to do something if you want to keep the stock price going. Well. But yeah, I mean, what are they going to become a cloud player? Are they going to be a NEO scaler? Like, you know what, they're going to buy all these GPUs from Nvidia and then they're going to put them somewhere and then they're going to do what with them? That's what I don't understand. [01:40:13] Speaker A: And with GPU scarcity, like, they're, they're going to be a small fish in this larger pond of competition. So they're got anything new, they're going to. It's going to be delayed. [01:40:22] Speaker D: Like what? [01:40:23] Speaker A: And what business, like, acumen did they have that they could capitalize to offer something new in this market? Like, I don't, I really don't understand. [01:40:33] Speaker D: There's something else going on. Whether it's like a back, back room deal or something like that. [01:40:39] Speaker A: Oh, I, I think the CEO had a nervous breakdown and was just like, AI. [01:40:43] Speaker D: And then, yeah, I told you you needed to use AI. So I'm going to buy AI for you. [01:40:51] Speaker C: Yeah, we're gonna sell the shoes and we're just gonna do. But yeah, basically they're saying they're gonna lease AI GPU capacity to customers who need dedicated AI access. So I mean, like this is a complete business change of the model. I mean, I, I'm glad the stock came back down. It's like that original pop was so ridiculous and I, I would be shocked to see that this doesn't return to two or $3 a share unless they can really show momentum in the space. But like, like, you know, when are they going to get their gpus? When are they going to get all these things? Because you know, they're on the clock now. They got to prove that they can do something. [01:41:24] Speaker A: Yeah, I mean, I, yeah, I, I don't suspect we'll see this company around long term. I, I, I just don't think they're going to be competitive in the space. They're not offering anything new to me like in and you know, they're not. There's no narrative on why the shoe company is going to be good at this. So out. I'm out. [01:41:45] Speaker D: So there, there's an article that, you know, not the one I think we referenced in here, but you know, I saw, I just found it again. It's like a lot of companies today that, you know, didn't start anywhere near what they do nowadays. Like Nokia was the Finnish teleco company, started as a paper pulp factory and Nintendo was originally playing cards and Samsung was exporting dried fish and produce. [01:42:12] Speaker C: I mean, in theory, Nintendo's still playing cards. Have you seen Yuvi Go and Pokemon? [01:42:17] Speaker D: Yeah, I know, but I've never seen [01:42:18] Speaker A: someone pivot like completely. Like Amazon was a bookseller company. Right. But they didn't give up the commercial business. [01:42:23] Speaker C: Yeah, I mean, like even in the, like this, they just dumped, you know, back like the 20s and you know, 1910s. Like there was pivots there. But you know, again, like they were typically somewhat tangential. Like, oh, I thought I was going to build cabinets and then I realized that the real money was in countertops. So I moved in countertop, you know, like, or like 3M. You know, I invented a bunch of chemicals. And so then I realized I stumbled across Post it Notes and I made a bajillion dollars on Post it Notes. So I became a paper company. Like they're very, very clear transition points. Like if all birds became a 3D printing shoe company who, you know, sold designs for 3D printed shoes, I can say that makes sense. It's related to what you did before, but it's different. And maybe it's making more money than the Original business. I can see that. But this, this is like, you know, oranges and apples. [01:43:11] Speaker D: Well, you're in the fruit family. [01:43:12] Speaker C: And, like, what? I mean, unless I'm desperate for GPU capacity and I can't get it from any other place in the entire world, why would I go to Albert? What's. What's the play for me is what's the value problem? Like, why am I going to. Are you telling it to me for pennies on a dollar? Because that's not going to be good for your business either. [01:43:25] Speaker A: Right? It's so strange. [01:43:26] Speaker D: Well, you'll get people using the product at least. Won't be cost effective for you, but will you? People would use it. [01:43:33] Speaker C: Yeah. [01:43:35] Speaker A: Not for long. [01:43:36] Speaker D: Not for long. [01:43:38] Speaker C: All right, well, we'll keep an eye on this one. I don't have a lot of hopes for them. I think this is just a slow, slow decline into nothingness. But maybe, maybe Allbirds. AI will be amazing. I mean, do they have a new website? Can I sign up for interest? [01:43:51] Speaker A: Yeah. [01:43:52] Speaker C: Do they have a blog that we can follow on the podcast? [01:43:54] Speaker D: What happened to AI.com after the Super Bow? [01:43:58] Speaker C: Yeah, I have no idea what happened. [01:44:00] Speaker A: Good call. Yeah. [01:44:01] Speaker C: Yeah. [01:44:01] Speaker A: We need. [01:44:03] Speaker D: We need. Maybe I'll look at adding a future to bowl of like, these random things that, like, you know, kind of reminder, like one year from now to do something like this. Like, where is AI.com? where is Alberts.com? [01:44:15] Speaker C: i would tell you to. You should write it into Bolt, but I don't know if you have the time for that. [01:44:22] Speaker D: No, more of a. It will be a future project. I'll put it on my to do list. [01:44:26] Speaker C: There you go. So it looks like it's still the same thing it was on the. On the Super Bowl. It's still just a landing page where you can sign up for a handle. My handle is still reserved because I was trying to sign up for it again and it said, nope, it's still. So it's already used up. But yeah. [01:44:40] Speaker D: So they dropped their database. Got it. [01:44:42] Speaker C: I mean, it's still. It's still. It looks like X is still involved somehow or I don't know, terms and condition. Like, there's AI.comappendant. i mean. Yeah, like, when are they going to launch something, do something with it? Yeah, Next Super Bowl. They just spent another $10 billion. [01:44:57] Speaker A: Just a new one. Yeah, yeah, it's just a laundry, you know, a money laundering scheme. [01:45:02] Speaker C: I'm going to post my Twitter. See people. Anyway, AI.com? where'd it go? [01:45:07] Speaker D: This is fascinating for our viewers. [01:45:09] Speaker C: I know, it's great. Sorry. [01:45:11] Speaker A: See the show. We'll edit all this out, Right? [01:45:13] Speaker C: Yeah. That's why this after show. It's great. Great time. Yeah. Well, I think on that note, we'll let allbirds become an AI Company. Yeah. See how that goes for them. And in the meantime, we'll keep following the cloud. So, gentlemen, I'll see you next week. [01:45:27] Speaker A: Until next week. Bye now. Bye.

Show Notes

Titles we almost went with this week

General News

AI Is Going Great – Or How ML Makes Money

Cloud Tools

AWS

GCP

Azure

After Show

Closing

Chapters

Episode Transcript

Other Episodes

Episode 173

173: Oracle Begins Its Invasion of Sovereign Nations

Episode 266

266: AWS Billing Finally Comes into FOCUS

Episode 245

245: The CloudPod is the SBOM!