257: Who Let the LLamas Out? *Bleat Bleat*

Welcome to episode 257 of the Cloud Pod podcast – where the forecast is always cloudy! This week your hosts Justin, Matthew, Ryan, and Jonathan are in the barnyard bringing you the latest news, which this week is really just Meta’s release of Llama 3. Seriously. That’s every announcement this week. Don’t say we didn’t warn you.

Titles we almost went with this week:

Meta Llama says no Drama
No Meta Prob-llama
Keep Calm and Llama on
Redis did not embrace the Llama MK
The bedrock of good AI is built on Llamas
The CloudPod announces support for Llama3 since everyone else was doing it
Llama3, better know as Llama Llama Llama
The Cloud Pod now known as the LLMPod
Cloud Pod is considering changing its name to LlamaPod
Unlike WinAMP nothing whips the llamas ass

A big thanks to this week’s sponsor:

Check out Sonrai Securities‘ new Cloud Permission Firewall. Just for our listeners, enjoy a 14 day trial at www.sonrai.co/cloudpod

Follow Up

01:27 Valkey is Rapidly Overtaking Redis

Valkey has continued to rack up support from AWS, Ericsson, Google, Oracle and Verizon initially, to now being joined by Alibaba, Aiven, Heroku and Percona backing Valkey as well.
Numerous blog posts have come out touting Valkey adoption.
I’m not sure this whole thing is working out as well as Redis CEO Rowan Trollope had hoped.

AI Is Going Great – Or How AI Makes All It’s Money

03:26 Introducing Meta Llama 3: The most capable openly available LLM to date

Meta has launched Llama 3, the next generation of their state-of-the-art open source large language model.
Llama 3 will be available on AWS, Databricks, GCP, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, Nvidia NIM, and Snowflake with support from hardware platforms offered by AMD, AWS, Dell, Intel, Nvidia and Qualcomm
Includes new trust and safety tools such as Llama Guard 2, Code Shield and Cybersec eval 2
They plan to introduce new capabilities, including longer context windows, additional model sizes and enhanced performance.
The first two models from Meta Lama3 are the 8B and 70B parameter variants that can support a broad range of use cases.
Meta shared some benchmarks against Gemma 7B and Mistral 7B vs the Lama 3 8B models and showed improvements across all major benchmarks. Including Math with Gemma 7b doing 12.2 vs 30 with Llama 3
It had highly comparable performance with the 70B model against Gemini Pro 1.5 and Claude 3 Sonnet scoring within a few points of most of the other scores.
Jonathan recommends using LM Studio to get start playing around with LLMS, which you can find at https://lmstudio.ai/

04:42 Jonathan – “Isn’t it funny how you go from an 8 billion parameter model to a 70 billion parameter model but nothing in between? Like you would have thought there would be some kind of like, some middle ground maybe? But, uh, but… No. But, um, I’ve been playing with the, um, 8 billion parameter model at home and it’s absolutely amazing. It blows everything else out of the water that I’ve tried. And it’s fast and it’s incredibly good.”

07:08 Building Enterprise GenAI Apps with Meta Llama 3 on Databricks

Now prepare yourselves for a slew of Llama 3 support announcements with the first one coming from Databricks.
Databricks AI capabilities allow you to access Production-grade APIs against Llama 3 and easily compare and govern Meta Llama 3 alongside other models.
As well as the ability to Customize Llama 3 with fine-tuning support with your private data.
Want to have a go? Check out the Databricks AI playground here.

07:37 OpenAI’s commitment to child safety: adopting safety by design principles

An update on our child safety efforts and commitments

Both Open AI and Google have announced their partnership to implement robust child safety measures in the development, deployment and maintenance of generative AI technologies as articulated in the Safety by Design Principles. This initiative, led by Thorn, a nonprofit dedicated to defending children from sexual abuse, and All Tech is Human, an organization dedicated to tackling tech and society’s complex problems, aims to mitigate the risks generative AI poses to children.
This commitment from Open AI is to develop, build and train generative AI models that proactively address child safety risks, including the detection and removal of child sexual abuse material (CSAM) and child sexual exploitation material (CSEM) from training data and report any confirmed CSAM to authorities, including incorporating feedback loops and iterative stress-testing strategies in the development process and deploying solutions to address adversarial misuse.
Release and distribute generative AI models after they have been trained and evaluated for child safety, protecting the process. Combat and respond to abuse content and conduct, incorporate prevention efforts, and encourage developer ownership in safety by design.
Maintain model and platform safety by continuing to understand and respond to child safety risks actively. Including removing new AIG-CSAM generated by bad actors from the platform. Invest in research and future technology solutions and fight CSAM, AIG-CSAM and CSEM on their platform.

10:45 Introducing more enterprise-grade features for API customers

Open AI has released new enterprise-grade features for its API customers.
They now support Private Link to ensure customers’ communication between Azure and Open AI has minimal exposure to the Internet.
Support for MFA and SSO, as well as data encryption at rest using AES-256 and in transit using TLS 1.2 and role-based Access controls.
They can also now offer BAA to healthcare companies.
The new project features allow organizations to have more granular control and oversight over individual projects in OpenAi.

11:45 Matthew – “There have been some organizations I worked with in the past that literally just, you don’t have an internet route. You don’t have a zero zero zero out in your V net VPC, wherever it is. You know, you have to use private links for every single thing for every single service. And that’s the only way out to the internet. So it’s probably trying to target those large enterprises that are like it’s ok to spend a third of your bill on private links.”

12:24 Ryan – “I was sort of conflicted about the feature to allow an organization to have granular control and oversight of projects. And like on one hand as a platform provider by day, I’m like, that’s great. There’s teams that’ll use that. And on the other hand, as a user, I’m like, oh, that’s terrible.”

AWS

13:44 Meta Llama 3 foundation models now available on AWS

A Llama 3 announcement! How unexpected.
Llama 3 is now available in Sagemaker Jumpstart, a machine learning (ML) hub that offers pre-trained models, built-in algorithms, and pre-built solutions to help you quickly get started with ML.

14:22 Amazon Inspector agentless vulnerability assessments for Amazon EC2 are now Generally Available (GA)

Amazon Inspector now continuously monitors your Amazon EC2 instances for software vulnerabilities without installing an agent or additional software. Currently, inspector leverages the widely deployed AWS Systems Manager (SSM) agent to assess your EC2 instances for third-party software vulnerabilities. With this expansion, Inspector now offers two scan modes for EC2 scanning, hybrid scan mode and agent-based scan mode.
In Hybrid scan mode, Inspector relies on SSM agents to collect information from instances to perform vulnerability assessments and automatically switches to agentless scanning for instances that do not have SSM agents installed or configured.
For agentless scanning, Inspector takes snapshots of EBS volumes to collect software application inventory from the instances to perform vulnerability assessments.
For agent based scan mode, Inspector only scans instances that have a SSM agent installed and configured. New customers enabling EC2 scanning are configured in hybrid mode by default, while existing customers can migrate to hybrid mode by simply visiting the EC2 settings page within the inspector console. Once enabled, Inspector automatically discovers all your EC2 instances and starts evaluating them for software vulnerabilities.
Hybrid mode is available in all regions where Inspector is available.

15:35 Ryan – “…managing third party vulnerabilities as agents is nightmarish, right? With the license management and the registration and the deregistration as you’re auto scaling and having services like that, like this one where it’s built in, you’re likely already running the agent. If you’re in the Amazon ecosystem, then how nice would it be to just not have to do one other thing. It’s something that you don’t have to pay attention to. It’s the benefit of a managed service.”

17:24 Unify DNS management using Amazon Route 53 Profiles with multiple VPCs and AWS accounts

If you manage many accounts and VPC resources, sharing and associating many DNS resources to each VPC can present a significant burden. You often hit limits around sharing and association, and you may have even built your own orchestration layers to propagate DNS configurations across your accounts and VPCs.
- In a prior life we did this with sub domains and pushing those into each account’s Route53 configurations.
Amazon has decided there is a better way with Amazon Route 53 profiles, which provide the ability to unify management of DNS across all your organization’s accounts and VPCs.
Route 53 profiles let you define a standard DNS configuration, including Route 53 private hosted Zone (PHZ) associations, resolver forwarding rules, and route 53 resolver DNS firewall rule groups, and apply that configuration to multiple VPCs in the same AWS region.
With profiles, you can easily ensure that all your VPCs have the same DNS configuration without the complexity of handling separate route 53 resources. Managing DNS across many VPCs is now as simple as managing those same settings for a single VPC.
Profiles are natively integrated into AWS Resource Access Manager allowing you to share your profile across accounts or with your AWS organizations account.
Profiles integrate seamlessly with route 53 private hosted zones by allowing you to create and add existing private hosted zones by allowing you to create and add existing private hosted zones to your profile so that your organizations have access to these same settings when the Profile is shared across accounts.
This one though comes with a hefty price tag, 0.75 per profile per hour, for up to 100 VPC attachments, with additional fee per VPC over the 100.
It’s a lot of cost to take care of a little bit of work, in our opinion.

20:09 Jonathan – “It’s kind of weird. It’s this static configuration where once it’s in place, it’s in place. And so, I mean, I guess you’re monitoring it to make sure it doesn’t drift, but an hourly charge for that? Yeah, no, I’m not jazzed by the price model.”

21:25 Justin- “I do hope this one comes down in price. Yeah, the other way that we did this in a prior life was we just created subdomains. And then we delegated subdomains to each team’s route 53. Now we paid a lot of money, probably an extra hosted zones that we had to support. But again, I think a hosted zone’s only like 10 or 15 cents, 50 cents, yeah. So it’s not a month, yeah, not per hour.”

22:19 Lots of Bedrock News, Including – you guessed it – Llama 3

Amazon Bedrock Launches New Capabilities as Tens of Thousands of Customers Choose It as the Foundation to Build and Scale Secure Generative AI Applications

Meta’s Llama 3 models are now available in Amazon Bedrock

Guardrails for Amazon Bedrock now available with new safety filters and privacy controls

Agents for Amazon Bedrock: Introducing a simplified creation and configuration experience

Amazon Bedrock model evaluation is now generally available

Import custom models in Amazon Bedrock (preview)

Amazon Titan Image Generator and watermark detection API are now available in Amazon Bedrock

- Amazon dropped a press release and 6 new features for bedrock.
- First up Llama 3 support….
- New Custom model imports lets you bring your proprietary model to Bedrock and take advantage of bedrocks capabilities
- Guardrails for Bedrock provides customers with best-in-class technology to help them effectively implement safeguards tailored to their application needs and aligned with their AI policies.

“Amazon Bedrock is experiencing explosive growth, with tens of thousands of organizations of all sizes and across all industries choosing it as the foundation for their generative AI strategy because they can use it to move from experimentation to production more quickly and easily than anywhere else,” said Dr. Swami Sivasubramanian, vice president of AI and Data at AWS. “Customers are excited by Amazon Bedrock because it offers enterprise-grade security and privacy, a wide choice of leading foundation models, and the easiest way to build generative AI applications. With today’s announcements, we continue to innovate rapidly for our customers by doubling-down on our commitment to provide them with the most comprehensive set of capabilities and choice of industry-leading models, further democratizing generative AI innovation at scale.”
Additional model choices including Titan text embeddings, titan image generator and the new Llama 3 and Cohere models.
Bedrock= simple to configure? We don’t think that word means what AWS thinks it means.

24:53 Amazon RDS Performance Insights provides execution plan for RDS SQL Server

RDS performance insights now collects the query execution plan for the resource intensive SQL queries in Amazon RDS for SQL Server, and stores them over time.
It helps identify if a change in the query execution plan is the cause of the performance degradation or stalled query.
A query execution plan is a sequence of steps the database engine uses to access relational data.
This feature allows you to visualize a SQL query with multiple plans and compare them.

21:25 Justin – “It’s also annoying that this isn’t just built into SQL Server, that it would keep history of stored SQL plans forever. So I do appreciate that Amazon has built this, but come on Microsoft, you can do better.

GCP

26:18 Meta Llama 3 Available Today on Google Cloud Vertex AI

Guess what… Meta Llama 3 is available in Vertex.

26:56 Direct VPC egress on Cloud Run is now generally available

After missing the Google Next deadline…the Cloud Run team is pleased to announce GA of Direct VPC egress for Cloud Run.
This feature enables your cloud run resources to send traffic directly to a VPC network without proxying it through Serverless VPC access connectors, making it easier to set up, faster and with lower costs.
Direct VPC Egress delivers approximately twice the throughput compared to both VPC connectors and the default cloud-run internet egress path, offering up to 1 GB per second per instance.

27:31 Ryan – “Yeah, this is just one of those usability things when you get all excited to use Cloud Run and then you realize you can’t do anything with it because you have all this other configuration that you have to do. And that’s just to get to the internet. And then trying to get it into your other environment is this whole other peering nonsense, you know, like just awful. It just makes it difficult to adopt and, you know, like it, you didn’t get my attention in the first five minutes. I’m probably not going to use that solution.”

34:36 Introducing the Verified Peering Provider program, a simple alternative to Direct Peering

In today’s cloud-first world, customers need a simple and highly available connectivity solution, says Google.
Many customers access Google Workspace, Cloud and APIs using direct peering, a solution designed for carrier-level network operators. These operators have in-house expertise to manage the peering connectivity, which can require complex routing designs. Because of the complexity, not all customers want to do this work.
At the same time customers also access latency-sensitive secure access service edge solutions or are migrating to SD-Wan solutions hosted on GCP using the internet as transport. Customers need to know where internet service providers (ISPs) are connected to Google’s network with the appropriate level of redundancy and HA.
To solve this they are announcing Verified Peering Provider Program, a new offering that simplifies connectivity to Google’s network. The Verified Peering provider program identifies ISPs that offer enterprise-grade internet services and have met multiple technical requirements, including diverse peering connectivity to Google.

35:38 Ryan – “I want to love this, but then when you read deep into this, you realize that it’s just a site that lists all their existing providers, their locations, and their resiliency offerings. And then you still have to go through all the setup of creating Direct Connect and working with your providers.”

37:04 Jonathan – “So it’s the Angie’s List of Peering Providers.”

Azure

37:28 CISA Review of the Summer 2023 Microsoft Exchange Online Intrusion

A full review by CISA of the summer 2023 Exchange Hack by a threat actor, has now been completed and frankly the report is unkind to MS and their practices.
The executive summary alone is pretty damning, and there are even more interesting items in the details.
They concluded that the intrusion was preventable and should never have occurred. They also concluded that Microsoft’s security culture is inadequate and requires an overhaul, particularly in light of the company’s centrality in the technology ecosystem and the level of trust customers place in the company to protect their data and operations.
They reached this conclusion based on 7 points:
- The cascade of Microsoft avoidable errors that allowed the intrusion to succeed
- Microsoft’s failure to detect the compromise of its cryptographic crown jewels on its own, relying instead on customers to reach out to identify anomalies the customer had observed
- The board’s assessment of security practices at other cloud service providers, which maintained security controls that Microsoft did not
- Microsoft failure to detect a compromise of an employee’s laptop from a recently acquired company prior to allowing it to connect to Microsoft’s corporate network in 2021
- Microsoft’s decision not to correct, promptly, its public statements about the incident, including a corporate statement that Microsoft believe it had determined the likely root cause of the intrusion, when in fact, it still has not; even though Microsoft acknowledges to the board in November 2023 that its September 6 2023 blog post about the root cause was inaccurate, it did not update that post until March 12, 2024 as the board was concluding its review and only after the boards repeated questioning about Microsoft’s plan to issue a correction
- The board’s observation of a separate incident, disclosed by Microsoft in January 2024, the investigation which was not in the purview of the board’s review, revealed a compromise that allowed a different nation-state actor to access highly-sensitive Microsoft corporate email accounts, source code repos and internal systems
- How Microsoft’s ubiquitous and critical products, which underpin essential services that support national security, the foundations of the economy, and public health and safety, require the company to demonstrate the highest standards of security, accountability and transparency.
The board believes that to resolve this, the MS board and CEO need to three-dimensionally focus on the company’s security culture and develop and share publicly a plan with specific timelines to make fundamental, security-focused reforms across the company and its full suite of products.
The board recommends that the CEO hold senior officers accountable for delivery against the plan.
In the interim, MS leaders should consider directing internal teams to deprioritize feature developments across the company’s cloud infrastructure and product suite until substantial security improvements have been made to preclude resource competition.
OUCH. Charlie Bell has *a lot* to deliver now.
AWS Response to March 2024 CSRB report
How the unique culture of security at AWS makes a difference

40:45 Jonathan – “I think it really emphasizes the need to separate production implementations of software and the development of the software. Having access to source code, it shouldn’t be the end of the world. But yeah, getting the access they did to the date they did is completely unacceptable.”

41:49 Justin – “Now in, in punching down on your competitor, Amazon decided to respond to this. And they posted a, you know, one of their quick little blog posts, basically saying, Amazon is aware of the recent cyber safety review board report regarding the 2023 Microsoft online exchange issue. We are not affected by the issue described in the report and no customer action is required…To learn more, please refer to our blog post…security is everyone’s job and distributing security expertise and ownership across AWS as a thing and scaling security through innovation. And it just feels dirty.”

44:12 Manufacturing for tomorrow: Microsoft announces new industrial AI innovations from the cloud to the factory floor

Manufacturing faces ongoing challenges with supply chain disruptions, changes in consumer demand, workforce shortages and the presence of data silos. These issues are making it crucial for the industry to adapt and change.
AI is of course, the solution by enabling companies to change their business models fundamentally and approach pervasive industry challenges. AI Acts as a catalyst for innovation, efficiency and sustainability in Manufacturing
Key Benefits
- Enhanced time-to-value and Operations resilience: AI Solutions help streamline processes, reducing the time from production to Market
- Cost Optimization: AI Optimizes factory and production costs through better resource management and process automation.
- Improved Productivity: AI Tools empower front-line workers by simplifying data queries and decision-making processes, thereby enhancing productivity and job satisfaction.
- Innovation in Factory Operations: New AI and Data solutions facilitate the creation of intelligent factories that are more efficient and capable of adapting to changes quickly.
Microsoft is looking to help by leveraging data solutions like fabric and copilot, which are designed to unify operational technology and information technology data, accelerate AI Deployment and enhance the scalability of these solutions across manufacturing sites.

Small Cloud Providers

45:38 Meta Llama 3 available on Cloudflare Workers AI

Oh hey Cloudflare supports Llama 3.
Betcha didn’t see that coming. (Oh hear it coming?)

Aftershow

46:31 Amazon Fresh kills “Just Walk Out” shopping tech—it never really worked

If you have been amazed at the just walk out technology, and likened it to magic. It’s probably for a good reason as its built on a house of lies… (reminds of Tesla’s FSD)
“Just walk out” is supposed to let customers grab what they want from a store and just leave.
Amazon wanted to track what customers took with them purely via AI powered video surveillance; the system just took a phone scan at the door, and shoppers would be billed later.
There are reportedly a ton of tech problems, and Amazon has been struggling with them for six years since the initial announcement.
The report indicated that Amazon had more than 1,000 people in India working on just walk out in mid-2022, whose jobs were to manually review transactions and label images from videos to train the ML model.
Training is part of any project, but, even after years of work, they have been unable to deliver on the promises.
Amazon will be switching to a more reasonable cashier-less format: shopping carts with built-in check-out screens and scanners. Customers can leisurely scan items as they throw them in the Amazon Dash Cart and the screen will show a running total of their purchases.
It’s not the first time we’ve run into an AI project that was built on a bed of lies.

51:14 Justin- “Well, maybe it’ll come back someday in the future when people figure out technology. But yeah, I sort of this, you know, I still feel like we’re dangerously close to like a lot of FUD around gen AI happening and the trough of disillusionment happening very quickly and stories like this don’t help. And so, you know, it’s going to be interesting reckoning in the economy as all these companies have laid people off with saying AI is making them more efficient. And I’m like, is it? Or are you just hoping it’s going to be? And then are you going to be suffering in another year from now when things aren’t working in your organization?”

Closing

And that is the week in the cloud! Go check out our sponsor, Sonrai and get your 14 day free trial. Also visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloud Pod

Show Notes

Titles we almost went with this week:

A big thanks to this week’s sponsor:

Check out Sonrai Securities‘ new Cloud Permission Firewall. Just for our listeners, enjoy a 14 day trial at www.sonrai.co/cloudpod

Follow Up

AI Is Going Great – Or How AI Makes All It’s Money

AWS

GCP

Azure

Small Cloud Providers

Aftershow

Closing

Other Episodes

Episode 272

272: AI: Now with JSON Schemas!

Episode 89

Episode 89: Azure gives The Cloud Pod an advisor score of 100

Episode 106

Episode 106: The Cloud Pod disagrees with Gartner on Low-Code