Welcome to episode 250 of the Cloud Pod podcast – where the forecast is always cloudy! Well, we’re not launching rockets this week, but we ARE discussing the AI arms race, AWS going nuclear, and all the latest drama between Elon and OpenAI. You won’t want to miss a minute of it!
Titles we almost went with this week:
- The Paradox of AI choice
- Amazon just comes across super desperate on RACING to AI foundation model
- support
- Your new JR developer Test-LLM
- If you can’t beat OpenAI, sue them
A big thanks to this week’s sponsor:
We’re sponsorless this week! Interested in sponsoring us and having access to a specialized and targeted market? We’d love to talk to you. Send us an email or hit us up on our Slack Channel.
General News
01:12 IT Infrastructure, Operations Management & Cloud Strategies: Chicago (Rosemont/O’Hare), Illinois
- Want to meet cloud superstar Matthew Kohn in person? He’s going to be giving a talk in Chicago, if you’re going to be in the neighborhood. *Maybe* he’ll have some stickers.
- 11:30am – 12:30pm: Using Data and AI to Shine a Light on Your Dark IT Estate
AI Is Going Great (Or, How ML Makes All Its Money)
03:42 Anthropic claims its new models beat GPT-4
- AI Startup Anthropics, has announced their latest version of Claude.
- The company claims that it rivals OpenAI’s GPT-4 in terms of performance.
- Claude 3, and its family of models, includes Claude 3 Haiku, Sonnet and Opus, with Opus being the most powerful.
- All show “increased capabilities” in analysis and forecasting, Anthropic claims, as well enhanced performance on specific benchmarks versus models like GPT-4 (but not GPT-4 Turbo) and Googles Gemini 1.0 Ultra (but not Gemini 1.5 Pro)
- Claude 3 is Anthropics first multi-modal model.
- In a step better than rivals, Claude can analyze multiple images in a single request (up to 20). This allows it to do compare and contrast operations
- However, there are limits to its image capabilities. It’s not allowed to identify people.
- They admit it is also prone to mistakes on low-quality images under 200 pixels, and struggles with tasks involving spatial reasoning and object counting.
05:42 Justin – “Overall, this looks like not a bad model. I do see a little bit of chatter today actually. Some people say it’s not quite as good in some areas, but it’s pretty good in others. And it is not connected to the internet, this model. So it is dated only through August of 2023. So anything that happened after that, like the Israeli Hamas conflicts, it doesn’t know anything about those. So just be aware.”
06:08 Matthew – “You know, it’s actually interesting now. There’s so many models out there. You know, you have to start to look at what makes sense for your data and what you need, along with also price. You know, I look too closely at what the price is, but you might be able to get away with running this over GPT-4 turbo, and you might not need the latest and greatest, and you’re leveraging this in your company’s product or just in general.”
07:38 Meta’s new LLM-based test generator is a sneak peek to the future of development
- This article comes from the Engineer’s Codex blog, which Justin is a subscriber to.
- The post is about a recent paper released by Meta on Automated Unit Test Improvements Using LLM at Meta.
- The idea is to use AI to make developers more productive, from Google using AI for code reviews to now this Meta article.
- The goal of their AI is that it recommends fully-formed software improvements that are verified to be both correct and an improvement on current code coverage tests.
- Compared to ChatGPT where things have to be manually verified to work, it’s a nice improvement.
- TestGen-LLM uses an approach called Assured LLM-based software engineering.
- It uses private, internal LLMs that are probably fine-tuned with Meta’s codebase.
- This means it uses LLM to generate improvements that are backed by verifiable guarantees of improvement and non-regression.
- Test-Gen LLM uses an ensemble approach to generate code improvements.
- A good way to think about it is it’s a junior dev with the task of creating more comprehensive tests for existing code.
- Other devs have more important things to work on, so this LLM gets the fun task of improving unit tests.
- The tests the JR Ai Developer creates are often good, and sometimes trivial or pointless. Occasionally, a test it produces is really good or uncovers a bug inadvertently.
- Regardless, this work wouldn’t have been done by humans anyway, due to other priorities.
- All of the pull requests from it require a human reviewer before being pushed into the codebase.
09:21 Matthew – “It’s amazing where we’re going with all this stuff. And the fact that it’s able to actually take your own, do the analysis and produce anything, you know, is great. Unit tests are one of those things that if you’re not doing test-driven development, which I feel like very few people do TDD, it’s a great way to start to really find all these bugs. Slightly terrifying on the same level of how good it gets at some of these things, you know, as I play with Copilot and a few of the other, you know, technologies that I play with often, but it’s getting there and, you know, it can start to automate some of these things, which is great because let’s be honest, what developer really likes to write unit tests?”
10:43 Struggling to Pick the Right AI Model? Let’s Break It Down.
- Last week, Ryan and Justin were talking about how difficult it is to choose between all these foundational models, and how do you really compare them beyond speed and accuracy? (Or in the case of Google, how “woke” the model is.)
- Now this week, Cohere has a blog post about picking the Right AI model… and we’re going to take a wild guess that it’s a little biased…but there may be some nice generic ideas
- Cohere’s Sudip Roy and Neil Shepherd published a paper on How to Choose the Right AI model for your Enterprise
- Some of the advice:
- Open-Source or Proprietary – Consider not only the upfront costs, but also the time-to-solution, data provenance, and indemnity options to avoid any unwanted surprises like indemnity obligations some open-source providers include. Then review the level of support and engineering know-how you will need, and the frequency of updates made to the models.
- General or Tailored – Rightsizing the model to your use case and performance requirements at scale is critical. For example, does your solution need advanced reasoning (and the costs it entails) for every query? Consider how a fine-tuned model with advanced RAG capabilities may outperform a large general model at a fraction of the cost. Look for models optimized for performance with methods like quantization, transformer efficiencies and model compression techniques.
- Transformation or incremental adoption: most organizations start with solutions for tactical benefits, like increasing productivity and lowering costs. A growing trend among customers is improving information retrieval systems with simple integration of a Rerank solution
13:27 Justin – “I’m sort of hoping for a Forester wave or a magic quadrant of models. You know, some, some kind of like general guidance that would be helpful as well. But, you know, I assume it’s going to be an area that’s rapidly maturing here over the next few years as people get more experience and more use cases behind these things.”
AWS
13:56 New AWS Region in Mexico is in the works
- Feliz Cloudidad, Feliz Cloudidad
Feliz Cloudidad, prospero año y felicidad
I wanna wish you a cloud-based welcome
From the heart of Mexico’s land
Where the servers hum, and the data it streams
AWS brings power in hand
(The rest of the team apologizes for this)
- AWS is announcing a new region is coming in Mexico.
- The Mexico Central Region will be the second Latin American region, joining the Sao Paulo region.
- The new region will have three availability zones.
14:24 AWS to Launch an Infrastructure Region in the Kingdom of Saudi Arabia
-
- AWS is announcing that they will launch an AWS infrastructure region in the Kingdom of Saudi Arabia in 2026. This will allow customers who want to keep their content in the country to do so.
- This commitment also includes investing more than 5.3B in the KSA.
- “Today’s announcement supports the Kingdom of Saudi Arabia’s digital transformation with the highest levels of security and resilience available on AWS cloud infrastructure, helping serve fast-growing demand for cloud services across the Middle East,” said Prasad Kalyanaraman, vice president of Infrastructure Services at AWS. “The new AWS Region will enable organizations to unlock the full potential of the cloud and build with AWS technologies like compute, storage, databases, analytics, and artificial intelligence, transforming the way businesses and institutions serve their customers. We look forward to helping Saudi Arabian institutions, startups, and enterprises deliver cloud-powered applications to accelerate growth, productivity, and innovation and spur job creation, skills training, and educational opportunities.”
- The new region will have 3 availability zones at launch.
20:22 AWS Acquiring Data Center Campus Powered by Nuclear Energy
- Talon Energy Corp has sold its Cumulus data center campus, near a Pennsylvania nuclear power station to AWS.
- It’s a 960 MW data center campus that can house multiple data center facilities, all powered by the Susquehanna Nuclear Power Plan.
- The data center campus comprises 1200 acres. AWS has minimum contractual power commitments for the data center that will ramp up in 120MW increments over several years, with a one-time option to cap commitments at 480MW.
21:17 Justin – “The interesting thing about this, I was like, well, it’s a little bit, uh, I thought maybe it was close enough on the Pennsylvania border to North Virginia that it wouldn’t be a big deal, but it’s
actually like 300 miles or so it’s two or three miles. It’s not close. So I was trying to figure out if this is going to be a new US East region, or is this going to be somehow extended to the US East one? So I’m not even sure Amazon’s planning to use this thing because they haven’t announced it, and all this news comes from directly from Talon Energy, who, you know, publicly had to announce it because they’re a publicly traded company.”
22:53 Amazon EKS announces support for Amazon Linux 2023
- EKS now supports AL2023.
- AL2023 is the next generation of Amazon Linux from AWS, and is designed to provide a secure, stable and high performance environment to develop and run your cloud applications.
- EKS customers can enjoy the benefit of AL2023 by using standard AL2023-based EKS optimized AMI’s with managed node groups, self-managed nodes and Karpenter.
- Several improvements over AL2 in that it takes a secure-by-default approach to help improve your security posture with preconfigured security policies, SELinux in permissive mode and IMSDv2 enabled by default, as well as an optimized boot time to reduce the time from instance launch to running applications.
- ECS got their optimized AL2023 in March 2023.
23:52 Matthew – “I was looking up to see, I was like, did I miss something? Cause I feel like AL2 became what’s originally was, supposed to be Amazon Linux 2022, which got renamed to 2023 I thought. And this just feels like a really long time for them to get support. So either it wasn’t a priority, which sounds weird, because, you know, I thought they were trying to kill off AL2 or…They had to do a whole lot to make it get there. Like I’m just trying to figure out why it took them so long.”
26:10 Anthropic’s Claude 3 Sonnet foundation model is now available in Amazon Bedrock
-
- Amazon was quick to announce that Claude 3 Sonnet is now available in Bedrock – the same day as Anthropic announced it to the world.
- Sonnet is available today, with Opus and Haiku coming very soon.
- Amazon points out that Claude 3 Sonnet is 2x faster than Claude 2 and Claude 2.1, with increased steerability, and new image to text vision capabilities.
- Claude 3 also has expanded its language understanding beyond English to also include French, Japanese and Spanish.
- “Anthropic at its core is a research company that is trying to create the safest large language models in the world, and through Amazon Bedrock we have a chance to take that technology, distribute it to users globally, and do this in an extremely safe and data-secure manner.” — Neerav Kingsland, Head of Global Accounts at Anthropic
27:04 Mistral AI models now available on Amazon Bedrock
- Last week we told you it was coming
- This week its here.
- Come on Amazon your just looking desperate
- Mistral 7B and Mistral 8x7B are now available in Bedrock.
27:22 Justin – “I have to say, at Amazon, this just looks desperate. Couldn’t have waited a week. Couldn’t have just, you know, let, you know, Hey, it’s now available today. You know, you didn’t have to tell me preannouncement last week. I mean, it’s one thing to pre-announce and like it waits, it takes a month or so, but like literally you preannounced, we recorded and like three days later you announced it was available. Uh, it just smells of desperation. And this is where I was commenting earlier about weird named models.”
28:39 Introducing the AWS WAF traffic overview dashboard
- AWS has introduced new WAF overview dashboards to make it easy to see your security-focused metrics so that you can identify and take action on security risks in a few clicks, such as adding rate-based rules during distributed DDOS events.
- The dashboards include near real-time summaries of the Amazon Cloudwatch Metrics that WAF collects.
- These dashboards are available by default, and require no additional setup.
- With default metrics such as the total number of requests, blocked requests and common attacks blocked you can customize your dashboard with the metrics and visualizations that are most important to you.
30:56 Justin – “Or what you can do is what I did, just put the CloudPod website behind CloudFlare, enable their WAF and DDoS capabilities, and you’re done. I don’t think about it ever now. And so it’s a pretty nice package over there. So I definitely recommend that if you’re not interested in implementing the Amazon WAF, or you’re looking for something that’s maybe multi-cloud, CloudFlare would be your friend.”
33:07 Free data transfer out to internet when moving out of AWS
- AWS sees you Google, and calls your bluff with their own “free data transfer out” when moving out of AWS.
- AWS feels they’re the best choice for a broad set of services – including over 200 fully featured services for all your workload needs.
- But even so, starting today, AWS now believes this must include the ability to migrate your data to another cloud provider or on-premises, and so now they are waiving data transfer out to the internet (DTO) charges when you want to move outside of AWS.
- They point out that over 90% of their customers already incur no data transfer expenses out of AWS because they provide 100 gigabytes per month free from AWS regions to the internet.
- If you need more, it’s as simple as reaching out to AWS support to ask for free DTO rates for the additional data.
- AWS says you must go through support because customers make hundreds of millions of data transfers each day, and they do not know if the data transferred out to the internet is normal part of your business or a one time transfer as part of a switch to another cloud provider or on-premises.
- Is the math math-ing here? We have questions.
- All review requests will be done at the AWS account level. Once approved, they will provide credits for the data being migrated. “We don’t require you to close your account or change your relationship with AWS in any way. You’re welcome to come back at any time. We will of course apply additional scrutiny of the same AWS account multiple times for free DTO.” says Amazon.
- “We believe in customer choice.” Sure, Jan.
- The waiver on the data transfer though also follows the directives set by the European data act and is available to all AWS customers around the world.
- **Listener Note from Justin** Not so fast: After your move away from AWS services, within the 60-day period, you must delete all remaining data and workloads from your AWS account, or you can close your AWS account.
GCP
Announcing Anthropic’s Claude 3 models in Google Cloud Vertex AI
- In Google secondary hope for a Foundational Model after Gemini went Woke… Anthropics Claude 3 models are also now available in Google Cloud Vertex AI.
- I see they are following a similar AWS model announcing it is coming in the upcoming weeks but had to get the news.
- All the cool things about Claude are the same here as well as on AWS.
37:11 Google Cloud databases stand ready to power your gen AI apps with new capabilities
-
- At the Next ‘23, Google laid out the vision to help developers build enterprise-gen AI applications including delivering world-class vector capabilities, building strong integration with the developer ecosystem and making it easy to connect to AI inference services.
- Google has been hard at work building this and now is announcing the GA of AlloyDB AI, an integrated set of capabilities in AlloyDB to easily build enterprise Gen AI apps.
- The AlloyDB AI is available in DB and Omni.
- Is optimized for enterprise gen ai apps that need real-time and accurate responses
- Delivers superior performance for transaction, analytical and vector workloads
- Runs anywhere, including on-premises and other clouds, enabling customers to modernize and innovate wherever they are.
- “AlloyDB acts as a dynamic vector store, indexing repositories of regulatory guidelines, compliance documents, and historical reporting data to ground the chatbot. Compliance analysts and reporting specialists interact with the chatbot in a conversational manner, saving time and addressing diverse regulatory reporting questions.” – Antoine Moreau, CIO, Regnology
- Vector Search we have talked about a few times here on the show, but they are announcing vector search across CloudSQL for MySQL, Memorystore for Redis and Spanner in preview.
- Cloud SQL for MySQL also now supports both approximate and exact nearest neighbor vector searches, adding to the pgvector capabilities launched last year in Cloud SQL for Postgres.
- LangChain has grown to be one of the most popular open-source LLM orchestration frameworks. In their efforts to provide application developers with tools to help them quickly build gen AI apps, google is open-sourcing LangChain integrations for all their Google Cloud Databases.
- They will support three types of LangChain integrations that include vector stores, document loaders and chat messages memory.
40:53 Justin – “All I want to say is this, this whole announcement makes me sound, feel like I want us to say bingo on the amount of tech buzzwords that they can throw in here. Like we have Redis, we have Memcat or sorry, we have memory store, we have SQL, we have TG, which for Postgres, you know, there’s just everything in the one, which goes back to your prior point of Google just throws every announcement into one where, uh, as you were saying it, I was like, okay, that, you know, AWS wouldthey could potentially do the Cloud SQL, the memory store for Redis and the Cloud Spanner. I can see that being one or three for them. If it’s around re-event and price three, that they would do three different slides. Otherwise, I could see them doing it as one. Plus the whole next one, there’s seven announcements in this.”
Azure
42:41 Introducing Microsoft Copilot for Finance – the newest Copilot offering in Microsoft 365 designed to transform modern finance
- Microsoft is announcing its latest Copilot, CoPilot for Finance, designed for business functions that extend normal Copilot for 365 and revolutionizes how finance teams approach their daily work.
- Finance departments are critical partners in strategic decisions impacting the company’s direction.
- Eighty percent of finance leaders and teams face challenges taking on more strategic work outside the operational portions of their roles.
- However, 62% of finance professionals say they are stuck in the drudgery of data entry and review cycles.
- Copilot for Finance includes Copilot for Microsoft 365, which means it supercharges excel, outlook, and other widely used productivity apps with workflow and data-specific insights for the finance professional.
- Copilot for Finance draws on essential context from your existing financial data sources, including traditional enterprise ERP such as Dynamics and SAP, and the Microsoft graph.
- Key features of the copilot for finance are:
- Quickly conduct a variance analysis in Excel using natural language prompts to review data sets for anomalies, risks and unmatched value. This type of analysis helps finance provide strategic insights to business leaders about where it is meeting, exceeding or failing short of planned financial outcomes
- Simplifies the reconciliation process in Excel with automated data structure comparisons and guided troubleshooting to help move from insight to action, which helps ensure the reliability and accuracy of financial records.
- Provides a complete summary of relevant customer account details in Outlook, such as balance statements and invoices, to expedite the collections process
- Enables customers to turn raw data in Excel into presentation-ready visuals and reports ready to share across Outlook and Teams.
44:26 Matthew – “I’m not gonna lie, I’m kind of looking forward to playing with this, mainly with our Azure Cloud Bill. Like I want to see, you know, I already kill Excel and it consumes like 10 gigabytes on my Mac, you know, every time I open it with our Cloud Bill. And then like I have pivot tables and you know, a bunch of data analysis I do every time about it, but I kind of want to see what Copilot for Finance does with this.”
45:59 Microsoft and Mistral AI announce new partnership to accelerate AI innovation and introduce Mistral Large first on Azure
- Microsoft and Mistral AI is a recognized leader in generative AI. Their “commitment to fostering the open-source community and achieving exceptional performance aligns harmoniously with Microsoft’s commitment to develop trustworthy, scalable and responsible AI solutions.”
- The partnership with MS enables Mistral AI with access to Azure’s cutting-edge AI infrastructure, to accelerate the development and deployment of their next generation large language models (LLMs) and represents an opportunity for Mistral AI to unlock new commercial opportunities, expand to global markets, and foster ongoing research.
- Microsoft’s partnership with Mistral AI is focused on three core areas:
- Supercomputing infrastructure: MS will support Mistral AI with Azure AI supercomputing infrastructure delivering best-in-class performance and scale for AI training and inference workloads for Mistral AI’s flagship models.
- Scale to market: MS and Mistral AI will make Mistral AI’s premium models available to customers through the Models as a Service (MaaS) in the Azure AI studio and Azure Machine Learning model catalog.
- AI research and development: Microsoft and Mistral AI will explore collaboration around training purpose-specific models for select customers, including European public sector workloads.
Aftershow
48:52 Elon Musk sues OpenAI and CEO Sam Altman, claiming betrayal of its goal to benefit humanity
- Elon Musk is suing Open AI and CEO Sam Altman over what Elon says is a betrayal of the ChatGPT maker’s founding aims of benefiting humanity rather than pursuing profits.
- Musk says when he bankrolled Open AI’s creation he secured an agreement with Altman and Greg Brockman the president, to keep the AI company as a non-profit that would develop technology for the benefits of the public.
- However, by embracing a close relationship with Microsoft, Open AI and its top executives have set the pact aflame and are perverting the company’s mission.
- The lawsuit states “Open AI, Inc has been transformed into a closed-source de facto subsidiary of the largest technology company in the world: Microsoft”
- It goes on further to say they are refining the AGI to maximize profits for MS, rather than to benefit humanity.
51:07 OpenAI and Elon Musk
- OpenAI has clapped back with a blog post. Surprise, surprise.
- They also intend to get all of Elon’s claims dismissed, as well as their sharing what they have learned to achieve their mission and ensuring benefits for humanity.
- #1 They realized building AGI will require far more resources than they initially imagined.
- In Late 2015 Greg and Sam initially wanted to raise 100M, Elon said that they needed a bigger number as 100M sounded hopeless, and that they should start with a $1B funding commitment and he will cover whatever they can’t raise.
- As they progressed through 2017 they figured out they would need a massive amount of compute, and they would need way more capital to succeed at their mission, billions of dollars per year, which was for more than they or Elon thought they would be able to raise as a non-profit.
- Elon and OpenAI recognized a for-profit entity would be necessary to acquire those resources.
- As they discussed the for profit entity, Elon wanted to merge OpenAI with Tesla or he wanted full control. Elon left Open AI, saying there needed to be a relevant competitor to Google/DeepMind and that he was going to do it himself, and that he would be supportive of open AI finding its own path.
- Elon pulled funding in the middle of these discussions and Reid Hoffman bridged the funding gap to cover salaries and operations. He believed their probability of success was 0, and he would build an AGI competitor within Tesla.
- OpenAI couldn’t agree to terms for-profit with elon because they felt it was against the mission for any individual to have absolute control over OpenAI.
- He sent an email wishing them the best of luck and finding their own path.
- OpenAI advanced our mission by building widely-available beneficial tools
- Open AI is making their tools broadly usable in ways that empower people and improve their daily lives, including via open source contributions.
- They provide free access as well as paid offerings. They highlight several users of the openAI system for the greater good pointing at Albania using OpenAI tools to accelerate its EU accessions by 5.5 years, Digital Green is helping boost farmer income in Kenya and India by dropping the cost of agricultural extension services 100x by building on OpenAI; Lifespan a health provider in Rhode Island used GPT-4 to simplify its surgical consent forms from a college level to a 6th grade one. And Iceland is using GPT 4 to preserve the Icelandic language.
- They always intended the Open in OpenAI to be about the benefits from AI after it was built, but not to necessarily share the science… Elon had previously replied “Yup”
- And so not only did they write this blog post.. But they published the FULL email exchanges between Sam, Greg and Elon. Spilling the proverbial tea, as it were.
54:03 Matthew – “…this is why legal departments don’t like 10 year old emails. You know, you put it in writing, you have to expect it to be used for you in the court a lot at one point, or the public opinion in this case. “
Closing
And that is the week in the cloud! Just a reminder – if you’re interested in joining us as a sponsor, let us know! Check out our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloud Pod