War of the Machines: In the great battle of LLMs, what side should enterprises be on?
By Ram Menon, CEO and Co-founder of Avaamo
It has taken about 3 months for ChatGPT to become a household phrase eliciting media headlines, one-liners from comedians, and the average tech user tweeting about a new magical task they were able to do with it.
Suddenly “This AI” is real. Thousands of enthusiastic self-publishing authors are holding forth in LinkedIn posts, Reddit forums and Youtube tutorials on exciting new ways to use the technology. What was once an obscure term in computer science, LLM (large language models) has entered the lexicon of the masses. For a deep dive into LLM’s, I recommend setting time aside to read this blog post from Stephen Wolfram creator of the Wolfram Alpha which goes under the bonnet of ‘large language models’ to take a peek at their inner workings.
The War is deadly serious.
What is less well understood is this has kicked off a multi-year war between tech companies that is deadly serious. There are old players, new players, pretenders, VCs, founders, all jostling for attention with increasingly strident “announcements” that scream for attention in the new ChatGPT landscape. The knives are drawn, and there is no quarter given.
Satya Nadella of Microsoft defined it well when he gleefully announced “I want people to know we made them dance,” referring to Google. The media landscape has been chockful with hasty announcements, tall claims, and flubbed demo events.
Google is the 800-pound gorilla in search. I want people to know that we made them dance.
-Satya Nadella, CEO Microsoft
Companies have persisted knowing that this is the first inning of a war that will be fought on every front of software. In tech the first innings: #mindshare is #marketshare. This war is coming to search, browser, operating systems, and applications.
The LLM traffic Jam
The market is ablaze with a confusing array of chest thumping claims, model releases from all over the globe. The screaming media headlines that follow yet another announcement and the cycle starts all over again next week. Lets take a quick look at the roster:
Microsoft
Certainly in the #mindshare pole position with its Open AI investment, Bing search and recent announcement on incorporating ChatGPT into all its applications.
Google’s arguably has the largest library of LLM’s that include LaMDA, PaLM, Imagen, MusicLM and Deep Mind’s Chinchilla. However they have chosen to release a more demure chatbot named Bard based on LaMDA and followed it up quickly with a Microsoft style copycat announcement of adding generative AI to its work apps.
Meta
Meta contribution has more subdued, but includes OPT, Sphere, and Galactica but its new language model LLaMA made news as it was leaked on 4chan a week after release.
Hugging face
Hugging face ostensibly founded with the intention of breaking the stranglehold of big tech on LLMs released BLOOM. The company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face’s products available to AWS customers to use them as the building blocks for their custom applications.
Baidu
Baidu Ernie developed in-house models ERNIE and Plato but flubbed the demo event causing the stock to drop 10%. It’s out there now, and Alibaba and JD have announced similar projects.
A121 Labs
A startup in Israel that plans on releasing a LLM called Jurassic-1 Jumbo, which contains 178 billion parameters, or 3 billion more than GPT-3. In machine learning, correlation between the number of parameters and sophistication has held up remarkably well. I guess we will have to see…
Other Research Projects
Megatron-Turing from Nvidia has been around for a while released as a research project rather than a commercial product, followed by recent announcement from Abu Dhabi FALCON .
And the list goes on with several startups fattening their coffers with hundreds of millions in new funding entering the fray including Adept.
So what are enterprises supposed to do?
Corporate IT
Realize that implementing large language models in their raw form is a tall proposition despite calls from in-house developers, the cost of compute can be extraordinary, the complex dev environment, and intricacies of competing architecture make this a daunting task. Counseling developers to build initial “business use cases” that can be tested for usability, accuracy and security is a good first step.
Enterprise CXO/CIO
In this ensuing chaos, enterprise decision makers are being inundated with demands for ChatGPT-style functionality. It’s time to cool your jets; wait for the fracas to settle down, learn what ChatGPT style technology can and cannot do. Bring your security and compliance team into the mix to decide what’s possible and talk to your vendors/partners who are embracing the technology by building toolsets so it can be used safely in an enterprise environment. Meanwhile learn about the possibilities and its limitations.
ChatGPT itself agrees with me too: