By Ram Menon, CEO & Co-founder, Avaamo
What powerful, foundational language models mean to consumers, enterprises, and the world at large.
The release of ChatGPT by OpenAI — only a couple months ago — has introduced the power of AI to the general public. Briefly, ChatGPT is a chatbot that can respond to human instructions for tasks such as writing essays, poems, and explaining and debugging code. It displays impressive reasoning capabilities and performs significantly better than prior language models.
Large Language Models: What are they? Why do they matter?
Large Language Models (LLMs) are designed to process and understand natural language. They are typically trained on large amounts of text data, allowing them to accurately analyze and generate human-like text. LLM models, such as PaLM, ChatGPT, LaMDA, and GPT-3, have achieved state-of-the-art performance on a variety of natural language processing tasks. They are typically trained using unsupervised learning, which means that they are not explicitly provided with the correct output for a given input, but instead must learn to generate reasonable outputs based on the input data.
The human experience: LLMs have demonstrated impressive capabilities in generating human-like text. They are able to generate text that is difficult to distinguish from text written by humans.
Change in consumer behavior: Gen Z and millennials prefer communication using natural language rather than the “Press 1 to continue” style of communication.
A large language model having absorbed the entire world
The future of LLMs:
There are many LLMs throwing up their hand. Theoretically, the question boils down to compute. How much money are you willing to spend on training millions or trillions of parameters? In a few years, it is arguable that if LLMs are trained on “enough data sets,” all of them could theoretically zero in on a similar answer, especially if the content is open-source internet content since there is no differentiation in the data set corpus. Is the future state a series of commoditized foundation models all which theoretically can perform almost at the same level?
The Legal and Governance Structure
The concept of training on data and images available on the internet has shaken the foundations of “fair use” clauses that form the underpinnings of modern copyright law. It’s a very murky area still being debated.
Are such technologies going to be locked behind corporate walls? For-profit entities have explicit incentives to downplay risks and discourage security “probing”, or will alternative governance structures like open source/not-for-profit entities be more popular? What will GOOG do vs. Nvidia vs. OpenAI? Will we see a series of open source foundational models delivering ChatGPT level capabilities available for developers to fine tune and quickly commoditize the LLM market?
Consuming LLM’S
For consumers, there will be a mind-boggling series of tools and applications to write blogs, term papers, and other tools to make life easier. Some will be for entertainment, but real-world uses will proliferate, where questions of accuracy, toxicity, and bias will trump ease of use.
For entrepreneurs, they will have to figure out how to use LLMs as a foundation model to build useful applications that can make money and manage the litigation games that may arise from fair use lawsuits.
For enterprises, the tendency of LLMs to hallucinate, the inability to do anything transactional, the cost of retraining, the compute cost of inference with 175 billion parameters, and the danger of it framing an answer as something that would violate compliance (a financial advice, company policy, or medical advice) means that even if you manage to reduce the risk of all those factors, compliance in most industries would still shut you down if LLMs are used in their raw form today.
Commercial solutions have a great way forward now — like a gourmet chef — to sample various foundation LLMs as they develop and appear — to pick and choose which one would be the best for a use case for an enterprise customer and then fine tune them on downstream tasks like customer support or employee support. The ability to combine task automation + foundational LLMs brings exciting new self-serve capabilities for the enterprise
Commercial solutions like Avaamo are cherry-picking various LLMs available to build new generative AI solutions to make enterprise deployments cheaper, easier, and faster.
In conclusion
LLMs have many uses and it is amazing, a glimpse of the future, and we should give LLMs the deference they deserve. However, more transparent, secure, explainable, and controllable architectures that can be throttled and filtered are needed to scale in an enterprise context.
All images in this post were created using Midjourney.