A Primer on Large Language Models
Nov 20, 2023 · 5 mins read
If you are a startup founder who’s not necessarily technical, staying on top of tech trends might feel overwhelming. And in this constantly evolving landscape, a basic understanding of AI technologies, like Large Language Models (LLMs), can make all the difference. Becoming more familiar with such concepts, and getting to know their structure and functionalities, can serve as a powerful tool in your entrepreneurial journey and open your business to new opportunities.
This post offers a concise yet comprehensive look at LLMs, their functionalities, learning structures, and concepts like tokens, context windows, and prompt engineering, enabling more informed discussions around potential AI integrations.
So, let’s start building that foundation today, by first understanding what Large Language Models are, and how they form the backbone of current AI technologies.
Understanding LLMs
Just as the brain is the central organ of the human nervous system, large language models are artificial neural networks at the heart of AI tech. These models learn from a massive range of texts, covering a wide range of topics and diverse types of content. A better part of their training involves learning speech patterns from human interactions on platforms like Reddit and similar. The goal is to help these AI models approximate human conversation as much as possible.
The Basic Functionality of AI Models
At its core, here’s how LLMs operate: AI models take an input, a question or some text that you provide, and they predict the next tokens. A token is a critical AI concept representing the minimum unit of a word. As an illustration, a word such as “impactful” would be split into two tokens, “impact” and “ful”. This might not always be a word-for-word translation since the way a model deciphers words into tokens is unique to each model.
Based on the input tokens, the AI model works to predict the next likely token that would follow a given sequence of words. This process of next-token prediction is known as “inference”.
The Concept of the Context Window in AI
Another concept in the AI lexicon is the “context window”. This represents the maximum number of tokens an AI model can process at once. Usually, context windows span 2000 to 4000 tokens, but some models offer much larger windows up to 100 000 tokens! This idea is paramount when engaging in extended conversations with an AI model. As the conversation progresses, the model starts forgetting the earlier messages as its context window is filled and older tokens need to be trimmed off, so we need to be mindful of it when sending a prompt to the model.
Delving into Prompt Engineering
Another intriguing aspect of AI is “prompt engineering”, consisting of methods focused on altering the prompt, which is the input you provide the AI model. The reason it is so important is because depending on the prompt fed into the model, the probability of certain tokens in the inference changes immensely.
Simple adjustments to your prompt, such as instructing the model to “act as my assistant” or to respond in a designated language, can drastically change the generation probabilities. Through these customisations, the AI model can better align its responses towards desired outcomes.
Three prominent methods used in prompt engineering are context building, embeddings, and system prompts:
- Context Building: This technique provides the model with relevant or necessary information that it requires to give an appropriate response. The added context is crucial because an AI model’s knowledge is confined to the date when the model was last trained, and it isn’t privy to any information beyond that.
- Embeddings: This advanced method is employed when dealing with large volumes of data that wouldn’t fit into a context window otherwise. Embeddings convert lengthy text sequences into vectors or lists of numbers. These vectors are utilised as references to find similar content relevant to a user’s query. Only the relevant (most similar) content to the user query is then added to the context for the model to use as information for the inference.
- System Prompts: This technique is one of the most basic yet influential ways to manipulate the AI’s response probabilities. We can tell the model to “act” as something specific, influencing its behaviour. For example, when using ChatGPT, the model begins with a default command that asks it to follow user instructions carefully. However, users can customise this instruction, allowing the model to assume a distinct role for the conversation, for example, to be an AI career coach, or to impersonate a character.
Understanding these underlying structures and concepts can help us not only use AI more effectively but also achieve a deeper appreciation of the technology and its capacity. Through learning and experimenting, we can shape and form these models into tools that can augment products in unprecedented ways.