LLMs in 2024: A Year of Integration and Innovation (And What’s Next)

The world of Large Language Models (LLMs) has seen a fascinating shift in 2024. We’ve moved beyond the initial awe of their language prowess and into an era where these models are becoming genuinely useful tools. This year wasn’t just about making models bigger or better at generating text; it was about integrating them into our workflows, making them more accessible, and pushing the boundaries of what they can do.

From Understanding to Action

One of the most significant trends of 2024 has been the evolution of LLMs from text generators to active decision-makers. Models are no longer just predicting the next word; they’re being equipped to reason, plan and interact with the outside world.

OpenAI’s “o1” is a prime example of this shift. By employing a multi-step reasoning system, o1 allows the underlying model to “think through” a problem, leading to more accurate and advanced outputs. Similarly, Anthropic’s introduction of the MCP standard has paved the way for LLMs to interact with external systems in a standardised way, simplifying the creation of autonomous systems.

But perhaps the most exciting development in this area is the rise of Agent systems. Tools like Aider and OpenHands are demonstrating how LLMs, when combined with well-crafted prompts and access to sandboxed environments, can become autonomous agents capable of writing and executing code. This opens up a world of possibilities for automating complex tasks and streamlining development workflows.

Breaking the Text Barrier

2024 also saw significant strides in multimodal capabilities. While many providers already offered models that could process multimodal inputs, the ability to generate multimodal outputs remained a challenge. This year, both OpenAI and Gemini have released models that can output not just text, but also images and audio.

This move towards any-to-any conversion is a game-changer. It allows for more natural and intuitive interactions with LLMs and opens up new avenues for creative expression and problem-solving. We’re also seeing exciting experiments in the open-source community to develop fully any-to-any models, further democratising access to these advanced capabilities.

The Open Source vs. Closed Source Dynamic

The interplay between open-source and closed-source models continues to be a defining feature of the LLM landscape. While closed-source models, backed by substantial resources, continue to push the boundaries of what’s possible, open-source models are rapidly catching up.

In fact, we’re seeing open-source models with under 30 billion parameters achieving performance comparable to models with over 70 billion. This is thanks to advancements in pre-training data quality and optimisation of training techniques. Many open-source models primarily use instruction tuning, while closed-source models often employ Reinforcement Learning from Human Feedback (RLHF), which may contribute to their performance edge. However, the rapid development of open-source models suggests this gap may narrow, and new techniques beyond RLHF could emerge. The implication is clear: open-source models are becoming viable alternatives for a wide range of tasks, making advanced AI more accessible, local, private and ubiquitous.

Making Models More Accessible

Another key trend of 2024 has been the focus on making LLMs more efficient and resource-efficient. This is being driven by breakthroughs in model compression and optimisation techniques.

Methods like BitNet, VPTQ, and MLX are enabling quantisation down to as low as 1-bit, significantly reducing the computational resources required to run these models. Additionally, techniques using neural attention memory models (NAMMs) are compressing attention heads, further reducing memory requirements during inference.

These advancements are crucial for democratising access to LLMs. They make it possible to run larger, more powerful models on smaller, less expensive hardware, opening up opportunities for wider adoption and innovation.

Looking Ahead to 2025

As we approach 2025, several key questions offer opportunities for innovation. We’re nearing the point where we’ve used most of the available human-written content for pre-training. This raises the question: what’s next? How do we continue to improve models when the data well starts to run dry?

The future of RLHF is also uncertain. While it has been instrumental in shaping model behavior, it’s a costly and time-consuming process. We’re likely to see exploration of alternative methods for fine-tuning and aligning models with human values.

While closed-source models may retain an edge in certain specialised, high-performance applications, the accessibility and rapid development of open-source models are making them increasingly attractive for many use cases. We can expect to see a growing adoption of open-source models across various sectors, fueled by collaborative efforts and a growing community of developers.

In conclusion, 2024 has been a pivotal year for LLMs. We’ve seen them evolve from impressive text generators to practical tools capable of reasoning, interacting with the world and expressing themselves in multiple modalities. As we look ahead, the focus will likely shift towards refining these capabilities, exploring new training paradigms, and making these powerful technologies accessible to everyone. They have the potential to reshape industries, redefine workflows, and ultimately, impact society in profound ways. As LLMs become more integrated into our lives, navigating the opportunities and challenges they present will be a defining task for the years to come. We believe that collaboration is key to unlocking the full potential of this transformative technology, and we’re always eager to connect with others who share our passion for building a future where AI benefits all of humanity. If these advancements resonate with you, and you’re exploring how to leverage the power of LLMs, we’re keen to hear your perspective.