Teaching AI Assistants New Tricks

The Promise and Pitfalls of Conversational AI

Conversational AI is becoming increasingly popular in consumer and enterprise applications, with chatbots providing capabilities like customer service, lead generation and workflow automation. However, even advanced large language models (LLMs) like GPT-4 have limitations when it comes to customisation for specific business needs. This opens up an opportunity for innovative startups building products to solve domain specific problems.

A Practical Example

In one exploratory experiment, we fine-tuned a small conversational model on a dataset of blog posts from our website with the goal of validating the concept of a copywriter assistant that learns from our company tone and voice. After continued pre-training on this specialised dataset, we observed traces of our language seeping into the responses from the fine-tuned model. While further fine-tuning would be needed to fully instill the linguistic tone and voice, this early finding highlights the potential to nudge a model behaviour in directions aligned with business domains and written conventions.

Strategic Fine-tuning for Specialised Chatbots

By fine-tuning existing foundation models through training on specialised data, startups can adapt and adjust models for differentiated functionality aligned with their products. Our experiments show that smaller LLMs (with less parameters) make better candidates for learning efficiently with small datasets. Models with over 10 billion parameters need extensive compute resources and large datasets to have an effect on the responses.

It’s also important to match text lengths between training examples and desired chatbot responses. Short text exchanges are best for fine-tuning simple responses while longer contextual examples are necessary for multi-turn conversations. Fine-tuning is most impactful when aiming to adapt the linguistic style, formatting, vocabulary and domain-specific terminology rather than just updating facts and data.

Agile Editing Frameworks

Emerging alternative methods aim to streamline the model editing process to make it even more efficient. These methods allow AI developers to inject fresh or customised knowledge into large LLMs for specific domains in a fraction of traditional fine-tuning time. This type of agile editing to adapt conversational abilities or domain expertise shows major promise for further advancing customised chatbots.

Realising the Potential

As this technology continues maturing, there is ample room for innovation by startups looking to stand out with uniquely tailored conversational AI. By taking an informed and strategic approach, startups have tremendous potential to deploy conversational agents with capabilities that can easily exceed industry norms. Unique and highly-customised chatbots that leverage the latest methods for adapting large language models will delight users in consumer and business applications alike.