Grounding Generative AI
As more software companies start to build out their Generative AI features, their attempts to ground the AI to their data and user context will be critical.
Imagine having access to an intelligent and knowledgeable personal assistant. This assistant is mostly accurate and factual. The problem is that for every five factual and accurate responses you receive, you will also get one response that seems correct but contains false information. Would you be able to trust the assistant with critical tasks?
This is the situation with today’s cutting edge Large Language Models. (A Large Language Model or LLM is a type of generative AI such as ChatGPT by OpenAI)
This is called the ‘hallucination’ effect where the LLM might invent a figure / certain details within a response and do so in manner which instils confidence. Furthermore, if the model is pushed for sources or citations about the incorrect information it might even reply with invented citations or sources.
The research and techniques being applied and developed to reduce this risk is often referred to grounding and aligning the model.
One alignment technique is to fine-tune and further train the model on which replies are factual and which aren’t using Reinforcement Learning (RL). This works by providing positive or negative feedback to a response and using the feedback for reinforcement. The feedback can either be Human lead (RLHF) or can be provided by other AI Models acting as adversaries.
As we start to see more applications built upon foundational AI models — we will also see an increase in the use of external datasets, articles, networks and databases to ‘ground’ the model to factual data and relevant user context. This technique means that the underlying model is not only using it’s internal pre-trained network weights to infer a response but it is also able to access external data sources to give it a more relevant narrow context.
Using Internet Search Results
An example of this would be an LLM being grounded on traditional search results. The user prompt to the LLM will be converted into multiple internet search queries. The top results can then be parsed and fed back into the LLM in the subsequent prompt to give it context (in-context learning). The output from the LLM can then be verified and referenced against the results before being sent to the user.
A good example of this would be when a user is using such a model to plan a holiday. The user would ask the model to recommend hotels with specific requirements. The model will generate a search query and use it to perform a web search to get a list of webpages containing information about matching hotels. The text from these web pages is parsed and sent back to the model using in-context learning (the information is included in the next prompt). The LLM will then use it’s pre-trained weights to select information that may be relevant based on the user’s original input. It can them summarise the results and allow the user to chat with it about them.
Therefore the actual web search results of hotels are used for grounding — using the search hotel context to give factual responses based upon real hotels returned in the search. In addition any inserted citations and references from the web search results, can give the user additional confidence as they can click through to view the external data used for grounding and compare this to the output of the model before booking the hotel.
Using external Knowledge Bases / Vector Databases
In addition to grounding using web search results; LLMs can be given access to data and context from any external knowledge or database. This can be done by first retrieving the external data and including it in the prompt to the LLM, or by actually giving the LLM access to APIs and allow it to query the data itself (both allow for in-context learning).
Vector databases are proving to be very powerful in this area, because they allow the LLM to perform semantic search instead of keyword search. This means that by storing word embeddings as vectors in a vector database - words, phrases and concepts that are related / similar are plotted close to each other. The query from the LLM is then sent to the vector database and the search is performed by returning the nearest neighbours to the search query (as opposed to matching string keywords).
The data extracted from the knowledge bases and vector databases would be used to build better prompts and also to ground the responses from the LLM.
As companies build and deploy their Generative AI features on top of their current SaaS offerings, we will see them use their current APIs and Platforms to ground the underlying AI models in this manner.
Using Model System Messages
In another example OpenAI’s GPT-4 API introduces a system message which allows you to give the model context to help ground and align it to the user’s requirements.
For example you could provide the following system message: ‘You are a leading Physics Professor who is giving a lecture to PhD students and all responses must be factual and respect the laws of physics’.
This should increase the accuracy of the physics explanations in the responses. Other examples would be providing a system message instructing the model to act as a Lawyer basing all replies on a specific legal jurisdiction or a developer using certain code library or style.
User Provided Context and Data Sources
Until now the amount of context that a user could provide to a model has been limited in terms of maximum input tokens. This is starting to change with new models such as GPT-4 allowing for much longer input context.
Based on this capability, a very useful way to ground the model is to allow the user to provide documentation links to the model and to ask it to use the new data to ground itself and get context. An example provided by OpenAI in the GPT-4 launch demo was of a developer using the model to output Python code based on specified requirements. Running the code resulted in an error message, so the user provided a link to the relevant technical documentation along with the error message and the model was able to infer the correct bug fixes from the user provided information.
As the context gets longer, and the model has access to user / company specific context the model’s output will become more powerful.
Examples of this include:
A team of developers using Generative AI to code and being able to input the entire codebase of their product to the model in realtime for grounding and context.
A novelist using Generative AI to output the next paragraphs of their book, but allowing the model access to all paragraphs already written in the past for grounding and context.
Using Multimodal Capabilities
While this might not technically be a form of grounding, new multimodal models that accept both text and images as prompts provide new strategies to help the model to adhere to a user’s scope and context.
For example a user could ask the model to suggest recipes for dinner. They could also provide a photo of items in their pantry or fridge, and the model will use the context from the image to only suggest recipes based upon the ingredients available.
More sophisticated examples could be user’s asking for repair instructions for an appliance / device and inputting photos back to the model at each step to maintain context.
Granting Access to External Tools
At some point new models will have access to one or more external tools, which they could learn to use to verify, reference and ground their responses. An example would be a model which is able to interact with the Google Maps API / Flight Provider API might be more accurate when assisting a user to plan a holiday or business trip.
Currently if you ask an LLM for driving directions along a route it will comply and provide turn by turn directions. These are based upon its training dataset which is only updated up to a fixed point in time. By having access to these external tools (and being trained to use them) the response could take into consideration road closures, traffic conditions. Perhaps if it also had access to the user’s personal information using other third party APIs the route could be personalised accordingly (are electric charging points needed etc.)
In conclusion, as more software companies start to build out their Generative AI features on top of Foundational AI Models (such as GPT), their attempts to ground the AI to their proprietary data and user requirements will determine whether they are successful or not.
Last updated on 11 May 2023