Chatbot with Your Data

ChatGPT + Retrieval Augmented Generation

Posted on July 15, 2023 · 5 mins read
  • How do you use LLMs to create chatbots to answer questions for your domain?
  • How do you keep the knowledge up-to-date after the model has already been trained?
  • How do you ensure generated answers are accurate and avoid hallucinations?

Those are some common questions I’ve seen floating around. Unless you’ve been living under a rock, you’re probably aware that ChatGPT, the large language model (LLM) from OpenAI released last November, has taken the world by storm. It’s created excitement but also raised alarm as powerful AIs on the horizon no longer seems as far-fetched science fiction. As such, the group that I’ve been working with to curate quality answers to AI Safety and Alignment questions has been getting an influx of attention and volunteers. In fact, the entire field has seen a flurry of activity.

AI Safety & Alignment Chatbot

Several months ago bounty was placed on LessWrong for a conversational agent educating people on the different approaches to AI Safety and Alignment. Since I’ve been swamped with other projects, I only had a couple days to put together a quick prototype. Fortunately, many of the components needed for the project are related to previous work I’ve already done with extractive question answering:

  1. Processing, chunking, and embedding the alignment research dataset
  2. Using semantic search to retrieve relevant chunks
  3. Answering the questions based on the chunk

Chatbot with ChatGPT

Extracting an answer from a chunk of text is slow and the short few-word answer always felt unsatisfying. On March 1, OpenAI made ChatGPT available to programmers using the gpt-3.5-turbo model through their cloud based API.

# TODO code for chat completion, model, temperature, context, chat history messages
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is AI Safety?"},
    ]
)

Collaboration

We’ve been fortunate collaborate with other teams who also submitted prototypes and are dedicated to scaling AI safety and alignment education. Developers from the Stampy team also made incredible contributions to clean up our underlying dataset, which serves as the knowledge base, and also ensure we had proper infrastructure resources to get deployed. It’s been great learning from everyone and leveraging the best aspects from each project. Some of the notable features from other projects include:

  • Detailed citations of sources with a streaming GUI
  • Using langchain to rephrase standalone improving conversational capabilities
  • Suggesting followup questions to keep the conversation flowing

There are definitely other areas of improvement we’re still working on:

  • Cleaning up the underlying dataset
  • Prompt engineering to ensure accuracy and appropriate tone
  • More thorough benchmarked testing, UX research, UI design
  • Customizing answer generation options based on user’s level and needs
  • Experimenting with other document splitting, vector DBs, embeddings & retrieval, LLMs

Together look forward to building a better product that can help answer questions based on the rapidly growing body of alignment research literature. If you’re interested in helping, please join us on Rob Mile’s Discord. Feel free to see our current progress on GitHub for the chatbot and the dataset.

As I’ve been conducting additional research, I’ll continue adding more resources on LangChain, prompt engineering, and fine-tuning a SBERT model below.

Prompt Engineering

As LLMs are getting better at understanding language and generalizing knowledge, the need to train or fine-tune models may be replaced with a need to understand how to interact or prompt LLMs to yield the best results. Below are some resources and guidelines for doing so:

LangChain

LangChain is a valuable library that makes it easier to work with LLM from ingesting data by splitting up text and offering abstractions for different vectorstores, embedding models, prompt templates and text generation LLMs. It also handles conversational memory and chaining multiple calls to the LLMs, treating them like agents:

General Resources

Cover Photo by Mike Lewis HeadSmart Media on Unsplash