Retrieval-Augmented Generation

CS123, Intro to AI

Topics 
Overview of AINeural networks and deep learning
AI Problem Solving Revisited
Machine LearningPart 1
Applications of AI
Generative AI +Prompt engineering
Machine LearningPart 2RAG & Custom chatbot creation
History of AI + MidtermSocial and ethical issues of AI
Final

 

Contents

What is RAG?

Retrieval-Augmented Generation (RAG) is an advanced AI technique that enhances the capabilities of large language models (LLMs) by integrating an information retrieval system. This system fetches relevant information from external sources, which the LLM then uses to generate more accurate and reliable responses.

Why Use it?

What problems does it solve?

How it works

Retrieval Augmented Generation (RAG) combines generative AI with information retrieval to enhance accuracy and reduce hallucinations. Here are the key processes involved:

6-complete-rag-architecture

 

Searching the Source Data

RAG systems can use either semantic search, vector search or both.

Semantic search goes beyond keyword matching to understand the meaning and intent behind a query. It uses techniques like text embeddings to interpret relationships between words and concepts.

Vector search Uses mathematical representations (vectors) of text to find similar items based on their numerical proximity in vector space. It’s efficient for handling large datasets and finding contextually similar results.

Embeddings

Embeddings are a way to represent text data in a numerical format, capturing the meaning and context of words or phrases. They capture the relationships between words, allowing models to understand context better.

Embeddings Store

The embeddings store is a specialized database used to store and manage text embeddings, which are mathematical representations of text that capture the meaning and context of words or phrases. It enables semantic search by matching queries with relevant documents based on their embeddings, rather than exact keyword matches.

 

References

A Simple Guide To Retrieval Augmented Generation Language ModelsJoas Pambou, 2024, Smashing Magazine

What is Retrieval-Augmented Generation? Kim Martineau with video by Marina Danilevsky, IBM Research, 2023.


Creative Commons License Intro to AI lecture notes by Brian Bird, written in , are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Note: Microsoft Copilot with GPT-4 was used to draft parts of these notes.