
What is Retrieval-Augmented Generation (RAG)?
RAG (Retrieval-Augmented Generation) is an AI framework that combines the strengths of traditional information retrieval systems (such as databases) with the capabilities of generative large language models (LLMs).
How does Retrieval-Augmented Generation work?
RAGs operate with a few main steps to help enhance generative AI outputs:
- Retrieval and Pre-processing: RAGs leverage powerful search algorithms to query external data, such as web pages, knowledge bases, and databases. Once retrieved, the relevant information undergoes pre-processing, including tokenization, stemming, and removal of stop words.
- Generation: The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM. This integration enhances the LLM’s context, providing it with a more comprehensive understanding of the topic.
RAG operates by first retrieving relevant information from a database using a query generated by the LLM. RAG leverages vector databases, which store data in a way that facilitates efficient search and retrieval.
Vertex AI Search
Vertex AI Serach functions as an out-of-the-box RAG system for information retrieval.
Vertex AI Vector Search
A vector search technology that enables semantic similarity search and retrieval from large collections of embeddings.
BigQuery
Large datasets that you can use to train machine learning models,including models for Vertex AI Vector Search.