Retrieval Augmented Generation (RAG) combines retrieval-based and generation-based approaches to improve the quality of generated text.
RAG is used in natural language processing tasks where the goal is to generate text that is both relevant and informative, such as question answering and summarization. It works by first retrieving relevant documents or passages from a large corpus and then using a generative model to produce a coherent response based on the retrieved information.
Generally accessible information is already represented in the training data and thus contained in the model. However, as AI tasks become more specific, the instructions need to be more detailed and possibly enriched with additional information. The more specific the tasks become, the more precise instructions and additional information are required, especially information that was not or not sufficiently prominently included in the training data. These additional pieces of information can be entered manually by users but can also be provided automatically. The solution for this is Retrieval Augmented Generation (RAG). RAG supplements the prompt with potentially relevant hits from databases, documents, websites, or other sources using semantic search.
For example, in a question-answering system, RAG can retrieve relevant articles from a database and then generate a precise answer to the user’s query by synthesizing information from those articles.
RAG is an essential technique in natural language processing to build systems that can generate high-quality, relevant, and informative text by combining the strengths of retrieval-based and generation-based approaches.
- Related terms
- Prompt Engineering Knowledge Retrieval Text Generation