What is RAG? - AI Foundations - #RealTalk with Aaron Bregg

Overview

Marina Danilevsky, Senior Research Scientist at IBM Research, explains how Retrieval-Augmented Generation (RAG) addresses key limitations of LLMs. This is the dominant pattern for giving AI systems access to your organization's knowledge — and understanding it is essential for both implementation and security.

Key Takeaways

The Problems RAG Solves

LLMs have two fundamental limitations:

Outdated information — Models are frozen at their training cutoff date. They can't know about recent events or changes.
No source citation — LLMs typically provide answers without showing where the information came from.
Hallucination — Models can generate confident but incorrect or misleading answers.
Data leakage risk — Without grounding, models may expose information from training data inappropriately.

How RAG Works

Instead of relying solely on internal training data, the LLM first consults an external content store. This store can be:

Open — Public sources like the internet
Closed — Private sources like internal documents, databases, or knowledge bases

The Three-Part RAG Framework

Instruction — The LLM is instructed to retrieve relevant content from the external store
Retrieval and Combination — Retrieved content is combined with the user's question
Generation — The LLM generates an answer grounded in the retrieved evidence

Key Advantages

Up-to-date information — Update the data store without retraining the entire model
Grounded responses — Answers are based on primary source data, reducing hallucination
Source attribution — The model can cite where information came from
Appropriate uncertainty — RAG helps models recognize when they don't have enough reliable information and say "I don't know"

Ongoing Research

Researchers are improving both sides of the RAG equation: better retrievers that find high-quality grounding information, and enhanced generators that deliver richer, more accurate responses.

Practitioner Notes

RAG is likely the first AI architecture pattern your organization will deploy. Here's what matters for healthcare security:

Your content store is now part of your attack surface

Whatever you connect to RAG becomes accessible to the AI — and potentially to anyone who can query it. If your content store includes clinical documentation, policies, or internal procedures, you need to think carefully about access controls. The LLM doesn't inherently understand "this user shouldn't see that document."

"Closed" doesn't mean "secure"

The video distinguishes between open (internet) and closed (internal) content stores. Don't assume closed means safe. Internal documents can contain PHI, credentials, or sensitive business information. RAG can surface things you didn't intend to expose.

Retrieval quality directly impacts security

If the retriever pulls the wrong documents, the LLM generates answers based on irrelevant or inappropriate content. In healthcare, this could mean clinical guidance based on the wrong protocol, or answers that mix patient information inappropriately. Retrieval accuracy isn't just a quality issue — it's a safety issue.

Data leakage works both directions

The video mentions RAG reducing data leakage from training data. But consider the reverse: queries and retrieved content may be logged, cached, or sent to external APIs. Where does that data go? Is PHI flowing through systems you don't control?

The "I don't know" behavior needs to be verified

Danilevsky mentions RAG helping models say "I don't know" when they lack reliable information. This is valuable, but it's not automatic — it depends on implementation. Test whether your RAG system actually admits uncertainty, or whether it confabulates when retrieval fails.

Content store governance is now AI governance

Your document management practices directly affect AI behavior. Outdated policies in the content store mean outdated AI answers. Duplicate or conflicting documents create inconsistent responses. RAG makes your content hygiene a first-order concern.

Continue Learning

This is the fourth resource in the AI Foundations learning path.

Previous: What is an AI Agent? View All Resources