Overview
Marina Danilevsky, Senior Research Scientist at IBM Research, explains how Retrieval-Augmented Generation (RAG) addresses key limitations of LLMs. This is the dominant pattern for giving AI systems access to your organization's knowledge — and understanding it is essential for both implementation and security.
Key Takeaways
The Problems RAG Solves
LLMs have two fundamental limitations:
- Outdated information — Models are frozen at their training cutoff date. They can't know about recent events or changes.
- No source citation — LLMs typically provide answers without showing where the information came from.
- Hallucination — Models can generate confident but incorrect or misleading answers.
- Data leakage risk — Without grounding, models may expose information from training data inappropriately.
How RAG Works
Instead of relying solely on internal training data, the LLM first consults an external content store. This store can be:
- Open — Public sources like the internet
- Closed — Private sources like internal documents, databases, or knowledge bases
The Three-Part RAG Framework
- Instruction — The LLM is instructed to retrieve relevant content from the external store
- Retrieval and Combination — Retrieved content is combined with the user's question
- Generation — The LLM generates an answer grounded in the retrieved evidence
Key Advantages
- Up-to-date information — Update the data store without retraining the entire model
- Grounded responses — Answers are based on primary source data, reducing hallucination
- Source attribution — The model can cite where information came from
- Appropriate uncertainty — RAG helps models recognize when they don't have enough reliable information and say "I don't know"
Ongoing Research
Researchers are improving both sides of the RAG equation: better retrievers that find high-quality grounding information, and enhanced generators that deliver richer, more accurate responses.
Practitioner Notes
RAG is likely the first AI architecture pattern your organization will deploy. Here's what matters for healthcare security:
Your content store is now part of your attack surface
Whatever you connect to RAG becomes accessible to the AI — and potentially to anyone who can query it. If your content store includes clinical documentation, policies, or internal procedures, you need to think carefully about access controls. The LLM doesn't inherently understand "this user shouldn't see that document."
"Closed" doesn't mean "secure"
The video distinguishes between open (internet) and closed (internal) content stores. Don't assume closed means safe. Internal documents can contain PHI, credentials, or sensitive business information. RAG can surface things you didn't intend to expose.
Retrieval quality directly impacts security
If the retriever pulls the wrong documents, the LLM generates answers based on irrelevant or inappropriate content. In healthcare, this could mean clinical guidance based on the wrong protocol, or answers that mix patient information inappropriately. Retrieval accuracy isn't just a quality issue — it's a safety issue.
Data leakage works both directions
The video mentions RAG reducing data leakage from training data. But consider the reverse: queries and retrieved content may be logged, cached, or sent to external APIs. Where does that data go? Is PHI flowing through systems you don't control?
The "I don't know" behavior needs to be verified
Danilevsky mentions RAG helping models say "I don't know" when they lack reliable information. This is valuable, but it's not automatic — it depends on implementation. Test whether your RAG system actually admits uncertainty, or whether it confabulates when retrieval fails.
Content store governance is now AI governance
Your document management practices directly affect AI behavior. Outdated policies in the content store mean outdated AI answers. Duplicate or conflicting documents create inconsistent responses. RAG makes your content hygiene a first-order concern.
Continue Learning
This is the fourth resource in the AI Foundations learning path.