AI Development Lesson 8: LangChain & RAG

🤖 AI Development CourseLesson 8 of 10 · 80% complete

LangChain is a framework for building LLM-powered applications. RAG (Retrieval-Augmented Generation) lets LLMs answer questions about YOUR data.

Why RAG?

// Problem: LLMs only know their training data (cutoff date)
// LLMs hallucinate facts they don't know
// LLMs can't answer about YOUR documents/data

// RAG Solution:
// 1. Embed your documents as vectors
// 2. Store in vector database (Pinecone, Chroma, Qdrant)
// 3. User asks question
// 4. Find relevant document chunks (similarity search)
// 5. Give chunks to LLM as context
// 6. LLM answers based on YOUR data
// Result: accurate, up-to-date answers about your content!

Simple RAG with LangChain

# pip install langchain langchain-openai chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import PyPDFLoader

# 1. Load documents
loader = PyPDFLoader("docs/manual.pdf")
docs = loader.load()

# 2. Split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(docs)

# 3. Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)

# 4. Create QA chain
llm = ChatOpenAI(model="gpt-4o-mini")
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=vectorstore.as_retriever())

# 5. Ask questions!
result = qa_chain.invoke({"query": "What is the refund policy?"})
print(result["result"])  # Answers from YOUR document!

Use Groq (Free) Instead

from langchain_groq import ChatGroq
llm = ChatGroq(model="llama-3.1-70b-versatile", api_key="YOUR_FREE_KEY")

🏋️ Practice Task

Build a “Documentation Q&A” bot. Take a local text file or PDF. Chunk it. Embed with a free embedding model (use sentence-transformers: pip install sentence-transformers). Store in ChromaDB. Build CLI: user types question, gets answer from the document.

💡 Hint: from sentence_transformers import SentenceTransformer. model = SentenceTransformer(“all-MiniLM-L6-v2”). embeddings = model.encode(chunks)

← PreviousLesson 8 of 10Next Lesson →