I write here about how you could leverage RAG by keeping the content in a vector Database and by using Graphs to get great answers to the queries.
Go through the introduction to LangGraph if you do not know about it!
Table of Contents
Use case: Create a study plan based on the context
I thought about a use case where one wants to learn about Binary search trees based on the context provided. The context includes Red-Black trees and some information about AVL trees as well.
The context is here https://github.com/sushmasush/langGraph/blob/main/data/BinarySearchTrees.pdf
from the Book “Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.
Step 1: Create embeddings & store them in vector DB
In my previous posts, I wrote about langchain and vector databases here.
Load the PDF file using PyPDF Loader by LangChain. Split the document into chunks. Create embeddings for the text. Creating a vector store from the PDF and storing them locally can sometimes help you to skip step 1 next time.
Step 2: Create the RAG context tool
Using the vector store as a retriever to the RAG. Create a tool to access the store using the retriever as an interface.
from langchain.tools.retriever import create_retriever_tool
retriever=db.as_retriever()
retriever_tool=create_retriever_tool(retriever,"BinarySearchTreesBook", "This contains information about binary search tree concepts. Use this to create a study plan and create mock tests when requested by the user.")
Step 3: Initialize any LLM model and bind the tool
from langchain_openai import ChatOpenAI
tools=[retriever_tool]
llm=ChatOpenAI(model_name="gpt-4o")
llm_with_tools=llm.bind_tools(tools=tools)
Step 4: Define the graph nodes
View my previous post about LangGraph if you need more understanding.
Create 2 nodes, one is the bstGuide that will be called by the graph. Another node will be the tool node that contains the resources available for the BST Node( explained in the next step)
def bstGuide(state:State):
return {"messages":[llm_with_tools.invoke(state["messages"])]}
tool_node = ToolNode(tools=tools)
Step 5: Create the graph
The graph contains only 1 node called BST which has access to tools that is connected to the vector store.
from langgraph.graph import StateGraph,START,END
from langgraph.prebuilt import ToolNode,tools_condition
workflow= StateGraph(State)
workflow.add_node("BST", bstGuide)
tool_node = ToolNode(tools=tools)
workflow.add_node("tools", tool_node)
workflow.add_conditional_edges(
"BST",
tools_condition,
)
workflow.add_edge("tools", "BST")
workflow.add_edge(START,"BST")
Step 6: Compile the graph and visualize
graph=workflow.compile()
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
Step 7: Query
Query 1: “I want to learn about binary search trees. Can you give me a study plan based on the topics to study along with the time required”
Response:
Based on the given text and general knowledge of Binary Search Trees (BSTs), here is a structured study plan along with the time required for each topic:
### Week 1: Introduction and Basics
1. **Day 1-2: Understanding Trees and Binary Trees**
- **Topics:**
- Definition of Trees
- Types of Trees (Binary Trees, Full Binary Trees, Complete Binary Trees)
- **Time Required:** 4 hours
2. **Day 3-4: Introduction to Binary Search Trees**
- **Topics:**
- Definition and Properties of BSTs
- BST vs Binary Trees
- **Time Required:** 4 hours
3. **Day 5-7: Basic Operations on BSTs**
- **Topics:**
- Insertion
- Deletion
- Search
- **Time Required:** 6 hours
### Week 2: Intermediate Concepts
4. **Day 1-2: Tree Traversals**
- **Topics:**
- In-order Traversal
- Pre-order Traversal
- Post-order Traversal
- Level-order Traversal
- **Time Required:** 4 hours
5. **Day 3-4: Balancing BSTs**
- **Topics:**
- Need for Balancing
- Simple Balancing Techniques
- **Time Required:** 4 hours
6. **Day 5-7: Advanced Operations and Analysis**
- **Topics:**
- Successor and Predecessor
- Height of a BST
- Time Complexity Analysis
- **Time Required:** 6 hours
### Week 3: Advanced Topics
7. **Day 1-3: Self-Balancing Trees**
- **Topics:**
- AVL Trees
- Red-Black Trees
- **Time Required:** 6 hours
8. **Day 4-5: Augmented BSTs**
- **Topics:**
- Order Statistic Trees
- Interval Trees
- **Time Required:** 4 hours
9. **Day 6-7: Practical Applications**
- **Topics:**
- Use cases in programming and real-world applications
- Problem-solving using BSTs
- **Time Required:** 4 hours
### Week 4: Review and Practice
10. **Day 1-2: Review and Recap**
- **Topics:**
- Revisiting key concepts
- Clarifying doubts
- **Time Required:** 4 hours
11. **Day 3-4: Hands-on Practice**
- **Topics:**
- Coding exercises
- Implementing various operations
- **Time Required:** 4 hours
12. **Day 5-7: Mock Tests and Problem-solving**
- **Topics:**
- Solving BST problems
- Taking mock tests
- **Time Required:** 6 hours
### Total Time Required: 50 hours
This study plan covers the essential topics related to Binary Search Trees, including basic concepts, operations, advanced topics, and practical applications. Each week is structured to build upon the previous one, ensuring a comprehensive understanding of BSTs.
Query 2: “I want to learn about binary search trees. I have 8 hours allocated for this preparation. I can give 2 hours per day. Can you give me a study plan based on the topics to study in 8 hours”
Based on the topics from the book, here's a detailed 4-day study plan with 2 hours per day for a total of 8 hours:
### Day 1: Introduction and Basic Concepts
1. **Introduction to Binary Search Trees (BST)**
- Definition and Properties
- Use cases and applications
2. **Basic Operations**
- Search
- Insertion
- Deletion
**Time Allocation:**
- Introduction to BST: 30 minutes
- Search Operation: 30 minutes
- Insertion Operation: 30 minutes
- Deletion Operation: 30 minutes
### Day 2: Traversal Techniques and Balanced Trees
1. **Tree Traversal Methods**
- In-order Traversal
- Pre-order Traversal
- Post-order Traversal
2. **Balanced Trees**
- AVL Trees
- Red-Black Trees
**Time Allocation:**
- In-order Traversal: 30 minutes
- Pre-order Traversal: 30 minutes
- Post-order Traversal: 30 minutes
- AVL Trees: 30 minutes
### Day 3: Advanced Concepts and Analysis
1. **Advanced Operations**
- Finding Minimum and Maximum
- Successor and Predecessor
2. **Complexity Analysis**
- Time Complexity of Operations
- Space Complexity
**Time Allocation:**
- Finding Minimum and Maximum: 30 minutes
- Successor and Predecessor: 30 minutes
- Time Complexity Analysis: 30 minutes
- Space Complexity Analysis: 30 minutes
### Day 4: Applications and Problem-Solving
1. **Real-world Applications**
- Databases
- File Systems
- Network Routing
2. **Problem-Solving Session**
- Practice Problems
- Common Interview Questions
**Time Allocation:**
- Real-world Applications: 30 minutes
- Practice Problems: 60 minutes
- Common Interview Questions: 30 minutes
This plan covers the essential topics and provides practice time to solidify your understanding. Feel free to adjust the time allocations based on your comfort with each topic.
Query 3: “Thanks! Can you create a cheat sheet for traversal mechanisms”
### Binary Search Tree Traversal Mechanisms
#### 1. Preorder Traversal
- **Description**: In a preorder traversal, the nodes are recursively visited in this order: root, left subtree, right subtree.
- **Algorithm**:
1. Visit the root node.
2. Traverse the left subtree in preorder.
3. Traverse the right subtree in preorder.
- **Example**:
```
1
/ \
2 3
/ \
4 5
```
Preorder traversal of the above tree: `1, 2, 4, 5, 3`
#### 2. Inorder Traversal
- **Description**: In an inorder traversal, the nodes are recursively visited in this order: left subtree, root, right subtree. This traversal prints the nodes in non-decreasing order.
- **Algorithm**:
1. Traverse the left subtree in inorder.
2. Visit the root node.
3. Traverse the right subtree in inorder.
- **Example**:
```
1
/ \
2 3
/ \
4 5
```
Inorder traversal of the above tree: `4, 2, 5, 1, 3`
#### 3. Postorder Traversal
- **Description**: In a postorder traversal, the nodes are recursively visited in this order: left subtree, right subtree, root.
- **Algorithm**:
1. Traverse the left subtree in postorder.
2. Traverse the right subtree in postorder.
3. Visit the root node.
- **Example**:
```
1
/ \
2 3
/ \
4 5
```
Postorder traversal of the above tree: `4, 5, 2, 3, 1`
#### 4. Level Order Traversal
- **Description**: In a level order traversal, the nodes are visited level by level from top to bottom and from left to right within each level.
- **Algorithm**:
1. Use a queue to keep track of nodes at each level.
2. Start with the root node and enqueue it.
3. While the queue is not empty, dequeue a node, visit it, and then enqueue its left and right children.
- **Example**:
```
1
/ \
2 3
/ \
4 5
```
Level order traversal of the above tree: `1, 2, 3, 4, 5`
This cheat sheet covers the basic traversal mechanisms for binary search trees, providing a clear and concise reference for each type of traversal.
How did the Graph work behind the scenes?
For each query made, the graph calls the tools and loops multiple times until it finds a reasonable answer. You can see that it calls the tool multiple times to gain more information using the context provided. The real power of the graph can be seen here.
Integrate the application with a LangSmith Project to be able to see these analytics.
The code is in GitHub you can check it out at this link: https://github.com/sushmasush/langGraph/blob/main/langGraphwithRAGTools.ipynb
Feel free to check my portfolio here, https://telestreak.com/home/portfolio/