Llamaindex Router¶
Read Markdown documents from the folder, index them using OpenAI embeddings, and provide answers to questions regarding the content.
Name |
URL |
---|---|
Building Agentic RAG with Llamaindex |
https://learn.deeplearning.ai/courses/building-agentic-rag-with-llamaindex/ |
LlamaIndex |
|
Getting Started |
|
Starter Tutorial (Local Models) |
|
RAG CLI |
https://docs.llamaindex.ai/en/stable/getting_started/starter_tools/rag_cli/ |
RAG CLI using Local Model |
Python code in this chapter demonstrates the use of the LlamaIndex library to process and query text data, specifically from a directory containing Markdown files. The code sets up a system to summarize documents and retrieve specific information, using two types of query engines: one for summarization and one for context retrieval.
Data Loading:
The data_dir variable specifies the directory containing the text documents.
The SimpleDirectoryReader class from LlamaIndex is used to load documents from this directory.
Name |
URL |
---|---|
SimpleDirectoryReader |
import os
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("data_dir", help="Directory containing Markdown documents")
args = parser.parse_args()
print("Data dir: " + args.data_dir)
from llama_index.core import SimpleDirectoryReader
# load documents
documents = SimpleDirectoryReader(args.data_dir).load_data()
Define LLM and Embedding model
Text Processing:
The SentenceSplitter is initialized with a chunk_size of 1024, which means documents will be split into chunks of 1024 characters.
get_nodes_from_documents breaks down the documents into manageable nodes (chunks of text).
Name |
URL |
---|---|
SentenceSplitter |
|
Using LLMs |
https://docs.llamaindex.ai/en/stable/module_guides/models/llms/ |
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)
Model Setup:
The code defines a language model (LLM) and an embedding model using the OpenAI gpt-4o-mini model for language processing and text-embedding-3-small for generating text embeddings.
This setup is made using the Settings class from the LlamaIndex library.
Name |
URL |
---|---|
Settings |
https://docs.llamaindex.ai/en/stable/module_guides/supporting_modules/settings/ |
Ollama |
|
OpenAIEmbedding |
|
OllamaEmbedding |
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Index Creation:
Two types of indices are created:
SummaryIndex: Used for summarizing documents.
VectorStoreIndex: Used for retrieving specific context or information from the documents.
Name |
URL |
---|---|
Document Summary Index |
https://docs.llamaindex.ai/en/stable/examples/index_structs/doc_summary/DocSummary/ |
VectorStoreIndex |
https://docs.llamaindex.ai/en/stable/examples/vector_stores/SimpleIndexDemoLlama2/ |
Starter Tutorial (OpenAI) |
https://docs.llamaindex.ai/en/stable/getting_started/starter_example/ |
from llama_index.core import SummaryIndex, VectorStoreIndex
summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)
Query Engine Initialization:
Two query engines are initialized:
summary_query_engine: Configured for summarization tasks, using a tree-based summarization response mode and asynchronous processing.
vector_query_engine: Configured for retrieving specific contexts from the documents.
summary_query_engine = summary_index.as_query_engine(
response_mode="tree_summarize",
use_async=True,
)
vector_query_engine = vector_index.as_query_engine()
Tool Definition:
QueryEngineTool instances are created for each query engine to facilitate query processing:
summary_tool: For summarization queries.
vector_tool: For context retrieval queries.
from llama_index.core.tools import QueryEngineTool
summary_tool = QueryEngineTool.from_defaults(
query_engine=summary_query_engine,
description=(
"Useful for summarization questions related to the book"
),
)
vector_tool = QueryEngineTool.from_defaults(
query_engine=vector_query_engine,
description=(
"Useful for retrieving specific context from the book."
),
)
Router Query Engine:
A RouterQueryEngine is set up to handle queries using a selector (LLMSingleSelector) that chooses the appropriate query engine tool based on the query type.
verbose=True enables detailed logging of the query processing.
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
query_engine = RouterQueryEngine(
selector=LLMSingleSelector.from_defaults(),
query_engine_tools=[
summary_tool,
vector_tool,
],
verbose=True
)
Query Execution:
The query_engine is used to execute a query asking for a summary of the document.
The response is printed, along with the number of source nodes (chunks of text) involved in generating the response.
response = query_engine.query("What is the summary of the document?")
print(str(response))
print(len(response.source_nodes))
In summary, this code sets up a system to load and process text data from a directory, create indices for summarization and context retrieval, and execute queries using a router query engine that selects the appropriate processing tool based on the query type.