LangChain is a framework designed for building applications powered by large language models (LLMs). It simplifies the development of LLM-based applications by providing tools for prompt management, memory, chains, agents, and integrations.


Core Concepts

1. PromptTemplates

  • Purpose: Define templates for LLM prompts with variables.

  • Usage:

from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables=["name"],
    template="Hello {name}, how can I assist you today?"
)

2. Chains

  • Purpose: Combine multiple components (e.g., prompts, LLMs, tools) into a pipeline.

  • Types:

    • LLMChain: Single LLM-based step.

    • SequentialChain: Sequence of multiple steps.

    • Custom Chains: User-defined logic.

  • Example:

from langchain.chains import LLMChain
from langchain.llms import OpenAI

llm = OpenAI(temperature=0.7)
chain = LLMChain(llm=llm, prompt=prompt)
response = chain.run({"name": "Alice"})

3. Agents

  • Purpose: Dynamically decide which tool to use based on user input.

  • Tools: Integrations like search, calculators, etc.

  • Example:

from langchain.agents import load_tools, initialize_agent
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["search", "calculator"])
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
response = agent.run("What is the square root of 144?")

4. Memory

  • Purpose: Maintain conversation state (context).

  • Types:

    • BufferMemory: Store all messages in a buffer.

    • ConversationSummaryMemory: Summarize conversations for brevity.

  • Example:

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory)
response = conversation.predict(input="Hello!")

5. Document Loaders

  • Purpose: Load and process external data (PDFs, text, etc.).

  • Example:

from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("example.pdf")
documents = loader.load()

6. Text Splitters

  • Purpose: Split large documents into smaller chunks.

  • Example:

from langchain.text_splitter import CharacterTextSplitter

splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_text("Long text here...")

7. Vector Stores

  • Purpose: Store and search embeddings.

  • Integrations: Pinecone, FAISS, Weaviate, etc.

  • Example:

from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(["sample text"], embedding)

8. Retrieval QA

  • Purpose: Perform QA over a document store.

  • Example:

from langchain.chains import RetrievalQA

retriever = vectorstore.as_retriever()
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
response = qa_chain.run("What is this document about?")

Key Integrations

1. LLMs

  • OpenAI, Hugging Face, Cohere, etc.

  • Example:

from langchain.llms import OpenAI

llm = OpenAI(api_key="your_key", temperature=0.5)

2. Tools

  • Search Engines: Google, Bing.

  • Calculators: Wolfram Alpha.

  • Custom APIs: Use Requests wrapper for custom APIs.

3. Databases

  • Vector DBs: Pinecone, FAISS, Chroma.

  • Relational DBs: SQL database integrations available.

4. Document Formats

  • PDFs, CSVs, HTML, Markdown, etc.


Best Practices

  1. Debugging:

    • Use verbose=True in chains and agents for detailed logs.

  2. Performance:

    • Optimize chunk size and overlap in text splitting for large documents.

  3. Security:

    • Avoid hardcoding API keys. Use environment variables.

  4. Customizability:

    • Extend base classes to create custom tools, memory, or chains.