- LangSmith Docs
- LangChain Cheat Sheet 🚀
1 · Quick Start
# Install the Python SDK
pip install langsmith # or: npm i @langchain/smith
# Minimum environment vars
export LANGCHAIN_TRACING_V2="true" # enable tracing
export LANGCHAIN_API_KEY="sk‑..." # get from smith.langchain.com
export LANGCHAIN_PROJECT="my‑project" # logical bucket for traces
# Optional
export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
export LANGCHAIN_HIDE_INPUTS="true" # redact PII
2 · Core Concepts
Concept | What it is |
---|---|
Project |
Top‑level bucket that groups runs/traces. |
Run / Span |
Single invocation (chain, LLM, tool) recorded with metadata. |
Dataset |
Collection of input → expected output pairs, used for eval & fine‑tuning. |
Evaluation |
Programmatic or LLM‑as‑judge scoring of a chain/LLM on a dataset. |
Monitor |
Dashboard card that tracks latency, cost, quality, etc., over time. |
Prompt Canvas |
GUI for prompt versioning, diffing, and collaboration. |
3 · Tracing & Observability
Python (one‑off)
from langchain.callbacks import tracing_v2_enabled
with tracing_v2_enabled(project_name="demo"):
result = chain.invoke({"input": "Hi"})
JS/TS (global)
import { traceable } from "@langchain/core/tracing";
await traceable.run(chain, { input: "Hello" }, { projectName: "demo" });
Tips
-
Tag runs:
chain.invoke(..., tags=["prod","user123"])
-
Add custom metadata:
run.extra = {"order_id": 42}
. -
Log user feedback:
client.create_feedback(run_id, score=1, comment="👍")
.
4 · CLI Reference (npm i -g langsmith)
langsmith login # OAuth or API key
langsmith projects list # show all projects
langsmith logs -p my-project # tail live traces
langsmith dataset create -n eval1 # make dataset from runs
langsmith eval run -d eval1 --chain app.py --evals qa embed_distance faithfulness
5 · Python Client Snippets
from langsmith import Client
client = Client()
# Read a run
a_run = client.read_run("<run-id>")
# Create dataset + examples
set = client.create_dataset("orders-test", description="Edge cases")
client.create_example(
inputs={"query": "Where's my order?"},
outputs={"answer": "Your order ships tomorrow."},
dataset_id=set.id,
)
# Evaluate chain on dataset
client.run_on_dataset(
dataset_name="orders-test",
llm_or_chain_factory=lambda: chain,
evaluation="qa",
)
6 · JS/TS Snippets
import { Client } from "@langchain/smith";
const client = new Client();
const run = await client.readRun(runId);
7 · Evaluators Deep Dive
Built‑in evaluators
Evaluator | Metric | Ideal Use Cases | Key Kwargs |
---|---|---|---|
qa |
LLM‑as‑judge (0–1) correctness vs reference |
Q&A, summarization, generation quality |
|
string_distance |
Levenshtein / Rouge‑L / BLEU |
Deterministic text comparison |
|
embed_distance |
Cosine similarity in embedding space |
Semantic similarity, retrieval |
|
faithfulness |
Context‑grounded hallucination score |
RAG answers, citation compliance |
|
json_diff |
Structural diff of JSON |
Tool responses & structured data |
|
toxicity |
Perspective API score |
Content safety & moderation |
|
Running evaluators
client.run_on_dataset(
dataset_name="orders-test",
llm_or_chain_factory=lambda: chain,
evaluation=["qa", "embed_distance", "faithfulness"],
max_examples=100,
)
CLI equivalent:
langsmith eval run -d orders-test --chain app.py --evals qa embed_distance faithfulness --max-examples 100
Custom evaluator skeleton
from langsmith.evaluation import StringEvaluator, EvalResult
import re
class RegexPassFail(StringEvaluator):
def __init__(self, pattern: str):
self.pattern = re.compile(pattern)
def _score_string(self, prediction: str, reference: str | None = None) -> EvalResult:
ok = bool(self.pattern.search(prediction))
return EvalResult(score=int(ok), comment="✅" if ok else "❌")
client.register_evaluator("regex_pass", RegexPassFail(r"order_\\d+"))
Interpreting results
-
score: float 0–1;
None
means skipped or errored. -
comment: model or evaluator reasoning—view in Evals tab.
-
Dashboards automatically aggregate p50, p90, pass‑rate.
Tips & tricks
-
Start with string/embedding metrics; layer LLM judges for samples that pass.
-
Weight metrics to blend:
--weights qa:0.7 faithfulness:0.3
. -
Cache expensive LLM calls by setting
LANGCHAIN_CACHE="redis://host:6379"
. -
Publish evaluators to workspace registry → reuse via UI “Add Metric”.
8 · Monitoring Dashboards
-
Go to Monitor → Charts in any project.
-
Windows: 1h • 24h • 7d.
-
Charts: Token usage, latency p95, error rate, evaluation score.
9 · Security & Self‑Hosting
Option | Purpose |
---|---|
|
Masks sensitive JSON keys. |
|
Share a read‑only project link. |
Self‑host |
Deploy OSS stack via Docker‑compose (). |
10 · Troubleshooting FAQ
Symptom | Fix |
---|---|
No runs appear |
Verify |
403 errors |
Check API key scopes & workspace membership. |
Duplicate traces |
Remove legacy callback handlers or disable DEBUG logs. |
11 · Resources
-
Docs & Quick Start → docs.smith.langchain.com
-
Evaluator library → docs.smith.langchain.com/evaluation
-
GitHub SDK → github.com/langchain-ai/langsmith
-
Community Help → discord.gg/langchain