From 3,000 Support Tickets to a Searchable Brain.
Your company has thousands of tickets and 200-page manuals. No one has time to read them—until now. Build a TypeScript RAG system that answers questions with page-level citations.
This is Part 2 of 4. If you missed the foundation, check out Part 1: The LlamaIndex 3-Phase Architecture.
What We're Building
The Project Roadmap
- Goal: Internal Knowledge Q&A.
- Tech:
ChatEngine(Multi-turn). - Data: Markdown/Text policies.
- Goal: PDF Deep Search.
- Tech:
QueryEngine+ Source Attribution. - Data: Real 200-page reports.
- Goal: Production RAG API.
- Tech: Express.js + TS.
- Data: Live endpoints for your frontend.
The Core Abstraction: The Document Object
Before building, you must understand how data is represented internally.
- ID: Unique identifier (
doc-123-abc). - Text: The raw content string.
- Metadata: Key-value pairs like
file_name,page_label, ordepartment. - Impact: Metadata stays with every chunk, enabling perfect source tracking.
Chunking & Tuning
- Input: 2,000 Token Document.
- Process: Split into nodes using
chunkSize=512andchunkOverlap=50. - Output: 4 overlapping nodes that preserve context at boundaries.
Chunk Size Sweet Spots
Best for: FAQ lookup, specific data points. Trade-off: Minimal context.
Best for: Most RAG apps (The Golden Rule). Trade-off: Balanced speed & context.
Best for: Legal contracts, research papers. Trade-off: Slower, higher token cost.
Choosing Your Engine
QueryEngine vs. ChatEngine
- Mode: Single-Turn Q&A.
- Memory: None (Stateless).
- Use Case: Search bars, batch processing.
- Mode: Multi-Turn Conversation.
- Memory: Full (Stateful).
- Use Case: Support bots, interactive tutors.
Implementation Roadmap
6 Steps to Production
Use SimpleDirectoryReader to ingest PDFs, MDs, and CSVs.
Configure SentenceSplitter for the optimal chunk size.
Create a VectorStoreIndex from your documents.
Decide between ChatEngine or QueryEngine.
Wrap the engine in an Express.js server for frontend access.
Expose sourceNodes so users can verify every answer.
Key Takeaways
Don't just load text. Add custom metadata like department or last_updated to your documents—it makes filtering much more powerful.
In production, never show an answer without a 'Source' link. LlamaIndex's sourceNodes makes this a 1-line implementation.
Build your index when the server starts. Re-building the index on every request is a massive waste of tokens and time.