RAG without Vectors: The End of Document 'Chunking' in AI?
The Dirty Problem of RAG
If you work with generative AI, you’ve heard of RAG (Retrieval-Augmented Generation). It’s the technique that allows AI to consult your company’s documents to give precise answers.
The concept is brilliant:
- AI doesn’t need to memorize everything
- Searches for information in your documents
- Combines search + text generation
- Answers based on real data
But traditional RAG has a “dirty” problem:
It chunks your documents into pieces, which often makes the machine lose the thread or ignore tables and important contexts.
How Traditional RAG Works (and Fails)
The Standard Process
Step 1: Chunking
100-page document
↓
Divided into 500 chunks of ~200 words
↓
Each chunk becomes a mathematical vector
↓
Stored in vector database
Step 2: Search
User asks: "What was net profit?"
↓
Question becomes vector
↓
Searches similar chunks (semantic similarity)
↓
Returns 3-5 most relevant chunks
Step 3: Generation
AI receives chunks + question
↓
Generates answer based on chunks
What Goes Wrong?
Problem 1: Broken Context
Chunk 237: "...as shown in Table 5"
Chunk 238: [Table 5 was here, but ended up in another chunk]
Chunk 239: "Based on this data..."
Result: AI doesn't see Table 5 when needed
Problem 2: Scattered Information
Document says:
Page 12: "Revenue: $500M"
Page 87: "Costs: $300M"
Page 143: "Profit: $200M"
Question: "What's the profit margin?"
Traditional RAG: May grab only 1 or 2 chunks
Answer: Incomplete or wrong
Problem 3: Destroyed Tables
Original table:
| Product | Q1 | Q2 | Q3 |
|---------|-----|-----|-----|
| A | 100 | 150 | 200 |
| B | 50 | 75 | 90 |
After chunking:
Chunk X: "| Product | Q1 | Q2"
Chunk Y: "| Q3 | |---------|-----|"
Chunk Z: "75 | 90 |"
AI: "I can't understand this table"
Problem 4: Structure Loss
Document has:
- Section 1: Introduction
- 1.1 Context
- 1.2 Objectives
- Section 2: Methodology
- 2.1 Approach
- 2.2 Data
Traditional RAG: Ignores hierarchy
AI doesn't know 1.1 and 1.2 are related
PageIndex: The Radical 2026 Solution
The 2026 scenario proposes a radical solution: PageIndex (or RAG without vectors).
From “Search by Similarity” to “Structured Reasoning”
In common RAG:
- AI searches for similar words
- Mathematical vectors
- No structure understanding
In PageIndex:
- More human approach
- Understands logical structure
- Navigates like you’d read an index
How PageIndex Works
1. Content Tree
Instead of chopping text, AI reads the entire document and creates a tree structure, like an ultra-detailed table of contents (in JSON format) that stays in the model’s “working memory”.
Example of generated tree:
{
"document": "Annual Report 2025",
"sections": [
{
"id": "1",
"title": "Executive Summary",
"page_range": [1, 5],
"subsections": [
{
"id": "1.1",
"title": "Financial Highlights",
"page": 2,
"content_summary": "Revenue $500M, profit $200M",
"has_table": true,
"table_ref": "Table_1_Financial_Summary"
}
]
}
],
"tables": [
{
"id": "Table_1_Financial_Summary",
"location": "page 2",
"columns": ["Metric", "2024", "2025"],
"referenced_in": ["1.1", "3.2"]
}
]
}
The AI created a “mind map” of the document.
2. Intelligent Navigation
When you ask a question, AI doesn’t search for loose words.
Reasoning process:
Question: "What was revenue growth?"
AI thinks:
1. "This is about finance"
2. "Probably in Executive Summary or Financial Analysis"
3. Consults tree → identifies section 1.1
4. "Section 1.1 has a financial table"
5. Goes directly to Table_1_Financial_Summary
6. Reads relevant data
7. Calculates: ($500M - $400M) / $400M = 25%
8. Answers: "25% growth"
It looks at the table of contents, reasons about which section should have the answer (e.g., “this should be in Section 4”) and goes straight to the point.
3. Cross-Reference
Problem solved:
Text says: "As shown in Table 3..."
Traditional RAG:
- Doesn't know where Table 3 is
- Ignores the reference
PageIndex:
- Sees reference to Table 3
- Consults tree
- Finds: table_ref: "Table_3_Market_Share"
- Navigates to table
- Connects information
If the text says “see table 3”, AI can navigate the tree, find the table, and connect information.
The Result: Crushing Precision
Real Benchmarks
In financial benchmark tests, this approach achieved 98% precision, far surpassing traditional RAG.
Comparison:
| Metric | Traditional RAG | PageIndex |
|---|---|---|
| Precision | 73% | 98% |
| Recall (finds all) | 65% | 95% |
| Tables | 45% | 97% |
| Cross-references | 20% | 92% |
| Cost per query | $0.02 | $0.15 |
| Latency | 2s | 8s |
For companies dealing with complex contracts or annual reports of hundreds of pages, this changes the game.
Perfect Use Cases
✅ Excellent for:
- Complex legal contracts
- Annual financial reports
- Structured technical documentation
- Manuals with many tables/references
- M&A due diligence
- Compliance and auditing
❌ Not worth it for:
- Simple FAQs
- Short documents (<10 pages)
- Search across thousands of documents
- Cases where speed > precision
The “Price” of Intelligence
Not everything is flowers. There are two real challenges for this technology:
1. Cost and Latency
The problem:
Since AI needs to make multiple “calls” to navigate the content tree, the process is slower and more expensive than a simple search.
Navigation example:
Call 1: Create document tree ($0.05)
Call 2: Analyze question and decide section ($0.02)
Call 3: Read specific section ($0.03)
Call 4: Search referenced table ($0.02)
Call 5: Synthesize answer ($0.03)
Total: $0.15 per query (vs $0.02 traditional RAG)
Time: 8 seconds (vs 2 seconds)
Trade-off:
- 7.5x more expensive
- 4x slower
- But 25% more accurate
Worth it? Depends on the use case.
2. Memory Limit
The problem:
The tree structure needs to fit in the AI’s context window.
Real numbers:
Claude 3.5 Sonnet: 200k context tokens
100-page document:
- Text: ~50k tokens
- JSON tree: ~20k tokens
- Space for answer: ~10k tokens
Total used: ~80k tokens
✅ Works!
Library with 50 documents:
- 50 × 50k = 2.5M tokens
❌ Doesn't fit!
Trying to apply this to an entire document library is not yet viable.
Partial solutions:
- More compact trees (summaries)
- Layered hierarchy (search document first, then detail)
- Models with larger context (Gemini 1.5: 1M tokens)
The Evolution of the AI Professional
The Orchestrator in Action
This reinforces our thesis of the AI Orchestrator.
The successful professional is not just someone who “installs” RAG, but who understands:
When to use Traditional RAG (vectorial):
Scenario: Product FAQ
- 1000 common questions
- Short answers
- Speed matters
- Cost matters
Decision: Vectorial RAG (fast and cheap)
When to use PageIndex:
Scenario: $10M contract analysis
- 200-page document
- Needs 98% precision
- Error can cost millions
- Client expects 1-day analysis
Decision: PageIndex (slow but precise)
When to use Hybrid:
Scenario: Technical support system
- 80% simple questions → Vectorial RAG
- 15% medium questions → RAG + human validation
- 5% complex questions → PageIndex
Decision: Intelligent routing
The New Skills
❌ No longer enough:
- Knowing how to install RAG library
- Running LangChain tutorial
- Applying same solution to everything
✅ Necessary:
- Understanding trade-offs (cost vs precision vs speed)
- Architecting hybrid solutions
- Measuring what matters (not just “works”)
- Optimizing costs without sacrificing quality
- Knowing when new technology is worth the investment
The Future of Document Search
2026-2027: Three Approaches Coexisting
Level 1: Vectorial RAG (commodity)
- Simple cases
- High scale
- Low cost
- 70-80% precision
Level 2: Hybrid RAG (emerging standard)
- Vectorial for initial filter
- PageIndex for refinement
- 85-92% precision
- Medium cost
Level 3: Pure PageIndex (premium)
- Critical cases
- Maximum precision (95-98%)
- High cost justified
- Acceptable latency
The right choice depends on context, not fashion.
Conclusion
RAG without vectors (PageIndex) is not the replacement for traditional RAG.
It’s an additional tool in the AI professional’s arsenal.
Main lessons:
-
New technology ≠ Always better
- PageIndex is more accurate
- But also more expensive and slower
- Not always worth it
-
Context is king
- Simple FAQ? Vectorial RAG
- Critical contract? PageIndex
- Hybrid? Probably
-
Orchestration is the skill
- Knowing which tool when
- Optimizing cost without losing quality
- Measuring real impact
-
Precision has a price
- 98% vs 73% = 7.5x more expensive
- Sometimes worth it (legal analysis)
- Sometimes not (email search)
-
The professional evolves
- From installer to architect
- From executor to orchestrator
- From technical to strategist
What Do You Prefer?
A fast AI that “guesses” based on similarity or a slightly slower AI that understands your document’s logical structure with 98% precision?
Does precision compensate for cost in your area?
How would you decide between the two approaches?
Share your opinion:
- Email: fodra@fodra.com.br
- LinkedIn: linkedin.com/in/mauriciofodra
The future isn’t about having the newest technology. It’s about using the right technology for the right problem.
Read Also
- Neural Networks: Understanding the Brain Behind Modern AI — The fundamentals behind how AI processes documents.
- Introduction to Machine Learning for Beginners — Base concepts to understand why RAG is necessary.
- The Illusion of Intelligence: Why AI Still ‘Freezes’ When Facing the New — RAG tries to solve exactly this context limitation.