RAG without Vectors: The End of Document 'Chunking' in AI?

The Dirty Problem of RAG

If you work with generative AI, you’ve heard of RAG (Retrieval-Augmented Generation). It’s the technique that allows AI to consult your company’s documents to give precise answers.

The concept is brilliant:

AI doesn’t need to memorize everything
Searches for information in your documents
Combines search + text generation
Answers based on real data

But traditional RAG has a “dirty” problem:

It chunks your documents into pieces, which often makes the machine lose the thread or ignore tables and important contexts.

How Traditional RAG Works (and Fails)

The Standard Process

Step 1: Chunking

100-page document
↓
Divided into 500 chunks of ~200 words
↓
Each chunk becomes a mathematical vector
↓
Stored in vector database

Step 2: Search

User asks: "What was net profit?"
↓
Question becomes vector
↓
Searches similar chunks (semantic similarity)
↓
Returns 3-5 most relevant chunks

Step 3: Generation

AI receives chunks + question
↓
Generates answer based on chunks

What Goes Wrong?

Problem 1: Broken Context

Chunk 237: "...as shown in Table 5"
Chunk 238: [Table 5 was here, but ended up in another chunk]
Chunk 239: "Based on this data..."

Result: AI doesn't see Table 5 when needed

Problem 2: Scattered Information

Document says:
Page 12: "Revenue: $500M"
Page 87: "Costs: $300M"
Page 143: "Profit: $200M"

Question: "What's the profit margin?"
Traditional RAG: May grab only 1 or 2 chunks
Answer: Incomplete or wrong

Problem 3: Destroyed Tables

Original table:
| Product | Q1  | Q2  | Q3  |
|---------|-----|-----|-----|
| A       | 100 | 150 | 200 |
| B       | 50  | 75  | 90  |

After chunking:
Chunk X: "| Product | Q1  | Q2"
Chunk Y: "| Q3  | |---------|-----|"
Chunk Z: "75  | 90  |"

AI: "I can't understand this table"

Problem 4: Structure Loss

Document has:
- Section 1: Introduction
  - 1.1 Context
  - 1.2 Objectives
- Section 2: Methodology
  - 2.1 Approach
  - 2.2 Data

Traditional RAG: Ignores hierarchy
AI doesn't know 1.1 and 1.2 are related

PageIndex: The Radical 2026 Solution

The 2026 scenario proposes a radical solution: PageIndex (or RAG without vectors).

From “Search by Similarity” to “Structured Reasoning”

In common RAG:

AI searches for similar words
Mathematical vectors
No structure understanding

In PageIndex:

More human approach
Understands logical structure
Navigates like you’d read an index

How PageIndex Works

1. Content Tree

Instead of chopping text, AI reads the entire document and creates a tree structure, like an ultra-detailed table of contents (in JSON format) that stays in the model’s “working memory”.

Example of generated tree:

{
  "document": "Annual Report 2025",
  "sections": [
    {
      "id": "1",
      "title": "Executive Summary",
      "page_range": [1, 5],
      "subsections": [
        {
          "id": "1.1",
          "title": "Financial Highlights",
          "page": 2,
          "content_summary": "Revenue $500M, profit $200M",
          "has_table": true,
          "table_ref": "Table_1_Financial_Summary"
        }
      ]
    }
  ],
  "tables": [
    {
      "id": "Table_1_Financial_Summary",
      "location": "page 2",
      "columns": ["Metric", "2024", "2025"],
      "referenced_in": ["1.1", "3.2"]
    }
  ]
}

The AI created a “mind map” of the document.

When you ask a question, AI doesn’t search for loose words.

Reasoning process:

Question: "What was revenue growth?"

AI thinks:
1. "This is about finance"
2. "Probably in Executive Summary or Financial Analysis"
3. Consults tree → identifies section 1.1
4. "Section 1.1 has a financial table"
5. Goes directly to Table_1_Financial_Summary
6. Reads relevant data
7. Calculates: ($500M - $400M) / $400M = 25%
8. Answers: "25% growth"

It looks at the table of contents, reasons about which section should have the answer (e.g., “this should be in Section 4”) and goes straight to the point.

3. Cross-Reference

Problem solved:

Text says: "As shown in Table 3..."

Traditional RAG:
- Doesn't know where Table 3 is
- Ignores the reference

PageIndex:
- Sees reference to Table 3
- Consults tree
- Finds: table_ref: "Table_3_Market_Share"
- Navigates to table
- Connects information

If the text says “see table 3”, AI can navigate the tree, find the table, and connect information.

The Result: Crushing Precision

Real Benchmarks

In financial benchmark tests, this approach achieved 98% precision, far surpassing traditional RAG.

Comparison:

Metric	Traditional RAG	PageIndex
Precision	73%	98%
Recall (finds all)	65%	95%
Tables	45%	97%
Cross-references	20%	92%
Cost per query	$0.02	$0.15
Latency	2s	8s

For companies dealing with complex contracts or annual reports of hundreds of pages, this changes the game.

Perfect Use Cases

✅ Excellent for:

Complex legal contracts
Annual financial reports
Structured technical documentation
Manuals with many tables/references
M&A due diligence
Compliance and auditing

❌ Not worth it for:

Simple FAQs
Short documents (<10 pages)
Search across thousands of documents
Cases where speed > precision

The “Price” of Intelligence

Not everything is flowers. There are two real challenges for this technology:

1. Cost and Latency

The problem:

Since AI needs to make multiple “calls” to navigate the content tree, the process is slower and more expensive than a simple search.

Navigation example:

Call 1: Create document tree ($0.05)
Call 2: Analyze question and decide section ($0.02)
Call 3: Read specific section ($0.03)
Call 4: Search referenced table ($0.02)
Call 5: Synthesize answer ($0.03)

Total: $0.15 per query (vs $0.02 traditional RAG)
Time: 8 seconds (vs 2 seconds)

Trade-off:

7.5x more expensive
4x slower
But 25% more accurate

Worth it? Depends on the use case.

2. Memory Limit

The problem:

The tree structure needs to fit in the AI’s context window.

Real numbers:

Claude 3.5 Sonnet: 200k context tokens

100-page document:
- Text: ~50k tokens
- JSON tree: ~20k tokens
- Space for answer: ~10k tokens
Total used: ~80k tokens
✅ Works!

Library with 50 documents:
- 50 × 50k = 2.5M tokens
❌ Doesn't fit!

Trying to apply this to an entire document library is not yet viable.

Partial solutions:

More compact trees (summaries)
Layered hierarchy (search document first, then detail)
Models with larger context (Gemini 1.5: 1M tokens)

The Evolution of the AI Professional

The Orchestrator in Action

This reinforces our thesis of the AI Orchestrator.

The successful professional is not just someone who “installs” RAG, but who understands:

When to use Traditional RAG (vectorial):

Scenario: Product FAQ
- 1000 common questions
- Short answers
- Speed matters
- Cost matters
Decision: Vectorial RAG (fast and cheap)

When to use PageIndex:

Scenario: $10M contract analysis
- 200-page document
- Needs 98% precision
- Error can cost millions
- Client expects 1-day analysis
Decision: PageIndex (slow but precise)

When to use Hybrid:

Scenario: Technical support system
- 80% simple questions → Vectorial RAG
- 15% medium questions → RAG + human validation
- 5% complex questions → PageIndex
Decision: Intelligent routing

The New Skills

❌ No longer enough:

Knowing how to install RAG library
Running LangChain tutorial
Applying same solution to everything

✅ Necessary:

Understanding trade-offs (cost vs precision vs speed)
Architecting hybrid solutions
Measuring what matters (not just “works”)
Optimizing costs without sacrificing quality
Knowing when new technology is worth the investment

The Future of Document Search

2026-2027: Three Approaches Coexisting

Level 1: Vectorial RAG (commodity)

Simple cases
High scale
Low cost
70-80% precision

Level 2: Hybrid RAG (emerging standard)

Vectorial for initial filter
PageIndex for refinement
85-92% precision
Medium cost

Level 3: Pure PageIndex (premium)

Critical cases
Maximum precision (95-98%)
High cost justified
Acceptable latency

The right choice depends on context, not fashion.

Conclusion

RAG without vectors (PageIndex) is not the replacement for traditional RAG.

It’s an additional tool in the AI professional’s arsenal.

Main lessons:

New technology ≠ Always better
- PageIndex is more accurate
- But also more expensive and slower
- Not always worth it
Context is king
- Simple FAQ? Vectorial RAG
- Critical contract? PageIndex
- Hybrid? Probably
Orchestration is the skill
- Knowing which tool when
- Optimizing cost without losing quality
- Measuring real impact
Precision has a price
- 98% vs 73% = 7.5x more expensive
- Sometimes worth it (legal analysis)
- Sometimes not (email search)
The professional evolves
- From installer to architect
- From executor to orchestrator
- From technical to strategist

What Do You Prefer?

A fast AI that “guesses” based on similarity or a slightly slower AI that understands your document’s logical structure with 98% precision?

Does precision compensate for cost in your area?

How would you decide between the two approaches?

Share your opinion:

Email: fodra@fodra.com.br
LinkedIn: linkedin.com/in/mauriciofodra

The future isn’t about having the newest technology. It’s about using the right technology for the right problem.