Ordering Retrieved Chunks

Where you place a document in your prompt matters. Research into LLM attention mechanisms shows that models value information at the beginning and end of a prompt more than information in the middle.

The "Best at Bottom" Strategy

For most models (especially Claude and GPT-4), placing the highest-ranked (most relevant) document closest to the user's question—at the very bottom of the context—leads to better "Grounding."

Example Prompt Structure:

System Instructions
Document 3 (Moderately Relevant)
Document 2 (Highly Relevant)
Document 1 (Most Relevant)
User Question

XML-Based Separation (Claude)

Claude is trained on structured data. Using XML tags to order your documents helps it distinguish between them.

<documents>
  <doc id="1"> Most Relevant Content... </doc>
  <doc id="2"> High Relevance Content... </doc>
</documents>

Handling Contradictions

What if Document A says "Revenue is $10M" and Document B says "Revenue is $12M"?

If Document B is more recent, place it at the bottom (highest rank).
The LLM will generally prefer the information it saw most recently (Recency Bias).

Sorting by Metadata vs. Semantic Rank

Chronological Order: Good for news or financial feeds.
Semantic Order: Good for technical manuals or Q&A.
Source Authority: Place expert docs (e.g., official manuals) lower in the prompt than community posts.

Exercises

Experiment: Place the "Answer" in the middle of 10 irrelevant documents. Does the model find it?
Move the "Answer" to the very end of the list. Does accuracy improve?
How would you programmatically sort chunks by publish_date before building the prompt strings?