In consumer products, a confident answer is enough. In the enterprise, a confident answer with no source is a liability. What separates a demo from a system people actually deploy is almost always traceability.
Retrieval is the product
Teams obsess over the model and under-invest in retrieval. But the quality of a RAG system is bounded by what it can find. I spend most of the effort on chunking strategy, metadata and ranking, not on prompt wording.
Citations are a first-class requirement
Every generated claim should point back to the exact passage it came from. That means carrying source identifiers through the entire pipeline, from ingestion to the rendered answer, and refusing to answer when retrieval confidence is low.
- Chunk with stable, addressable IDs.
- Return the supporting passages alongside the answer, not buried in logs.
- Make “I don’t know” a valid, well-handled output.
Trust compounds
Once users can verify answers, adoption stops being a training problem and becomes a habit. Verifiable systems get used, opaque ones get abandoned.