azamzam.ai

Blog

Essays on the engineering decisions that decide whether a RAG system is fit for regulated work. Long-form, opinionated, with worked examples from real builds.

How I evaluate legal RAG systems

1st May 2026

Most RAG projects skip evaluation. This is how I do it on legal builds — the three axes that matter, how to construct a test set without going crazy, and how to use LLM-as-judge without fooling yourself. Includes a link to the open-source harness.