Mastering LLM Evaluation: Build Reliable Scalable AI Systems