LLM-as-a-Judge — Structural Reference
Independent, jurisdiction-neutral, non-advisory reference.
Orientation
Large language models can generate outputs.
They can also function as evaluators within model-based assessment contexts.
A system produces an output. An LLM evaluates that output.
Problem Space
Evaluation Consistency
Assessment outcomes may vary across prompts, models, or evaluation conditions.
Judge Reliability
The evaluation mechanism itself may exhibit uncertainty, variability, or bias.
Assessment Transparency
Evaluation outcomes may be difficult to interpret without explicit evaluation criteria.
System Boundary
The evaluation boundary separates model-based assessment from contexts outside defined evaluation scope.
Within Boundary
Outputs are assessed through model-based evaluation relative to defined criteria.
At Boundary
Evaluation processes, scoring mechanisms, or comparative assessments are examined.
Outside Boundary
Assessment cannot be performed due to missing criteria, undefined evaluation scope, or absence of evaluative context.
Structure
Context and positioning are described in About.
Formal definition, scope boundaries, and structural models are provided in Method.