LLM-as-a-Judge — Structural Reference

Independent, jurisdiction-neutral, non-advisory reference.

Orientation

Large language models can generate outputs.

They can also function as evaluators within model-based assessment contexts.

A system produces an output. An LLM evaluates that output.

Problem Space

Evaluation Consistency

Assessment outcomes may vary across prompts, models, or evaluation conditions.

Judge Reliability

The evaluation mechanism itself may exhibit uncertainty, variability, or bias.

Assessment Transparency

Evaluation outcomes may be difficult to interpret without explicit evaluation criteria.

System Boundary

The evaluation boundary separates model-based assessment from contexts outside defined evaluation scope.

Within Boundary

Outputs are assessed through model-based evaluation relative to defined criteria.

At Boundary

Evaluation processes, scoring mechanisms, or comparative assessments are examined.

Outside Boundary

Assessment cannot be performed due to missing criteria, undefined evaluation scope, or absence of evaluative context.

Structure

Context and positioning are described in About.

Formal definition, scope boundaries, and structural models are provided in Method.