https://docs.ragas.io/en/stable/

Metrics - https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/

Metrics

1. Context Precision

Measure of proportion of relevant chunks in the retrieved contexts. It is calculated as the mean of the precision@kfor each chunk in the context.

<aside> 💡

Precision@k is the ratio of the number of relevant chunks at rank k to the total number of chunks at rank k.

image.png

</aside>

image.png

This can be calculated with and without reference:

1.1 Context Precision without reference

This can be used when you have both retrieved contexts and also reference contexts associated with a user input.

from ragas import SingleTurnSample
from ragas.metrics import LLMContextPrecisionWithoutReference

context_precision = LLMContextPrecisionWithoutReference(llm=evaluator_llm)

sample = SingleTurnSample(
    user_input="Where is the Eiffel Tower located?",
    response="The Eiffel Tower is located in Paris.",
    retrieved_contexts=["The Eiffel Tower is located in Paris."], 
)

await context_precision.single_turn_ascore(sample)
# 0.9999999999

An LLM is used to compare response against each chunk in retrieved_contexts

1.2 Context Precision with reference

This can be used when you have both retrieved contexts and also reference answer associated with a user input.

from ragas import SingleTurnSample
from ragas.metrics import LLMContextPrecisionWithReference

context_precision = LLMContextPrecisionWithReference(llm=evaluator_llm)

sample = SingleTurnSample(
    user_input="Where is the Eiffel Tower located?",
    reference="The Eiffel Tower is located in Paris.",
    retrieved_contexts=["The Eiffel Tower is located in Paris."], 
)

await context_precision.single_turn_ascore(sample)
# 0.9999999999

Uses LLM to compare each of the retrieved context or chunk present in retrieved_contexts with reference

1.3 Context Precision with reference contexts

For scenarios where both retrieved contexts and reference contexts are available for a user input. Retrieved context relevance is determined using a non-LLM based similarity measure.

from ragas import SingleTurnSample
from ragas.metrics import NonLLMContextPrecisionWithReference

context_precision = NonLLMContextPrecisionWithReference()

sample = SingleTurnSample(
    retrieved_contexts=["The Eiffel Tower is located in Paris."], 
    reference_contexts=["Paris is the capital of France.", "The Eiffel Tower is one of the most famous landmarks in Paris."]
)

await context_precision.single_turn_ascore(sample)