https://docs.ragas.io/en/stable/
Metrics - https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/
Measure of proportion of relevant chunks in the retrieved contexts. It is calculated as the mean of the precision@k
for each chunk in the context.
<aside> 💡
Precision@k
is the ratio of the number of relevant chunks at rank k to the total number of chunks at rank k.
</aside>
This can be calculated with and without reference:
This can be used when you have both retrieved contexts and also reference contexts associated with a user input.
from ragas import SingleTurnSample
from ragas.metrics import LLMContextPrecisionWithoutReference
context_precision = LLMContextPrecisionWithoutReference(llm=evaluator_llm)
sample = SingleTurnSample(
user_input="Where is the Eiffel Tower located?",
response="The Eiffel Tower is located in Paris.",
retrieved_contexts=["The Eiffel Tower is located in Paris."],
)
await context_precision.single_turn_ascore(sample)
# 0.9999999999
An LLM is used to compare response
against each chunk in retrieved_contexts
This can be used when you have both retrieved contexts and also reference answer associated with a user input.
from ragas import SingleTurnSample
from ragas.metrics import LLMContextPrecisionWithReference
context_precision = LLMContextPrecisionWithReference(llm=evaluator_llm)
sample = SingleTurnSample(
user_input="Where is the Eiffel Tower located?",
reference="The Eiffel Tower is located in Paris.",
retrieved_contexts=["The Eiffel Tower is located in Paris."],
)
await context_precision.single_turn_ascore(sample)
# 0.9999999999
Uses LLM to compare each of the retrieved context or chunk present in retrieved_contexts
 with reference
For scenarios where both retrieved contexts and reference contexts are available for a user input. Retrieved context relevance is determined using a non-LLM based similarity measure.
from ragas import SingleTurnSample
from ragas.metrics import NonLLMContextPrecisionWithReference
context_precision = NonLLMContextPrecisionWithReference()
sample = SingleTurnSample(
retrieved_contexts=["The Eiffel Tower is located in Paris."],
reference_contexts=["Paris is the capital of France.", "The Eiffel Tower is one of the most famous landmarks in Paris."]
)
await context_precision.single_turn_ascore(sample)