add validate metrics #30

Wenshansilvia · 2024-02-05T12:28:20Z

@FBzzh @yuanpcr you can list all potential metrics for the validate task in this issue. For more details about the validate task, you can refer to issue #13 .

The text was updated successfully, but these errors were encountered:

bugtig6351 · 2024-04-26T13:18:15Z

Here are some metrics related to the answer groundness.

Knowledge F1. A lexical overlap metric used for knowledge-grounded dialogue, which checks the F1 score between the tokens of gold passages and model responses.
Knowledge F1 ++. A variant of K-F1 that discounts tokens from user question or the conversation history in the model response.
Faithfulness (RAGAS). Use LLM to extract the statements in the model response, and then determines whether these statements can be inffered from the given contexts.
FActScore. A LLM-based method that breaks down the generated text into a series of atom facts, and then evaluates whether these facts are supported by the knowledge source.
QUIP-Score. An n-gram overlap measure that quantifies the degree to which a generated passage consists of exact spans found in a text corpus.

Wenshansilvia mentioned this issue Feb 5, 2024

Initialize Metrics #23

Open

7 tasks

faneshion mentioned this issue Feb 6, 2024

add simple metric for answergroundedness #32

Merged

faneshion added this to the Version 0.1 milestone Feb 6, 2024

faneshion added the enhancement New feature or request label Feb 23, 2024

faneshion changed the title ~~add validator metrics~~ add validate metrics Feb 23, 2024

faneshion assigned FBzzh and yuanpcr Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add validate metrics #30

add validate metrics #30

Wenshansilvia commented Feb 5, 2024 •

edited by faneshion

Loading

bugtig6351 commented Apr 26, 2024 •

edited

Loading

add validate metrics #30

add validate metrics #30

Comments

Wenshansilvia commented Feb 5, 2024 • edited by faneshion Loading

bugtig6351 commented Apr 26, 2024 • edited Loading

Wenshansilvia commented Feb 5, 2024 •

edited by faneshion

Loading

bugtig6351 commented Apr 26, 2024 •

edited

Loading