Evaluation specifics #251

piotrmigdalek · 2024-05-10T12:12:55Z

Hi!

I'm trying to evaluate Mistral-7b based model with custom locality and portability data.
For each of 50 edits I have 6 locality prompts and 2 portability ones.

How should I arange the dicts to feed them into an edit function in that case? Will the variable below feeded to portability_inputs work as intended?

portability_inputs = {
    'english': {
        'prompt': df_port['question_en'].tolist(),
        'ground_truth': df_port['label_en'].tolist()
    },
    'polish': {
        'prompt': df_port['question_pl'].tolist(),
        'ground_truth': df_port['label_pl'].tolist()
    }
}

And a technical one, are the metrics calculated after each edit? If yes, is there an option to evaluate everything on the final model after 50 sequential edits?

Thank you :)

The text was updated successfully, but these errors were encountered:

pengzju · 2024-05-10T16:34:45Z

Q1:

Your usage is correct; just ensure that the number of items in the prompts and ground_truth under each dimension, such as "english" and "polish," are consistent.
You can also check if the number of metrics recorded in the logs matches the number of input prompts.

Q2:

I haven't implemented this feature yet, which allows for unified evaluation after full editing, but you can refer to the pseudocode in this Continual Editing #220. I will improve this feature in the next version. Thank you!

zxlzr · 2024-05-12T06:25:40Z

Hi, do you have any further questions?

piotrmigdalek · 2024-05-29T16:26:54Z

Nothing as of now, thanks :)

zxlzr added the question Further information is requested label May 10, 2024

zxlzr closed this as completed May 14, 2024

zxlzr reopened this May 14, 2024

zxlzr closed this as completed Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation specifics #251

Evaluation specifics #251

piotrmigdalek commented May 10, 2024 •

edited

Loading

pengzju commented May 10, 2024

zxlzr commented May 12, 2024

piotrmigdalek commented May 29, 2024

Evaluation specifics #251

Evaluation specifics #251

Comments

piotrmigdalek commented May 10, 2024 • edited Loading

pengzju commented May 10, 2024

zxlzr commented May 12, 2024

piotrmigdalek commented May 29, 2024

piotrmigdalek commented May 10, 2024 •

edited

Loading