BoolQ for Training and Eval #30

farzadab · 2024-06-14T23:46:24Z

This PR adds:

Training version of BoolQ (extended with GPT-based "explanations")
True/False (exact match) evaluations for BoolQ
Text-only evaluations for BoolQ and AnyInstruct to show the text-audio gap

ultravox/data/datasets.py

ultravox/evaluation/eval_types.py

ultravox/training/evaluation.py

ultravox/training/configs/stage2_lora.yaml

This reverts commit 5b2e373.

ultravox/data/datasets.py

ultravox/training/configs/stage2_lora.yaml

ultravox/training/evaluation.py

* set default to include_context=True * boolq extended dataset for training * improved evals + boolq T/F eval + text-only

farzadab added 4 commits June 14, 2024 16:40

set default to include_context=True

52bbbe0

boolq extended dataset for training

d5cf38e

improved evals + boolq T/F eval + text-only

d561dea

fix name: boolq_passage -> boolq_extended

334798a

farzadab commented Jun 14, 2024

View reviewed changes

ultravox/data/datasets.py Show resolved Hide resolved

ultravox/data/datasets.py Show resolved Hide resolved

ultravox/data/datasets.py Show resolved Hide resolved

ultravox/evaluation/eval_types.py Show resolved Hide resolved

ultravox/training/evaluation.py Show resolved Hide resolved

farzadab marked this pull request as ready for review June 14, 2024 23:53

farzadab requested a review from juberti June 14, 2024 23:53

farzadab commented Jun 17, 2024

View reviewed changes

ultravox/training/configs/stage2_lora.yaml Show resolved Hide resolved

farzadab marked this pull request as draft June 17, 2024 16:10

farzadab added 2 commits June 19, 2024 15:47

set max_tokens for training with BoolQ-extended

5b2e373

Revert "set max_tokens for training with BoolQ-extended"

dcd3b10

This reverts commit 5b2e373.

farzadab marked this pull request as ready for review June 19, 2024 23:44

juberti approved these changes Jun 20, 2024

View reviewed changes

alphabetize and fix _ vs __

3e79d62

farzadab merged commit 4202b56 into main Jun 21, 2024
1 check passed

farzadab deleted the farzad-boolq-evals branch June 21, 2024 16:00

akshat0311 pushed a commit to jiviai/audio-llm that referenced this pull request Jan 30, 2025

BoolQ for Training and Eval (fixie-ai#30)

e8e3e77

* set default to include_context=True * boolq extended dataset for training * improved evals + boolq T/F eval + text-only

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BoolQ for Training and Eval #30

BoolQ for Training and Eval #30

farzadab commented Jun 14, 2024 •

edited

Loading

BoolQ for Training and Eval #30

BoolQ for Training and Eval #30

Conversation

farzadab commented Jun 14, 2024 • edited Loading

farzadab commented Jun 14, 2024 •

edited

Loading