1.31.4
1.31.4 (2025-01-29)
Fix
-
fix: Allow aggregated tasks within benchmarks (#1771)
-
fix: Allow aggregated tasks within benchmarks
Fixes #1231
- feat: Update task filtering, fixing bug on MTEB
- Updated task filtering adding exclusive_language_filter and hf_subset
- fix bug in MTEB where cross-lingual splits were included
- added missing language filtering to MTEB(europe, beta) and MTEB(indic, beta)
The following code outlines the problems:
import mteb
from mteb.benchmarks import MTEB_ENG_CLASSIC
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == "STS22"][0]
# was eq. to:
task = mteb.get_task("STS22", languages=["eng"])
task.hf_subsets
# correct filtering to English datasets:
# ['en', 'de-en', 'es-en', 'pl-en', 'zh-en']
# However it should be:
# ['en']
# with the changes it is:
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == "STS22"][0]
task.hf_subsets
# ['en']
# eq. to
task = mteb.get_task("STS22", hf_subsets=["en"])
# which you can also obtain using the exclusive_language_filter (though not if there was multiple english splits):
task = mteb.get_task("STS22", languages=["eng"], exclusive_language_filter=True)
-
format
-
remove "en-ext" from AmazonCounterfactualClassification
-
fixed mteb(deu)
-
fix: simplify in a few areas
-
wip
-
tmp
-
sav
-
Allow aggregated tasks within benchmarks
Fixes #1231 -
ensure correct formatting of eval_langs
-
ignore aggregate dataset
-
clean up dummy cases
-
add to mteb(eng, classic)
-
format
-
clean up
-
Allow aggregated tasks within benchmarks
Fixes #1231 -
added fixed from comments
-
fix merge
-
format
-
Updated task type
-
Added minor fix for dummy tasks (
8fb59a4
)
Unknown
-
Update tasks table (
3ee0785
) -
Update tasks table (
02f8ad5
) -
Update tasks table (
c77c82c
) -
Update tasks table (
e8b8ac0
) -
Update tasks table (
50f305f
) -
Update tasks table (
2689cb8
) -
Update tasks table (
24d5373
) -
Update tasks table (
e487eff
) -
Update tasks table (
8bc101f
) -
Update tasks table (
cebf5b6
) -
Update tasks table (
1ead72f
) -
Update tasks table (
d939627
)