-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove *
imports
#1569
fix: remove *
imports
#1569
Conversation
* add more stat * add more stat * update statistics
Bugfixes with data parsing in main figure
* Fixed task result loading from disk * Fixed task result loading from disk
* fix: Removed column wrapping on the table, so that it remains readable * Added disclaimer to figure * fix: Added links to task info table, switched out license with metric
* small fix * fix: fix
swap touche2020 for parity
* add sum per lang * add sort by sum option * make lint
* feat: add CUREv1 dataset --------- Co-authored-by: nadshe <nadia.sheikh@clinia.com> Co-authored-by: olivierr42 <olivier.rousseau@clinia.com> Co-authored-by: Daniel Buades Marcos <daniel@buad.es> * feat: add missing domains to medical tasks * feat: modify benchmark tasks * chore: benchmark naming --------- Co-authored-by: nadshe <nadia.sheikh@clinia.com> Co-authored-by: olivierr42 <olivier.rousseau@clinia.com>
* check if model attr of model exists * lint * Fix retrieval evaluator
* Made get_scores error tolerant * Added join_revisions, made get_scores failsafe * Fetching metadata fixed fr HF models * Added failsafe metadata fetching to leaderboard code * Added revision joining to leaderboard app * fix * Only show models that have metadata, when filter_models is called * Ran linting
Filtering for models that have metadata
* align readme with current mteb * align with mieb branch * fix test
* add lang family mapping and map to task table * make lint * add back some unclassified lang codes
* Correction of SICK-R metadata * Correction of SICK-R metadata --------- Co-authored-by: rposwiata <rposwiata@opi.org.pl>
…05` and `text-multilingual-embedding-002` (#1562) * fix: google_models batching and prompt * feat: add text-embedding-005 and text-multilingual-embedding-002 * chore: `make lint` errors * fix: address PR comments
fix: bm25s implementation
# Conflicts: # docs/create_tasks_table.py # docs/tasks.md # mteb/abstasks/AbsTaskClassification.py # mteb/abstasks/AbsTaskClusteringFast.py # mteb/abstasks/AbsTaskInstructionRetrieval.py # mteb/abstasks/AbsTaskMultilabelClassification.py # mteb/abstasks/AbsTaskPairClassification.py # mteb/abstasks/AbsTaskReranking.py # mteb/abstasks/AbsTaskRetrieval.py # mteb/abstasks/AbsTaskSTS.py # mteb/descriptive_stats/InstructionRetrieval/Core17InstructionRetrieval.json # mteb/descriptive_stats/MultilabelClassification/MultiEURLEXMultilabelClassification.json # mteb/descriptive_stats/Reranking/AskUbuntuDupQuestions.json # mteb/descriptive_stats/Reranking/ESCIReranking.json # mteb/descriptive_stats/Reranking/WikipediaRerankingMultilingual.json # mteb/descriptive_stats/Retrieval/AppsRetrieval.json # mteb/descriptive_stats/Retrieval/BelebeleRetrieval.json # mteb/descriptive_stats/Retrieval/COIRCodeSearchNetRetrieval.json # mteb/descriptive_stats/Retrieval/CodeEditSearchRetrieval.json # mteb/descriptive_stats/Retrieval/CodeFeedbackMT.json # mteb/descriptive_stats/Retrieval/CodeFeedbackST.json # mteb/descriptive_stats/Retrieval/CodeSearchNetCCRetrieval.json # mteb/descriptive_stats/Retrieval/CodeSearchNetRetrieval.json # mteb/descriptive_stats/Retrieval/CodeTransOceanContest.json # mteb/descriptive_stats/Retrieval/CodeTransOceanDL.json # mteb/descriptive_stats/Retrieval/CosQA.json # mteb/descriptive_stats/Retrieval/JaqketRetrieval.json # mteb/descriptive_stats/Retrieval/NFCorpus.json # mteb/descriptive_stats/Retrieval/StackOverflowQA.json # mteb/descriptive_stats/Retrieval/SyntheticText2SQL.json # mteb/descriptive_stats/Retrieval/Touche2020.json # mteb/descriptive_stats/Retrieval/Touche2020Retrieval.v3.json # mteb/descriptive_stats/Retrieval/mFollowIRCrossLingualInstructionRetrieval.json # mteb/descriptive_stats/Retrieval/mFollowIRInstructionRetrieval.json # mteb/evaluation/MTEB.py # mteb/evaluation/evaluators/RetrievalEvaluator.py # mteb/leaderboard/table.py # mteb/model_meta.py # mteb/models/arctic_models.py # mteb/models/e5_models.py # mteb/models/nomic_models.py # mteb/models/sentence_transformers_models.py # mteb/tasks/PairClassification/multilingual/XStance.py # mteb/tasks/Reranking/zho/CMTEBReranking.py # mteb/tasks/STS/por/SickBrSTS.py # tests/test_benchmark/mock_tasks.py
* fix: bm25s implementation * correct library name --------- Co-authored-by: Daniel Buades Marcos <daniel.buades@clinia.com>
* fix: Add training dataset to model meta Adresses #1556 * Added docs * format
… for visualization (#1564) * feat: batch requests to cohere models * fix: use correct task_type * feat: use tqdm with openai * fix: explicitely set `show_progress_bar` to False
# Conflicts: # mteb/model_meta.py
Nice! Was wondering actually, if this could be merged to main instead? I don't think there were any major compatible changes? (as in, everything should run the same as before) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I'm very happy about this!
It was a big frustration for me in #1567 so I am very happy to see it. Did you do it all manually?
Nice! Was wondering actually, if this could be merged to main instead? I don't think there were any major compatible changes? (as in, everything should run the same as before)
Before you could e.g. do
from mteb import load_datasets
I believe this will no longer be possible
I see a script being used to generate these. To see changes from this PR, I was switching the commits one at a time. Might be nice if the merge from main was separated. |
Yes, as @isaac-chung mentioned, I wrote a script to generate imports for tasks, but for other directories, I did it manually. For future PRs I won't merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this :)
Might be worth running the same benchmarks in #1463 again as a comparison.
Checklist
make test
.make lint
.Ref #1463
Also merged changes from
main
and found 3 datasets that previously never imported: