Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove * imports #1569

Merged
merged 78 commits into from
Dec 9, 2024
Merged

fix: remove * imports #1569

merged 78 commits into from
Dec 9, 2024

Conversation

Samoed
Copy link
Collaborator

@Samoed Samoed commented Dec 8, 2024

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

Ref #1463
Also merged changes from main and found 3 datasets that previously never imported:

  • Ddisco
  • GeorgianSentimentClassification
  • WongnaiReviewsClassification

Samoed and others added 30 commits November 14, 2024 11:52
* add more stat

* add more stat

* update statistics
Automatically generated by python-semantic-release
* Fixed task result loading from disk

* Fixed task result loading from disk
Automatically generated by python-semantic-release
Automatically generated by python-semantic-release
* fix: Removed column wrapping on the table, so that it remains readable

* Added disclaimer to figure

* fix: Added links to task info table, switched out license with metric
* small fix

* fix: fix
Automatically generated by python-semantic-release
Automatically generated by python-semantic-release
* add sum per lang

* add sort by sum option

* make lint
Automatically generated by python-semantic-release
* feat: add CUREv1 dataset

---------

Co-authored-by: nadshe <nadia.sheikh@clinia.com>
Co-authored-by: olivierr42 <olivier.rousseau@clinia.com>
Co-authored-by: Daniel Buades Marcos <daniel@buad.es>

* feat: add missing domains to medical tasks

* feat: modify benchmark tasks

* chore: benchmark naming

---------

Co-authored-by: nadshe <nadia.sheikh@clinia.com>
Co-authored-by: olivierr42 <olivier.rousseau@clinia.com>
Automatically generated by python-semantic-release
* check if model attr of model exists

* lint

* Fix retrieval evaluator
Automatically generated by python-semantic-release
* Made get_scores error tolerant

* Added join_revisions, made get_scores failsafe

* Fetching metadata fixed fr HF models

* Added failsafe metadata fetching to leaderboard code

* Added revision joining to leaderboard app

* fix

* Only show models that have metadata, when filter_models is called

* Ran linting
Automatically generated by python-semantic-release
Automatically generated by python-semantic-release
* align readme with current mteb

* align with mieb branch

* fix test
Automatically generated by python-semantic-release
* add lang family mapping and map to task table

* make lint

* add back some unclassified lang codes
github-actions and others added 17 commits December 6, 2024 11:18
Automatically generated by python-semantic-release
* Correction of SICK-R metadata
* Correction of SICK-R metadata
---------
Co-authored-by: rposwiata <rposwiata@opi.org.pl>
…05` and `text-multilingual-embedding-002` (#1562)

* fix: google_models batching and prompt
* feat: add text-embedding-005 and text-multilingual-embedding-002
* chore: `make lint` errors
* fix: address PR comments
Automatically generated by python-semantic-release
# Conflicts:
#	docs/create_tasks_table.py
#	docs/tasks.md
#	mteb/abstasks/AbsTaskClassification.py
#	mteb/abstasks/AbsTaskClusteringFast.py
#	mteb/abstasks/AbsTaskInstructionRetrieval.py
#	mteb/abstasks/AbsTaskMultilabelClassification.py
#	mteb/abstasks/AbsTaskPairClassification.py
#	mteb/abstasks/AbsTaskReranking.py
#	mteb/abstasks/AbsTaskRetrieval.py
#	mteb/abstasks/AbsTaskSTS.py
#	mteb/descriptive_stats/InstructionRetrieval/Core17InstructionRetrieval.json
#	mteb/descriptive_stats/MultilabelClassification/MultiEURLEXMultilabelClassification.json
#	mteb/descriptive_stats/Reranking/AskUbuntuDupQuestions.json
#	mteb/descriptive_stats/Reranking/ESCIReranking.json
#	mteb/descriptive_stats/Reranking/WikipediaRerankingMultilingual.json
#	mteb/descriptive_stats/Retrieval/AppsRetrieval.json
#	mteb/descriptive_stats/Retrieval/BelebeleRetrieval.json
#	mteb/descriptive_stats/Retrieval/COIRCodeSearchNetRetrieval.json
#	mteb/descriptive_stats/Retrieval/CodeEditSearchRetrieval.json
#	mteb/descriptive_stats/Retrieval/CodeFeedbackMT.json
#	mteb/descriptive_stats/Retrieval/CodeFeedbackST.json
#	mteb/descriptive_stats/Retrieval/CodeSearchNetCCRetrieval.json
#	mteb/descriptive_stats/Retrieval/CodeSearchNetRetrieval.json
#	mteb/descriptive_stats/Retrieval/CodeTransOceanContest.json
#	mteb/descriptive_stats/Retrieval/CodeTransOceanDL.json
#	mteb/descriptive_stats/Retrieval/CosQA.json
#	mteb/descriptive_stats/Retrieval/JaqketRetrieval.json
#	mteb/descriptive_stats/Retrieval/NFCorpus.json
#	mteb/descriptive_stats/Retrieval/StackOverflowQA.json
#	mteb/descriptive_stats/Retrieval/SyntheticText2SQL.json
#	mteb/descriptive_stats/Retrieval/Touche2020.json
#	mteb/descriptive_stats/Retrieval/Touche2020Retrieval.v3.json
#	mteb/descriptive_stats/Retrieval/mFollowIRCrossLingualInstructionRetrieval.json
#	mteb/descriptive_stats/Retrieval/mFollowIRInstructionRetrieval.json
#	mteb/evaluation/MTEB.py
#	mteb/evaluation/evaluators/RetrievalEvaluator.py
#	mteb/leaderboard/table.py
#	mteb/model_meta.py
#	mteb/models/arctic_models.py
#	mteb/models/e5_models.py
#	mteb/models/nomic_models.py
#	mteb/models/sentence_transformers_models.py
#	mteb/tasks/PairClassification/multilingual/XStance.py
#	mteb/tasks/Reranking/zho/CMTEBReranking.py
#	mteb/tasks/STS/por/SickBrSTS.py
#	tests/test_benchmark/mock_tasks.py
Automatically generated by python-semantic-release
* fix: bm25s implementation

* correct library name

---------

Co-authored-by: Daniel Buades Marcos <daniel.buades@clinia.com>
* fix: Add training dataset to model meta

Adresses #1556

* Added docs

* format
… for visualization (#1564)

* feat: batch requests to cohere models

* fix: use correct task_type

* feat: use tqdm with openai

* fix: explicitely set `show_progress_bar` to False
Automatically generated by python-semantic-release
# Conflicts:
#	mteb/model_meta.py
@isaac-chung
Copy link
Collaborator

Nice! Was wondering actually, if this could be merged to main instead? I don't think there were any major compatible changes? (as in, everything should run the same as before)

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I'm very happy about this!

It was a big frustration for me in #1567 so I am very happy to see it. Did you do it all manually?

Nice! Was wondering actually, if this could be merged to main instead? I don't think there were any major compatible changes? (as in, everything should run the same as before)

Before you could e.g. do

from mteb import load_datasets

I believe this will no longer be possible

mteb/__init__.py Outdated Show resolved Hide resolved
@isaac-chung
Copy link
Collaborator

I see a script being used to generate these. To see changes from this PR, I was switching the commits one at a time.

Might be nice if the merge from main was separated.

@Samoed
Copy link
Collaborator Author

Samoed commented Dec 9, 2024

Yes, as @isaac-chung mentioned, I wrote a script to generate imports for tasks, but for other directories, I did it manually. For future PRs I won't merge main and v2 with feature at the same time

Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this :)
Might be worth running the same benchmarks in #1463 again as a comparison.

@Samoed
Copy link
Collaborator Author

Samoed commented Dec 9, 2024

Here is profile results of profiling, but it is hard to tell difference

profile

@isaac-chung isaac-chung merged commit d0aa3a7 into v2.0.0 Dec 9, 2024
10 checks passed
@isaac-chung isaac-chung deleted the update_imports branch December 9, 2024 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants