Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated replication doc for msmarco-passage and msmarco-doc #1237

Merged
merged 2 commits into from
May 28, 2020

Conversation

adamyy
Copy link
Contributor

@adamyy adamyy commented May 28, 2020

Environment

OS: macOS Catalina 10.15.4
Java: openjdk 11.0.7 2020-04-14
Python: 3.7.5

Msmarco Doc

  • With default BM25 parameters (k1=0.9, b=0.4)

image

  • With tuned BM25 parameters (k1=3.44, b=0.87)

image

Msmarco Passage

  • With BM25 parameters k1=0.9, b=0.4

image

image

  • With BM25 parameters k1=0.82, b=0.68

image

Issues

Same issue with pyserini as mentioned in #1220. Had to fallback to the Java implementation as well.

@lintool
Copy link
Member

lintool commented May 28, 2020

I think I fixed the issue: #1238

Can you merge, fix conflict, and re-check?

@codecov
Copy link

codecov bot commented May 28, 2020

Codecov Report

Merging #1237 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1237      +/-   ##
============================================
- Coverage     48.37%   48.33%   -0.04%     
+ Complexity      741      739       -2     
============================================
  Files           147      147              
  Lines          8559     8559              
  Branches       1217     1217              
============================================
- Hits           4140     4137       -3     
- Misses         4078     4082       +4     
+ Partials        341      340       -1     
Impacted Files Coverage Δ Complexity Δ
...java/io/anserini/ltr/feature/CountBigramPairs.java 89.61% <0.00%> (-3.90%) 33.00% <0.00%> (-2.00%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94893f1...af7320d. Read the comment docs.

@adamyy
Copy link
Contributor Author

adamyy commented May 28, 2020

Re-ran msmarco-passage with the updated Python msmarco retrieve command, seems to work just fine this time. msmarco-doc does not seem to be affected by #1238 so I did not re-run it.

@lintool lintool merged commit a6a968a into castorini:master May 28, 2020
crystina-z pushed a commit to crystina-z/anserini that referenced this pull request Oct 28, 2022
* Refactor naming convention for files to ensure consistency across indexes, topics and qrels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants