Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MS MARCO passage regression errors: BM25prf gives non-deterministic results #774

Closed
lintool opened this issue Aug 11, 2019 · 6 comments · Fixed by #777 or #788
Closed

MS MARCO passage regression errors: BM25prf gives non-deterministic results #774

lintool opened this issue Aug 11, 2019 · 6 comments · Fixed by #777 or #788
Assignees

Comments

@lintool
Copy link
Member

lintool commented Aug 11, 2019

Hi @emmileaf I'm getting these MS MARCO passage regression errors:

This is on tuna:

2019-08-11 03:55:02,107 - regression_test - ERROR - !!!!!{"actual": 0.1518, "collection": "msmarco-passage", "expected": 0.152, "metric": "map", "model": "bm25-default+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
...
2019-08-11 03:56:41,141 - regression_test - ERROR - !!!!!{"actual": 0.1579, "collection": "msmarco-passage", "expected": 0.1582, "metric": "map", "model": "bm25-tuned+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!

This is on another machine:

2019-08-11 03:38:45,575 - regression_test - ERROR - !!!!!{"actual": 0.1519, "collection": "msmarco-passage", "expected": 0.152, "metric": "map", "model": "bm25-default+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
...
2019-08-11 03:39:51,630 - regression_test - ERROR - !!!!!{"actual": 0.158, "collection": "msmarco-passage", "expected": 0.1582, "metric": "map", "model": "bm25-tuned+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!

It seems like BM25prf gives non-deterministic results?

@matthew-z any ideas?

@matthew-z
Copy link
Contributor

BM25PRF is expected to give deterministic results.

Could you please run BM25PRF again with -bm25prf.outputQuery argument to log the query expansion? I will also try to run on my server, but I don't have the collection right now.

@emmileaf
Copy link
Member

From a quick look at the runs, this might be a score tie handling issue?

rs = searcher.search(newQuery, context.getSearchArgs().hits);

// Figure out how to break the scoring ties.
if (context.getSearchArgs().arbitraryScoreTieBreak) {
rs = searcher.search(finalQuery, context.getSearchArgs().hits);
} else if (context.getSearchArgs().searchtweets) {
rs = searcher.search(finalQuery, context.getSearchArgs().hits, BREAK_SCORE_TIES_BY_TWEETID, true);
} else {
rs = searcher.search(finalQuery, context.getSearchArgs().hits, BREAK_SCORE_TIES_BY_DOCID, true);
}

I'll add in tiebreak handling similar to the other rerankers, and update regression numbers for PRF.

@emmileaf
Copy link
Member

An update on this issue: tried changing the line mentioned above and re-ran retrieval a couple of times, but results are still sometimes inconsistent.

Discrepancies look like they're mostly from closely scoring documents having different order, though there's probably something else in the code that I missed in the comment above…

A small example diff:

< 26664 Q0 1469568 841 11.726800 Anserini
< 26664 Q0 3427981 842 11.726799 Anserini
---
> 26664 Q0 3427981 841 11.726800 Anserini
> 26664 Q0 1469568 842 11.726799 Anserini

@emmileaf
Copy link
Member

Turns out I had made a different mistake while verifying/testing the changes earlier 🤦‍♀
Tie-breaking seems to have fixed it with no regression number changes - will follow-up with PR.

@lintool
Copy link
Member Author

lintool commented Sep 4, 2019

Regression error has cropped up again when running:

python src/main/python/run_regression.py --index --collection msmarco-doc >& log.msmarco-doc

Results on damiano:

2019-09-04 03:02:24,322 - regression_test - ERROR - !!!!!{"actual": 0.1357, "collection": "msmarco-doc", "expected": 0.1359, "metric": "map", "model": "bm25-default+prf", "topic": "[MS MARCO Document Ranking: Dev Queries](https://github.com/microsoft/TREC-2019-Deep-Learning)"}!!!!!
2019-09-04 03:03:14,602 - regression_test - ERROR - !!!!!{"actual": 0.1559, "collection": "msmarco-doc", "expected": 0.1562, "metric": "map", "model": "bm25-tuned+prf", "topic": "[MS MARCO Document Ranking: Dev Queries](https://github.com/microsoft/TREC-2019-Deep-Learning)"}!!!!!

@lintool
Copy link
Member Author

lintool commented Sep 5, 2019

Two trials on tuna (Java 8) give the same result.
Two trials on my iMac Pro (Java 8) gives the same result.

I think it's just the case that we forgot to update the regression values.

See PR #788

crystina-z pushed a commit to crystina-z/anserini that referenced this issue Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants