Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new test for different articles from the same book in DuplicateCeckTest #9769

Merged
merged 4 commits into from
Sep 4, 2023

Conversation

Luggas4you
Copy link
Contributor

@Luggas4you Luggas4you commented Apr 17, 2023

Adds a new test for #8885 to org.jabref.logic.database.DuplicateCheckTest which checks if different articles from the same book are duplicates.

Compulsory checks

Preview Give feedback

@Luggas4you Luggas4you marked this pull request as ready for review April 17, 2023 07:55
@ThiloteE
Copy link
Member

ThiloteE commented Apr 17, 2023

Do I understand this right that booktitle, editor, publisher, isbn and year have to be similar to count as excluded from duplicate check? If one of those 5 differs, then the merge duplicates dialogue will trigger? If one of these 5 misses, the dialogue will also always be triggered?

@Luggas4you
Copy link
Contributor Author

Luggas4you commented Apr 17, 2023

Do I understand this right that booktitle, editor, publisher, isbn and year have to be similar to count as excluded from duplicate check? If one of those 5 differs, then the merge duplicates dialogue will trigger? If one of these 5 misses, the dialogue will also always be triggered?

I think so, I do not know yet, I was too quick requesting the commit ^^ I will have to do some more work in order to make this work.

@ThiloteE
Copy link
Member

ThiloteE commented Apr 17, 2023

I tested this PR. Following entries trigger duplicate merge dialogue upon copying / pasting them into the same library:

@Article{article1,
  author    = {author1},
  title     = {title1},
  year      = {2000-01-01},
  pages     = {10-11},
  booktitle = {Book},
  chapter   = {1},
  date      = {2000-01-01},
  editor    = {editor},
  isbn      = {978-1-4684-8585-1},
  publisher = {publisher},
}

@Article{article2,
  author    = {author2},
  title     = {title2},
  year      = {2000-01-01},
  pages     = {20-22},
  booktitle = {Book},
  chapter   = {2},
  date      = {2000-01-01},
  editor    = {editor},
  isbn      = {978-1-4684-8585-1},
  publisher = {publisher},
}

@koppor koppor marked this pull request as draft April 18, 2023 07:38
@koppor
Copy link
Member

koppor commented Apr 18, 2023

I think so, I do not know yet, I was too fast requesting the commit ^^

You did a pull request, which is fine!

You are sharing information fast, which is also fine: "Move fast, break things" is the agile way of working. ^^

In the concrete case, the claim was that the issue #8885 was solved meanwhile. In general, all issue fixes should be covered by test cases - to have a machine-repeatable check. A machine-repeatable check frees us humas to click through JabRef. This pull request created a machine-repatable check for the issue. It turned out that the issue was NOT fixed. Thus, we need to really work on a fix.

@Luggas4you You can create an additional test case based on the data provided by @ThiloteE. This way, we also cover his case.

Depending on how we treat the difference between title1 and title2, this could also be treated as duplicate. Reason: If a title differs only in one letter, I would bet, it is a duplicate. Thus, maybe the test by @ThiloteE will really lead to a duplicate.

@Luggas4you used Performance on a Signal and Rest in Treatment as title, which are surely no duplicates.

More test cases:

  • If Rest in Treatment and {Rest in Treatment} were used as the two titles, the entries should be considered duplicate
  • If Rest in Treatment and {Rest} in {Treatment} were used as the two titles, the entries should be considered duplicate
  • If Rest in Treatment and {R}est in {T}reatment were used as the two titles, the entries should be considered duplicate
  • If Rest in Treatment and Rest in treatment were used as the two titles, the entries should be considered duplicate
  • If Rest in Treatment and Rest inn Treatment were used as the two titles, the entries should be considered duplicate
  • If Rest in Treatment and Restin Treatment were used as the two titles, the entries should be considered duplicate

@koppor
Copy link
Member

koppor commented Apr 18, 2023

Coming from #8885 (comment)

Entries may have the same isbn number, but not be the same entries.

Reason: A book has an ISBN number, but a book may contain multiple chapters by different authors. Each chapter can be an entry. This entry is not a duplicate.

@Siedlerchr
Copy link
Member

I think the original reason was: If two entries have the same identifier they are most likely duplicates. e..g two articles with the same doi means you already imported that article.
But we did not check for books, that book chapter can have the same isbn. IMHO this is a use case for crossref. You but the isbn in the book entry and the inbook entrires of the chapter references this and will have their isbn also filled when citing

@koppor
Copy link
Member

koppor commented May 7, 2023

I think the original reason was: If two entries have the same identifier they are most likely duplicates. e..g two articles with the same doi means you already imported that article.

Yes.

But we did not check for books, that book chapter can have the same isbn.

This was overlooked at the implementation. One can even have two @inproceeding entries with the same ISBN but different chapters of a book. Without using crossref. This is, for instance, the case, if one collects two interesting publications of one venue. Such a venue could be ZEUS.

IMHO this is a use case for crossref. You but the isbn in the book entry and the inbook entrires of the chapter references this and will have their isbn also filled when citing

Yes! We currently lack of good support for crossref.

@Siedlerchr
Copy link
Member

What is the status here? I like the idea of having extra tests, but I think the Duplicate manager does not yet take this into account. I think we can annotate the test with @ignore in the meantime and a description why it's disabled

@Luggas4you Luggas4you closed this Aug 19, 2023
@Luggas4you Luggas4you force-pushed the duplicate-check-test branch from 879a96b to 03b3f25 Compare August 19, 2023 12:42
@calixtus calixtus reopened this Aug 19, 2023
@koppor
Copy link
Member

koppor commented Sep 4, 2023

I added the description now using the value parameter of @Disabled. I will add it to the merge queue to move things forward.

@koppor koppor marked this pull request as ready for review September 4, 2023 14:28
@koppor koppor enabled auto-merge September 4, 2023 14:28
@koppor
Copy link
Member

koppor commented Sep 4, 2023

The GitHub workflows just seem to hang. At another occation, this was because the Workflow file was invalid. This is not the case here.

(Screenshot for archiving reasons)

grafik

Maybe, GitHub has issues, because the PR was opened before the merge queue was activated? Therefore, I try to close and reopen the PR.

@koppor koppor closed this Sep 4, 2023
auto-merge was automatically disabled September 4, 2023 17:03

Pull request was closed

@koppor koppor reopened this Sep 4, 2023
@koppor
Copy link
Member

koppor commented Sep 4, 2023

Temporarily removed "Create installer for X" to required checks. They will be checked by the merge queue nevertheless. Trying now.

@koppor
Copy link
Member

koppor commented Sep 4, 2023

Note that the Deployment * checks do not appear in the list if these checks are not required any more. -- This is a hint that the trigger of the deployment workflow is not being activated. Maybe, we got something wrong there? However, I did not find something wrong...

grafik

@koppor koppor added this pull request to the merge queue Sep 4, 2023
@koppor
Copy link
Member

koppor commented Sep 4, 2023

Note: We configured the merge queue checks NOT to fail. Thus, PRs won't be merged if the installer cannot be build:

grafik

@Siedlerchr
Copy link
Member

Mabye because it was a fork of jabref and therefore the build installer does not work/is skipped?

@koppor koppor mentioned this pull request Sep 4, 2023
6 tasks
Merged via the queue into JabRef:main with commit bef6d23 Sep 4, 2023
@koppor
Copy link
Member

koppor commented Sep 4, 2023

(For the interested readers)

Mabye because it was a fork of jabref and therefore the build installer does not work/is skipped?

I thought so too. I wondered, why it worked before the merge. I tried to re-create protection rules. Did not work. Then I came to the idea of triggers. See #10319 for the fix and the linked discussion for an explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants