-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search in bs4_book often missing results #1431
Comments
For the record, this particular search failure bit me in a meeting last night. I literally trusted the search results for this word and stated that the only place where "budget" was mentioned was in Appendix 1. Yeah, I looked like an idiot. Yeah, I thought this got fixed a year ago. |
Sorry for the trouble @jtbayly We did not spend much time on bookdown last year will all the development on Quarto. I understand this is a problem, especially when using the search for exact matching and all result match. Current opiniated design of In your example your two other match have 0.83 score. So not shown. It seems Field norm as a high impact. I don't believe #1319 the right fix for a new default. I understand it fixes your use case, but I believe it could have side effect for other books. I'll enquire with more knowledgable fuse.js people to understand better the scoring. However, I think best approach would be to allow customisation for |
Allowing customization would be great, but I'm struggling to see why a book using Fuse.js would not want to ignore field length. Chapters can vary in their length dramatically. Leaving this as-is means the longer a chapter is the lower every single search result will score from that chapter. If you have a long chapter, you are always going to have trouble finding things in it, even if they are exact matches and should obviously be in the search results. It seems to me that this situation is precisely why Fuse.js offers the ability to ignore field length. If somebody can show me an example of setting "ignoreFieldNorm: true" harming search in a book, or even explain theoretically why they don't want to ignore field length, I'd be grateful. Because if this becomes customizable, I'm going to to change it for our books, and I don't want to do it without understanding why bookdown defaults the other way. It feels like I must be missing a really big downside if people are willing to put up with exact matches of a single word not being found by "search" in a book. |
I have discussed today with other, and also considering your good arguments, I believe we can make this change safely. I merged #1319 now. Thank a lot @jtbayly for your patience, perseverance and explanation. Sorry again for the delay on this one! Just a reminder: the default search in Happy to tweak again if needed. I am still planning to allow changing configuration but that is for another time. |
Glad to hear it. And I do think scoring results makes sense, and wherever the cutoff is it's always going to be a tradeoff. |
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary. |
Until this change is committed, search is simply broken in bs4book.
Take this book as an example (it's not running the latest version of gitbook, but this hasn't changed):
https://bylaws.evangelpresbytery.com
Do a search for "budget". Search will indicate that there is only one place in the book where the word is found: Appendix 1.
In point of fact, the word "budget" is found in several places in Section 4 and Section 6.
Now change ignoreFieldNorm: true as in the patch linked above, and try your search again. Now you get all three locations as search results.
I'm sure this is biting people when they search for stuff in books and don't realize they aren't seeing all the results. But they don't know what they don't know, so people aren't reporting it.
The text was updated successfully, but these errors were encountered: