Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author aliases not always recognised when searching for books #3063

Closed
hughrun opened this issue Oct 24, 2023 · 5 comments · Fixed by #3325
Closed

Author aliases not always recognised when searching for books #3063

hughrun opened this issue Oct 24, 2023 · 5 comments · Fixed by #3325
Assignees
Labels
bug Something isn't working feature: search

Comments

@hughrun
Copy link
Contributor

hughrun commented Oct 24, 2023

Describe the bug
Searching for a name should return books by author/s with that string in the name or alias fields. e.g. https://bookwyrm.social/search?q=刘慈欣

However this appears to be inconsistent. e.g. https://youtu.be/InCOD_6wBP0

I'm splitting this issue out from #3047 because this looks like a bug, whereas the inability to set a particular form of authors names to match the display language is really an enhancement request.

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://reads.netsphere.pub/search?q=레이코+시미즈
  2. See search result
  3. Go to https://reads.netsphere.pub/search?q=Reiko+Shimizu
  4. See no results

Expected behavior
Both of these searches should return the same result

Screenshots
If applicable, add screenshots to help explain your problem.

Instance
reads.netsphere.pub

Additional context
See #3047

I assume this behaviour is determined by search_title_author in bookwyrm.book_search.py but I don't understand enough about the django search functionality to see where this might be going wrong.


@hughrun hughrun added the bug Something isn't working label Oct 24, 2023
@mouse-reeve
Copy link
Member

The way title/author search works is that a hidden postgres function creates a weighted search vector that includes the title, author, subtitle, and series name it in the search_vector field in the database. This function is executed whenever a book or author is edited. A friend and I wrung ourselves absolutely dry figuring out how to write the SQL for this and as a result made the deeply questionable decision to allow the SQL to live in a random migration file: https://github.com/bookwyrm-social/bookwyrm/blob/main/bookwyrm/migrations/0077_auto_20210623_2155.py#L39. This is obviously a horrible place for it to live.

Author aliases aren't included in the search vector, and I think the first step to updating the SQL query would be removing it from that migration so that it can be edited in a remotely normal way. It may be that the best way to delete the vector is to create a migration that drops it from the database (as the reverse migration in 0077 does).

Once it lives in a relatively sane location, it would just be a matter of editing some confusing and complicated PSQL

@hughrun
Copy link
Contributor Author

hughrun commented Oct 25, 2023

This sounds like a good idea. But it also begs the question: how come aliases seem to work sometimes in author searches?

@mouse-reeve
Copy link
Member

I would bet that the author is listed in Chinese on some of the editions, and a different edition than the one that appeared in the search is being displayed. Once the search happens on the database level, there's some complicated logic that decided which edition of a work should be shown, and it can end up showing you the default edition rather than the one matched by postgres

@mouse-reeve
Copy link
Member

Hm well that doesn't seem to be the case for the example you linked.

@dato
Copy link
Contributor

dato commented Nov 25, 2023

and I think the first step to updating the SQL query would be removing it from that migration so that it can be edited in a remotely normal way

Yes, that would unblock any work that needs to touch search_vector... I've prepared a PR for it, #3134.

@Minnozz Minnozz mentioned this issue Mar 20, 2024
5 tasks
@Minnozz Minnozz self-assigned this Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature: search
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants