Anime recommendation with lexical field and reduced bias by using Bayesian score

Source of the algorithm: https://towardsdatascience.com/bayesian-ranking-system-77818e63b57b

How it was setup

The data was scraped using all anime data from MyAnimeList, using the Jikan API (https://jikan.moe/) That database is stored in the anime_db.csv file and in the .pickle format for direct usage

Database format:

anime_ID | show type | anime name | synopsis | number of episodes | rating | number of votes | ranking | popularity | myanimelist direct link

How it works

A word is provided by the user
That word generates a list of closest related words based on https://relatedwords.org/api/related
That list of closest related words (lexical field) has each of its words pluralized (using pattern3.text.en module; *needs this fix before using: https://stackoverflow.com/questions/52161349/indentationerrorexpected-an-indented-block)
All the words in that list are then searched through all the anime's synopsises (and title); anime that have those words are held and stored
The matched anime are then subjected to a Bayesian model where number of ratings is taken into count with the score (ex: A show rated 9.7 with 10 votes is not better than a show rated 8.5 with 403000 votes)
After the Bayesian model is applied to the anime and they all have their Bayesian score, they're sorted from biggest Bayesian score (best; closest to 1) to lowest (worst; further from 1)
Only anime with a Bayesian score bigger than 0.75 will appear (this can simply be changed on 127 of main.py (you can put "smaller" for smaller than and you can change the threshold value)

Limitations

The lexical field generated is not always optimal (ex: "love" will generate a lexicon that includes the word "emotion", which is too vague and misleads the search)
Certain words with double meaning can cause issues (ex: "matter" (as in universe matter) will pick up on "it doesn't matter" and "it matters")
I cannot confirm for sure that the Bayesian model applied is legitimate, since I am working for a score and not upvotes/downvotes; my manipulation could be erroneous, however the results seem to be in alignment with what is expected
Supports one word only as an input, instead of a sentence (ex: "love" vs "love story with betrayal")

Potential future improvements

Creating an entire GUI with images/link/trailer video for the recommended anime
Generating a better lexical field and fix double meaning issues
Support more than one word as input
Maybe try to implement Sentiment Analysis to strengthen the search's confidence and even extract the theme from the synopsis rather than naively matching words

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitattributes		.gitattributes
README.md		README.md
anime_db.csv		anime_db.csv
anime_db_pickle.pickle		anime_db_pickle.pickle
bayesian.py		bayesian.py
createPickle.py		createPickle.py
lexicalField.py		lexicalField.py
main.py		main.py
scrapeDatabase.py		scrapeDatabase.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anime recommendation with lexical field and reduced bias by using Bayesian score

Source of the algorithm: https://towardsdatascience.com/bayesian-ranking-system-77818e63b57b

How it was setup

anime_ID | show type | anime name | synopsis | number of episodes | rating | number of votes | ranking | popularity | myanimelist direct link

How it works

Limitations

Potential future improvements

About

Releases

Packages

Languages

haddad-github/BayesianScore-Anime-RecommendationEngine

Folders and files

Latest commit

History

Repository files navigation

Anime recommendation with lexical field and reduced bias by using Bayesian score

Source of the algorithm: https://towardsdatascience.com/bayesian-ranking-system-77818e63b57b

How it was setup

anime_ID | show type | anime name | synopsis | number of episodes | rating | number of votes | ranking | popularity | myanimelist direct link

How it works

Limitations

Potential future improvements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages