4 methodologies to find similar documents. Different methodologies can be used based on the case at hand. This repository can be used to find similar documents among billions of documents.
text-mining bag-of-words locality-sensitive-hashing hashing-algorithm text-analytics mining-massive-datasets dictionary-of-keys
-
Updated
Mar 19, 2017 - Python