OpenSubtitles Hash implementation.
This algorithm is focused on speed because unlike other algorithms, OSHash doesn't read the whole file. This makes it a perfect algorithm for hashing large files.
The latest stable release can be installed from PyPI:
$ pip install oshash
Simply import oshash
and call oshash
function with your file path.
import oshash
file_hash = oshash.oshash("/path/to/file")
You can compute OSHash directly from the terminal.
$ oshash <file_path>
For example:
$ oshash /path/to/video.mp4
OSHash (/path/to/video.mp4) = d315edebf53a4af3
Below we can see a small graph comparing the hashing speed (in seconds) of OSHash with other algorithms for two different files:
320p video (61.7 MB) | 1080p video (339.4 MB) |
---|---|
You can create a comparison for any file with the following command:
$ python3 scripts/compare_algorithms.py <file_path>
If you want to view graphics, make sure you have matplotlib
installed.
In pseudo-code, the hash is computed in the following way:
file_buffer = open("/path/to/file/")
head_checksum = checksum(file_buffer.head(64 * 1024)) # 64KB
tail_checksum = checksum(file_buffer.tail(64 * 1024)) # 64KB
file_hash = file_buffer.size + head_checksum + tail_checksum
You can read more in OpenSubtitles.org Wiki
Thanks to the OpenSubtitles.org team for this algorithm.