This project provides a simple implementation to compute the cosine similarity between two feature vectors. Cosine similarity is a measure of similarity between two non-zero vectors that measures the cosine of the angle between them.
The project includes:
- A Python script to compute cosine similarity between feature vectors
- Support for reading vectors from text files
- Handling for both space and comma-separated values
- Python 3.x
- NumPy
Install the required package using:
uv sync
-
Prepare your feature vectors in a text file (e.g.,
features.txt
):- One vector per line
- Values can be space or comma-separated
- Example:
0.5 0.8 0.3 0.9 0.1 0.2 0.7 0.4 0.6 0.3
-
Run the script:
uv run python main.py
The script will output:
- The input feature vectors
- The computed cosine similarity between them
Cosine similarity is computed using the formula:
cos(θ) = (A·B) / (||A|| * ||B||)
where:
- A·B is the dot product of vectors A and B
- ||A|| and ||B|| are the L2 norms (magnitudes) of vectors A and B
The script includes error handling for:
- Missing input file
- Invalid file format
- Vectors of different dimensions
- Zero vectors