This is a demo of vector search using MongoDB Atlas and Google Cloud. The dataset is a catalogue of books. The project uses Node.js and express for the server and Angular for the client.
- Node.js LTS.
Follow the instructions below to run the demo locally.
-
Clone the project.
git clone https://github.com/mongodb-developer/Google-Cloud-Semantic-Search
-
Navigate to the
prepare-data
directory and install the dependencies.npm install
-
Create a free MongoDB Atlas account.
-
Deploy a free M0 database cluster in a region of your choice.
-
Complete the security quickstart.
-
Add your connection string to
prepare-data/.env
.
Make sure to replace the placeholders with credentials of the database user you created in the security quickstart.prepare-data/.env
ATLAS_URI="<your-connection-string>"
Note that you will have to create the file
.env
in theprepare-data
folder. -
Run the script for importing the dataset into your database.
node ./prepare-data/import-data.js
-
Navigate to your MongoDB Atlas database deployment and verify that the data is loaded successfully.
-
Create a new Google Cloud project with billing enabled.
-
Enable the Vertex AI and Cloud Functions APIs.
-
Deploy a public 2nd generation Google Cloud Function with the following implementation:
Replace the
PROJECT_ID
andLOCATION
placeholders in the file google-cloud-functions/generate-embeddings/main.py before deploying the function. Remember to also update the Entry Point to generate_embeddings.Note: The
LOCATION
parameter defines the region where the cloud function will run, make sure this region supports VertexAI Model Garden.europe-west1
does not.If you have the
gcloud
CLI installed, run the following deployment command.gcloud functions deploy generate-embeddings \ --region=us-central1 \ --gen2 \ --runtime=python311 \ --source=./google-cloud-functions/generate-embeddings/ \ --entry-point=generate_embeddings \ --trigger-http \ --allow-unauthenticated
-
Add the deployed function URL to
prepare-data/.env
.prepare-data/.env
ATLAS_URI="<your-connection-string>" EMBEDDING_ENDPOINT="<your-cloud-function-url>"
-
Run the embeddings generation script.
node ./prepare-data/create-embeddings.js
Note that Vertex AI has a limitation for generating 600 embeddings per minute. If you're getting 403 errors, wait for a minute and rerun the script. Repeat until all documents are 'vectorized'.
-
Go back to your MongoDB Atlas project and open the deployed database cluster. Verify that the
bookstore.books
collection has a newtext_embedding
field containing a multi-dimensional vector. -
Navigate to the Atlas Search Tab and click on Create Search Index.
-
Select JSON Editor under Atlas Vector Search and then click on Next.
-
Select the Database and Collection and then insert the following index definition and click 'Save'.
{ "fields": [ { "numDimensions": 768, "path": "text_embedding", "similarity": "euclidean", "type": "vector" } ] }
-
Navigate to the
server
directory. -
Copy the
prepare-data/.env
.cp ../prepare-data/.env .
-
Install the dependencies and run the application.
npm install && npm start
-
Open a new terminal window to run the client application.
-
In the new window, navigate to the
client
directory. -
Install the dependencies and run the project.
npm install && npm start
-
Open the browser at
localhost:4200
and find books using the power of vector search!
Use at your own risk; not a supported MongoDB product