wikidex

Fast + local semantic search over select body of Wikipedia pages.

Installation

Using pip:

pip install -r requirements.txt

Usage

To construct the search index:

python construct_index.py --save_dir=<directory name> --title=<title of main Wikipedia list>

To query over the index:

python search.py --save_dir=<directory name>

Architecture

I chose to use prebuilt LLM orchestration tools such as LangChain and LlamaIndex for their efficiency and simplistic usage. Saves having to manually implement text chunking, embedding, and vector lookup via an algorithm like FAISS.

I also chose to use the open-source FastEmbed embeddings model (defaults to BAAI/bge-base-en on HuggingFace). This prioritizes latency and end-user experience while maintaining accuracy of the lookup.

A command line interface was chosen for simplicity and ease of implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wikidex

Installation

Usage

Architecture

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
index		index
README.md		README.md
construct_index.py		construct_index.py
requirements.txt		requirements.txt
search.py		search.py

jackhhao/wikidex

Folders and files

Latest commit

History

Repository files navigation

wikidex

Installation

Usage

Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages