Skip to content

Ehzoahis/SubKeywordSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subjects Keywords Search

Search sentences from abstracts of articles in ArXiv dataset base if the keyword matches the subject of the sentence. Highlight the subjects containing the keywords.

Made my Haozhe Si, 01/10/2021

Dataset

Cornell ArXiv Dataset

Package Usage

SpaCy

Use SpaCy model to perform denpendency parsing. Don't have to install if not modifying myindex. The model is not 100% accurate and may cause some issues in searching.

pip install spacy
python -m spacy download en_core_web_md

Whoosh

Use Whoosh module to build the searching engine.

pip install whoosh

Flask

Use Flask module to build the web interface. The module will open a local host at http://127.0.0.1:5000/.

pip install flask

Usage

Download core.zip, unpack and run

python arxiv_web.py

By default, the corpus size is 10,000. Can modify the size of corpus in archivesearch.ipynb.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published