text evaluation

evaluation of text embedding techniques

GETTING STARTED

Create and switch to the virtual environment:

cd text_evaluation
make create_environment
conda activate text_evaluation
make requirements

Explore the notebooks in the notebooks directory

Project Organization

LICENSE
Makefile
- top-level makefile. Type make for a list of valid commands
README.md
- this file
data
- Data directory. often symlinked to a filesystem with lots of space
- data/raw
  - Raw (immutable) hash-verified downloads
- data/interim
  - Extracted and interim data representations
- data/processed
  - The final, canonical data sets for modeling.
docs
- A default Sphinx project; see sphinx-doc.org for details
models
- Trained and serialized models, model predictions, or model summaries
- models/trained
  - Trained models
- models/output
  - predictions and transformations from the trained models
notebooks
- Jupyter notebooks. Naming convention is a number (for ordering), the creator's initials, and a short - delimited description, e.g. 1.0-jqp-initial-data-exploration.
references
- Data dictionaries, manuals, and all other explanatory materials.
reports
- Generated analysis as HTML, PDF, LaTeX, etc.
- reports/figures
  - Generated graphics and figures to be used in reporting
- reports/tables
  - Generated data tables to be used in reporting
- reports/summary
  - Generated summary information to be used in reporting
requirements.txt
- (if using pip+virtualenv) The requirements file for reproducing the analysis environment, e.g. generated with pip freeze > requirements.txt
environment.yml
- (if using conda) The YAML file for reproducing the analysis environment
setup.py
- Turns contents of src into a pip-installable python module (pip install -e .) so it can be imported in python code
src
- Source code for use in this project.
- src/__init__.py
  - Makes src a Python module
- src/data
  - Scripts to fetch or generate data. In particular:
  - src/data/make_dataset.py
    - Run with python -m src.data.make_dataset fetch or python -m src.data.make_dataset process
- src/analysis
  - Scripts to turn datasets into output products
- src/models
  - Scripts to train models and then use trained models to make predictions. e.g. predict_model.py, train_model.py
tox.ini
- tox file with settings for running tox; see tox.testrun.org

This project was built using cookiecutter-easydata, an experimental fork of [cookiecutter-data-science](https://github.com/drivendata/cookiecutter-data-science) aimed at making your data science workflow reproducible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

text evaluation

GETTING STARTED

Project Organization

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
catalog		catalog
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TUTORIAL.md		TUTORIAL.md
environment.lock		environment.lock
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

License

jc-healy/text_evaluation

Folders and files

Latest commit

History

Repository files navigation

text evaluation

GETTING STARTED

Project Organization

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages