Installation and Testing on Biowulf

To install the active learning framework for natural language processing (ALNLP) of pathology reports:

Log in to Biowulf.
Go to the /data partition of Biowulf. For example:
1. Run the following command:
  
  cd /data/$USER/export
2. Export the current working directory to the $alnlp_INSTALL variable. For example:
  
  export alnlp_INSTALL=$(pwd)
Do this on Biowulf. (That is, not from a Biowulf compute node, where GitHub access is limited.)

Clone this repository:

cd $alnlp_INSTALL
git clone https://github.com/CBIIT/NCI-DOE-Collab-Pilot3-Active_learning_NLP.git

Allocate a compute node for the installation process:
```
sinteractive --mem=2g
```
Install the Miniconda package manager. Create and activate an alnlp environment:
```
conda env create -f environment.yml -n alnlp
conda activate alnlp
```

In your conda environment, load python dependencies:

python
>>> import nltk
>>> nltk.download('stopwords')
>>> nltk.download('punkt')

Testing the Installation

You can test the installation via:

cd $alnlp_INSTALL/NCI-DOE-Collab-Pilot3-Active_learning_NLP/experiments
python experiment_001.py

The above example script runs the active learning loop for four logistic regression models, each one using a different acquisition function. This example uses the 20 Newsgroups dataset. In the loop's execute method, you can specify what percentages of data you want to initially use for training, the size of the test set, and how many new samples you want each iteration of the loop to select for labeling. After the execution, the example script creates a report with all the results and plots in the given output folder. The Python script also creates a sub-folder with the same name as the script (experiment_001 in this case) to store the plots in PDF format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README-installation.md

README-installation.md

Installation and Testing on Biowulf

Testing the Installation

Files

README-installation.md

Latest commit

History

README-installation.md

File metadata and controls

Installation and Testing on Biowulf

Testing the Installation