We include here a small tutorial to install and compile CUTEXT. If you are going to use the executable (Java ARchive, JAR) file included, you do not need to install or compile anything, just read the section 'Execution via JAR file' in the README.md file.
You only need to have installed Java (developer version) 1.7 or later. Neither IDE nor automated compilation tool is needed, such as Eclipse or IntelliJ IDEA.
If you don’t have Java 1.7 or later, download the current Java Development Kit (JDK) To check if you have a compatible version of Java installed, use the following command:
java -version
It is necessary to include in the PATH, and CLASSPATH environment variables, the java path and the CUTEXT packets, respectively.
This variable informs the Operating System (OS) where Java is located within it. Therefore, it is dependent on the OS:
- Windows: Generally when installing Java in Windows, it automatically includes in the PATH variable the path to the Java 'bin' directory. To be sure, we have to type (cmd terminal):
echo %PATH%
We can also see the routes included within the PATH variable in 'Environment Variables' within 'System Properties'. The absolute path must appear to the 'bin' directory of the version of the JDK that we have installed. For example, if we have installed 1.8.0_152 version inside 'Program_Files', we'll see something like this:
C:\Program_Files\Java\jdk1.8.0_152\bin
If it does not appear, we must include it. Again, it can be done in 'Environment Variables', editing PATH, and including that path. Also from the cmd terminal, typing the following (for the previous example):
set PATH=C:\Program_Files\Java\jdk1.8.0_152\bin;%PATH%
- Linux: To include it in linux, we open a terminal and type:
export PATH='route':$PATH
Where 'route' is the path to the Java 'bin' directory. With this the changes will be temporary, closing the terminal will disappear. If we want them to be permanent, we can edit the .bashrc file, by writing:
gedit /home/usuario/.bashrc
This will open .bashrc in the 'gedit' text editor. We just have to include the following two lines, at the end of it:
PATH=route:$PATH export PATH
When saving the changes in 'gedit', they will now be permanent. We can close the terminal, reopen it, and type:
echo $PATH
Now, we will see that the PATH variable contains the path to the Java 'bin' directory.
The CLASSPATH environment variable is modified with the set command. The format is:
set CLASSPATH=path1;path2 ... (Windows) export CLASSPATH=path1:path2 ... (Linux)
The paths should begin with the letter specifying the drive, for example, C:\ under Windows or /home/user/ under Linux. That way, the classes will still be found if you happen to switch to a different drive.
If your CLASSPATH environment variable has been set to a value that is not correct, or if your startup file or script is setting an incorrect path, you can unset CLASSPATH by using:
C:\set CLASSPATH= (Windows) $ export CLASSPATH= (Linux)
This command unsets CLASSPATH for the current command prompt window only.
If you have downloaded CUTEXT from Git-Hub, the directory structure will have been maintained, which will be:
es/cnio/bionlp/cutext/...
The route to set the CLASSPATH is the 'es' parent directory. For example, we will assume that in Windows we have downloaded it at 'Software':
C:\Software\es\cnio\bionlp\cutext\...
Then, the CLASSPATH must be set to 'Software':
set CLASSPATH=C:\Software;%CLASSPATH%
In Linux it is very similar. For example, if we have downloaded it at 'Software':
/home/user/Software/es/cnio/bionlp/cutext/...
Then, again, the CLASSPATH must be set to 'Software':
export CLASSPATH=/home/user/Software/:$CLASSPATH
Once the environment variables are fixed, we can compile and execute CUTEXT. To compile the .java files, we only have to move, in the terminal, at the directories that contain them, and type:
javac *.java
For example, to compile the files that are into the 'util' directory, we type the following under Windows (we will assume that Modifier was downloaded at 'Software', see the previous section):
cd C:\Software\es\cnio\bionlp\cutext\util C:\Software\es\cnio\bionlp\cutext\util> javac *.java
In Windows, by default, the files are encoded in ASCII (in Linux in UTF-8), so, to avoid compilation errors for certain characters, it is convenient to compile the java files, setting utf8 as the encoding flag, like this:
javac -encoding utf8 *.java
Under Linux it is very similar (as before, we will assume that it was downloaded at 'Software', see the previous section):
$ cd /home/user/Software/es/cnio/bionlp/cutext/util $ javac *.java
In a similar manner it would be for the rest of the directories that contain Java files (extension .java).
Once compiled, we can execute it.
For this, the simplest way is to move (in the terminal) at 'main' directory, and type:
java es.cnio.bionlp.cutext.main.ExecCutext [options]
The options are available in the README.md file. To prove that it runs correctly, you can ask it, for example, to show you the execution options in the following way:
java es.cnio.bionlp.cutext.main.ExecCutext -help