add md files

bagustris · Mar 7, 2024 · 8baaaaf · 8baaaaf
1 parent ba20bbf
commit 8baaaaf
Show file tree

Hide file tree

Showing 7 changed files with 204 additions and 0 deletions.
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,14 @@
+# Nkululeko documentation  
+
+# How to run documentation locally
+```bash
+# Install requirements
+$ pip install -r requirements.txt
+$ make html
+```
+
+And then check the built html in `build/html/index.html`
+
+```bash
+$ firefox build/html/index.html
+```
diff --git a/docs/source/hello_world_aud.md b/docs/source/hello_world_aud.md
@@ -0,0 +1,43 @@
+# Hello world!
+
+Here is a hello world of using Nkululeko using dataset in [audformat](https://audeering.github.io/audformat/). This hello world is also
+available in [Google
+Colab](https://colab.research.google.com/drive/1GYNBd5cdZQ1QC3Jm58qoeMaJg3UuPhjw?usp=sharing#scrollTo=4G_SjuF9xeQf')
+and
+[Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example).
+
+In this setup, we will use the Berlin Emodb dataset. Check the `data/emodb` directory for the dataset and follow the instructions below.
+0. Change the directory to the root of the project.
+
+```bash
+# Download using wget
+wget https://zenodo.org/record/7447302/files/emodb.zip
+# Unzip
+unzip emodb.zip
+# change to Nkululeko parent directory
+cd ..
+# run the nkululeko experiment
+python -m nkululeko.nkululeko --config tests/exp_emodb_os_xgb.ini
+```
+
+Then, check the results in the `results` directory.
+
+You can experiment with changing some paramaters in INI file. For instance, change the `type` of the model to `xgb` or `svm` and see how the results change.
+<!-- -->
+
+    [EXP]
+    root = ./results
+    name = exp_emodb
+    [DATA]
+    databases = ['emodb']
+    emodb = ./emodb/
+    emodb.split_strategy = specified
+    emodb.train_tables = ['emotion.categories.train.gold_standard']
+    emodb.test_tables = ['emotion.categories.test.gold_standard']
+    target = emotion
+    labels = ['anger', 'boredom', 'disgust', 'fear', 'happiness', 'neutral', 'sadness']
+    [FEATS]
+    type = ['os']
+    [MODEL]
+    type = xgb
+    [PLOT]
diff --git a/docs/source/hello_world_csv.md b/docs/source/hello_world_csv.md
@@ -0,0 +1,43 @@
+# Hello World with Pre-processed CSV Dataset
+
+In the previous tutorial, we learned how to use Nkululeko with audformat dataset. 
+Since most dataset are not in audformat, we will learn how to use Nkululeko with pre-processed CSV dataset.
+
+## Dataset Location 
+
+The dataset is assumed to be located in the `data` directory under `Nkululeko` root directory. The best practice is to store dataset in `/data/` or `/home/$USER/data/` directory and then make a symbolic link to each dataset in the Nkululeko `data` directory. 
+Here the example of downloading dataset into its location, doing pre-processing and running the experiment. The main idea of the pre-processing is to convert the dataset into the format that Nkululeko can understand. Usually, the pre-processing is done by running the `process_database.py` script. You can learn more about the pre-processing in each dataset directory (`/nkululeko/data`).
+
+Let's start with the ravdess directory.
+
+This `ravdess` folder is to import the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)
+database to nkululeko.
+
+We used the version downloadable from [Zenodo](https://zenodo.org/record/1188976)
+
+Download and unzip the file Audio_Speech_Actors_01-24.zip
+```bash
+$ wget https://zenodo.org/record/1188976/files/Audio_Speech_Actors_01-24.zip
+$ unzip Audio_Speech_Actors_01-24.zip
+```
+
+Run the file
+```bash
+python3 process_database.py
+```
+
+Change to Nkululeko parent directory,
+
+```bash
+cd ../..
+```
+
+then, as a test, you might do
+
+```bash
+python3 -m nkululeko.nkululeko --config data/ravdess/exp_ravdess_os_xgb.ini 
+```
+
+Check the results in the results folder under Nkululeko parent directory.
+
+Just simple as that. Check your results and play with some parameters. If facing any problem, please open an issue in [our github](https://github.com/felixbur/nkululeko/).
diff --git a/docs/source/how_to.md b/docs/source/how_to.md
@@ -0,0 +1,57 @@
+# How to set up your first nkululeko project
+
+Nkululeko is a framework to build machine learning models that recognize
+speaker characteristics on a very high level of abstraction (i.e.
+starting without programming experience).
+
+This post is meant to help you with setting up your first experiment,
+based on the Berlin Emodb.
+
+1)  Set up python
+
+It's written in python, so first you have to set up a Python environment. 
+It is recommended to use Linux-based systems for easiness, but it should work on Windows as well. 
+The current version of nkululeko is tested with Python 3.8.5.
+
+2)  Get a database
+
+Load the Berlin emodb database to some location on you harddrive, as
+discussed in this post. I will refer to the location as "emodb root"
+from now on.
+
+3)  Install nkululeko
+
+Inside your virtual environment, run:
+
+    pip install nkululeko
+
+This should install nkululeko and all required modules. It takes a long
+time and a lot of space, when done intially.
+
+5)  Adapt the ini file
+
+Use your favourite editor, e.g., Visual Studio code and edit the file
+that defines your experiment. You might start with this demo sample. You
+can find more templates to start here and an overview on all the options
+you can set here
+
+Put the emodb root folder as the emodb value, for me this looks like
+this:
+
+    emodb = /home/felix/data/audb/emodb
+
+An overview on all nkululeko options should be here.
+
+6)  Run the experiment
+
+Inside a shell type (or use VSC) and start the process with:
+
+    python -m nkululeko.nkululeko --config exp_emodb.ini
+
+7)  Inspect the results
+
+If all goes well, the program should start by extracting opensmile
+features, and, if you\'re done, you should be able to inspect the
+results in the folder named like the experiment: exp\_emodb. There
+should be a subfolder with a confusion matrix named [images]{.title-ref}
+and a subfolder for the textual results named [results]{.title-ref}.
diff --git a/docs/source/ini_file.md b/docs/source/ini_file.md
@@ -0,0 +1 @@
+../../ini_file.md
diff --git a/docs/source/tutorial b/docs/source/tutorial
diff --git a/docs/source/usage.md b/docs/source/usage.md
@@ -0,0 +1,46 @@
+# Usage
+
+The main usage of Nkululeko is as follows:
+
+``` {bash}
+python -m nkululeko.[MODULE] --config [CONFIG_FILE.ini]
+# Example to run the experiment
+python -m nkululeko.nkululeko --config INI_FILE.ini
+```
+
+where [INI\_FILE.ini]{.title-ref} is a configuration file. The only file
+needed by the user is the INI file (after preparing the dataset).
+That\'s why we said this tool is intented without or less coding. The
+example of configuration file (INI\_FILE.ini) is given below. See [INI
+file](ini_file.md) for complete options.
+
+``` {ini}
+[EXP]
+root = ./results
+name = exp_emodb
+[DATA]
+databases = ['emodb']
+emodb = ./emodb/
+emodb.split_strategy = specified
+emodb.train_tables = ['emotion.categories.train.gold_standard']
+emodb.test_tables = ['emotion.categories.test.gold_standard']
+target = emotion
+labels = ['anger', 'boredom', 'disgust', 'fear', 'happiness', 'neutral', 'sadness']
+[FEATS]
+type = ['os']
+[MODEL]
+type = xgb
+```
+
+Besides [nkululeko.nkululeko]{.title-ref}, there are other functionalities. The completen functions are:
+
+:   -   **nkululeko.nkululeko**: doing experiments
+    -   **nkululeko.demo**: demo the current best model on command line
+    -   **nkululeko.test**: predict a series of files with the current
+        best model
+    -   **nkululeko.explore**: perform data exploration
+    -   **nkululeko.augment**: augment the current training data
+    -   **nkululeko.predict**: predict a series of files with a given
+        model
+
+See the [API documentation](ini_file.md) for more details.