Skip to content

Commit

Permalink
add md files
Browse files Browse the repository at this point in the history
  • Loading branch information
bagustris committed Mar 7, 2024
1 parent ba20bbf commit 8baaaaf
Show file tree
Hide file tree
Showing 7 changed files with 204 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Nkululeko documentation

# How to run documentation locally
```bash
# Install requirements
$ pip install -r requirements.txt
$ make html
```

And then check the built html in `build/html/index.html`

```bash
$ firefox build/html/index.html
```
43 changes: 43 additions & 0 deletions docs/source/hello_world_aud.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Hello world!

Here is a hello world of using Nkululeko using dataset in [audformat](https://audeering.github.io/audformat/). This hello world is also
available in [Google
Colab](https://colab.research.google.com/drive/1GYNBd5cdZQ1QC3Jm58qoeMaJg3UuPhjw?usp=sharing#scrollTo=4G_SjuF9xeQf')
and
[Kaggle](https://www.kaggle.com/felixburk/nkululeko-hello-world-example).

In this setup, we will use the Berlin Emodb dataset. Check the `data/emodb` directory for the dataset and follow the instructions below.
0. Change the directory to the root of the project.

```bash
# Download using wget
wget https://zenodo.org/record/7447302/files/emodb.zip
# Unzip
unzip emodb.zip
# change to Nkululeko parent directory
cd ..
# run the nkululeko experiment
python -m nkululeko.nkululeko --config tests/exp_emodb_os_xgb.ini
```

Then, check the results in the `results` directory.

You can experiment with changing some paramaters in INI file. For instance, change the `type` of the model to `xgb` or `svm` and see how the results change.
<!-- -->

[EXP]
root = ./results
name = exp_emodb
[DATA]
databases = ['emodb']
emodb = ./emodb/
emodb.split_strategy = specified
emodb.train_tables = ['emotion.categories.train.gold_standard']
emodb.test_tables = ['emotion.categories.test.gold_standard']
target = emotion
labels = ['anger', 'boredom', 'disgust', 'fear', 'happiness', 'neutral', 'sadness']
[FEATS]
type = ['os']
[MODEL]
type = xgb
[PLOT]
43 changes: 43 additions & 0 deletions docs/source/hello_world_csv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Hello World with Pre-processed CSV Dataset

In the previous tutorial, we learned how to use Nkululeko with audformat dataset.
Since most dataset are not in audformat, we will learn how to use Nkululeko with pre-processed CSV dataset.

## Dataset Location

The dataset is assumed to be located in the `data` directory under `Nkululeko` root directory. The best practice is to store dataset in `/data/` or `/home/$USER/data/` directory and then make a symbolic link to each dataset in the Nkululeko `data` directory.
Here the example of downloading dataset into its location, doing pre-processing and running the experiment. The main idea of the pre-processing is to convert the dataset into the format that Nkululeko can understand. Usually, the pre-processing is done by running the `process_database.py` script. You can learn more about the pre-processing in each dataset directory (`/nkululeko/data`).

Let's start with the ravdess directory.

This `ravdess` folder is to import the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)
database to nkululeko.

We used the version downloadable from [Zenodo](https://zenodo.org/record/1188976)

Download and unzip the file Audio_Speech_Actors_01-24.zip
```bash
$ wget https://zenodo.org/record/1188976/files/Audio_Speech_Actors_01-24.zip
$ unzip Audio_Speech_Actors_01-24.zip
```

Run the file
```bash
python3 process_database.py
```

Change to Nkululeko parent directory,

```bash
cd ../..
```

then, as a test, you might do

```bash
python3 -m nkululeko.nkululeko --config data/ravdess/exp_ravdess_os_xgb.ini
```

Check the results in the results folder under Nkululeko parent directory.

Just simple as that. Check your results and play with some parameters. If facing any problem, please open an issue in [our github](https://github.com/felixbur/nkululeko/).
57 changes: 57 additions & 0 deletions docs/source/how_to.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# How to set up your first nkululeko project

Nkululeko is a framework to build machine learning models that recognize
speaker characteristics on a very high level of abstraction (i.e.
starting without programming experience).

This post is meant to help you with setting up your first experiment,
based on the Berlin Emodb.

1) Set up python

It's written in python, so first you have to set up a Python environment.
It is recommended to use Linux-based systems for easiness, but it should work on Windows as well.
The current version of nkululeko is tested with Python 3.8.5.

2) Get a database

Load the Berlin emodb database to some location on you harddrive, as
discussed in this post. I will refer to the location as "emodb root"
from now on.

3) Install nkululeko

Inside your virtual environment, run:

pip install nkululeko

This should install nkululeko and all required modules. It takes a long
time and a lot of space, when done intially.

5) Adapt the ini file

Use your favourite editor, e.g., Visual Studio code and edit the file
that defines your experiment. You might start with this demo sample. You
can find more templates to start here and an overview on all the options
you can set here

Put the emodb root folder as the emodb value, for me this looks like
this:

emodb = /home/felix/data/audb/emodb

An overview on all nkululeko options should be here.

6) Run the experiment

Inside a shell type (or use VSC) and start the process with:

python -m nkululeko.nkululeko --config exp_emodb.ini

7) Inspect the results

If all goes well, the program should start by extracting opensmile
features, and, if you\'re done, you should be able to inspect the
results in the folder named like the experiment: exp\_emodb. There
should be a subfolder with a confusion matrix named [images]{.title-ref}
and a subfolder for the textual results named [results]{.title-ref}.
1 change: 1 addition & 0 deletions docs/source/ini_file.md
Empty file removed docs/source/tutorial
Empty file.
46 changes: 46 additions & 0 deletions docs/source/usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Usage

The main usage of Nkululeko is as follows:

``` {bash}
python -m nkululeko.[MODULE] --config [CONFIG_FILE.ini]
# Example to run the experiment
python -m nkululeko.nkululeko --config INI_FILE.ini
```

where [INI\_FILE.ini]{.title-ref} is a configuration file. The only file
needed by the user is the INI file (after preparing the dataset).
That\'s why we said this tool is intented without or less coding. The
example of configuration file (INI\_FILE.ini) is given below. See [INI
file](ini_file.md) for complete options.

``` {ini}
[EXP]
root = ./results
name = exp_emodb
[DATA]
databases = ['emodb']
emodb = ./emodb/
emodb.split_strategy = specified
emodb.train_tables = ['emotion.categories.train.gold_standard']
emodb.test_tables = ['emotion.categories.test.gold_standard']
target = emotion
labels = ['anger', 'boredom', 'disgust', 'fear', 'happiness', 'neutral', 'sadness']
[FEATS]
type = ['os']
[MODEL]
type = xgb
```

Besides [nkululeko.nkululeko]{.title-ref}, there are other functionalities. The completen functions are:

: - **nkululeko.nkululeko**: doing experiments
- **nkululeko.demo**: demo the current best model on command line
- **nkululeko.test**: predict a series of files with the current
best model
- **nkululeko.explore**: perform data exploration
- **nkululeko.augment**: augment the current training data
- **nkululeko.predict**: predict a series of files with a given
model

See the [API documentation](ini_file.md) for more details.

0 comments on commit 8baaaaf

Please sign in to comment.