- Download kaggle files from https://www.kaggle.com/c/text-normalization-challenge-russian-language/data to /input folder
- Download additional data set from https://storage.googleapis.com/text-normalization/ru_with_types.tgz to /input/ru_with_types folder
- Download missing files from https://drive.google.com/open?id=1eIWHqhc_HSa6IJsFXMuNsSe1eKMXukpU to /obj folder
- Run rus_base.ipynb
- You can run only parts 0 (imports) and 5 (main loop) to get the final result
- Parts 1-4 are for preparing frequency dictionaries which all saved as .pkl files in /obj folder
- Every cell contains running time information at the beginning