Skip to content

Latest commit

 

History

History
9 lines (7 loc) · 642 Bytes

File metadata and controls

9 lines (7 loc) · 642 Bytes
  1. Download kaggle files from https://www.kaggle.com/c/text-normalization-challenge-russian-language/data to /input folder
  2. Download additional data set from https://storage.googleapis.com/text-normalization/ru_with_types.tgz to /input/ru_with_types folder
  3. Download missing files from https://drive.google.com/open?id=1eIWHqhc_HSa6IJsFXMuNsSe1eKMXukpU to /obj folder
  4. Run rus_base.ipynb
  5. You can run only parts 0 (imports) and 5 (main loop) to get the final result
  6. Parts 1-4 are for preparing frequency dictionaries which all saved as .pkl files in /obj folder
  7. Every cell contains running time information at the beginning