GEO-AI Challenge for Landslide Susceptibility Mapping

Submission by Isaac Oluwafemi Ogunniyi (zindi username: isaacOluwafemiOg) for the ITU GeoAI Landslide Susceptibility Mapping challenge

This solution makes use of five files:

4 data files (all from the data provided as part of the competition):
- 'Train.gpkg' which contains geometries classified as either susceptible or not.
- 'geological_faults.gpkg' and 'land_use_land_cover.gpkg' for extracting additional features about any specific land geometry that will aid in the prediction of its susceptibility to landslides.
- 'Test.gpkg' contains the data of locations for testing the performance of the model
A Jupyter notebook named 'GeoLandslide.ipynb' containing the code for handling data, training model and predicting the landslide susceptibility of the test locations.

The environment

The Jupyter notebook runs in a python environment. For a Google Colab CPU runtime type, it lasts under 40 minutes. No paid subscriptions or resources were used

The Jupyter Notebook

This notebook takes input of all 4 data files mentioned previously and outputs 'Submission.csv'.
To run the notebook the 'data_path'' variable must contain the path to the directory containing all the data files mentioned previously.
The notebook contains python code and the whole code runs on Google colab lasting for under 40 minutes.
The General Overview of the Notebook is as follows:

Importing Relevant libraries geopandas (0.13.2), pandas (1.5.3), numpy (1.23.5), scikit-learn (1.2.2), imbalanced-learn (0.10.1)
Reading and Cleaning Data
Engineering Features The features fed into the model are 38 and fall into 3 categories:
- Features from the train and test geometries which include the area ('area') of the geometry under consideration
- Features from the land-use-land-cover categories which make up the geometry for which the susceptibility is being predicted. This includes total area of each of the land-use-land-cover category (eg: 'area_31' indicates the size of lands with land-use-land-cover code of 31 ('cod_31') that fall within the geometry under consideration)
- Features from geological fault lines that lie within the geometry for which susceptibility is being predicted. This includes the cumulative length ('SHAPE_LEN') of the fault lines that falls within the geometry.
Developing Pipelines and training models
- Pipeline involved scaling (standardScaler), selecting best features (SelectKBest) and then fitting a RandomForestClassifier model
- Prior to fitting the model to the training data, balance was introduced to the number of landslide susceptible and landslide-proof observations by the help of the SMOTE technique.
- The model's hyperparameters were tuned using the RandomizedsearchCV
Predicting and preparing submission
- Predictions are made on the data imported from 'Test.gpkg' which has been transformed to include the same features as the train data.
- The final predictions are exported to 'Submission.csv' in the same directory as the jupyter notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

GEO-AI Challenge for Landslide Susceptibility Mapping

The environment

The Jupyter Notebook

Files

README.md

Latest commit

History

README.md

File metadata and controls

GEO-AI Challenge for Landslide Susceptibility Mapping

The environment

The Jupyter Notebook