Skip to content

This is the thesis of my bachelor's degree in Statistics at UPC-UB (Barcelona, Spain).

Notifications You must be signed in to change notification settings

alvarorgaz/Guide-to-Spark-Machine-Learning-for-credit-scoring

Repository files navigation

Guide to Spark Machine Learning for credit scoring

Author: I am Álvaro Orgaz Expósito, a Statistics student at UPC-UB (Barcelona, Spain) and this is my bachelor's degree thesis.

Advisors: Ana María Pérez Marín, Catalina Bolancé Losilla

Department: Econometrics, Statistics and Applied Economics

Academic year: 2017-2018

Abstract:

This bachelor’s degree thesis aims to develop a predictive analytics guide for credit fraud detection which combine theory and practice using the Big Data engine Spark.

The main objectives of the theoretical part are introducing the credit sector and doing research about the theory of most common algorithms used nowadays in the areas of machine learning, artificial intelligence and predictive analytics. The main aims of the practical part are applying the studied algorithms for predicting the probability of default with a dataset of customer characteristics and comparing the predictive behaviour between models.

In conclusion, this thesis will provide the necessary information for developing a predictive analytics project, from the business problem definition to the application of the optimal model.

Code:

The project has been developed in R mainly using libraries sparklyr and SparkR.

About

This is the thesis of my bachelor's degree in Statistics at UPC-UB (Barcelona, Spain).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published