Author: I am Álvaro Orgaz Expósito, a Statistics student at UPC-UB (Barcelona, Spain) and this is my bachelor's degree thesis.
Advisors: Ana María Pérez Marín, Catalina Bolancé Losilla
Department: Econometrics, Statistics and Applied Economics
Academic year: 2017-2018
Abstract:
This bachelor’s degree thesis aims to develop a predictive analytics guide for credit fraud detection which combine theory and practice using the Big Data engine Spark.
The main objectives of the theoretical part are introducing the credit sector and doing research about the theory of most common algorithms used nowadays in the areas of machine learning, artificial intelligence and predictive analytics. The main aims of the practical part are applying the studied algorithms for predicting the probability of default with a dataset of customer characteristics and comparing the predictive behaviour between models.
In conclusion, this thesis will provide the necessary information for developing a predictive analytics project, from the business problem definition to the application of the optimal model.
Code:
The project has been developed in R mainly using libraries sparklyr and SparkR.