In this study, we are trying to analyze the pattern of fraudulent transactions by making use of different modelling techniques and using the results obtained from our study we intend to predict and prevent similar fraud cases in future. We performed feature engineering to develop new variables that helped us in our prediction. We have used three different model: logistic regression, random forest and XG boost for training our dataset and measured the performance of each model in terms of balanced accuracy, sensitivity and specificity.
According to latest report from Javelin Strategy & Research, “Around 15.4 million consumers were victims of identity theft or fraud last year and the total fraud costed whooping $16 billion”. It is therefore paramount to enhance the fraud detection mechanisms. Our objective is to predict future fraudulent activities with maximum accuracy using the available data.
Relevance: Digitalization in the financial sector has made it more vulnerable towards frauds Increase in number of mobile and online transactions Amount of Literature
Questions to be asked? Which sampling method works better for predicting and modeling of fraudulent activity Which modeling technique works best for fraud detection? Does the algorithm for mobile transactions and credit card transactions differ?
To predict fraud transaction several methods have been used that include recognizing customer spending behavior, tracking network data, using advanced modeling techniques, using biotechnology-based methods etc. In this experiment we first do a down sampling to create an approximately 50:50 split of fraud and non-fraud transactions. And developed three models mentioned above to test and predict fraud in Transactions. We obtained .99 F1 score and 99.9% accuracy