Academia.eduAcademia.edu

Outline

A graph-based, semi-supervised, credit card fraud detection system

2016, Studies in Computational Intelligence

https://doi.org/10.1007/978-3-319-50901-3_57

Abstract

Global card fraud losses amounted to 16.31 Billion US dollars in 2014 [18]. To recover this huge amount, automated Fraud Detection Systems (FDS) are used to deny a transaction before it is granted. In this paper, we start from a graph-based FDS named APATE [28]: this algorithm uses a collective inference algorithm to spread fraudulent influence through a network by using a limited set of confirmed fraudulent transactions. We propose several improvements from the network data analysis literature [16] and semi-supervised learning [9] to this approach. Furthermore, we redesigned APATE to fit to e-commerce field reality. Those improvements have a high impact on performance, multiplying Precision@100 by three, both on fraudulent card and transaction prediction. This new method is assessed on a three-months real-life e-commerce credit card transactions data set obtained from a large credit card issuer.

FAQs

sparkles

AI

How does the proposed graph-based approach improve fraud detection performance?add

The findings indicate that the RCTK method enhances performance metrics like Precision@100 threefold in comparison to traditional models, particularly by addressing hub node impacts.

What role does semi-supervised learning (SSL) play in the detection system?add

The research establishes that SSL leads to marked improvements in fraud detection accuracy, particularly when combined with techniques that mitigate hub node influence.

How does the system handle concept drift in fraudulent behavior?add

The paper employs a dynamic modeling approach that incorporates a time decay factor within the graph structure to adapt to evolving fraudulent activities.

What specific challenges does the system address regarding large datasets?add

The method is designed to process millions of transactions rapidly, adhering to the six-seconds rule by utilizing efficient algorithms and pre-computed graph features.

Why are merchant risk scores excluded in the final model?add

The research demonstrates that excluding merchant scores actually enhances performance, as new merchants tend to introduce unreliable risk assessments in the model's predictions.

References (30)

  1. Abdallah, A., Maarof, M.A., Zainal, A.: Fraud detection system. Journal of Network and Computer Applications 68, 90-113 (2016)
  2. Baesens, B., Van Vlasselaer, V., Verbeke, W.: Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection. Wiley Publishing (2015)
  3. Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: A comparative study. Decision Support Systems 50(3), 602-613 (2011)
  4. Bolton, R., Hand, D.: Statistical fraud detection: A review. Statistical science 17, 235-249 (2002)
  5. Bolton, R.J., Hand, D.J.: Unsupervised profiling methods for fraud detection. In: Proceedings of the Credit Scoring and Credit Control VII Conference, p. 235255 (2001)
  6. Brandes, U., Erlebach, T.: Network analysis: methodological foundations. Springer-Verlag (2005)
  7. Braun, F., Caelen, O., Smirnov, E., Kelk, S., Lebichot, B.: Improving card fraud detection through suspicious pattern discovery. Submitted for publication (2016)
  8. Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. MIT Press (2006)
  9. Dal Pozzolo, A.: Adaptive machine learning for credit card fraud detection. Ph.D. thesis, Universite Libre de Bruxelles (2015)
  10. Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi, C., Bontempi, G.: Credit card fraud detection and concept-drift adaptation with delayed supervised information. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1-8. IEEE (2015)
  11. Dal Pozzolo, A., Caelen, O., Le Borgne, Y.A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Expert System with Applications 10(41), 4915-4928 (2014)
  12. Demsar, J.: Statistical comparaison of classifiers over multiple data sets. Journal of Machine Learning Research 7 pp. 1-30 (2006)
  13. commerce Europe, E.: Global b2c e-commerce light report 2015 (2014). URL \https: //www.ecommerce-europe.eu/facts-figures/free-light-reports
  14. Fawcett, T., Provost, F.: Adaptive fraud detection. Data Mining and Knowledge Discovery 1, 291-316 (1997)
  15. Fouss, F., Francoisse, K., Yen, L., Pirotte, A., Saerens, M.: An experimental investigation of kernels on a graph on collaborative recommendation and semisupervised classification. Neural Networks 31, 53-72 (2012)
  16. Hara, K., Suzuki, I., Shimbo, M., Kobayashi, K., Fukumizu, K., Radovanovic, M.: Localized centering: Reducing hubness in large-sample data. In: Proceedings of the Association for the Advancement of Artificial Intelligence Conference, pp. 2645-2651 (2015)
  17. HSN Consultants, I.: The nilson report (2015). URL \https://www.nilsonreport. com/publication_newsletter_archive_issue.php?issue=1068
  18. Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer-Verlag (1976)
  19. Lebichot, B., Kivimaki, I., Franc ¸oisse, K., Saerens, M.: Semi-supervised classification through the bag-of-paths group betweenness. IEEE Transactions on Neural Networks and Learning Systems 25, 1173-1186 (2014)
  20. Mantrach, A., van Zeebroeck, N., Francq, P., Shimbo, M., Bersini, H., Saerens, M.: Semi- supervised classification and betweenness computation on large, sparse, directed graphs. Pattern Recognition 44(6), 1212 -1224 (2011)
  21. Newman, M.: Networks: an introduction. Oxford University Press (2010)
  22. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab (1999). Previous number = SIDL-WP-1999-0120
  23. Phua, C., Lee, V., Smith-Miles, K., Gayler, R.: A comprehensive survey of data mining-based fraud detection research. Computing Research Repository abs/1009.6119 (2010)
  24. Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research 11, 2487-2531 (2010)
  25. Radovanović, M., Nanopoulos, A., Ivanović, M.: On the existence of obstinate results in vector space models. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '10, pp. 186-193. ACM (2010)
  26. Theodoridis, S., Koutroumbas, K.: Pattern recognition, 4th ed. Academic Press (2009)
  27. Van Vlasselaer, V., Bravo, C., Caelen, O., Eliassi-Rad, T., Akoglu, L., Snoeck, M., Baesens, B.: Apate: A novel approach for automated credit card transaction fraud detection using network-based extensions. Decision Support Systems 75, 38-48 (2015)
  28. Weston, D.J., Hand, D.J., Adams, N.M., Whitrow, C., Juszczak, P.: Plastic card fraud detection using peer group analysis. Advances in Data Analysis and Classification 2(1), 45-62 (2008)
  29. Zaslavsky, V., Strizhak, A.: Credit card fraud detection using self-organizing maps. Information and Security 18, 48 (2006)
  30. Zhou, D., Bousquet, O., Lal, T., Weston, J., Scholkopf, B.: Learning with local and global consistency. In: Proceedings of the Neural Information Processing Systems Conference (NIPS 2003), pp. 237-244 (2003)