Currently, fraud detection is employed in numerous domains, including banking, finance, insurance, government organizations, law enforcement, and so on. The amount of fraud attempts has recently grown significantly, making fraud detection critical when it comes to protecting your personal information or sensitive data. There are several forms of fraud issues, such as stolen credit cards, forged checks, deceptive accounting practices, card-not-present fraud (CNP), and so on. This article introduces the credit card-not-present fraud detection and prevention (CCFDP) method for dealing with CNP fraud utilizing big data analytics. In order to deal with suspicious behavior, the proposed CCFDP includes two steps: the fraud detection Process (FDP) and the fraud prevention process (FPP). The FDP examines the system to detect harmful behavior, after which the FPP assists in preventing malicious activity. Five cutting-edge methods are used in the FDP step: random undersampling (RU), t-distributed stochastic neighbor embedding (t-SNE), principal component analysis (PCA), singular value decomposition (SVD), and logistic regression learning (LRL). For conducting experiments, the FDP needs to balance the dataset. In order to overcome this issue, Random Undersampling is used. Furthermore, in order to better data presentation, FDP must lower the dimensionality characteristics. This procedure employs the t-SNE, PCA, and SVD algorithms, resulting in a speedier data training process and improved accuracy. The logistic regression learning (LRL) model is used by the FPP to evaluate the success and failure probability of CNP fraud. Python is used to implement the suggested CCFDP mechanism. We validate the efficacy of the hypothesized CCFDP mechanism based on the testing results.
Keywords: fraud detection; fraud prevention; Big data analysis; t-SNE; PCA; SVD; LRL; RU; CNP
Nowadays, e-commerce is a regular and necessary element of everyday life. It enables instant payment for services and commodities. Nonetheless, for the vast majority of people, the process of transmitting money by air is a "black box." This condition invites scammers who seek to benefit unlawfully [[
Ref. [[
Ref. [[
Ref. [[
The following questions are addressed in this work: (
The main contributions are summarized as follows:
- ▪ The state-of-the-art techniques (RU, t-SNE, PCA, and SVD) are combined to address the persistent problem of card-not-present fraud. These techniques perform a quicker data training process and increase accuracy, which helps them detect fraud successfully.
- ▪ The exploratory data analysis and predictive modeling are carried out to reduce dimensionality by projecting each data point onto only the first few major components to obtain lower-dimensional data while retaining as much variation in the data as feasible.
- ▪ t-SNE reduces dimensionality by keeping similar and dissimilar instances separately to further increase the accuracy.
- ▪ LRL is used to evaluate the success and failure probability of CNP fraud. The interaction of predictor factors is simulated to predict the link between distinct lawful and illegitimate transactions.
Section 2 presents the salient features of existing approaches. Section 3 presents the proposed plan. Section 4 shows the implementation and experimental results. Section 5 discusses the significance of the result and limitations including suggestions for improvement. Section 6 concludes the entire paper.
The fundamental issue that arises when attempting to create a fraud detection and prevention algorithm is that the number of fraudulent transactions is insignificant in comparison to the number of valid transactions. According to various statistics, credit card fraud accounts for around 0.1% of all card transactions [[
In order to address the issue of credit card fraud, ref. [[
It scores transactions quickly and accurately, and it can detect new fraudulent activities. Principal Component Analysis gives a more thorough image of family members among exclusive traits while also being more adaptable. However, the risk remains of achieving a 'local' best rather than a recognized one. This risk might be reduced by repeating the "k means" technique several times with unique beginning clusters at the price of increasing execution time.
Ref. [[
Ref. [[
The incredibly imbalanced example sets are reduced by consolidating the cost-based inspecting technique in trademark space, yielding a dominant performance of extortion identification, a story exchanging highlight called exchanging entropy is proposed to recognize progressively complex extortion designs. A characteristic matrix is used to represent many transaction records, on which a convolutional neural network is trained to recognize a set of latent patterns for each sample. Trial findings from a commercial bank's actual trading data show that the suggested approach outperforms other conditions of craftsmanship strategies.
The issue of a false sense of security is addressed in [[
The authors of [[
Ref. [[
In [[
This research [[
Comparative research on data mining strategies for credit card fraud detection is undertaken in this work [[
Integration of multiple algorithms attempt to overcome the fraud detection of the card is proposed [[
In this section, we provide a description of our CCFDP mechanism. Our algorithm mainly focuses on solving the CNP committed fraud through online credit card transactions. The CCFDP provides automatic detection of the anomalies in the set of incoming transactions depicted in Figure 2. The detection involves two processes:
- ▪ Fraud Detection Process
- ▪ Fraud prevention Process
In order to detect the fraudulent activity of the credit card fraud, we will apply different types of rules. We will use the Logistic Regression algorithm to detect fraudulent activity. First of all, we apply the Random Undersampling (RU) method to balance our dataset. Next, we will train our model by using the dataset and log files of the user. Furthermore, when the model will be trained well enough, we start to apply it on new transactions. It compares the features of a new transaction with a history of user transactions, and if it finds anomalies it calls prevention process.
It describes methods for minimizing the number of variables in training data. It may be helpful to reduce dimensionality when working with high-dimensional data by projecting the data to a lower-dimensional subspace that captures the core of the data. The term high-dimensionality refers to input variables that have hundreds, thousands, or even millions of possible values. Fewer input dimensions may suggest fewer parameters or a simpler structure in the machine learning model, known as degrees of freedom. A model with many degrees of autonomy is prone to overfitting the training dataset and so performing badly on new data. Simple models that generalize well are preferable, as are input data with few input variables. This is especially true for linear models, which commonly relate the number of inputs and degrees of freedom.
It is difficult to reduce dimensionality because different components have different occurrence probabilities. A common issue is determining how to describe and reduce these variables. Thus, dimensionality reduction should be modeled. Let us assume there is training dataset
(
The result
(
The complexity of the features is composed of number of the variables in training data. The training data of a single variable are
(
(
The information entropy of the variables can be reduced linearly if they are independent of each other, but this is not the case. Multiple variables' training data are frequently related to one another. As a result, there is a need to use conditional self-information to reduce multiple variables at the same time, as given by:
(
It entails arbitrarily selecting examples from the majority class to be removed from the training dataset. This process is repeated until the desired class distribution is obtained. For example, an equal number of examples for each class is maintained. The RU is appropriate for dimensionality reduction features in datasets because the minority class has an uneven and adequate number of examples.
Let us take a hypothesis
1. 2. 3. 4. 5. Do 6. 7. 8. 9. 10. 11. 12. 13. 14. U ୮ Db 15. 16. Apply 17. 18. 19. Store Tr ⊂ Ds 20. 21. Write L 22. Make Tr
In Algorithm 1, the Step-1 shows the initialization of utilized variables. Steps-2-3 demonstrate the input and output procedures, respectively. Steps-4-11 initiate the random undersampling process to balance the dataset. Furthermore, it has been ensured that the high-dimensional dataset is reduced into a low-dimensional graph to retain the majority of the original data. Finally, feature duplication is also removed from the dataset. Step-12 starts both exploratory data analysis and predictive modelling. Step-13 uses the dataset and the logistic regression technique to train our model. Step-14 looks up the card owner in the database. Step-15 demonstrates that if the query result is not successful, then an alarm is given to Algorithm 2, otherwise the transaction is requested from the database. Steps-16-17 demonstrate the process of applying the proposed model on the given transaction to determine the nature of transaction (malicious or non-malicious). Step-18 determines if the model exhibits anomalous behavior, then it sends an alarm to Algorithm 2. Otherwise, Steps-19-20 record transactions in the dataset and end the process. Step-21 depicts the log writing procedure. Step-22 depicts the transactional process.
Undersampling removes irrelevant information. Since the precise probability cannot be determined. As a result, when using undersampling to drop information, it is best to first collect statistical data on the datasets used in the experiment. The difference between two datasets must be quantified. The degree of difference must be described. As a result, the Levenshtein distance can be used to quantify the degree of separation between two character strings (such as English characters). The Levenshtein distance is computed as follows:
(
Another time-based difference characterization is disagreement decay. Disagreement decay is defined as the probability that changes the value of an attribute s within time t. The symbol d represents this probability (s,t). The mathematical statistics can be used to characterize the probability distribution function of this probability given by:
(
The absolute value denotes the number of samples needed to generate an agreement decay. The agreement decay is the probability that an entity retains the same value of an attribute s over time t.
It reduces a high-dimensional dataset to a low-dimensional graph that retains the majority of the original data. It does this by placing each data point on a two- or three-dimensional map. This method finds data clusters, guaranteeing that an embedding keeps its meaning in the data. In this case, t-SNE reduces dimensionality by seeking to keep similar and dissimilar instances close together.
Here, t-SNE computes probabilities
For
(
Therefore,
The conditional probability
(
Therefore,
Hence
Where
It is the process of computing the primary components and then utilizing them to adjust the foundation of the data, frequently using only the top few primary components and dismissing the rest. PCA is used for both exploratory data analysis and predictive modelling. It is extensively used to reduce dimensionality by projecting each data point onto only the first few major components to obtain lower-dimensional data while retaining as much variation in the data as feasible. The direction that minimizes the variance of the anticipated data is defined as the first principal component. The major components are in charge of increasing the variation of the projected data. The standardization is especially important before PCA because the latter is very sensitive to the variances of the initial variables. That is, if the ranges of initial variables differ significantly, those with larger ranges will become dominate over those with small ranges (for example, a variable ranging from 0 to 100 will become dominate over a variable ranging from 0 to 1), resulting in biased results. As a result, converting the data to comparable scales can aid in avoiding this problem.
For each value of the variable, perform this mathematical operation by subtracting the mean and dividing by the standard deviation.
Let us assume
(
where
The matrix can be used to express the PCA is given by:
(
where
The accuracy of the principal component
(
where
Therefore, Figure 3a,b demonstrate the accuracy of PCA with 2 and 3 components. The result demonstrates that the accuracy with 2 components is achieved 97.76% and 99.49% with 2 components.
It is a method of matrix factorization that extends the eigenmode composition of a square matrix (n × n) to any matrix (n × m) (source). Thus, SVD can be obtained as:
(
where,
The left singular matrix
Implementing a strategy to prevent fraudulent transactions boosts the customer's confidence. In prevention process, the user sends a secret code to the user telephone, and if the code entered by the user is not the same as sent code, the transaction will be blocked, and the secret question will be sent. In addition, if the answer to the question is wrong, then the system will block the user. The Algorithm 2 describes the fraud prevention process.
1. 2. 3. 4. Send C to T 5. 6. Continue 7. 8. 9. Send Q to T 10. 11. Continue 12. 13. Else block U 14. 15. 16. Store Tr ⊂ Ds 17. Write L
In Algorithm 2, Step 1 shows the initialization of variables. Steps-2- 3 define input and output processes. Step 4 sends the secret code to the telephone. In Steps 5-8, the code verification process is described to ensure that code is reached the legitimate user. If the user is legitimate, then it can respond with the correct code to enter into the system. If the user is not legitimate, then the transaction will be blocked. Step-9 initiates the second process of determining the identity of the user by sending the secret question. Steps 10-12 ensure if the secret question is answered correctly, then the user continues the transaction process. Step-13 explains that if the secret question is not answered properly, then the transaction is blocked permanently. Steps-16-17 show that all of the information of transactions is securely stored in the database to improve the fraud detection model. All procedure is written to log.
The FPP involves the logic regression learning (LLP) that tries to quantify the relationship between a categorical dependent variable and one or more independent variables by plotting the probability scores of the dependent variables.
The FPP applies the LRL model to prevent the fraud transaction. The LRL simulates the interaction of predictor factors and a categorical response variable. We might, for example, utilize logistic regression to predict the link between distinct lawful and illegitimate transactions. Given a collection of variables, logistic regression estimates the likelihood of falling into a specific level of categorical response. It is necessary to calculate the log-odds that are given by:
(
where
(
where b is the base of the logarithm and exponent. From this, fraud prevention can be obtained as:
(
Thus, the logistic function can be defined as this, when there is a single explanatory variable x:
(
Based on the previous equation for probability of fraud prevention function, we can define the inverse of the logistic function, g, the logit:
(
which then is exponentiated and transformed to the following form:
(
The odds of the dependent variable equaling a case, which serves as a link function between the probability and the linear regression expression is defined by:
(
When there is a continuous independent variable, the odds ratio can be calculated as:
(
In addition, for multiple explanatory variables, the perdition can be defined as:
(
where m is explanators of multiple regression, from which we acquire as:
(
After considering the Logistic Regression function itself, it is important to know how the model for machine learning algorithm is constructed. Here is the generalized linear model function with parameter θ:
(
where X is independent variable, Y is random variable, which may be 1 or 0, it is exponential and from which we can acquire the conditional probability of input and output variables given the parameter θ:
(
We acquire the likelihood function assuming that all the observations in the sample are independently Bernoulli distributed:
(
The log likelihood is typically maximized with a normalizing factor N
(
which is maximized using gradient descent, one of the optimization techniques.
Assuming the (x, y) pairs are drawn uniformly from the underlying distribution, then in the limit of large N,
(
Fraudulent transaction initiating rate
(
Suppose the fraud attempting capability
(
□
This section covers the implementation and results.
The materials utilized during implementation are shown in Table 2. The platform was developed using Ubuntu 18.04 with the Visual Studio IDE Python 3.4.6. We rent clusters with 0.1-s uptime, 16 GB RAM, 256 GB ROM, and Intel Core i7 processors running at 2.4 GHz. MySQL 8.0.31 is the database management system of choice.
Figure 4 illustrates a typical case of the credit card fraud. It can be seen that the thief who steals the credit card tries to use the card credentials in order to make electronic purchases and pay it through Internet. Every purchase is known as a one transaction. The payments may be quite small, this is one of the features of the fraudulent behavior. As the thief does not know how much money he can utilize. Next, our proposed CCFDP model will analyze every transaction made by this card, compares with the previous owner behavior. In addition, when the model detects the fraudulent activity, the system sends to the owner's phone a verification code. If the thief fails to verify the code, the transaction will be aborted and then the system tries to verify the cardholder by his secret question, and if at that time the answer will be wrong, the credit card will be blocked, until the card owner unlocks it. Furthermore, the system sends alert message to the administrator.
In order to train our model we have found the dataset from the Kaggle "creditcard.csv". This file consists of 31 features. The dataset includes credit card transactions from September 2013. The dataset is unbalanced, with frauds making up 0.172% of all transactions in the positive class. It only has numeric input variables that have undergone Prompt Corrective Action (PCA) transformation. Unfortunately, we are unable to provide the original features and additional context for the data due to confidentiality concerns. The principal components obtained with PCA are the features
Based on this dataset, an interesting result have been obtained using proposed CCFDP. The Python Plotly library is used to draw the graphical results.
- ▪ Transaction amount distribution
- ▪ Normal VS Fraudulent transaction and transaction time
- ▪ Equally distributed class (legitimate VS fraudulent transactions)
- ▪ Detection of balanced and imbalanced correlation matrix
- ▪ Dimensionality feature reduction
- ▪ Validation rate
- ▪ Accuracy
The Figure 5 shows the imbalanced distribution of the data in the dataset. It has been the most important problem to be solved in order to detect fraudulent activity. The fraudulent cases are very few, and because of that, any algorithm can miscalculate that any transaction that is requested from the database will be normal. However, in reality, it is not the case.
Figure 6a,b show the distribution of transaction amount and distribution of transaction time, respectively.
Figure 7 shows a balanced dataset that is used to train the model. In order to acquire this kind of dataset, we have applied the random undersampling method. This process selects the data from the majority class at random and removing them from the training dataset. The majority of class instances are discarded at random in random under-sampling until a more balanced distribution is reached. The main idea of this experiment is to randomly grab the same number of legitimate transactions as a fraudulent transaction. Furthermore, after choosing them, we create a new data frame based the information.
Figure 8a,b show the results of the imbalanced and balanced correlation matrix. This result aims to identify all features of the dataset. It has been observed in Figure 8a that the imbalanced correlation coefficients are unnoticeable. On the other, Figure 8b shows a subsample of a balanced correlation matrix that is more noticeable in a balanced dataset, which helps to identify outliers and remove redundant data.
Figure 9a–c depict the transaction distribution on the coordinate position. The transactions are based on classification results that have been clustered in the dataset. For dimensionality feature reduction, this experiment employs the t-SNE, PCA, and SVD methods. The main goal of this experiment is to remove unnecessary features that aid in the reduction of complexity. Furthermore, the reduction in dimensionality results in less storage space and thus less computation time. Reduced misleading data can also improve model accuracy.
Figure 10a–d show the training and validation curves for the most advanced prediction classifiers. According to the results, the Logistic Regression produces the best results because the difference between cross-validation and training score is the smallest. As a result, we have decided to build our prediction model using this algorithm. The Logistic Regression produces a cross-validation rate of 93.35% and a training score of 94.31%. Other algorithms, on the other hand, produce lower results, such as the k-nears neighbors learning algorithm, which produces 92.84% cross-validation and 94.26% training score, and the support vector learning algorithm, which produces 93.76% cross-validation and 96.92% training score, and the decision tree classifier algorithm, which produces 91.98% cross-validation and 94.82% training score. According to the results, Logistic Regression has a smaller difference between cross-validation and training-score, which is found to be 0.96%, whereas other competing algorithms have larger differences. It has been observed that the support vector learning algorithm has a larger difference between cross-validation and training-score, which is 3.16%.
In the previous results, the feature reduction dimensionality of the dataset including balanced and imbalanced correlation is performed to determine the accuracy of the proposed CCFDP method. The suitability of the integrated algorithms/models in the proposed method confirmed how to detect normal and fraudulent transactions. The suitability of the proposed CCFDP method is assessed and compared with current state-of-the-art approaches (CATCHM [[
In statistical implication, model selection is typically performed using statistical performance metrics. The model that performs the best in terms of a chosen performance criterion is ultimately chosen from among a number of models that were trained with various sets of parameters and hyper-parameters.
It is an average squared deviation of forecasting from real amounts. Therefore, RMSE can be obtained as:
(
where
Relative root mean squared error (RRMSE) is frequently expressed as a percentage and is normalized by the mean of the real amounts. RRMSE smaller values are preferred. The RRMSE
(
The Mean Bias Error (MBE), which measures model error, is typically not used because high individual predicted values can also result in a low MBE. In order to determine whether any actions are required to rectify the model bias, the MBE is principally used to calculate the average bias in the model. The average bias in the estimation is captured by the MBE. Data from datasets are overestimated when there is a constructive bias or error in a variable, and vice versa. The average bias in the forecasting is measured using the MBE. The MBE
(
A metric to determine the likelihood that the prediction model can identify the proper direction of time series is provided by mean directional accuracy (MDA). Studies in macroeconomics and economics frequently employ this metric. The MDA
(
where
The statistical performance metrics (MDA, MBE, RRMSE, RMSE) are used to determine the performance of the proposed CCFDP and also compared with contending approaches CATCHM [[
The proposed LRL is integrated with the proposed CCFDP which considerably enhances detection accuracy. In order to solve the unbalanced dataset problem, the Random Undersampling approach is employed to build a new balanced data frame. It has been observed that Random Undersampling can employ more authentic data and help to solve the unbalanced situation. During implementation, fraud proportions of 1%, 2%, and 5% are taken and randomly selected the same size as legitimate transactions. The CCFDP method's performance is better than that of other current models from the accuracy perspective depicted in Table 4. All of the approaches considered are trained using the same training data and fraud rates. The result demonstrates that the proposed CCFDP detects fraud cases more accurately regardless of the proportion of available fraud rate in the dataset. The CCFDP outperforms all others on diverse sample sets. On the other hand, the proposed CCFDP has the capability to prevent fraud, whereas the contending methods do not have the capability to prevent fraud because they are designed for only fraud detection. The results shown in Table 5 validate the fraud prevention capability of the proposed method. The performance of the proposed CCFDP is also better than competing methods (CATCHM, LSTM-RNN, CSLMLE, CCFDM, ESDL, ITCCFD, and BTG from the perspective of the statistical performance metrics (MDA, MBE, RRMSE, RMSE).
The main reason for acquiring better accuracy is the use of integration of different modern techniques (RU, t-SNE, PCA, LRL, and SVD). These techniques perform a quicker data training process for increasing accuracy. Furthermore, these integrated techniques help to detect fraud successfully. PCA is employed to obtain lower-dimensional data while retaining as much variation in the data. The exploratory data analysis and predictive modeling are performed to reduce dimensionality by projecting each data point onto only the first few major components. In order to further improve accuracy, t-SNE is used to reduce dimensionality by keeping similar and dissimilar instances apart. LRL is also used to assess the success and failure probability of CNP fraud. In order to predict the relationship between various legitimate and illegitimate transactions, the interaction of predictor factors is simulated. However, the integration of modern techniques can increase the complexity. Thus, this issue has been resolved by using the dimensionality reduction feature process. This process helps to describe the procedure for reducing the number of variables in training data. When working with high-dimensional data, it may be useful to reduce dimensionality by projecting the data to a lower-dimensional subspace that captures the core of the data. The challenge encountered is the process of integration of different techniques.
The rise in fraudulent activity is massively increased due to e-banking transactions that put a burden on fraud control systems. In this research, we present a CCFDP method for detecting credit card fraud detection and prevention. The proposed method involves FDP and FPP. The FDP consists of four cutting-edge methods used in the FDP module: RU, t-SNE, PCA, and SVD processes. The FPP uses logistic regression learning. Furthermore, the Random Under Sampling approach has been employed to increase detection accuracy by balancing the number of fake samples with authentic ones. Different tests have been conducted on 1%, 2%, and 5% fraud proportions to demonstrate the efficiency of the proposed CCFDP. The results confirm that our proposed method has greater fraud detection accuracy. Furthermore, the accuracy of the proposed CCFDP is compared with the state-of-the-art (CATCHM, LSTM-RNN, CSLMLE, CCFDM, ESDL, ITCCFD, and BTG). Based on the comparison result, the proposed CCFDP outperforms current state-of-the-art methods. Moreover, the proposed CCFDP has the capability to prevent fraud as compared to counterpart methods. The result of fraud prevention accuracy also confirms the suitability of the proposed method.
In the future, we will assess the proposed CCFDP's time and space complexity, as well as other quality-of-service factors.
Graph: Figure 1 Credit card-not-present fraud transaction process.
Graph: Figure 2 Credit card fraud detection and prevention mechanism.
Graph: Figure 3 (a) shows the accuracy 97.76% with 2 components, and (b) shows the accuracy 99.49% with 3 components.
Graph: Figure 4 Showing the fraudulent transaction process.
Graph: Figure 5 An imbalanced dataset.
Graph: Figure 6 (a) Legitimate vs. fraudulent transaction and (b) the transactions and transaction time. (a) depicts both legitimate and fraudulent transactions. A maximum of 25,000 transactions were processed, with a maximum fraudulent transaction rate of 0.0178% observed. (b) depicts 18,000 transactions with transaction times for each. The transaction took a maximum of 0.000010 s to complete. The increasing time is due to an attack on the transaction during that time.
Graph: Figure 7 The balanced dataset using the random undersampling method.
Graph: Figure 8 (a) shows imbalanced convolutional correlation matrix, and (b) shows subsample of a balanced convolutional correlation matrix.
Graph: Figure 9 (a) shows the dimensionality feature reduction using t-SNE, while (b) shows the dimensionality feature reduction using PCA and (c) shows the dimensionality feature reduction using SVD.
Graph: Figure 10 (a) Learning and validation curves for logistic regression, (b) Learning and validating curves for k-nears neighbors learning algorithm, (c) Learning and validation curves for support vector learning algorithm, and (d) Learning and validation curves for decision tree classifier algorithm.
Graph: Figure 11 (a) Fraud detection accuracy with 1% fraud proportion, (b) fraud detection accuracy with 2% fraud proportion, (c) fraud detection accuracy with 5% fraud proportion, and (d) fraud prevention accuracy with 1–5% fraud proportions.
Graph: Figure 12 (a) Root Mean Square Error for the proposed CCFDP and contending approaches with maximum 45,000 transactions. (b) Relative root mean squared error for the proposed CCFDP and contending approaches with maximum 45,000 transactions. (c) Mean Bias Error for the proposed CCFDP and contending approaches with maximum 45,000 transactions. (d) Mean directional accuracy for the proposed CCFDP and contending approaches with maximum 45,000 transactions.
Table 1 The current contributions for addressing credit card fraud detection.
Works Algorithms for Credit Card Fraud Detection and Prevention Features/Strengths Deficiencies/Vulnerabilities Carcillo et al. [ Unsupervised credit card fraud detection Improves the model accuracy for credit card fraud detection. Increases the execution time and there is possibility of the risk Itoo, and Satwinder [ Evaluation of the multiple algorithms for credit card fraud detection Determining the best fraud detection algorithm that can help to protect the financial assets Failed to evaluate the analytical algorithms' correctness and effectiveness. The work lacks any uniqueness. Staar et al. [ Suggested a paradigm for fraud detection based on CNN Provides innate fraud behavior features. Additionally, trade entropy is employed for the transaction categorization accuracy Limited to fraud feature detection but failed to identify complete fraud transactions Park et al. [ CNN has been employed after the trade entropy and feature matrices are combined. proposes the use of entropy to identify more sophisticated extortion schemes. In addition, many transactions use a characteristic matrix, on which CNN is trained to detect the collection of latent patterns for each sample. The work is sample-only and not intended to identify actual fraud transactions. Balagolla et al. [ Addressed the issue of false sense of security using use a deep-learning system The algorithm is highly accurate at spotting new patterns and illegitimate certifications. Additionally, it reduces the time needed to identify new attacks from risky websites It is not specifically intended to identify fraudulent transactions. Fiore et al. [ A case study involving credit card fraud detection To standardize data, cluster analysis is employed. The results of employing cluster analysis and artificial neural networks for fraud detection have shown that clustering properties can reduce neural inputs. Limited to cluster analysis and failed to detect the credit card fraud detection Du et al. [ Suggested a fundamentally new model-based strategy that separates anomalies Since iForest uses sub-sampling, which was previously impossible, the algorithm has a linear time complexity with minimal constant and memory usage It is not specially designed for credit card fraud detection. Limited to determining sub-sampling Zhang et al. [ Method for authenticating a credit card Provides the procedure for transaction validation before a credit card transaction may be approved Failed to provide better accuracy for the credit card fraud detection West and Maumita [ Introduces computational intelligence-based solution for monetary fraud detection procedures The categorization techniques include critical components such as detection rules for detecting various forms of fraud. A higher success rate is obtained Numerous aspects of intelligent fraud detection have yet not been thoroughly studied Razaque et al. [ Paradigm is used to detect the anomalies A new matric profile is introduced for anomaly detection. The paradigm is applied to huge multivariate data sets and delivers high-quality approximation solutions in a reasonable amount of time with excellent accuracy It is restricted to medical domains Ghosh and Arijit [ Undertaken comparative research on data mining strategies for credit card fraud detection Investigating the performance of Random Forest and Support Vector Machines in conjunction with logistic regression to address the issue of an unbalanced dataset. Furthermore, it is proved that the Random Forest method outperforms SVM in terms of accuracy It is limited to comparison of two algorithms. No novelty and originality Belle et al. [ A novel network-based model called CATCHM for credit card fraud detection An inventive network design is employed for efficient inductive pooling and careful configuration of the downstream classifier. The accuracy of the proposed method is lower Roseline et al. [ LSTM-RNN method for analyzing the credit card fraud LSTM-RNN is introduced for detecting credit card fraud. This method reduces the likelihood of fraud. This approach is sensitive to various random weight initializations and easily overfits. Olowookere and Olumide [ Framework for combining the potentials of cost-sensitive learning and meta-learning ensemble techniques for fraud detection The framework is suggested to combine the potentials of cost-sensitive learning and meta-learning ensemble techniques for fraud detection. Additionally, it takes the approach of letting base-classifiers fit conventionally while incorporating cost-sensitive learning into the ensemble learning process It is restricted to fraud rates and lower accuracy is produced Asha and Suresh Kumar [ Integration of multiple algorithms for fraud detection Integration of multiple algorithms attempt to overcome the fraud detection of the card Increases the complexity and produced a lower accuracy Credit Card Fraud Detection and Prevention method for dealing with CNP fraud utilizing big data analytics Presents CCFDP consists of two processes for detecting credit card fraud: FDP and FPP. Furthermore, cutting-edge algorithms (RU, t-SNE, PCA, SVD) have been incorporated. The detection rate's efficacy is optimized based on the experimental findings The integration of the algorithms increases the time complexity, but it considerably increases the accuracy rate
Table 2 Used materials for conducting experiments.
Used Materials Description Cluster uptime 0.1 s Cluster time zone Almaty, Kazakhstan Cluster connection URL Connection proxy none Internal security False Platform Python 3.4.6 Operating system Ubuntu 18.04 RAM 16 Gb ROM 256 Gb Database MySQL Processor 2.4 GHz Intel Core i7 Environment Visual Studio
Table 3 The comparison of the proposed CCFDP with contending approaches using statistical metrics.
Methods RMSE RRMSE MBE MDA CATCHM [ 0.068% 9.48% 0.294% 99.05% LSTM-RNN [ 0.078% 9.77% 0.361% 99.24% CSLMLE [ 0.076% 8.61% 0.122% 99.11% CCFDM [ 0.095% 10.76% 0.554% 99.87% ESDL [ 0.072% 9.62% 0.321% 99.45% ITCCFD [ 0.059% 8.96% 0.192% 99.18% BTG [ 0.079% 10.38% 0.472% 99.54% Proposed CCFDP 0.056% 8.39% 0.078% 99.92%
Table 4 The fraud detection accuracy of the proposed CCFDP and contending methods.
Methods Accuracy with 1% Fraud Proposition Accuracy with 2% Fraud Proposition Accuracy with 5% Fraud Proposition CATCHM [ 99.72% 99.19% 97.97% LSTM-RNN [ 98.95% 98.81% 97% CSLMLE [ 99.84% 99.51% 98% CCFDM [ 99.32% 98.91% 97.88% Proposed CCFDP 99.94% 99.89% 99.72%
Table 5 The fraud prevention accuracy of the proposed CCFDP.
Methods Fraud Prevention Accuracy % CCFDP with 1% Fraud Proportion 99.99% CCFDP with 2% Fraud Proportion 99.98% CCFDP with 3% Fraud Proportion 99.95% CCFDP with 4% Fraud Proportion 99.94% CCFDP with 5% Fraud Proportion 99.90%
A.R., conceptualization, writing, idea proposal, software development, methodology, review, manuscript preparation, visualization, results and submission; M.B.H.F., data curation, software development, and preparation; M.A. (Muder Almiani), F.A. and N.Z.J., review, manuscript preparation, and visualization; G.B., A.A., M.A. (Majid Alshammari) and S.A., review, data curation and editing. All authors have read and agreed to the published version of the manuscript.
The data that support the findings of this research are publicly available as indicated in the reference. Additionally, the links are given as below: https://datahub.io/machine-learning/creditcard#data; https://github.com/nsethi31/Kaggle-Data-Credit-Card-Fraud-Detection; https://
The authors declare no conflict of interest.
The authors gratefully acknowledge the support of Mohsin Ali for providing insights.
By Abdul Razaque; Mohamed Ben Haj Frej; Gulnara Bektemyssova; Fathi Amsaad; Muder Almiani; Aziz Alotaibi; N. Z. Jhanjhi; Saule Amanzholova and Majid Alshammari
Reported by Author; Author; Author; Author; Author; Author; Author; Author; Author