Einzeltreffer — DigiBib

Background: Developing deep learning networks to classify between benign and malignant lung nodules usually requires many samples. Due to the precious nature of medical samples, it is difficult to obtain many samples.

Objective: To investigate and test a DCA-Xception network combined with a new data enhancement method to improve performance of lung nodule classification.

Methods: First, the Wasserstein Generative Adversarial Network (WGAN) with conditions and five data enhancement methods such as flipping, rotating, and adding Gaussian noise are used to extend the samples to solve the problems of unbalanced sample classification and the insufficient samples. Then, a DCA-Xception network is designed to classify lung nodules. Using this network, information around the target is obtained by introducing an adaptive dual-channel feature extraction module, and the network learns features more accurately by introducing a convolutional attention module. The network is trained and validated using 274 lung nodules (154 benign and 120 malignant) and tested using 52 lung nodules (23 benign and 29 malignant).

Results: The experiments show that the network has an accuracy of 83.46% and an AUC of 0.929. The features extracted using this network achieve an accuracy of 85.24% on the K-nearest neighbor and random forest classifiers.

Conclusion: This study demonstrates that the DCA-Xception network yields higher performance in classification of lung nodules than the performance using the classical classification networks as well as pre-trained networks.

Classification of lung nodules based on the DCA-Xception network

BACKGROUND: Developing deep learning networks to classify between benign and malignant lung nodules usually requires many samples. Due to the precious nature of medical samples, it is difficult to obtain many samples. OBJECTIVE: To investigate and test a DCA-Xception network combined with a new data enhancement method to improve performance of lung nodule classification. METHODS: First, the Wasserstein Generative Adversarial Network (WGAN) with conditions and five data enhancement methods such as flipping, rotating, and adding Gaussian noise are used to extend the samples to solve the problems of unbalanced sample classification and the insufficient samples. Then, a DCA-Xception network is designed to classify lung nodules. Using this network, information around the target is obtained by introducing an adaptive dual-channel feature extraction module, and the network learns features more accurately by introducing a convolutional attention module. The network is trained and validated using 274 lung nodules (154 benign and 120 malignant) and tested using 52 lung nodules (23 benign and 29 malignant). RESULTS: The experiments show that the network has an accuracy of 83.46% and an AUC of 0.929. The features extracted using this network achieve an accuracy of 85.24% on the K-nearest neighbor and random forest classifiers. CONCLUSION: This study demonstrates that the DCA-Xception network yields higher performance in classification of lung nodules than the performance using the classical classification networks as well as pre-trained networks.

Keywords: Lung nodule classification; Wasserstein Generative Adversarial Networks (WGAN); Xception; convolutional attention module; classifier

1 Introduction

Lung cancer is more common and has a higher mortality rate than most cancers, posing a significant threat to people's health and lives [[1]]. According to statistics, the 5-year survival rate of lung cancer patients is only 19% [[2]]. Because by the time their bodies exhibit clinical signs, they are already in the middle or late stages of the disease, and cancer has spread to organs other than the lungs, resulting in higher treatment costs and difficulty. Lung cancer mortality can now be reduced by 20% with early CT screening [[3]]. However, the screening process necessitates a high level of focus and expertise. For less experienced radiologists, this leads to highly variable detection rates and an increase in the rate of false-positive detection [[5]]. Therefore, the use of computer-aided diagnosis systems can avoid these problems.

For the problem of classifying benign and malignant lung nodules, some researchers have chosen to extract features manually, which are usually size [[6], [8]], texture [[9], [11]], histogram of oriented gradients (HOG) [[13], [15]], etc., and then classify them by classifiers such as support vector machines (SVM) [[16]], random forest [[17]], K-nearest neighbor (KNN) [[18]]. Because of the advantages of convolutional neural networks (CNN) in automatically extracting typical features, more and more researchers choose CNN to process medical images [[19]]. Liu et al. [[21]] used CNN networks to extract features from pulmonary nodules, dimensionality reduction of the extracted features using principal component analysis, and finally classified pulmonary nodules using SVM. Chae et al. [[22]] proposed a deep learning-based malignancy prediction model (CT-lungNet), and initially evaluated its performance and usefulness to human evaluators. Naik et al. [[23]] used FractalNet to classify lung nodules and demonstrated excellent classification performance, validated on the LUNA dataset, achieving 94.7%, 90.41%, 96.68%, and 0.98 scores for accuracy, specificity, sensitivity, and area under the receiver operating characteristic curve.

As the study progressed, the researchers found that classification lung nodules using deep neural networks required many samples. Due to the precious nature of medical images, which are difficult to obtain in large quantities, researchers have come up with the following solutions. Zhao et al. [[24]] proposed an Agile CNN model based on LeNet and AlexNet to overcome the challenges of small medical image databases and small nodules. Zhao et al. [[25]] implemented three strategies, including modifying some state-of-the-art CNN architectures, integrating different CNNs, and using migration learning. Nobrega et al. [[26]] explored the performance of deep transfer learning in the classification of lung nodules malignancy. It was found that the pairing of ResNet50 as a feature extractor and SVM as a classifier worked best. In addition to the above methods, Xie et al. [[27]] proposed a semi-supervised adversarial classification (SSAC) model which can be trained by using labeled and unlabeled data. Through collaborative learning based on multi-view knowledge, the MK-SSAC model was evaluated on the LIDC-IDRI dataset with an accuracy of 92.53% and an AUC of 0.958. Onishi et al. [[28]] used many nodule images generated by GAN to train the DCNN and then fine-tuned it using a small number of actual nodule images to enable the DCNN to distinguish between benign and malignant nodules. This method determined 66.7% of benign nodules and 93.9% of malignant nodules. Table 1 shows the existing methods for classifying lung nodules and their classification performance.

Table 1 Overview of the existing classification methods

Reference	Database	Method	Performance metrics
Liu et al. [21]	LIDC-IDRI	CNN+SVM	ACC = 91.94%
Chae et al. [22]	Chonbuk National University Hospital	CT-lungNet	AUC = 0.85
Naik et al. [23]	LUNA	FractalNet	ACC = 94.7%
			TNR = 90.41%
			TPR = 96.68%
			AUC = 0.98
Zhao et al. [24]	LIDC-IDRI	Agile CNN	ACC = 82.2%
			AUC = 0.877
Zhao et al. [25]	LIDC-IDRI	Transfer learning	AUC = 0.94
		ResNet	TPR = 94%
			ACC = 85%
Nobrega et al. [26]	LIDC-IDRI	Transfer learning	ACC = 88.41%
		ResNet50+SVM	AUC = 0.932
Xie et al. [27]	LIDC-IDRI	SSAC	ACC = 92.53%
			AUC = 0.958
Onishi et al. [28]	Fujita Health University Hospital	GAN+DCNN	TNP = 66.7%
			TPR = 93.9%
Sun et al. [29]	LIDC-IDRI	DBN	ACC = 81%
Xie et al. [30]	LIDC-IDRI	MV-KBC	ACC = 91.60%
			AUC = 0.957
Tran et al. [31]	LUNA16	CNN+Focal loss	ACC = 97.2%
			TPR = 96.0%
			TNR = 97.3%
Ali et al. [32]	LUNGx	Transfer learning	ACC = 90.46±0.25%
Wang et al. [33]	JSRT	Transfer learning	TPR = 95.41%
		InceptionV3	TNR = 80.09%
Mastouri et al. [34]	LUNA16	BCNN	ACC = 91.99%
			AUC = 0.959

In this paper, to address the problems of insufficient samples and class imbalance, first CT slices are randomly cropped. Then, a Wasserstein Generative Adversarial Network (WGAN) with conditions is used to balance the class of the samples, and finally, five data augmentation methods such as flipping, rotating, and adding Gaussian noise are used to extend the dataset. As can be seen from Table 1, for the lung nodule classification problem, many researchers have used pre-trained networks to classify lung nodules. Still, the scalability, flexibility, and generalization of pre-trained networks are poor compared to customized networks. Therefore, in this paper, a DCA-Xception network is designed, which has higher flexibility and better classification performance than the fine-tuned pre-trained network. The network obtains information around the target through an adaptive dual-channel feature extraction module. The convolutional block attention module enables the network to learn effective features in a targeted manner.

2 Materials and methods

Figure 1 shows the workflow of the DCA-Xception network for lung nodule classification. As can be seen in the figure, the regions of interest are first extracted from the collected data, followed by expanding the data using WGAN with conditions and data augmentation, followed by inputting the patches into the network for classification of lung nodules. There are two ways of classification, the first one directly uses the improved network to classify lung nodules, i.e., the fully connected layer acts as a classifier. In a second way, the output of the last global average pooling layer of the network is used as the feature result. Subsequently, the feature input for the six classifiers (logistic, multinomial Bayesian [[35]], multilayer perceptron (MLP) [[36]], SVM, KNN, and random forest) for training and testing, and finally, the classifier that works best with the network proposed in this paper is selected for classification. The first way is more convenient and consumes fewer computational resources, while the second way is complex but is more accurate in classifying lung nodules. In the application, the appropriate classification way can be chosen according to the needs and conditions.

Graph: Fig. 1 Workflow based on DCA-Xception for lung nodule classification.

2.1 Dataset

The dataset used in this paper is the LIDC-IDRI public dataset, in which four physicians classify the malignancy of lung nodules. The degree of malignancy is classified into five categories, where categories one and two indicate benign, category three indicates indeterminate benignity, and categories four and five indicate malignancy. In this paper, nodules that are jointly identified by three or more physicians are selected, and their malignancy is determined by the average of the physician-labeled categories. A nodule is considered benign when the mean value of that nodule category is less than or equal to 2.5, nodules between 2.5 and 3.5 are removed, and nodules greater than or equal to 3.5 are considered malignant. A total of 326 nodules are selected, of which 274 nodules (154 benign and 120 malignant) are used for the training and validation sets, and 52 nodules (23 benign and 29 malignant) are used for the test set.

To make full use of the data, all CT scans slices covering the nodules are utilized, and each slice is treated as a sample. The total number of the generated slices and their distribution are shown in Table 2 Initial. In addition, patches with a size of 64×64, including the pulmonary nodules, are extracted after establishing the location of the pulmonary nodules based on the XML file as well as ground truth labels. The nodules are not concentrated in the center of the patches but are located at arbitrary positions. Four patches are extracted for each slice of 274 lung nodules using the above method to obtain lung nodules in different contexts. The number of patches obtained is divided into the training and validation sets by 9:1, as shown in Table 2 Original. Patches are extracted once for malignant nodule slices in the test set, and benign nodule slices are extracted in multiple rounds until their number approaches the number of malignant nodule patches, as shown in Test in the last column of Table 2.

Table 2 Specific settings for each stage of the data set

Class	Initial			Original		Balance		Augmentation
	Train	Val	Test	Train	Val	Train	Val	Train	Val	Test
Benign	598	66	81	2391	265	3626 (1235)	402 (137)	25382	2808	196
Malignancy	790	88	197	3162	350	3632 (470)	400 (50)	25424	2800	197
Total	1388	154	278	5553	615	7258 (1705)	802 (187)	50806	5608	393

In CT images, the number of slices of malignant nodules is usually much larger than the number of slices of benign nodules, which results in a smaller sample size than that of malignant nodules even though the number of benign nodules is high. To balance the sample, more benign nodules or fewer malignant nodules are needed. However, because access to medical images is protected by personal privacy and laws, it is difficult to obtain many samples. If the number of malignant nodules is reduced, the diversity of samples will weaken. Using GAN [[37]] can solve the above problems, but the model tends to collapse during training, and the generated sample classes are often uncontrollable. Therefore, in this paper, WGAN [[38]] is chosen to generate patches to make the network more stable during the training process, and constraints are added to enable the network to generate patches of specified classes. The structure of WGAN with conditions is shown in Fig. 2. The number of patches after class balancing is shown in Balance in Table 2, where the number in parentheses is the number of patches generated by the WGAN with condition.. The generated patches are shown in Fig. 3(b).

Graph: Fig. 2 Structure of WGAN with conditions.

Graph: Fig. 3 Example of patch images.

To support deeper neural network training, as well as to avoid overfitting, this paper uses five data augmentation methods to expand the dataset again. The data augmentation methods include flipping up and down, flipping left and right, rotating counterclockwise (90°, 180°, 270°), randomly changing brightness, and adding Gaussian noise. The number of patches after data augmentation is shown in Table 2 Augmentation. An example of using five data augmentations is shown in Fig. 3(c).

2.2 Model structure

Compared to classification networks of the same depth, the Xception network utilizes depth-wise separable convolution, which allows the network to have a smaller number of parameters at a better classification performance. However, the network is prone to misclassify the class of lung nodules due to its lack of ability to combine information around the target and extract effective features. Therefore, in this paper, the Xception network is improved to enhance its classification performance for lung nodules. The improved network structure is shown in Fig. 4. The main improvements are the addition of an adaptive dual-channel feature extraction module and a convolutional block attention module to the middle flow of the Xception network.

Graph: Fig. 4 DCA-Xception structure.

Only the 3×3 depth-wise separable convolution is utilized in the middle flow of the Xception network, which obtains single feature information and cannot effectively combine the information around the target. Therefore, in this paper, an adaptive dual-channel feature extraction module is added to the network, whose structure is shown in middle flow part A in Fig. 4. As seen from the figure, the module first performs N×N and 3×3 dual-channel feature extraction on the input features. It then adds adaptive coefficients α to the features extracted by the convolution kernel of N×N. Then the features extracted by the dual-channel are combined, the information is integrated, and the network parameters are reduced by 1×1 convolution. Finally, new features are extracted with a 3×3 depth-wise separable convolution. This paper uses two sets of adaptive dual-channel modules with N of 1 and 5, respectively. The network can use different receptive fields to obtain features and efficiently combine information around the target. Also, the residual structure is used to prevent gradient disappearance and explosion.

The network learns intricate features during the training process, but not all these features are valid. Therefore, the convolutional block attention module [[39]] is introduced, which can increase the weight of useful features and weaken the weight of useless features during the training process and has the effect of learning for useful features. The structure of the convolutional block attention module is shown in Fig. 4 CBAM. As can be seen in part B of the figure, first, 1×1 convolution is used to reduce the parameters, and then two layers of 3×3 depth-wise separable convolution are used for feature extraction. Finally, the spatial attention module and the channel attention module are used to divide the weights of the learned features.

In addition to the above improvements, to inherit the advantages of the small parameters of the Xception network, some adjustments are made to the parameters of the improved middle flow, mainly the number of repetitions of the middle flow is adjusted from 8 to 4, and the number of channels is reduced from 728 to 512.

2.3 Training Strategies

The patch size of the input model is 64*64, and the data is normalized before input to improve the network convergence speed. The algorithm is experimented using the Adam optimization algorithm with a batch size of 64 and a learning rate of 0.001 on a GPU (NVIDIA Tesla K80 16GB). Adaptive decay of the learning rate is chosen. The loss values of the validation set are detected, and the learning rate decayed tenfold when two consecutive epochs (794 iterations per epoch) are no longer decreasing. To save computational resources and to prevent the network from overfitting, the early stopping strategy is chosen. The network stops training when the loss value of the validation set no longer decreases for five consecutive epochs.

2.4 Evaluation metrics

To be able to backpropagate the gradients corresponding to each category in a stable manner, the problem of gradient disappearance during back propagation of the network is effectively solved. Therefore, in this paper, cross-entropy is used as the loss function, as shown in Equation (1). The smaller the loss function is, the better the classification performance of the model and the stronger the robustness.

(1) $Loss = \frac{1}{N} \sum_{i = 1}^{N} (gi \cdot ln (pi) + (1 - gi) \cdot ln (1 - pi))$

In the formula, gi is the true class of sample i, and pi is the prediction of the network for sample i.

The evaluation metrics accuracy (ACC) Equation (2), F-Score Equation (3), specificity (TNR) Equation (4), sensitivity (TPR) Equation (5), and area under the receiver operating characteristic curve (AUC) Equation (6) are used to measure the classification performance of the network, and the formulas are shown below.

(2) $Accuracy = \frac{(TP + TN)}{(TP + TN + FP + FN)}$

(3) $F - Score = 2 . \frac{TP}{FP + 2 TP + FN}$

(4) $TNR = \frac{TN}{(TN + FP)}$

(5) $TPR = \frac{TP}{(TP + FN)}$

where TP, TN, FP, FN indicate the number of true positives, true negatives, false positives, false negatives, respectively.

(6) $AUC = \int_{0}^{1} t_{pr} (f_{pr}) {df}_{pr} = P (X_{p} > X_{n})$

The true positive rate tpr is a function of the false positive rate fpr along the receiver operating characteristic curve, and Xp and Xn are the confidence scores for a positive and negative instance, respectively.

3 Experiments and results

Figure 5 shows the changes in the loss values and accuracy rates of the training and validation sets of the model proposed in this paper during the training process. From the figure, it shows that the accuracy rates of both the training and validation sets are increasing smoothly. The loss values of the training set decrease smoothly and converge gradually, but the loss values of the validation set show slight fluctuations and level off gradually after the 7th epoch. The lower the loss value is, the better the generalization performance of the model, so the weights under the 5th epoch are chosen to test the model's performance.

Graph: Fig. 5 Curves of loss and ACC.

3.1 Ablation experiments

Table 3 shows the experimental results of the ablation experiments. As can be seen from Table 3, A has poor testing results on the original dataset using the original model, which has less than 50% TNR. B uses WGAN with conditions for category balancing on the dataset while expanding the sample size by a small margin. Since the stationary features generated by WGAN make the network easier to learn, therefore, all the metrics in the experimental results are improved except TPR, in which the TNR is improved by 23.17%. C used five data augmentation methods, which greatly increase the amount of data. The network is less likely to be over-fitted and can learn more features. Compared with B, the AUC is improved by 0.098. D builds on the model of C by cutting the middle flow to 4 times and introducing an adaptive dual-channel feature extraction module.

Table 3 Ablation experiments

Model	Data	ACC (%)	AUC	F-Score (%)	TNR (%)	TPR (%)
A.Xception	Original	64.38	0.722	69.43	47.96	80.71
B.Xception	Balance	72.01 (+7.63)	0.790 (+0.068)	72.22 (+2.79)	71.43 (+23.47)	72.59 (–8.12)
C.Xception	Augmentation	78.88 (+6.87)	0.888 (+0.098)	80.56 (+8.34)	70.41 (–1.02)	87.31 (+14.72)
D.+Channel	Augmentation	80.66 (+1.78)	0.904 (+0.016)	82.08 (+1.52)	72.96 (+2.55)	88.32 (+1.01)
E.+Attention	Augmentation	83.46 (+2.80)	0.923 (+0.019)	84.92 (+2.84)	73.98 (+1.02)	92.89 (+4.57)

Thus, the design of the adaptive dual-channel feature extraction module enables the model to combine the surrounding information when classifying the target and classify the lung nodules more accurately. In this case, the model achieves the optimal value of each index compared with A, B, and C. E is based on D and introduces the convolutional block attention module, enabling the model to learn features more accurately through spatial attention and channel attention module. This improvement makes the classification performance of the model improve again. Among them, the TPR improves by 4.57% to 92.89%.

3.2 Comparison with classical classification networks

In this paper, the without pre-trained LeNet [[40]], AlexNet [[41]], VGG16 [[42]], InceptionV3 [[43]], ResNet50 [[44]], and Xception [[45]] networks are selected as the comparison networks for the classification way 1 proposed in this paper. To make the models converge faster, all models include batch normalization (BN) layers. The above models' training strategies are the same as the network training strategies in this paper. The experimental results are shown in Table 4, where the bolded font indicates the optimal values. Figure 6 shows the receiver operating characteristic (ROC) curve of each network. It can be seen in combination with Table 4 and Fig. 6 that the proposed method achieves the optimal value in all the indexes except TNR, which fully demonstrates the effectiveness of this network for lung nodule classification. In addition, the resulting confusion matrix of DCA-Xception in this paper is shown in Fig. 7.

Table 4 Comparison with typical classification networks

Model	ACC (%)	AUC	F-Score (%)	TNR (%)	TPR (%)
LeNet	64.63	0.695	60.17	76.02	53.30
AlexNet	75.06	0.837	77.00	75.06	66.84
VGG16	75.83	0.841	77.11	70.41	81.22
ResNet50	78.12	0.846	78.92	74.49	81.73
InceptionV3	79.39	0.890	79.70	78.06	80.71
Xception	78.88	0.888	80.56	70.41	87.31
Proposed	83.46	0.923	84.92	73.98	92.89

Graph: Fig. 6 Comparison of the ROC curves of classification way 1 and classical classification network.

Graph: Fig. 7 Confusion matrix.

3.3 Comparison with pre-trained network

In this paper, the pre-trained networks VGG19, RestNet101, InceptionV3, MobileNetV2 [[46]], Xception, Inception-ResNetV2 [[47]], and DenseNet121 [[48]] on ImageNet are selected as feature extractors. Then logistic, multinomial Bayesian, multilayer perceptron (MLP), SVM, KNN, and random forest classifiers are used to perform the classification. The training steps are as follows: First, a fully connected layer is added after the convolutional base of the pre-trained model and fine-tuned. Then the fully connected layer is removed, and the output of the global average pooling layer is used as the result of the features. These features are then input to each classifier for training. Finally, a test set is used to verify the classification performance of the model under each classifier and to compare it with the proposed classification way 2 in this paper.

The experimental results are shown in Table 5, where the best values of each metric in all models in bolded font. Figure 8 shows the ROC curve for each pre-trained network. Combining Table 5 and Fig. 8, it shows that the fine-tuned pre-trained network has better TPR, but its TNR is lower. The proposed method achieves the best values for ACC, AUC, and F-Score, indicating that the model outperforms the above pre-trained models in classification and achieves a better balance of TPR and TNR.

Table 5 Comparison with pre-training network

Model	Classifier	ACC (%)	AUC	F-Score (%)	TNR (%)	TPR (%)
VGG19	Logistic	76.34	0.834	79.47	61.22	91.37
	Multinomial NB	78.88	0.879	81.18	66.84	90.86
	KNN	77.61	0.817	80.44	63.27	91.88
	Random Forest	76.34	0.859	79.37	61.73	90.86
	SVM RBF	79.13	0.883	81.61	65.31	87.31
	MLP	76.08	0.776	78.92	62.76	89.34
RestNet101	Logistic	79.64	0.885	81.40	70.41	88.83
	Multinomial NB	78.37	0.803	80.46	67.86	88.83
	KNN	81.68	0.860	81.91	80.61	82.74
	Random Forest	80.15	0.855	81.34	73.98	86.29
	SVM RBF	80.15	0.877	81.52	72.96	87.31
	MLP	83.21	0.834	83.82	79.59	86.80
InceptionV3	Logistic	79.13	0.902	81.70	65.31	92.89
	Multinomial NB	79.13	0.795	81.86	64.29	93.91
	KNN	79.64	0.815	81.90	67.35	91.88
	Random Forest	79.64	0.828	82.14	65.82	93.40
	SVM RBF	80.15	0.873	82.74	65.31	94.92
	MLP	79.90	0.809	81.84	69.39	90.36
Inception-ResNetV2	Logistic	80.15	0.895	82.66	65.82	94.42
	Multinomial NB	78.88	0.821	81.68	63.78	93.91
	KNN	80.92	0.848	82.99	68.88	92.89
	Random Forest	81.42	0.886	82.28	71.43	91.37
	SVM RBF	81.17	0.868	83.37	67.35	94.92
	MLP	81.93	0.816	83.75	70.92	92.89
MobileNetV2	Logistic	79.13	0.865	81.11	68.88	89.34
	Multinomial NB	79.64	0.798	81.04	72.45	86.80
	KNN	79.90	0.845	81.50	71.43	88.32
	Random Forest	80.66	0.859	81.99	73.47	87.82
	SVM RBF	79.90	0.856	81.41	71.94	87.82
	MLP	79.39	0.813	80.58	73.47	85.28
Xception	Logistic	80.15	0.903	82.02	69.90	90.36
	Multinomial NB	80.15	0.803	82.11	69.39	90.86
	KNN	80.66	0.889	82.73	68.88	92.39
	Random Forest	83.21	0.861	85.07	70.92	95.43
	SVM RBF	80.15	0.899	82.27	68.37	91.88
	MLP	80.66	0.835	82.88	67.88	93.4
DenseNet121	Logistic	80.66	0.869	81.64	75.51	85.79
	Multinomial NB	78.88	0.799	80.65	69.90	87.82
	KNN	80.66	0.840	81.09	78.57	82.74
	Random Forest	81.17	0.858	82.46	73.98	88.32
	SVM RBF	80.41	0.876	81.27	76.02	84.77
	MLP	80.92	0.852	82.52	71.94	89.85
Proposed	Logistic	83.72	0.923	85.05	75.00	92.39
	Multinomial NB	83.21	0.837	84.65	73.98	92.39
	KNN	85.24	0.883	86.06	79.59	90.86
	Random Forest	85.24	0.871	85.99	80.10	90.35
	SVM RBF	84.22	0.918	85.24	77.55	90.86
	MLP	84.99	0.912	85.85	79.08	90.86

Graph: Fig. 8 Comparison of the ROC curves of classification way 2 and pre-trained network.

4 Discussion

Table 6 shows the changes in the number of parameters, training speed, and inference speed after adding the adaptive dual-channel feature extraction module and the convolutional block attention module. As can be observed in the table, the improved model improves accuracy and AUC, with ACC increasing by 4.58% and AUC increasing by 0.035. Due to the reduced number of middle flow repetitions, the number of parameters of the model and the model complexity are slightly reduced. However, the training speed and inference speed of the model become slower due to more element-wise operations. For the classification of lung nodules, the importance of classification accuracy is greater than inference speed. Therefore, this model is acceptable to lose a small amount of inference speed while improving the classification accuracy. In addition, this paper explores the effect of the variation of N in the dual-channel feature extraction module on the classification performance. From Table 6, it can be seen that when N is both 1 or 5, the network obtains limited information around the target, which improves the classification performance of the network less. When N is 1,7 or 1,5, the network can perform better in classifying lung nodules because it receives more information about the target. Combining the indicators, the best classification performance of the network for pulmonary nodules is obtained when N is 1,5.

Table 6 Number of participants and speed

Model	N	ACC%	AUC	Total params	FLOPs	Ms/step	FPS
Original	None	78.88	0.888	20,890,736	20,864,641	97	18.54
+Channel	1,5	80.66	0.904	17,857,176	17,832,841	102	14.66
+Attention	1,5	83.46	0.923	20,258,852	20,227,401	114	13.63
DCA-Xception	1,1	81.93	0.915	20,209,699	20,178,249	114	13.87
DCA-Xception	5,5	81.93	0.911	20,308,003	20,276,553	115	13.49
DCA-Xception	1,7	83.72	0.911	20,308,003	20,276,553	115	13.51

Through a series of experiments described above, it is shown that the proposed method in this paper produces satisfactory results on the LIDC-IDRI dataset. Compared with classical classification networks and pre-trained networks in the same dataset, it has better performance in classifying benign and malignant lung nodules while maintaining a smaller number of parameters. In addition, the model proposed in this paper has better structural flexibility. The number and size of the convolutional kernels of the adaptive dual-channel feature extraction module can be changed to cope with different data sets. In this paper, two classification ways are utilized. The main difference is that classification way 1 utilizes fully connected layers for classification, while classification way 2 utilizes various classifiers for classification. Experiments show that using classifiers for classification will improve the accuracy of classification. Therefore, the classifiers can be optimized to further improve the classification accuracy. One of the limitations of this paper is the exclusion of benign and malignant easily confused pulmonary nodules between 2.5 and 3.5, so the classification performance of the proposed method for this type of nodules needs to be further investigated. In addition, since pulmonary nodule lesions are present in multiple CT slices, the method in this paper only targets the lesion regions in individual slices, ignoring the association between images of adjacent slices. Therefore, using the pixel association between lung nodule lesions in adjacent slices to improve the classification accuracy will be the next research focus.

5 Conclusion

In this paper, to address the problem of insufficient samples and unbalanced categories of lung nodules, five data augmentation methods and WGAN with conditions are utilized to expand the samples. An adaptive dual-channel feature extraction module and a convolutional block attention module are introduced in the middle flow of Xceotion to improve the classification performance of the model. The computational complexity, number of parameters, training speed and, inference speed of the DCA-Xception network are researched. The lung nodule classification performance is compared with that of classical classification networks and pre-trained networks to verify the effectiveness of the improved network. The experimental results show that the network outperforms the traditional classification network and the pre-trained network in classifying lung nodules. In summary, the method proposed in this paper, which has a better performance on the lung nodule classification task, can provide effective diagnostic decision support for physicians.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NO.51975170), Youth Innovation Fund of Heilongjiang Academy of Sciences (NO.CXJQ2020WL01), Basic Applied Technology of Heilongjiang Institutes Research Special Project (NO.ZNJZ2020WL01), Natural Science Foundation of Heilongjiang Province (NO.LH2019F024).

References 1 Bray F., Ferlay J., Soerjomataram I., et al., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: A Cancer Journal for Clinicians. 68 (2018), 394-424. 2 Allemani C., Matsuda T., Di Carlo V., et al., Global surveillance of trends in cancer survival 2000–14 (CONCORD-3): Analysis of individual records for 37 513 025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries, The Lancet. 391 (2018), 1023-1075. 3 Wu G.X., Raz D.J., Lung Cancer Screening, in: Reckam K.L. (Ed.), Lung Cancer, Springer International Publishing, Cham, pp (2016), 1-23. 4 Wu G.X., Raz D.J., Brown L., Sun V., Psychological burden associated with lung cancer screening: A systematic review, Clinical Lung Cancer. 17 (2016), 315-324. 5 Shen S., Han S.X., Aberle D.R., et al., An interpretable deep hierarchical semantic convolutional neural network for lung nodule malignancy classification, ArXiv:1806.00712 [Cs] (2018). 6 Wang J., Liu X., Dong D., et al., Prediction of malignant and benign of lung tumor using a quantitative radiomic method, in: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, Orlando, FL, USA, 2016, pp. 1272-1275. 7 Sun N., Yang D., Fang S., Xie H., Deep convolutional nets for pulmonary nodule detection and classification, in: Liu W., Giunchiglia F. and Yang B. (Eds.), Knowledge Science, Engineering and Management, Springer International Publishing, Cham, 2018, pp. 197-208. 8 Kim H., Park C.M., Hwang E.J., et al., Pulmonary subsolid nodules: Value of semi-automatic measurement in diagnostic accuracy, diagnostic reproducibility and nodule classification agreement, Eur Radiol. 28 (2018), 2124-2133. 9 Dhara A.K., Mukhopadhyay S., Dutta A., et al., A combination of shape and texture features for classification of pulmonary nodules in lung CT images, J Digit Imaging. 29 (2016), 466-475. Chen S., Qin J., Ji X., et al., Automatic scoring of multiple semantic attributes with multi-task feature leverage: A study on pulmonary nodules in CT images, IEEE Trans Med Imaging. 36 (2017), 802-814. Farag A.A., Ali A., Elshazly S., Farag A.A., Feature fusion for lung nodule classification, Int J CARS. 12 (2017), 1809-1818. Xie Y., Xia Y., Zhang J., et al., Transferable multi-model ensemble for benign-malignant lung nodule classification on chest CT, in: Descoteaux M., Maier-Hein L., Franz A., Jannin P., Collins D.L. and Duchesne S. (Eds.), Medical Image Computing and Computer Assisted Intervention-MICCAI 2017, Springer International Publishing, Cham, 2017, pp. 656-664. Dilger S.K.N., Uthoff J., Judisch A., et al., Improved pulmonary nodule classification utilizing quantitative lung parenchyma features, J Med Imag. 2 (2015), 041004. Shen W., Zhou M., Yang F., et al., Multi-scale convolutional neural networks for lung nodule classification, in: Ourselin S., Alexander D.C., Westin C.-F. and Cardoso M.J., (Eds.), Information Processing in Medical Imaging, Springer International Publishing, Cham, 2015, pp. 588-599. Keming M., Zhuofu D., Lung nodule image classification based on ensemble machine learning, J Med Imaging Hlth Inform. 6 (2016), 1679-1685. Cortes C., Vapnik V., Support-vector networks, Mach Learn. 20 (1995), 273-297. Breiman L., Random forests, Machine Learning. 45 (2001), 5-32. Keller J.M., Gray M.R., Givens J.A., A fuzzy K-nearest neighbor algorithm, IEEE Trans Syst, Man, Cybern SMC-15 (1985), 580-585. Dai Y., Yan S., Zheng B., Song C., Incorporating automatically learned pulmonary nodules attributes into a convolutional neural network to improve accuracy of benign-malignant nodule classification, Phys Med Biol. 63 (2018), 245004. Mastouri R., Khlifa N., Neji H., Hantous-Zannad S., Deep learning-based CAD schemes for the detection and classification of lung nodules from CT images: A survey, J Xray Sci Technol. 28 (2020), 591-617. Liu L., Liu Y., Zhao H., Benign and malignant solitary pulmonary nodules classification based on CNN and SVM, Proceedings of the International Conference on Machine Vision and Applications-ICMVA 2018, ACM Press, Singapore, Singapore, 2018, pp. 46-50. Chae K.J., Jin G.Y., Ko S.B., et al., Deep learning for the classification of small (≤2cm) pulmonary nodules on CT imaging: A preliminary study, Acad Radiol. 27 (2020), e55-e63. Naik A., Edla D.R., Kuppili V., Lung nodule classification on computed tomography images using Fractalnet, Wireless Pers Commun. 119 (2021), 1209-1229. Zhao X., Liu L., Qi S., et al., Agile convolutional neural network for pulmonary nodule classification using CT images, Int J CARS. 13 (2018), 585-595. Zhao X., Qi S., Zhang B., et al., Deep CNN models for pulmonary nodule classification: Model modification, model integration, and transfer learning, J Xray Sci Technol. 27 (2019), 615-629. da Nobrega R.V.M., Peixoto S.A., da Silva S.P.P. and Filho P.P.R., Lung nodule classification via deep transfer learning in CT lung images, 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), IEEE, Karlstad, 2018, pp. 244-249. Xie Y., Zhang J., Xia Y., Semi-supervised adversarial model for benign–malignant lung nodule classification on chest CT, Medical Image Analysis. 57 (2019), 237-248. Onishi Y., Teramoto A., Tsujimoto M., et al., Automated pulmonary nodule classification in computed tomography images using a deep convolutional neural network trained by generative adversarial networks, BioMed Res Int (2019), 6051939. Sun W., Zheng B., Qian W., Computer aided lung cancer diagnosis with deep learning algorithms, Presented at the SPIE Medical Imaging, San Diego, California, United States. 9785 (2016), 241-248. Xie Y., Xia Y., Zhang J., et al., Knowledge-based collaborative deep learning for benign-malignant lung nodule classification on chest CT, IEEE Trans Med Imaging. 38 (2019), 991-1004. Tran G.S., Nghiem T.P., Nguyen V.T., et al., Improving accuracy of lung nodule classification using deep learning with focal loss, Journal of Healthcare Engineering. 2019 (2019), 1-9. Ali I., Muzammil M., Haq I.U., et al., Deep feature selection and decision level fusion for lungs nodule classification, IEEE Access. 9 (2021), 18962-18973. Wang C., Chen D., Hao L., et al., Pulmonary image classification based on Inception-v3 transfer learning model, IEEE Access. 7 (2019), 146533-146541. Mastouri R., Khlifa N., Neji H., Hantous-Zannad S., A bilinear convolutional neural network for lung nodules classification on CT images, Int J CARS. 16 (2021), 91-101. Theodoridis S., Koutroumbas K., Pattern Recognition (Fourth Edition). USA: Academic Press, 2008. Haykin S., Neural Networks and Learning Machines. McMaster University, Canada: Prentice Hall, 2008. Goodfellow I., Pouget-Abadie J., Mirza M., et al., Generative adversarial networks, Commun ACM. 63 (2020), 139-144. Arjovsky M., Chintala S., Bottou L., Wasserstein GAN, ArXiv:1701.07875 [Cs, Stat]. (2017). Woo S., Park J., Lee J.Y., Kweon I.S., CBAM: Convolutional block attention module, in: Ferrari V., Hebert M., Sminchisescu C. and Weiss Y. (Eds.), Computer Vision-ECCV 2018, Springer International Publishing, Cham, 2018, pp. 3-19. Lecun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proc IEEE. 86 (1998), 2278-2324. Krizhevsky A., Sutskever I., Hinton G.E., ImageNet classification with deep convolutional neural networks, Commun ACM. 60 (2017), 84-90. Simonyan K., Zisserman A., Very deep convolutional networks for large-scale image recognition, ArXiv:1409.1556 [Cs] (2015). Szegedy C., Vanhoucke V., Ioffe S., et al., Rethinking the inception architecture for computer vision, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Las Vegas, NV, USA, 2016, pp. 2818-2826. He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Las Vegas, NV, USA, 2016, pp. 770-778. Chollet F., Xception: Deep learning with depthwise separable convolutions, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Honolulu, HI, 2017, pp. 1800-1807. Sandler M., Howard A., Zhu M., et al., MobileNetV2: Inverted residuals and linear bottlenecks, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, UT, 2018, pp. 4510-4520. Szegedy C., Ioffe S., Vanhoucke V., Alemi A.A., Inception-v4, inception-resnet and the impact of residual connections on learning, In2017 AAAI (2017), 4278-4284. Huang G., Liu Z., Van L., Der Maaten and K.Q. Weinberger, Densely connected convolutional networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Honolulu, HI, 2017, pp. 2261-2269.

By Dongjie Li; Shanliang Yuan and Gang Yao

Reported by Author; Author; Author

Titel:	Classification of lung nodules based on the DCA-Xception network.
Autor/in / Beteiligte Person:	Li, D ; Yuan, S ; Yao, G
Link:	Volltext (PDF)
Zeitschrift:	Journal of X-ray science and technology, Jg. 30 (2022), Heft 5, S. 993-1008
Veröffentlichung:	<Sept.>1997- : Amsterdam : IOS Press ; <i>Original Publication</i>: San Diego [i.e. Duluth, MN] : Academic Press, [c1989-, 2022
Medientyp:	academicJournal
ISSN:	1095-9114 (electronic)
DOI:	10.3233/XST-221219
Schlagwort:	Humans Lung diagnostic imaging Lung pathology Normal Distribution Tomography, X-Ray Computed methods Lung Neoplasms diagnostic imaging Lung Neoplasms pathology Solitary Pulmonary Nodule
Sonstiges:	Nachgewiesen in: MEDLINE Sprachen: English Publication Type: Journal Article; Research Support, Non-U.S. Gov't Language: English [J Xray Sci Technol] 2022; Vol. 30 (5), pp. 993-1008. MeSH Terms: Lung Neoplasms* / diagnostic imaging ; Lung Neoplasms* / pathology ; Solitary Pulmonary Nodule* ; Humans ; Lung / diagnostic imaging ; Lung / pathology ; Normal Distribution ; Tomography, X-Ray Computed / methods Contributed Indexing: Keywords: Lung nodule classification; Wasserstein Generative Adversarial Networks (WGAN); Xception; classifier; convolutional attention module Entry Date(s): Date Created: 20220801 Date Completed: 20221011 Latest Revision: 20221129 Update Code: 20231215

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.

Classification of lung nodules based on the DCA-Xception network.