The skin is the largest organ of the human body, and many visceral diseases will be directly reflected on the skin, so it is of great clinical significance to accurately segment the skin lesion images. To address the characteristics of complex color, blurred boundaries, and uneven scale information, a skin lesion image segmentation method based on dense atrous spatial pyramid pooling (DenseASPP) and attention mechanism is proposed. The method is based on the U-shaped network (U-Net). Firstly, a new encoder is redesigned to replace the ordinary convolutional stacking with a large number of residual connections, which can effectively retain key features even after expanding the network depth. Secondly, channel attention is fused with spatial attention, and residual connections are added so that the network can adaptively learn channel and spatial features of images. Finally, the DenseASPP module is introduced and redesigned to expand the perceptual field size and obtain multi-scale feature information. The algorithm proposed in this paper has obtained satisfactory results in the official public dataset of the International Skin Imaging Collaboration (ISIC 2016). The mean Intersection over Union (mIOU), sensitivity (SE), precision (PC), accuracy (ACC), and Dice coefficient (Dice) are 0.901 8, 0.945 9, 0.948 7, 0.968 1, 0.947 3, respectively. The experimental results demonstrate that the method in this paper can improve the segmentation effect of skin lesion images, and is expected to provide an auxiliary diagnosis for professional dermatologists.
Lung cancer is the most threatening tumor disease to human health. Early detection is crucial to improve the survival rate and recovery rate of lung cancer patients. Existing methods use the two-dimensional multi-view framework to learn lung nodules features and simply integrate multi-view features to achieve the classification of benign and malignant lung nodules. However, these methods suffer from the problems of not capturing the spatial features effectively and ignoring the variability of multi-views. Therefore, this paper proposes a three-dimensional (3D) multi-view convolutional neural network (MVCNN) framework. To further solve the problem of different views in the multi-view model, a 3D multi-view squeeze-and-excitation convolution neural network (MVSECNN) model is constructed by introducing the squeeze-and-excitation (SE) module in the feature fusion stage. Finally, statistical methods are used to analyze model predictions and doctor annotations. In the independent test set, the classification accuracy and sensitivity of the model were 96.04% and 98.59% respectively, which were higher than other state-of-the-art methods. The consistency score between the predictions of the model and the pathological diagnosis results was 0.948, which is significantly higher than that between the doctor annotations and the pathological diagnosis results. The methods presented in this paper can effectively learn the spatial heterogeneity of lung nodules and solve the problem of multi-view differences. At the same time, the classification of benign and malignant lung nodules can be achieved, which is of great significance for assisting doctors in clinical diagnosis.
Glioma is a primary brain tumor with high incidence rate. High-grade gliomas (HGG) are those with the highest degree of malignancy and the lowest degree of survival. Surgical resection and postoperative adjuvant chemoradiotherapy are often used in clinical treatment, so accurate segmentation of tumor-related areas is of great significance for the treatment of patients. In order to improve the segmentation accuracy of HGG, this paper proposes a multi-modal glioma semantic segmentation network with multi-scale feature extraction and multi-attention fusion mechanism. The main contributions are, (1) Multi-scale residual structures were used to extract features from multi-modal gliomas magnetic resonance imaging (MRI); (2) Two types of attention modules were used for features aggregating in channel and spatial; (3) In order to improve the segmentation performance of the whole network, the branch classifier was constructed using ensemble learning strategy to adjust and correct the classification results of the backbone classifier. The experimental results showed that the Dice coefficient values of the proposed segmentation method in this article were 0.909 7, 0.877 3 and 0.839 6 for whole tumor, tumor core and enhanced tumor respectively, and the segmentation results had good boundary continuity in the three-dimensional direction. Therefore, the proposed semantic segmentation network has good segmentation performance for high-grade gliomas lesions.
Deep learning-based automatic classification of diabetic retinopathy (DR) helps to enhance the accuracy and efficiency of auxiliary diagnosis. This paper presents an improved residual network model for classifying DR into five different severity levels. First, the convolution in the first layer of the residual network was replaced with three smaller convolutions to reduce the computational load of the network. Second, to address the issue of inaccurate classification due to minimal differences between different severity levels, a mixed attention mechanism was introduced to make the model focus more on the crucial features of the lesions. Finally, to better extract the morphological features of the lesions in DR images, cross-layer fusion convolutions were used instead of the conventional residual structure. To validate the effectiveness of the improved model, it was applied to the Kaggle Blindness Detection competition dataset APTOS2019. The experimental results demonstrated that the proposed model achieved a classification accuracy of 97.75% and a Kappa value of 0.971 7 for the five DR severity levels. Compared to some existing models, this approach shows significant advantages in classification accuracy and performance.
The accurate segmentation of breast ultrasound images is an important precondition for the lesion determination. The existing segmentation approaches embrace massive parameters, sluggish inference speed, and huge memory consumption. To tackle this problem, we propose T2KD Attention U-Net (dual-Teacher Knowledge Distillation Attention U-Net), a lightweight semantic segmentation method combined double-path joint distillation in breast ultrasound images. Primarily, we designed two teacher models to learn the fine-grained features from each class of images according to different feature representation and semantic information of benign and malignant breast lesions. Then we leveraged the joint distillation to train a lightweight student model. Finally, we constructed a novel weight balance loss to focus on the semantic feature of small objection, solving the unbalance problem of tumor and background. Specifically, the extensive experiments conducted on Dataset BUSI and Dataset B demonstrated that the T2KD Attention U-Net outperformed various knowledge distillation counterparts. Concretely, the accuracy, recall, precision, Dice, and mIoU of proposed method were 95.26%, 86.23%, 85.09%, 83.59%and 77.78% on Dataset BUSI, respectively. And these performance indexes were 97.95%, 92.80%, 88.33%, 88.40% and 82.42% on Dataset B, respectively. Compared with other models, the performance of this model was significantly improved. Meanwhile, compared with the teacher model, the number, size, and complexity of student model were significantly reduced (2.2×106 vs. 106.1×106, 8.4 MB vs. 414 MB, 16.59 GFLOPs vs. 205.98 GFLOPs, respectively). Indeedy, the proposed model guarantees the performances while greatly decreasing the amount of computation, which provides a new method for the deployment of clinical medical scenarios.
Motor imagery electroencephalogram (EEG) signals are non-stationary time series with a low signal-to-noise ratio. Therefore, the single-channel EEG analysis method is difficult to effectively describe the interaction characteristics between multi-channel signals. This paper proposed a deep learning network model based on the multi-channel attention mechanism. First, we performed time-frequency sparse decomposition on the pre-processed data, which enhanced the difference of time-frequency characteristics of EEG signals. Then we used the attention module to map the data in time and space so that the model could make full use of the data characteristics of different channels of EEG signals. Finally, the improved time-convolution network (TCN) was used for feature fusion and classification. The BCI competition IV-2a data set was used to verify the proposed algorithm. The experimental results showed that the proposed algorithm could effectively improve the classification accuracy of motor imagination EEG signals, which achieved an average accuracy of 83.03% for 9 subjects. Compared with the existing methods, the classification accuracy of EEG signals was improved. With the enhanced difference features between different motor imagery EEG data, the proposed method is important for the study of improving classifier performance.
Early screening based on computed tomography (CT) pulmonary nodule detection is an important means to reduce lung cancer mortality, and in recent years three dimensional convolutional neural network (3D CNN) has achieved success and continuous development in the field of lung nodule detection. We proposed a pulmonary nodule detection algorithm by using 3D CNN based on a multi-scale attention mechanism. Aiming at the characteristics of different sizes and shapes of lung nodules, we designed a multi-scale feature extraction module to extract the corresponding features of different scales. Through the attention module, the correlation information between the features was mined from both spatial and channel perspectives to strengthen the features. The extracted features entered into a pyramid-similar fusion mechanism, so that the features would contain both deep semantic information and shallow location information, which is more conducive to target positioning and bounding box regression. On representative LUNA16 datasets, compared with other advanced methods, this method significantly improved the detection sensitivity, which can provide theoretical reference for clinical medicine.
The brain-computer interface (BCI) based on motor imagery electroencephalography (MI-EEG) enables direct information interaction between the human brain and external devices. In this paper, a multi-scale EEG feature extraction convolutional neural network model based on time series data enhancement is proposed for decoding MI-EEG signals. First, an EEG signals augmentation method was proposed that could increase the information content of training samples without changing the length of the time series, while retaining its original features completely. Then, multiple holistic and detailed features of the EEG data were adaptively extracted by multi-scale convolution module, and the features were fused and filtered by parallel residual module and channel attention. Finally, classification results were output by a fully connected network. The application experimental results on the BCI Competition IV 2a and 2b datasets showed that the proposed model achieved an average classification accuracy of 91.87% and 87.85% for the motor imagery task, respectively, which had high accuracy and strong robustness compared with existing baseline models. The proposed model does not require complex signals pre-processing operations and has the advantage of multi-scale feature extraction, which has high practical application value.
The conventional fault diagnosis of patient monitors heavily relies on manual experience, resulting in low diagnostic efficiency and ineffective utilization of fault maintenance text data. To address these issues, this paper proposes an intelligent fault diagnosis method for patient monitors based on multi-feature text representation, improved bidirectional gate recurrent unit (BiGRU) and attention mechanism. Firstly, the fault text data was preprocessed, and the word vectors containing multiple linguistic features was generated by linguistically-motivated bidirectional encoder representation from Transformer. Then, the bidirectional fault features were extracted and weighted by the improved BiGRU and attention mechanism respectively. Finally, the weighted loss function is used to reduce the impact of class imbalance on the model. To validate the effectiveness of the proposed method, this paper uses the patient monitor fault dataset for verification, and the macro F1 value has achieved 91.11%. The results show that the model built in this study can realize the automatic classification of fault text, and may provide assistant decision support for the intelligent fault diagnosis of the patient monitor in the future.
Existing classification methods for myositis ultrasound images have problems of poor classification performance or high computational cost. Motivated by this difficulty, a lightweight neural network based on a soft threshold attention mechanism is proposed to cater for a better IIMs classification. The proposed network was constructed by alternately using depthwise separable convolution (DSC) and conventional convolution (CConv). Moreover, a soft threshold attention mechanism was leveraged to enhance the extraction capabilities of key features. Compared with the current dual-branch feature fusion myositis classification network with the highest classification accuracy, the classification accuracy of the network proposed in this paper increased by 5.9%, reaching 96.1%, and its computational complexity was only 0.25% of the existing method. The obtained results support that the proposed method can provide physicians with more accurate classification results at a lower computational cost, thereby greatly assisting them in their clinical diagnosis.