X-ray Diffraction Data Analysis by Machine Learning Methods—a review
Download 1.51 Mb. Pdf ko'rish
|
applsci-13-09992
Table 2.
Accuracy of the class-specific predictive performance for the different classifier algorithms. Data from reference [ 92 ]. Class Classifier SVM NB KNN RF CNN: Cartesian CNN: Polar-Min CNN: Polar-Max Artifact 0.85 0.78 0.87 0.91 0.94 0.93 0.92 Background Ring 0.72 0.61 0.72 0.86 0.92 0.91 0.90 Diffuse Scattering 0.93 0.45 0.93 0.93 0.96 0.95 0.97 Ice Ring 0.14 0.80 0.93 0.95 0.99 0.99 0.98 Loop Scattering 0.70 0.62 0.71 0.83 0.94 0.95 0.96 Nonuniform Detector Response 0.45 0.68 0.75 0.81 0.87 0.89 0.89 Strong Background 0.90 0.87 0.89 0.93 0.94 0.91 0.93 Chakraborty and Sharma [ 93 , 94 ] compared several algorithms (RF, KNN, decision tree, SVM, and gradient boosting) with the CNN for the purpose of the classification of crystal systems into seven categories: triclinic, monoclinic, orthorhombic, tetragonal, hexagonal, rhombohedral, and cubic. The training dataset consisted of 164 compounds extracted from the Inorganic Crystal Structure Database with a similar composition, expected crystal sym- metry, and space group. Their work showed that the CNN performed better than the other studied algorithm achieving a cross-validation accuracy for crystal system classification of 95.6% as compared to 55% for naïve Bayes, 64.3% for KNN, 68.5% for logistic regression, 56.5% for RF, 45.6% for decision trees, 67.1% for SVM, 62.3% for decision trees and 65.4% for deep neural network. Massuyeay et al. [ 95 ] explored RF and CNN to distinguish between perovskite and non-perovskite-type materials in a series of hybrid lead halides. The synthetic (simu- lated) dataset was based on 998 crystal structures from the Cambridge Structural Database: 375 perovskite-type compounds (50 chlorides, 105 bromides, and 220 iodides) and 623 non-perovskite-type compounds (50 chlorides, 139 bromides, and 426 iodides). The study also used experimentally measured X-ray powder diffraction data on 23 freshly prepared lead halides: 9 previously published (and reported in Cambridge Structural Database) and 14 new compounds. The categories used for the classification were per- ovskite and nonperovskite. On the one hand, in the RF algorithm, the number of trees was set to 100, with a maximum of 10 levels in tree, a minimum number of 2 samples on a leaf, a minimum number of samples to split a node of 10, and a step size for the XRD patterns of 2.18 ◦ . On the other hand, the CNN was designed with 23 layers and simulated patterns acted as 1D input. The mean values of the accuracy obtained after the classification were 0.92 in the case of CNN and 0.89 in the case of RF. In what concerns the 23 experimentally synthesized samples, the mean values of accuracy were 0.73 for CNN and 0.78 for RF. The lower accuracy obtained for the experimentally raw patterns was explained by the authors in terms of the different effects, such as the preferential orientation and different signal/noise ratio [ 92 – 95 ]. In geothermal fields, the classification of rock cuttings is important for understanding the geothermal system and for selecting a promising site [ 96 ]. Rock cuttings containing 24 minerals (Table 3 ) were obtained from two wells in the Hachimantai geothermal field, which may have formed during hydrothermal alteration according to Ishitsuka et al. For the assessment of three ML algorithms, namely, K-mean clustering, Gaussian mixture model, and agglomerative clustering [ 96 ], the authors prepared a dataset of 88 simulated samples with four mineral distributions along a well down to 1000 m with a depth spacing of 10 m. The classification of the samples was performed using three labels: quartz index, temperature, and depth. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling