Prediction and Analysis of Tensile Properties of Austenitic Stainless Steel Using Artificial Neural Network

: Predicting mechanical properties of metals from big data is of great importance to materials engineering. The present work aims at applying artificial neural network (ANN) models to predict the tensile properties including yield strength (YS) and ultimate tensile strength (UTS) on austenitic stainless steel as a function of chemical composition, heat treatment and test temperature. The developed models have good prediction performance for YS and UTS, with R values over 0.93. The models were also tested to verify the reliability and accuracy in the context of metallurgical principles and other data published in the literature. In addition, the mean impact value analysis was conducted to quantitatively examine the relative significance of each input variable for the improvement of prediction performance. The trained models can be used as a guideline for the preparation and development of new austenitic stainless steels with the required tensile properties.


Introduction
Metallic materials are widely used in daily life, especially a variety of steel that have a very long history of research. It is known that there are many variables that can affect the properties of steels such as strength [1]. Strength is the ability of a material to resist plastic deformation or fracture, and the strength properties such as tensile strength and plasticity of steels are usually dependent on chemical composition. In addition, the heat treatments, such as annealing, tempering and quenching, can effectively control the microstructure, grain size and defects, which are all closely related to the tensile properties of steels. In addition, the tensile properties are also affected by the service conditions such as working temperature and irradiation environment [2]. To discover the inherent mechanism of how these variables affect the steels, researchers could only get the preliminary influence trends through the continuous experiments via changing a part of variables for a long time. A formula obtained with the traditional linear fitting regression is generally difficult to capture the exact correlations between relative variables and corresponding properties of steels, because they often have complex nonlinear relationships [3].
With the vigorous development of computer technology, machine learning sprang up and became a powerful method for finding the patterns in high-dimensional data [4]. Machine learning is actually an efficient statistical analysis method to capture the linear or nonlinear internal relationships by learning from empirical data [5]. The common machine learning methods include artificial neural network (ANN) [6,7], support vector machine (SVM) [8,9], decision tree (DT) [10] and so on. Nowadays, machine learning prompts data science and analytics to become a significant tool to find the desired causal relations in the material research [11], and results in developing a new field termed as "Materials Informatics" [12,13] in recent years. Machine learning has been rapidly used in the fields of metals [14][15][16][17][18], as well as polymers [19], semiconductors [20,21], which fully demonstrates its powerful universality.
Meanwhile, in order to meet the needs of machine learning for big data [22], many experimental data were collected and established databases such as MatNavi [23], MatWeb [24] and Matmatch [25]. MatNavi contains a large amount of data about the fatigue and creep properties of various steels, which have been already used for machine learning model establishment and research. Agrawal et al. [26] proved the practicality of machine learning for fatigue strength research with the Fatigue Data Sheet. Sourmail et al. [27] correctly captured the important influence trends using the established models with ANN based on the Creep Data Sheet. Besides the fatigue and creep properties, a few machine learning models have been established to obtain the correlations between the tensile properties of steels and the important variables. Guo et al. [28] used the ANN model to well characterize the relationships between the mechanical properties of maraging steels and composition, processing and working conditions. Fragassa et al. [29] chose the metallographic factors as the input features and designed three kinds of machine learning methods to model the mechanical properties of cast iron.
However, for austenitic stainless steel (ASS), more attention is paid on the creep, fatigue and corrosion resistance [30][31][32]. Besides the corrosion resistance, the mechanical properties such as strength of ASS are also important for their application and have drawn much attention in past decades. The tensile properties of ASS have been extensively studied both experimentally and theoretically. Sivaprasad et al. [33] developed an artificial neural network model to correlate alloy composition and test temperature to tensile properties of 15Cr-15Ni-2.2Mo-Ti modified ASS. Desu et al. [34] used test temperature and strain rates as descriptors to predict the tensile properties of ASS 304L and 316L using the ANN model. However, there are no general machine learning models to correlate chemical composition, heat processes and service conditions to tensile properties of ASS, and clarify how each variable affects the tensile properties of ASS.
In this work, we proposed a machine learning method using ANN to predict the tensile properties of ASS with the chemical composition, solution treatment conditions (heat processes) and test temperature (service condition) as descriptors. The models established by partial data in the database have high predictive accuracy for the remaining data and some new data outside the database. We also calculated the impact degrees of each variable with the mean impact value (MIV) method and predicted the influence trends of several important variables on tensile properties. Our results conform to the previous metallurgical theories, and the established models can guide us for further research and development of new ASS with the expected tensile properties.

Information of the Database
Our study is based on the tensile test data of some classical types of ASS including SUS 304, SUS 316, SUS 321, SUS 347 and NCF 800H, which is referenced to the Creep Data Sheet of Steel (No.4B, 5B, 6B, 14B, 15B, 26B, 27B, 28B, 32A, 42 and 45) from NIMS MatNavi and BSCC High Temperature Data from The British Steelmakers Creep Committee [35], and collected by the Material Algorithm Project (MAP) of University of Cambridge [36].
The original data contains 1916 samples, of which 1107 samples unfortunately lack the necessary information, so the remaining 809 samples are selected for further research. The data has the following characteristics: (1) Every sample has two kinds of tensile properties as output, yield strength (YS) and ultimate tensile strength (UTS); (2) The data contains 20 variables selected as input: chemical composition, solution treatment conditions and test temperature, as shown in Table 1. The chemical composition includes some common elements such as carbon (C), nickel (Ni), chromium (Cr), Nitrogen (N) and the microalloying additions such as titanium (Ti), vanadium (V) and niobium (Nb). The parameters of solution treatment conditions on ASS are temperature and time. It should be noted that after the solution treatment, in order to stabilize the austenite structure and prevent the carbide precipitation at room temperature, it is generally required to perform the rapid water quenching of the materials. Here, the samples that are water-quenched in the database are set as the label 1 and the air-cooled ones without water quenching are set as the label 0. These features are related to the physical metallurgy which is important for modeling and property predictions of ASS [37]. There are some other features in the original database, such as the type of melting, grain size and the form of products, but the data of them are incomplete or they have a lower correlation with tensile properties. So they are not considered in this study. Further information about the input variables is given in Tables A1 and A2 of Appendix A.

Division of Data and Pre-Processing
The data are randomly divided into two groups. Eight-three percent of the data (674 samples) are selected as the training set for the model establishment while the remaining 17% (135 samples) are employed as the testing set for accuracy verification of models, which makes the training/testing ratio close to 5/1.
Deeply understanding and pre-processing the data with appropriate normalization before modeling is one of the most vital steps of effective data mining [26]. The input data generally have more than one dimension and each variable has different size of range, so it is necessary to make each variable normalized within the range from 0 to 1 using the following equation firstly to improve the accuracy and efficiency of calculation and prediction: where xn is the normalized value of the corresponding x, xMax and xmin are the maximum and minimum values of x respectively.

ANN Model Development
Artificial neural network (ANN) is a flexible model for non-linear statistical analysis and it can be used for both data classification and regression calculation. It looks like a box that links input data and output data together via a set of non-linear functions. More details of this method can be found elsewhere [38], but it is necessary to have a brief introduction on the main features of ANN.
A simple three-layer feedforward network is competent for general works, such as the one shown in Figure 1. As the name suggests, it consists of three layers: input, hidden and output. The transfer function in the second layer can be any kind of non-linear function as long as it is continuous and differentiable, such as the hyperbolic tangent function tanh (Equation (2)), which can effectively capture the interaction between the inputs and map many functions of practical interest [39]. The transfer function in the third layer is usually linear (Equation (3)): where xi are inputs, wi the weights which determine the strength of the transfer function and the biases hi the analogous just like the constant in linear regression. The number of neurons in the hidden layer determines the complexity of the model and intensely influences the effect of modeling. How to determine it will be introduced later. We generally take the following 6 steps for the ANN model development: (1) determine the input and output variables and collect the data; (2) pre-process the data such as normalizing; (3) divide the original database into the training set and testing set; (4) use the training set for modeling; (5) test the established model with the testing set; (6) use the model for further simulation and prediction.
There are many types of network models and what we use in this work is the one based on the back propagation learning algorithm which is called BPNN [40,41]. The traditional algorithms of BPNN usually make the model inaccurate and reduce the efficiency of calculation. Therefore, some kinds of new algorithms start to be widely used with the technological advancement. What we select is one of the most popular algorithms with good generalization called Bayesian regularization (TRAINBR) [42]. The transfer functions we select are the hyperbolic tangent sigmoid function (TANSIG) in the second layer and the linear function (PURELIN) in the third layer. The combination of these functions can meet the basic modeling requirement. To implement the ANN modeling, we use the Neural Network Toolbox (nntool) of MATLAB (R2018b edition) on PC, and TRAINBR, TANSIG, PURELIN are the MATLAB commands of nntool. Some detailed information can be viewed in the manual of MATLAB nntool [43]. In order to describe the modeling process more intuitively, a schematic diagram of the model is illustrated in Figure 2.
As mentioned above, a three-layer network with one hidden layer is found to be sufficient for this study. A suitable neural network usually needs to have a high accuracy and good correction on both the training and testing set. However, it is difficult to select the best architecture of network.
Although the model has a very good description of the training set, the accuracy of prediction could be very poor for other new data in the testing set. This phenomenon is generally called overfitting.
Since we have already determined 20 variables as the input and 2 properties as the output, the number of units in the hidden layer will greatly change the architecture and performance of the networks. We determine the optimal number of units in the hidden layer by comparing the predictive accuracy of different networks on the testing set. Here, we use root mean square error (RMSE) and correlation coefficient (R) as the error statistical parameters. The RMSE can accurately measure the deviation between original values and predicted ones, and the R is able to provide information on the strength of correlation between them. They are calculated using the following equation: where n is total number of data, yi the original values, f(x i ) the predicted ones, and y= 1 n ∑ y i n i=1 , respectively [26]. Once the predicted values and the original ones have a small deviation, a strong correlation is found with a small value of RMSE and R close to 1.  The increase in number of hidden units means the more complexity of the model. It is obviously seen that both for YS and UTS, R increases and RMSE decreases in the training set, indicating that the more complicated model has better prediction for the training set. However, a high degree of complexity probably causes the overfitting and makes the model not suitable for the unseen data in the testing set. As shown in Figure  3, the prediction for the testing set first becomes better and then gets slightly worse when the number of units in the hidden layer increases. Hence, we set the optimum number of units by the RMSE and R of the testing set in this study and it is clearly found to be 8 for YS and 11 for UTS in Figure 3. The architectures of the models we use in the following work are [20-8-1] for YS and [20-11-1] for UTS. The YS/UTS ratio is also an important parameter of tensile properties that determines the reliability of metallic materials [44]. We also establish the network for the YS/UTS ratio and find the similar phenomenon of prediction from the observation of Figure 3c. Its optimum architecture is [20-6-1].  In order to verify the accuracy of the established models, besides the RMSE and R, another goodness-of-fit statistical parameter is the mean absolute percentage error (MAPE) [26], which is used to measure the error between the predicted and original values, and it particularly considers the ratio of error to the original values. The smaller the MAPE value, the higher the predictive accuracy. The value of MAPE is calculated as following equation: Table 2 shows the values of MAPE, RMSE and R of ANN models for three tensile properties, YS, UTS and YS/UTS, of the training and testing sets with the optimal hidden units. As it can be seen that the predictions of the training set are generally better than those of the testing set, which could be expected because the model is better for predicting the known data than the unseen data. However, for the testing set, the values of MAPE are less than 6% and of R are above 0.86 for the properties YS, UTS and YS/UTS, indicating a good prediction performance of the models. It is worth noting that the value of R for UTS is nearly 0.99, indicating that the present model has a more accurate description and prediction for UTS than YS and YS/UTS. Figure 4 shows the original and predicted values of UTS, YS and YS/UTS for the training and testing sets. Good performance of the models for UTS, YS and YS/UTS is observed for both the training and testing sets.  In order to verify the ability of established models to predict the unseen data outside the database, some new data of tensile properties of ASS in the Fatigue Data Sheet from MatNavi are tested. The details of these data are listed in Table A3 of Appendix A. The prediction results are shown in Figure 5 and the statistical parameters are presented in Table 3. It can be seen from Figure  5 and Table 3 that the models have good performance for UTS and YS with R ≥ 0.95. This means that the models have a good ability to predict the unknown data. Moreover, the models have a relative better prediction for UTS than YS with these new data, which is similar to the result for the original data mentioned above.

Mean Impact Value
Using machine learning cannot only make accurate predictions, but also further analyze the effect of each single variable on the corresponding properties. The mean impact value (MIV) method has been widely used for quantitative feature analysis in the machine learning application to explore the relative importance of each input variable for the improvement of the prediction performance [45]. The algorithm process of MIV is as follow: 1. Building two new datasets by varying the magnitude of one of variables by ±10% for the original training set; 2. Inputting two new datasets as the simulation samples to the model and obtaining two predicted results; 3. Calculating the difference value of these two predicted results, called the impact value (IV); 4. According to the amount of samples in the original training set, calculating the average value of IV, that is MIV of each variable; 5. Repeating the above steps in turn to get the MIV of each independent variable. It should be noted that the value of MIV indicates the positive or negative effect as well as the intensity of influence.
The MIV values of each variable are calculated and shown in Figure 6. MIV results show that test temperature and Ni content are two most important factors for both YS and UTS. Generally, the tensile properties of steels largely depend on the test temperatures and experimentally, the properties often need to be carried out at different test temperatures. Moreover, for the UTS, besides the test temperature, the importance of Ni and Cr contents is in good agreement with the traditional metallurgical theories and the engineering practice. In ASS steels, Ni and Cr are intentionally added in large quantities into ASS to improve the tensile properties and high-temperature oxidation resistance. In addition, Ti and Mo contents and temperature of solution treatment are also strong indicators of YS, while Cr and Mn contents are highly related to UTS. Note that the MIV value of test temperature for UTS is much greater than that for YS, which means that the test temperature has a much stronger effect on UTS than YS. The information about the positive and negative correlation between the properties and all features is also shown in Figure 6. The elements Cr, V, Ti, Mo, Nb, C and Mn appear positively correlated with YS and UTS, which is consistent with the characteristics of previous theories of precipitation strengthening and solid solution strengthening [2,46]. The elements Cr, V, Ti, Mo and Nb could form strong carbides or nitrides that play a role of second phase precipitation strengthening. The elements like C, N and Mn could form interstitial solid solution that lead to the solid solution strengthening. In addition, it is worth mentioning that S content, test temperature and solution treatment time all exhibit the negative influences on YS and UTS.

Influence Trends of Variables
As discussed above, the present models show good prediction performance for the tensile properties YS and UTS, so they can be used to predict the influence trends of some important variables on the properties. Here, the effects of some typical elements and treatment conditions are examined. To investigate the effect of each variable, a new dataset is built and the values of this variable are set from the maximum to minimum and the values of other variables are set to the average of the original database, which are listed in Table A1 of Appendix A. Figure 7 shows the effect of typical elements C, Cr, Ni, Ti, Nb, and V on the tensile properties UTS and YS. It can be observed that both UTS and YS generally show positive correlations with the contents of C, Cr, Ti, Nb and V, while they exhibit negative correlation with the Ni content. The results are in good agreement with the above MIV analyses that C and Cr are added to stabilize the austenite and to form interstitial solid solution or carbides dispersed in the matrix [2,46,47]. Note that for YS, it shows a firstly decreasing trend when the C content is less than 0.05%. This result is probably due to the lack of sufficient amount of data that limits the accuracy of the model. Moreover, the reduced value is not large and the overall tendency is still increasing, so the model has a reasonable prediction result for the effect of C content on YS within a certain error tolerance. The microalloying elements Ti, Nb and V could form the stable second carbides to prevent austenite grain coarsening, hindering the dislocation motion and then strongly strengthening the mechanical properties of ASS. Previous experimental results show that Nb and Ti can improve the tensile properties of ASS [48]. As shown in Figure 7c, Ni is a deliberate element which has a strong negative correlation with tensile strengths and makes ASS have good plasticity and ductility for subsequent processing [49]. It is known that by solution treatment, the second phase carbon nitride could be uniformly dispersed in the matrix and then leads to increased second phase strengthening. However, the continued solution treatment will cause the grain coarsening and the reduction of crystal defects of ASS. So, the effect of second phase strengthening is gradually offset and the strength of ASS decreases ultimately [50][51][52].
strengthening. In Figure 8c, both UTS and YS decrease as Tt increases. When the temperature is close to 1300 K, the predicted value of UTS reduces to nearly 0 and even less than YS. This is clearly contrary to previous theories, which is similar to the previous prediction for the influence trend of carbon. The possible reason is a lack of sufficient data. By comparing the predicted results in Figures  7 and 8, the test temperature is the most influential variable for YS and UTS, where the variations of tensile properties are the largest, especially for UTS. This is also consistent with the conclusion of the MIV analysis above.

Conclusions
In this work, we have developed two BPNN models capable of studying and predicting two tensile properties of austenitic stainless steel, YS and UTS, as a function of 20 variables including chemical composition, heat treatment conditions and test temperature. The accuracy of the established models is evaluated based on three statistical parameters, RMSE, R and MAPE. The results indicate that the models have not only highly predictive accuracy on both the training and testing set, but also have good prediction performance for some unknown data. For analyzing the effect of each variable on YS and UTS, we use the MIV method and predict the influence trends of several important variables with the established models. The results correctly reflect the positive and negative correlation between the tensile properties and all features, which are consistent with the previous metallurgical theories.
Compared with experiments and other models, the present models are able to accurately predict the tensile properties of ASS when all features are known. Based on the models, the test temperature and Ni content are found to be two most important factors for both YS and UTS. This work is helpful for the preparation and development of new ASS in the future.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A