Aircraft Engine Prognostics Based on Informative Sensor Selection and Adaptive Degradation Modeling with Functional Principal Component Analysis

Engine prognostics are critical to improve safety, reliability, and operational efficiency of an aircraft. With the development in sensor technology, multiple sensors are embedded or deployed to monitor the health condition of the aircraft engine. Thus, the challenge of engine prognostics lies in how to model and predict future health by appropriate utilization of these sensor information. In this paper, a prognostic approach is developed based on informative sensor selection and adaptive degradation modeling with functional data analysis. The presented approach selects sensors based on metrics and constructs health index to characterize engine degradation by fusing the selected informative sensors. Next, the engine degradation is adaptively modeled with the functional principal component analysis (FPCA) method and future health is prognosticated using the Bayesian inference. The prognostic approach is applied to run-to-failure data sets of C-MAPSS test-bed developed by NASA. Results show that the proposed method can effectively select the informative sensors and accurately predict the complex degradation of the aircraft engine.


Introduction
As the heart of an aircraft, the engine consists of several subsystems with millions of parts, and its health condition directly impacts operation and safety of the whole aircraft system. While the reliability of aero-engines has been improved over the years, factors such as fatigue and wear will inevitably cause the health condition of an engine to degrade with usage, reducing the overall performance of the powered aircraft [1]. Therefore, a number of sensors of varying types have been widely used to monitor the health degradation of aircraft engines, which creates a multi-sensor environment for operational health analysis and maintenance decision-making. Over the past decades, prognostics that utilizes monitoring sensors information to predict the future health and estimates the remaining useful life (RUL) before failure/end-of-life has attracted increasing attention from academic researchers and industrial operators [2]. For its potential in enhancing reliability and maintenance efficiency while based on logistic regression with penalization regularization for aircraft engine health prognostics. In [25], Liu et al. proposed an improved permutation entropy method for selection of informative sensors. To identify multiple stages before failure, Chehade et al. studied informative sensor selection and fusion using statistical hypothesis testing [26]. Besides, the task of informative sensor selection was integrated with the statistical modeling in [23] and [27] for engine prognostics.
Through fusion of the selected informative sensors into a composite HI, many statistical methods have been proposed for prognostics of complex systems, including aircraft engines. However, most of the statistical approaches are based on parametric models, such as a linear model [16,23,28], exponential model [17,27], quadratic model [24], etc. These methods are effective when the engine degradation pattern can be described with a specific model, but there are numerous cases where the degradation trajectory may not be appropriately fitted by a parametric model [1,3,16]. Therefore, a more flexible degradation modeling and prognosis technique is needed for practical applications. Furthermore, the existing literature relevant to the selection of informative sensors is basically focused on the relevance of a sensor to the degradation process. While metrics of monotonicity [10,25], correlation and robustness [2,29] have been commonly utilized to evaluate the relevance of sensors for prognostics. There is a lack of efforts to consider the degree to which the sensors of a population of systems have the same underlying shape [24]. Specifically, for a fleet of engines operating under one operational condition and with one failure mode, it is highly desired that the sensor demonstrating a consistently increasing or decreasing trend for all the engines to be selected as the informative sensor. To address these limitations, a statistical method with informative sensor selection is proposed for prognostics of the aircraft engine in this paper. In the presented method, a new metric that evaluates the degree of consistent trend of a sensor for a population of systems is proposed and used to select informative sensors to construct the engine HI. Then, the functional principal component analysis (FPCA) method is applied to adaptively model the engine HI and combined with Bayesian inference for future prognostics.
The remainder of this paper is organized as follows. Section 2 briefly gives the related theories. Section 3 presents a detailed description of the proposed prognostic approach. Section 4 demonstrates the developed method with a case study on the degradation of a turbofan engine. Section 5 discusses some key issues of the proposed approach. Finally, Section 6 summaries the research and future work.

Related Theories
In this section, the theories utilized in this study are briefly introduced.

Basics of FPCA
Functional data analysis concerns the analysis of data that are in the form of a function with statistical methods. Functional data are of intrinsically infinite dimension, and thus, dimension reduction based on some kind of basis function is crucial for analysis of such data. Unlike the parametric basis function such as B-spline, Fourier, and wavelet functions, FPCA directly derives the basis function from the data and has been a useful tool for functional data analysis [30][31][32].
In functional data analysis, the given data Y i (t) (i = 1, 2, . . . , n) is assumed as an independent and identically distributed (i.i.d) realization of a stochastic process Y(t) that is in L 2 and defined on the interval [0, Γ]. The expectation and covariance function of Y(t) are respectively µ(t) = E(Y(t)) and G(s, t) = E((Y(s) − µ(s))(Y(t) − µ(t))), where E(·) is the expectation operator. Under mild assumptions, Mercer's theorem implies that the spectral decomposition of G(s, t) leads to: Sensors 2020, 20, 920 4 of 21 where λ 1 ≥ λ 2 ≥ ... are the ordered eigenvalues and φ k (t) are the corresponding eigenfunctions or functional principal components (FPCs). These terms are obtained by solving the following eigen-decomposition problem: Then the Karhunen-Loève theorem states that the functional data Y i (t) (i = 1, 2, . . . , n) are represented by FPCs of the underlying stochastic process covariance function as Equation (3). In the statistical literature, this method has been coined FPCA: where ε ik are the FPC scores calculated as Equation (4), and ε ik are random variables that are independent across i and uncorrelated across k with E(ε ik ) = 0 and E(ε ik 2 ) = λ k : Based on the above analysis, the functional data is expressed as the sum of infinite terms of FPCs by FPCA. Each FPC independently depicts one mode of variations in the functional data, thus the variance (Var) of the data Y i (t) (i = 1, 2, . . . , n) is the sum of all the eigenvalues of the covariance function as follows: However, only a small number of eigenvalues are commonly significantly nonzero in practice. For eigenvalues which are approximately zero, the corresponding FPC scores will also be approximately zero. Therefore, Y i (t) (i = 1, 2, . . . , n) can be approximated by the first K-truncated terms as: To ensure that sufficient variance of the functional data is explained by the truncated expansion, K is usually chosen using the fraction of variance explained (FVE) method as Equation (7) and the threshold η is often a value higher than 85% [30]:

Bayesian Inference
Bayesian inference is based on the Bayes' theorem, which allows one to formally incorporate prior knowledge or experience into computing statistical probabilities. In this way, the inference takes the accumulated information as a priori, and then uses the observed data to update the priori to a posterior. Currently, Bayesian inference has been widely applied in reliability engineering [8,16,21]. For a continuous random variable y and related random parameters θ, Bayes' theorem states that their joint probability can be written in two ways as [33]: p(θ, y) = p(y |θ )p(θ) = p θ y p(y) (8) Eliminating the joint probability p(y, θ) and rearranging a bit, we obtain Bayesian inference for the parameters θ: In the literature, p(θ|y) and p(θ) are respectively defined as the posterior and prior distribution of the random parameters θ, and p(y|θ) is denoted as the likelihood function. The p(y|θ) can be understood as the probability of observing samples of the random variable y conditioned on the parameters θ. Based on Equation (9), Bayesian inference updates the parameters distribution from a priori to a posterior with the likelihood function of the observed data. Thus, the uncertainty in the parameters can be reduced and accuracy of further inference using the parameters can be improved.

The Proposed Method
This section is devoted to introducing the proposed method. The prognostic approach consists of informative sensor selection, HI construction, degradation modeling, and health prognostics. Flowchart of the method is shown in Figure 1.
incorporate prior knowledge or experience into computing statistical probabilities. In this way, the inference takes the accumulated information as a priori, and then uses the observed data to update the priori to a posterior. Currently, Bayesian inference has been widely applied in reliability engineering [8,16,21].
For a continuous random variable y and related random parameters θ, Bayes' theorem states that their joint probability can be written in two ways as [33]: Eliminating the joint probability p(y, θ) and rearranging a bit, we obtain Bayesian inference for the parameters θ: In the literature, p(θ|y) and p(θ) are respectively defined as the posterior and prior distribution of the random parameters θ, and p(y|θ) is denoted as the likelihood function. The p(y|θ) can be understood as the probability of observing samples of the random variable y conditioned on the parameters θ. Based on Equation (9), Bayesian inference updates the parameters distribution from a priori to a posterior with the likelihood function of the observed data. Thus, the uncertainty in the parameters can be reduced and accuracy of further inference using the parameters can be improved.

The Proposed Method
This section is devoted to introducing the proposed method. The prognostic approach consists of informative sensor selection, HI construction, degradation modeling, and health prognostics. Flowchart of the method is shown in Figure 1.

Informative Sensor Selection
In this subsection, four metrics are presented to select informative sensors for complex systems of aircraft engines with a number of monitoring sensors. In practice, system health deterioration is inherently a stochastic process. Metrics for informative sensor selection should be defined upon separating a sensor measurement into its trend and the residual. This decomposition can be carried out locally with smoothing methods or globally with parametric or nonparametric methods. To make full use of the sensor measurements for the whole degrading process modeling and to preserve local characteristics of the degradation, local smoothing methods may be a better option. In this paper, the trend of a sensor measurement is obtained using the moving average method. With the sensor trend and residual, three metrics for degradation feature selection were defined in [29] to improve prognostics effectiveness and efficiency. For the sake of completeness and clarity, the three metrics are reformulated for sensor selection as: Sensors 2020, 20, 920 where s(t j ) is the measurement of the sensor S at the time t j (j = 1, 2, . . . , N j ) with the trend of s T (t j ) and residual s R (t j ), δ(·) is the simple unit step function. Among the three metrics, correlation (Corr) measures the linearity between the interested sensor and the usage time, monotonicity (Mon) assesses consistently increasing or decreasing trend of the sensor, and robustness (Rob) reflects the tolerance of the sensor to noises and outliers. The sensor with higher scores should be selected as the informative sensor, for the three metrics are all positively correlated with the relevance of a sensor to the system degradation.
From Equation (10), it can be concluded that these metrics are defined on the sensor from one individual system and the same sensor from other systems of the population is not accounted. For a population of systems (such as a group of aircraft engines) that operate under the same condition and with one failure mode, the informative sensor is highly desired to have a consistent increasing or decreasing trend for all the systems. Additionally, the sensor with a small variation of measurements upon system failures and a large range till failure is also desired for prognostics. Thus, a new metric of predictability (Pre) is formulated for sensor selection as: where s f and s s are respectively the measurements at the failure and start instant of the sensor S for a population of systems.
The new metric Pre takes the failure and start values of a sensor from the entire population into account, and thus the trend consistency of a sensor among a population is given consideration. Also, Pre is positively related to the performance of the sensor with the range of [0,1], and favorites the sensor with well-clustered failure values and large range. That is the sensor with a high score of Pre should be selected as the informative sensor.
For prognostic parameter selection, Coble et al. presented three metrics in [24] and the prognosability (Pro) metric is quite similar to the proposed Pre. However, the two metrics are different in measuring the range of one sensor. In the metric Pro, the range is the mean range from start to failure for a population of systems. Sensors with well-clustered failure values and large range are encouraged by this measure. While in the Pre, as in Equation (11), the sensor measurements at the start and failure instant of a population are respectively averaged to the given range of the sensor. The metric Pre selects sensors with both well-clustered failure and start values as well as large range.
To select the informative sensor for aircraft engines, two steps are included using the presented goodness metrics. In the first step, binary-value and constant-value sensors are eliminated by simple visual inspection, because these sensors do not make sense for prognosis. As can be seen from these definitions, one goodness metric only partially measures suitability of a sensor for prediction and sensor selection based on only one metric will be biased. In the second step, sensors are evaluated by four metrics simultaneously, and then the informative sensors selected with drop out strategy are fused to construct a HI for health prognosis of aircraft engines.

Health Index Construction
For a data-driven prognostics, extraction of the health signatures and background knowledge from massive training/testing sensors is required. Based on the selected informative sensors, the HI construction is considered in this subsection.
The transformation of the selected informative sensors into one-dimensional HI is a process of information fusion, which enables a general measure to characterize the health condition of a system and also the degradation of different systems from the same population to be similar. The HI can be constructed using many information fusion techniques, such as linear data transformation method [13,17,22], PCA and Euclidean distance measure [20], and logistic regression [6,11]. For its effectiveness and ease of use, the linear data transformation method is also employed to construct the HI for aero-engines in this paper.
Suppose the selected informative sensors are d-dimensional and the two sensor data sets that represent the aircraft engine failed, and healthy states are Q 0 of M 0 × d and Q 1 of M 1 × d matrix, where M 0 and M 1 are, respectively, the data sizes for engine failed and healthy states. Generally, data sets for healthy and failed states can be collected from the training data sets at the beginning and at the end of the run-to-failure tests. With these two data sets, a transformation matrix T of d × 1 is obtained and then the one-dimensional HI y for any d-dimensional matrix Q is: where Q off = [Q 0 ; Q 1 ] T , S off = [S 0 , S 1 ] T , S 0 is a 1 × M 0 zero vector and S 1 is a 1 × M 1 unity vector. As can be easily derived, the HI obtained by the linear transformation method as Equation (12) is varying approximately between 1 and 0. With the usage of the aircraft engine, the HI has a decreasing trend. This HI contains health condition information extracted from multi-dimensional informative sensors of the monitored aircraft engine. It can be used to construct background health knowledge in the offline process and to further conduct the online prediction process. Although the linear transformation is discussed here and will be employed for the case studies, other information fusion methods can also be used to construct the HI with the selected informative sensors.

Degradation Modeling by FPCA
In view that the health degradation is a stochastic process with uncertainties from physical degradation dynamics, usage variations and other effects, the degradation pattern of an aircraft engine is complex and unknown. To adaptively study the aircraft engine degradation, the HI that describes the health deterioration of an engine is assumed as discrete sampling values of the functional data and is modeled with the FPCA method in this subsection.
Suppose there is a population of aircraft engines, then the degradation pattern of one aircraft engine is an independent realization of the degradation process of the whole population. Based on the FPCA analysis method briefly introduced in Section 2, the degradation of one engine can be represented by the mean function and FPCs derived adaptively from the population stochastic degradation process X(t) (t∈I), where I is the service time interval of the population of engines (the observation interval for the engine/system with the longest possible lifetime). Further, considering the fact that the HI indirectly demonstrates the health degradation of the engine, so the measurement error is one important aspect that should be accounted for. To be specific, for a population of n aircraft engines with N i measurements for the i-th engine, the degradation of the i-th engine is modeled with its HI y i (t ij ) (i = 1, 2, . . . , n; j = 1, 2, . . . , N i ) by FPCA as: where x i (t ij ) and e ij are the unobservable degradation and measurement error of the i-th engine at time t ij , x i (t) is the underlying functional data with the mean function µ(t) and the FPCs φ k (t)s for the i-th engine, and ε ik are the related FPC scores.
With the FPCA-based degradation modeling as Equation (13), the degradation characteristics of the whole engine population is represented by the mean function and the first K FPCs, while the degradation peculiarity of an engine is captured by its specific FPC scores. The mean function µ(t) describes the common degradating trend of all the engines from an engine population, and the first few FPCs reflect the main varying modes of the population degradation process. As in [27,34], the measurement errors e ij are practically assumed to be i.i.d with the normal distribution N(0, σ 2 ) and are also independent of FPC scores in this study. Also, note that no pre-specified parametric form is Sensors 2020, 20, 920 8 of 21 needed to be assumed for µ(t) and the φ k (t)s, but they are adaptively derived from the engines HI in the following.
For estimations of the φ k (t)s, the covariance function G(s, t) = E((X(s) − µ(s))(X(t) − µ(t))) of the engine degradation process X(t) should be firstly estimated. For the n engines, the following holds: where δ ij is the Kronecker delta function.
It is easy to see that the diagonal (i.e., j = l) of the raw covariance function V(t ij , t il ) calculated with the degradation observations are contaminated with measurement errors. Estimation of the covariance function G(s, t) using the local linear surface smoothing should be without the diagonal as: where κ 2 (·) is a bivariate kernel function and h G is the smoothing bandwidth.
Then eigen-decomposition following Equation (2) by discretizing theĜ(s, t) is performed to obtain the eigenvaluesλ k (k = 1, 2, . . . ) and the corresponding FPCs. Spline interpolation is further utilized to obtain the continuousφ k (t)s.
For the variance σ 2 of the measurement errors, it is related to the diagonal values of V(t ij , t il ) and G(s, t). To mitigate boundary effects, its estimation is: whereV(t) is the local linear smoother of diagonal values of V(t ij , t il ), G(t) is the diagonal values of G(s, t), and the interval Considering the measurement errors in the degradation model, to estimate FPC scores with Equation (4) will lead to bias. To remedy that, the conditional expectation method proposed by Yao et al. [35] is utilized here to estimate the ε ik : where y i = (y i1 , y i2 , ..., y iNi ) T is the engine HI observing vector, are the mean and FPC vectors interpolated from the mean function and FPCs, and With the estimated parameters, the uncontaminated degradation of the i-th engine is modeled by FPCA as: Sensors 2020, 20, 920 9 of 21 Based on the FPCA modeling analysis, degradation of an engine is statistically represented with the common trend and a few varying terms that reflect the main dynamics of the degradation process. For estimation of the model parameters, all degradation observations of the engine population are fused and local linear smoothing is needed. In this study, the Gaussian kernel function is applied for both curve and surface smoother, and the related smoothing bandwidth is determined by the one-leave-out cross-validation strategy.

Prognostics with Bayesian Inference
For one aero-engine that is still operating, its future health is of significance for condition-based maintenance and health management. Therefore, Bayesian inference is combined with the degradation model for engine prognostics in this Subsection.
With the degradation modeling by FPCA as detailed in the previous Subsection, the degradation trajectory of one in-service engine can be described as: Assume that we have monitored the in-service engine at a vector of time t = (t 1 , t 2 , ..., t H ) with the observed HI y 1:H = (y 1 , y 2 , ..., y H ) T , where t H denotes the latest monitoring time. Once the mean function µ(t) and FPCs φ k (t)s are estimated with HI of historical run-to-failure engines of the population as elaborated in Section 3.3, the trend and the main variations of the in-service engine degradation can be obtained by interpolations with its monitoring times. However, the FPC scores ε = (ε 1 , ε 2 , ..., ε K ) T are to be inferenced. In practice, system health deterioration is usually a gradual process under the effects of various internal and external environmental factors, and the degradation process can be assumed as a Gaussian process. Then the uncorrelated FPC scores ε k (k = 1, 2, ..., K) will respectively follow the normal distributions N(0, λ k ) with λ k being the ordered eigenvalues of the covariance function of the degradation process. Taking FPC scores estimated from the historical run-to-failure engines as the prior distribution p 0 (ε) and considering the likelihood function p(y 1:H |ε) of observing y 1:H [36], we have: where µ 1:H and Φ with entry (h, k) asφ k (t h ) (h = 1, 2, . . . , H; k = 1, 2, . . . , K) are respectively interpolated with the monitoring time vector t = (t 1 , t 2 , ..., t H ) from theμ(t) andφ k (t)s estimated from the run-to-failure engines, andσ 2 is the estimated measurement error. Based on Bayesian inference outlined in Section 2.2, the posterior distribution p(ε|y 1:H ) of the FPC scores ε of the in-service engine can be updated as: From Equation (21), the p 0 (ε) and p(y 1:H |ε) are conjugate multivariate normal distributions. The p(ε|y 1:H ) can be analytically derived to also follow multivariate normal distribution as follows: Further inserting the posteriori distribution of the FPC scores ε into Equation (20), a real-time prognosis for the in-service engine based on the latest observations y 1: H is: are respectively the mean value and FPC vector interpolated from the mean function and K FPCs at a future time t.

Case Studies and Results Analysis
The main purpose of this section is to demonstrate the validity and performance of the proposed prognostics approach with the case study on an aircraft gas turbine engine.

Sensor Data of Aircraft Engine
The sensor data of the aircraft gas turbine engine degradation is generated from the commercial modular aero-propulsion system simulation (C-MAPSS) developed at NASA [37] and published online for research investigations. Each time series signal represents a different degradation instance of the dynamic simulation of the same engine population and consists of multi-sensor measurements. For each cycle of a degradation instance, 21 sensor measurements, as listed in Table 1, were recorded. The multi-sensor data was contaminated with noises and each engine started with different initial health conditions and manufacturing variations, which was unknown. With limited knowledge about the true physical model, we relied solely on the multi-sensor data from the training and testing engines to understand the engine degradation process. The data can be downloaded from NASA data repository [38].
In particular, the data set FD001 that contains 100 training and testing engines is used in this study. The multi-sensor measurements for each training engine were collected until failure, whereas the multi-sensor measurements for each in-service unit were truncated at some random point before its failure. Although these engines were simulated under one operational condition and one failure mode, the engine with multiple usage conditions and failure modes may also be analyzed using the proposed method with necessary processing as in [20,39].

Results and Analysis on Informative Sensor Selection
Before evaluation and selection, a rough screening of the 21 sensors tells that the sensors T2, P2, P15, epr, farB, htBleed, Nf_dmd, and PCNfR_dmd are of binary or constant value and are excluded. With the remaining 13 sensors of the 100 training engines, the discussed four metrics and the two metrics of Pro and trendability (Tre) in [24] are calculated as in Table 2. For metrics of Mon, Corr, and Rob that are defined on sensors from one engine, statistics of the mean and standard deviation (std) are obtained based on results of the 100 engines. From the results, four sensors Nf, Nc, NRf, and NRc were found to be of significant difference from the other nine sensors. For the two sensors Nf and NRf, they have high variation in metric Corr and relative low value in metric Pre. Thus, these two sensors should not be selected as the informative sensors based on the analysis in Section 3.1. In addition, the two sensors score low values of metric Pro proposed in [24], verifying the effectiveness of the proposed metric Pre. Nevertheless, the metric Tre proposed also in [24] to characterize the trendability of a population of sensors scores both high for the two sensors. To make sense of interpretation, signals of the two sensors for the 100 training engines are shown in Figure 2. It shows that the two sensors are not correlated well with the usage of the engine and their ranges are both very small (actually less than 0.5) from start to failure, which validates these sensors have high variation in metric Corr and score low value in metric Pre and Pro. In previous studies using the C-MAPSS data, including [9,20], the above two sensors were also wiped out as non-informative sensors.  From the results in Table 2, the sensors Nc and NRc have high variations in both metric Mon and Corr, and their Pre scores are even lower. Also, the Pro and Tre scores of the two sensors are relatively lower as compared to other sensors. Thus, the two sensors also should not be selected as informative sensors. To interpret the evaluation results, signals of the two sensors for the 100 engines are given in Figure 3. It is observed that the two sensor signals demonstrate a very inconsistent degrading trend for the population of training engines that operate under the same condition and with one failure mode. There are increasing, decreasing, as well as oscillating patterns in the two sensors, which explains the high variations in metric Mon and Corr, as well as the low scores in metric Pre, Pro, and Tre. For the three metrics Pre, Pro, and Tre that consider the trend consistency of a sensor for a population operate under the same condition and with one failure mode, some comparisons can be made from Table 2. The metric Tre in [24] is effective to sift out sensors without a  From the results in Table 2, the sensors Nc and NRc have high variations in both metric Mon and Corr, and their Pre scores are even lower. Also, the Pro and Tre scores of the two sensors are relatively lower as compared to other sensors. Thus, the two sensors also should not be selected as informative sensors. To interpret the evaluation results, signals of the two sensors for the 100 engines are given in Figure 3. It is observed that the two sensor signals demonstrate a very inconsistent degrading trend for the population of training engines that operate under the same condition and with one failure mode. There are increasing, decreasing, as well as oscillating patterns in the two sensors, which explains the high variations in metric Mon and Corr, as well as the low scores in metric Pre, Pro, and Tre.  From the results in Table 2, the sensors Nc and NRc have high variations in both metric Mon and Corr, and their Pre scores are even lower. Also, the Pro and Tre scores of the two sensors are relatively lower as compared to other sensors. Thus, the two sensors also should not be selected as informative sensors. To interpret the evaluation results, signals of the two sensors for the 100 engines are given in Figure 3. It is observed that the two sensor signals demonstrate a very inconsistent degrading trend for the population of training engines that operate under the same condition and with one failure mode. There are increasing, decreasing, as well as oscillating patterns in the two sensors, which explains the high variations in metric Mon and Corr, as well as the low scores in metric Pre, Pro, and Tre. For the three metrics Pre, Pro, and Tre that consider the trend consistency of a sensor for a population operate under the same condition and with one failure mode, some comparisons can be made from Table 2. The metric Tre in [24] is effective to sift out sensors without a consistent trend for the population of engines by a lower score, it cannot sift out the small range sensor, whose correlation with the degradation is poor. The proposed metric Pre and the metric  For the three metrics Pre, Pro, and Tre that consider the trend consistency of a sensor for a population operate under the same condition and with one failure mode, some comparisons can be made from Table 2. The metric Tre in [24] is effective to sift out sensors without a consistent trend for the population of engines by a lower score, it cannot sift out the small range sensor, whose correlation with the degradation is poor. The proposed metric Pre and the metric Pro in [24] are both effective for wiping out these two kinds of sensors by evident lower scores. Furthermore, the Pre punishes the first kind of sensors with a lower score than that of the Pro, thus it is much easier to identify non-informative sensors with the proposed Pre. Sensors for the training engines from FD003 with the same failure mode as FD001 (i.e., HPC failure) are also evaluated and the results are in Table 3. Similar results can be drawn as those of FD001, which further shows the validity of the proposed metric Pre.  Table 2, there is no significant difference for the evaluation results of the other nine sensors, so they are selected as the informative sensors for the HI construction of the engines. To contrast with the non-informative sensors, signals of two selected sensors are illustrated in Figure 4. thus it is much easier to identify non-informative sensors with the proposed Pre. Sensors for the training engines from FD003 with the same failure mode as FD001 (i.e., HPC failure) are also evaluated and the results are in Table 3. Similar results can be drawn as those of FD001, which further shows the validity of the proposed metric Pre.   Table 2, there is no significant difference for the evaluation results of the other nine sensors, so they are selected as the informative sensors for the HI construction of the engines. To contrast with the non-informative sensors, signals of two selected sensors are illustrated in Figure 4.

Results and Analysis on Degradation Modeling
With the selected nine-dimensional informative sensors, HI that describes the degradation of the engine is constructed as Equation (12). Data matrices for healthy and failed states of the engine are respectively created using the nine-sensor measurements of the initial five cycles and the failure cycle of all the engines. The HI of one engine and all the training engines are displayed in Figure 5. It is observed that the HI of all the engines are with the same decreasing trend and roughly vary between 1 and 0, which means a smaller HI relates to a less healthy

Results and Analysis on Degradation Modeling
With the selected nine-dimensional informative sensors, HI that describes the degradation of the engine is constructed as Equation (12). Data matrices for healthy and failed states of the engine are respectively created using the nine-sensor measurements of the initial five cycles and the failure cycle of all the engines. The HI of one engine and all the training engines are displayed in Figure 5. It is observed that the HI of all the engines are with the same decreasing trend and roughly vary between 1 and 0, which means a smaller HI relates to a less healthy condition of the engine. Based on the HI of the 100 training engine systems, parameters of the FPCA-based degradation model can be estimated as stated in Section 3.3 and the main ones are shown in Figure 6. From Figure 6a, it is observed that the estimated mean function adaptively captures the general degrading trend of the 100 engines population. To ensure that sufficiently variance is retained in the truncated expansion as in Equation (19), the threshold η is set as 0.95 and the truncated terms K is 7 using Equation (7). That is, the first seven FPCs are needed to explain more than 95% of the variations of the engine degradation process, and each of the FPCs reflects one distinct varying mode of the engine deterioration process. After modeling by the FPCA method, the background health information of the 100 engines is abstracted into few parameters, i.e., the estimated mean function and the first seven FPCs, which are the common characteristics shared by engines from the training population, and the corresponding FPC scores specify the degradation peculiarity of one specific engine. To demonstrate the performance of the proposed method, the true parametric model of exponential (Exp) y(t) = a × exp(bt) + c [37], power law (Pow) as in [17] and 3rd polynomial (Poly3) as in [20] are also used to fit the HI of the simulated engines. Degradation modeling results for one engine are shown in Figure 7. It can be observed that the FPCA method and the three parametric models fit the degradation pattern well, but the proposed degradation model Based on the HI of the 100 training engine systems, parameters of the FPCA-based degradation model can be estimated as stated in Section 3.3 and the main ones are shown in Figure 6. From Figure 6a, it is observed that the estimated mean function adaptively captures the general degrading trend of the 100 engines population. To ensure that sufficiently variance is retained in the truncated expansion as in Equation (19), the threshold η is set as 0.95 and the truncated terms K is 7 using Equation (7). That is, the first seven FPCs are needed to explain more than 95% of the variations of the engine degradation process, and each of the FPCs reflects one distinct varying mode of the engine deterioration process. After modeling by the FPCA method, the background health information of the 100 engines is abstracted into few parameters, i.e., the estimated mean function and the first seven FPCs, which are the common characteristics shared by engines from the training population, and the corresponding FPC scores specify the degradation peculiarity of one specific engine. Based on the HI of the 100 training engine systems, parameters of the FPCA-based degradation model can be estimated as stated in Section 3.3 and the main ones are shown in Figure 6. From Figure 6a, it is observed that the estimated mean function adaptively captures the general degrading trend of the 100 engines population. To ensure that sufficiently variance is retained in the truncated expansion as in Equation (19), the threshold η is set as 0.95 and the truncated terms K is 7 using Equation (7). That is, the first seven FPCs are needed to explain more than 95% of the variations of the engine degradation process, and each of the FPCs reflects one distinct varying mode of the engine deterioration process. After modeling by the FPCA method, the background health information of the 100 engines is abstracted into few parameters, i.e., the estimated mean function and the first seven FPCs, which are the common characteristics shared by engines from the training population, and the corresponding FPC scores specify the degradation peculiarity of one specific engine. To demonstrate the performance of the proposed method, the true parametric model of exponential (Exp) y(t) = a × exp(bt) + c [37], power law (Pow) as in [17] and 3rd polynomial (Poly3) as in [20] are also used to fit the HI of the simulated engines. Degradation modeling results for one engine are shown in Figure 7. It can be observed that the FPCA method and the three parametric models fit the degradation pattern well, but the proposed degradation model To demonstrate the performance of the proposed method, the true parametric model of exponential (Exp) y(t) = a × exp(bt) + c [37], power law (Pow) as in [17] and 3rd polynomial (Poly3) as in [20] are also used to fit the HI of the simulated engines. Degradation modeling results for one engine are shown in Figure 7. It can be observed that the FPCA method and the three parametric models fit the degradation pattern well, but the proposed degradation model in Equation (19) reveals more local degradation dynamics of the engine. For quantitative comparison, sum of square error (SSE), the coefficient of determination (R-square) [40], and root mean square error (RMSE) are statistically calculated for the 100 engines as in Table 4. in Equation (19) reveals more local degradation dynamics of the engine. For quantitative comparison, sum of square error (SSE), the coefficient of determination (R-square) [40], and root mean square error (RMSE) are statistically calculated for the 100 engines as in Table 4.    Table 4 show that the proposed method performs better than the Pow and Poly3 models, and demonstrates comparable performance to the true Exp model. However, the proposed method models the degradation adaptively without any assumption on the parametric form of the degradation pattern. This will be more significant when little knowledge is known about the latent degradation trend. Also, all the 100 engine degradation observations from the training set are pooled to estimate the parameters of the proposed degradation model. While in the parametric models, fittings are carried out with one individual engine   Table 4 show that the proposed method performs better than the Pow and Poly3 models, and demonstrates comparable performance to the true Exp model. However, the proposed method models the degradation adaptively without any assumption on the parametric form of the degradation pattern. This will be more significant when little knowledge is known about the latent degradation trend. Also, all the 100 engine degradation observations from the training set are pooled to estimate the parameters of the proposed degradation model. While in the parametric models, fittings are carried out with one individual engine independently, common information about the degradation process of the engine population is prone to be lost. In the following, the FPCA method is combined with Bayesian inference to predict the aero-engine health and comparison are made with the true Exp degadation model.

Results and Analysis on Health Prognostics
For validation of the presented method for long-term health prognostics, the multiple sensor data of the 100 testing engines from FD001 is processed following the same procedures of the training engines as detailed above.
Through linear transformation of the nine selected informative sensors with the coefficients learned from the training engines, the HI of all the testing engines and of two engines are shown in Figure 8. With the operation of one engine, its health condition degrades and the HI declines as the same pattern of the training engines. independently, common information about the degradation process of the engine population is prone to be lost. In the following, the FPCA method is combined with Bayesian inference to predict the aero-engine health and comparison are made with the true Exp degadation model.

Results and Analysis on Health Prognostics
For validation of the presented method for long-term health prognostics, the multiple sensor data of the 100 testing engines from FD001 is processed following the same procedures of the training engines as detailed above.
Through linear transformation of the nine selected informative sensors with the coefficients learned from the training engines, the HI of all the testing engines and of two engines are shown in Figure 8. With the operation of one engine, its health condition degrades and the HI declines as the same pattern of the training engines. With the estimated results from the training engines as the prior distribution p0(ε) for FPC scores ε = (ε1, ε2, ..., ε7) T , the ε of a testing engine is updated to a posteriori distribution p(ε|y1:H) with its latest HI observations y1:H using Equation (21) as discussed in SubSection 3.4. Statistically, the 100 testing engines are truncated from about 20% to 96% of their true useful lives. In the following, the testing engines with enough observations are chosen to demonstrate the prognostic inference.
The testing engine # 49 consumed about 96% of its true useful life (i.e., 324 cycles) before being truncated and there are in total 303 observations. Prognostics of the engine is performed at 35%, 50%, 80%, and 100% of its health monitoring history with, respectively, H = 106, 152, 242, and 303 historical observations. The health prediction results are as Figure 9. When there is not enough historical observations (in cases of 35% and 50%), the parametric model of Exp is poorly fitted and the engine long-term health is over or under estimated. While the proposed method predicts long-term health of the engine more accurately by utilization of the mean function and the first seven FPCs learned from the training engines. With the approaching of its failure (in cases of 80% and 100%), both methods track and predict the engine health degradation with high accuracy since there accumulates sufficient historical observations to fit or update the model parameters. Also note that the proposed method reveals more local dynamics of the engine degradation, which is beneficial for explaining of the degradation variation. With the estimated results from the training engines as the prior distribution p 0 (ε) for FPC scores ε = (ε 1 , ε 2 , ..., ε 7 ) T , the ε of a testing engine is updated to a posteriori distribution p(ε|y 1:H ) with its latest HI observations y 1:H using Equation (21) as discussed in Section 3.4. Statistically, the 100 testing engines are truncated from about 20% to 96% of their true useful lives. In the following, the testing engines with enough observations are chosen to demonstrate the prognostic inference.
The testing engine # 49 consumed about 96% of its true useful life (i.e., 324 cycles) before being truncated and there are in total 303 observations. Prognostics of the engine is performed at 35%, 50%, 80%, and 100% of its health monitoring history with, respectively, H = 106, 152, 242, and 303 historical observations. The health prediction results are as Figure 9. When there is not enough historical observations (in cases of 35% and 50%), the parametric model of Exp is poorly fitted and the engine long-term health is over or under estimated. While the proposed method predicts long-term health of the engine more accurately by utilization of the mean function and the first seven FPCs learned from the training engines. With the approaching of its failure (in cases of 80% and 100%), both methods track and predict the engine health degradation with high accuracy since there accumulates sufficient historical observations to fit or update the model parameters. Also note that the proposed method reveals more local dynamics of the engine degradation, which is beneficial for explaining of the degradation variation. Sensors 2020, 20,  Besides the testing engine # 49, health of the other nine testing engines which are observed to consume more than 90% of their true useful lives are also predicted at four monitoring phases (i.e., 35%, 50%, 80%, and 90%) of their health monitoring histories. The HI prediction RMSEs for all the 10 engines are statistically compared in Table 5. It can be observed that the proposed method outperforms the true Exp model in the very long-term prediction, when the engines are in the early phase of service and only 35% of observations are used. With such limited historical HI, some engines are even wrongly fitted, such as # 20 and # 34. For the case of 50% observations, the proposed method also predicts better, except for engine # 20 and # 81. When the engines run into their later stages (in cases of 80% and 90%), performance of the proposed method is comparable to the true Exp model as there are sufficient HI observations to fit the parametric model. Besides the testing engine # 49, health of the other nine testing engines which are observed to consume more than 90% of their true useful lives are also predicted at four monitoring phases (i.e., 35%, 50%, 80%, and 90%) of their health monitoring histories. The HI prediction RMSEs for all the 10 engines are statistically compared in Table 5. It can be observed that the proposed method outperforms the true Exp model in the very long-term prediction, when the engines are in the early phase of service and only 35% of observations are used. With such limited historical HI, some engines are even wrongly fitted, such as # 20 and # 34. For the case of 50% observations, the proposed method also predicts better, except for engine # 20 and # 81. When the engines run into their later stages (in cases of 80% and 90%), performance of the proposed method is comparable to the true Exp model as there are sufficient HI observations to fit the parametric model.
To further illustrate the RUL prediction ability of the proposed approach, HI of the 100 testing engines are predicted with FPC scores updated by the last recorded observations and are extrapolated to the pre-defined failure threshold (i.e., HI f = 0.3) to estimate the final failure cycles. Results of the 100 testing engines are plotted in Figure 10. For the proposed statistical method, both the mean and median of 100 simulations are given. For the engine whose HI cannot be properly fitted to the Exp model, its life is estimated as 206 cycles, which is the mean life of the 100 training engines. It can be observed that the predicted RULs by the proposed method and Exp model are close to the actual values in the region where the RUL value is small. That is because when the engine is working closer to the failure, the degradation is enhanced and can be captured for better prognostics. However, when the engines are far from failure, the mean and median prediction results of the proposed method are more accurate than the parametric Exp model. To further illustrate the RUL prediction ability of the proposed approach, HI of the 100 testing engines are predicted with FPC scores updated by the last recorded observations and are extrapolated to the pre-defined failure threshold (i.e., HIf = 0.3) to estimate the final failure cycles. Results of the 100 testing engines are plotted in Figure 10. For the proposed statistical method, both the mean and median of 100 simulations are given. For the engine whose HI cannot be properly fitted to the Exp model, its life is estimated as 206 cycles, which is the mean life of the 100 training engines. It can be observed that the predicted RULs by the proposed method and Exp model are close to the actual values in the region where the RUL value is small. That is because when the engine is working closer to the failure, the degradation is enhanced and can be captured for better prognostics. However, when the engines are far from failure, the mean and median prediction results of the proposed method are more accurate than the parametric Exp model. The common score used to evaluate the performance of prognostic method for the C-MAPSS data set as Equation (25) and prediction RMSE are also calculated and listed in Table 6. The far lower score of the proposed method (both mean and median) as well as the smaller RMSE indicate that the performance of the proposed prognostic method is better than the Exp The common score used to evaluate the performance of prognostic method for the C-MAPSS data set as Equation (25) and prediction RMSE are also calculated and listed in Table 6. The far lower score of the proposed method (both mean and median) as well as the smaller RMSE indicate that the performance of the proposed prognostic method is better than the Exp model. Also, narrower prediction errors range and a higher number of correct-predictions are achieved using the proposed method. From the above analysis, the proposed adaptive method is more robust than the parametric Exp method, especially when one engine is in an incipient service stage of its life.
where d i = predicted RUL of the i-th engine-true RUL of the i-th engine (i = 1, 2, ..., n). Table 6. Comparison of RUL prediction results for the 100 testing engines.

Method
Score RMSE

Discussion
In this paper, a data-driven statistical prognostic approach for informative sensor selection and adaptive degradation modeling based on FPCA is introduced. Effectiveness of the proposed method is validated by the case study on an aircraft engine simulation experiment. Nevertheless, some key issues should be tackled for application of the presented method to real engines and other systems.
For the proposed approach to be effective for real engines, enough run-to-failure sensor data should be provided, and this is a common requirement for a data-driven approach. In real aircraft engine operation, the degradation of one engine usually begins after one long normal working stage, the so-called delay-time phenomenon. Modeling of the potential-to-failure stage is of more significance. Thus, it is better to apply the proposed method upon the detection of incipient faults. For few instances of real engine failures, and that is often the case, the jointly Gaussian assumption may no longer hold for the parameter estimation. To partially tackle this problem, some kind of bootstrap methods can be useful. As for the extension of the proposed method, especially the adaptive degradation modeling by FPCA, to monitor other systems, efforts should be put especially to the construction of a HI that relates to the degradation of these systems, as the linear data transformation method used in this paper may not be applicable.

Conclusions
To model and track the complex degradation pattern of aircraft engines for accurate and efficient prognosis, a novel method for informative sensor selection and adaptive degradation track is proposed in this paper. The deterioration sensitive sensors can be selected by the presented metrics and be fused to construct a health index that describes the degradation of an aircraft engine. Taking the degradation index of one engine as functional data, the degradation process of the engine population is then adaptively modeled by the FPCA, and future health is predicted with Bayesian inference. Experimental studies are performed on the sensor dataset of aircraft gas turbine engines, and the results verify that the proposed method can effectively select the informative sensors to model and predict the complex degradation process of the aircraft engine. The failure threshold is set to a fixed value in this study; however, large variation of the failure value requires a random failure threshold to be pursued in the future.

Conflicts of Interest:
The authors declare no conflicts of interest.