Sensor Selection via Maximizing Hybrid Bayesian Fisher Information and Mutual Information in Unreliable Sensor Networks

: The sensor selection problem is addressed for unreliable sensor networks. The Bayesian Fisher information (BFI) matrix, mutual information (MI) and their relationship are investigated under Gaussian mixture noise conditions. To overcome the ﬂaw that the sensor selection methods based on either BFI matrix or MI could not provide coincident results, the multiple objective optimal (MOP) -based sensor selection approach is developed via minimizing the number of selected sensors while maximizing corresponding BFI matrix and MI. The variable weight decision making (VWDM) and technique for order of preference by similarity to ideal solution (TOPSIS) approaches are then proposed to ﬁnd the candidate that can better trade o ﬀ the cost and two performance metrics. Comparison results demonstrated that the proposed method can ﬁnd a more informative sensor group, and ultimately, its overall localization performance outperforms the sensor selection methods based on BFI or MI.


Introduction
Advances in sensor technology have made it possible to use a large number of sensors in various applications, such as environmental monitoring, battlefield surveillance, target localization and tracking [1,2], etc. Sensor selection is critical for saving energy to prolong the lifetime of sensor networks. A good sensor selection strategy needs to select most informative sensors to achieve a good balance between the localization accuracy and cost. In this paper, the sensor selection problem for angle of arrival (AOA)-based source localization is addressed, as AOA localization technology has been applied to many areas such as radar, sonar, wireless communications and indoor acoustic localization, to name but a few.
The sensor selection problem has been attracting much attention in the last decades [3][4][5][6][7][8][9][10][11][12][13]. Entropy and its variant mutual information (MI) are two popular performance metrics to design sensor selection methods. MI is a standard information quantity from the information theoretic point of a view. The MI between the predicted sensor observation and the current target location distribution was proposed to evaluate the expected information gain about the target location attributable to a sensor in [5,6]. The simple entropy-based heuristic for sensor selection is introduced in [7]. This method is computationally simpler than MI [5], but it works well only when the measurement noise is small. The maximum entropy fuzzy clustering was introduced to sensor selection for target tracking [8].
The Cramer-Rao lower bound (CRLB) (Bayesian CRLB if the priori distribution is known) provides a theoretical performance limit for an unbiased or asymptotically unbiased estimator, it thus is another attractive metric to develop various sensor selection methods. For single target tracking, a subset of sensors was selected in a bearing-only sensor network to minimize the posteriori CRLB [9]. For time difference of arrival (TDOA)-based localization, the sensor selection method in non-line of sight (NLOS) condition was investigated in [10]. The global sensor selection method for AOA based localization was proposed via minimizing the trace of CRLB [11]. In [12,13], sensor selection methods for linear dynamical systems were proposed under correlated measurement noise condition and sensor selection approaches for non-linear measurement models were developed in [14,15].
All of the sensor selection works mentioned above are derived from either CRLB or MI. It has been demonstrated that they exhibit a good consistency when the estimation error of each individual sensor follows a Gaussian distribution. However, sensor failures, data loss, NLOS propagation or unexpected interference may impose uncertainty on sensor networks which result in the presence of unreliable measurements [16,17].
In the last decades, much attention has been devoted to using non-Gaussian noise model to model the sensor networks with uncertain observations. The Gaussian mixture noise has been applied to model the ambient noise in various applications (see e.g., [18,19] and the references therein). As illustrated in [20], the selection results based on MI and CRLB are different. MI is more easily influenced by uncertain probability of one sensor as its observation has insufficient information about the target; CRLB-based methods tend to select sensors which are close to the source even some of them having large uncertainties for received signal strength (RSS)-based target localization and tracking. In this work, we will investigate the selection results for AOA-based source localization.
This paper focuses on the sensor selection problem for AOA-based source localization in unreliable sensor networks. To select more informative sensors, we propose to incorporate both MI and Bayesian Fisher information (BFI) matrix, which is the inverse of Bayesian CRLB, into the selection scheme. In addition, as the number of selected sensors is usually unknown for practical applications, the best way is to select sensors that can trade off the localization performance and the cost. For this purpose, the number of selected sensors is also formulated as one objective to optimize. Thus, we have three objective functions to optimize: minimizing the number of selected sensors, maximizing the BFI and MI of the selected sensors. Obviously, the first one is conflict with the other two. The multi-objective evolutionary algorithm based on decomposition (MOEA/D) [21] is used to find the optimal solutions that can trade off these conflicting objective functions. Then, the decision-making method, variable weight decision making (VWDM) [22,23] and technique for order of preference by similarity to ideal solution (TOPSIS) [24,25], is proposed to find the final solution.
The rest of the paper is organized as follows. The measurement model is described in Section 2. The relationship between Fisher information (FI) and MI are derived in Section 3. The sensor selection method based on MOEA/D is proposed in Section 4. Section 5 provides simulation results and the conclusion are summarized in Section 6.

System Model
In this section, we review the recursive Bayesian estimation for target localization. The recursive Bayesian estimation is using expected posterior distribution to predict what the posterior distribution would look like if a simulated measurement of a new sensor is incorporated.
In the recursive Bayesian estimation for target localization and tracking [4], both the target location and the sensor observations are modeled as stochastic processes, and the posterior target location distribution conditioned on sensor observations is computed recursively from additional sensor observations step by step. Let x denote the target location random variable and its realization value, respectively. Let z j denote the sensor observation. The posterior target location distribution is incrementally updated by one sensor observation at a time. When the recursive Bayesian estimation is applied to the target location, we can get that [4]: where C is normalization constant. When z 1 ,..., z j are conditionally independent with one another conditioned on x, the above equation is simplified to: For the AOA sensors, the observation is the estimated angle of each sensor which can be given by: where θ i (x) = tan −1 y−y i x−x i represents the true angle, tan −1 () stands for the 4-quadrant arctangent.
. . , N, denotes the position of N sensors collecting angle measurements. η i denotes the angle estimation error. As mentioned above, the adverse environment may bring uncertainty to sensor networks. We follow [14,16] to model the uncertain scenario in which the probability density of function (PDF) of angle estimation error η i is: where p is the reliable probability. µ 0 >> 0 and σ 2 0 >>σ 2 .

BFI for Gaussian Mixture Noise
There are several different measures of the estimation error of the posterior target location distribution. One estimation error measure is the Bayesian CRLB of the target location which is the inverse of BFI.
In this section, the BFI is derived under Gaussian mixture noise. Letx be an unbiased estimate of x, BFI satisfies the well-known inequality: where J is the BFI. It has been shown in [13] that, the BFI matrix consists of two parts: the information matrix obtained from the sensor measurements and the priori information matrix. Furthermore, under the assumption that the sensor measurements are conditionally independent on the given target information, the BFI matrix can be written as [13]: where J prior is the FI matrix of the priori information about target which typically comes from previous measurements or from other available measurements. Let f (x) denotes the prior PDF of the target position distribution, J prior can be expressed as: Let J i denote the standard FI of sensor s i , it can be formulated as: where: For AOA sensors, we have: where r i is the distance between s i and x. Let p 1 = p, p 2 = 1 − p, µ 1 = 0, µ 2 = µ 0 , σ 1 = σ, σ 2 2 = σ 2 0 . Substituting Equations (10) and (11) into Equation (9), we can get: When ignoring prior information, it is well known that it is desirable for the nodes to be both close to the target and to provide good angular diversity by surrounding the target [11]. By inspection of Equation (12), it is clear that 2 × 2 BFI is positive semi-definite, and: there exists two eigenvalues of J represented as: Equation (16) indicates that the range from source to sensors, angular diversity and bearing scaling factor determine BFIU. Due to the upper bound in Equation (16), it is expected that the selection approach will attempt to select nodes that both are close to the target and have a high sensing probability. In general, when prior information is available, the prior information is skewed to favor a certain direction; node selection methods will select sensors that reduce the error in the direction where it is high. For simplicity, we can use the maximum likelihood estimation of x as the actual target position to calculate the upper bound of Equation (16) to select nodes.

Mutual Information
Another estimation error measure is the Shannon entropy that measures the uncertainty of the posterior target location distribution. From the information theoretic point of view, sensors are tasked to observe the target in order to reduce the uncertainty about the target location distribution. One expression to denote the contribution of a sensor is MI. The greedy sensor selection method gradually reduces the uncertainty of the target location distribution by repeatedly selecting the currently Electronics 2020, 9, 283 5 of 17 unused sensors with maximal MI. Given the distribution of the target state and the likelihood function of the sensor measurements, the MI of sensor can be written as [5]:

The Relationship between Fisher Information and Mutual Information
In this section, we will demonstrate the relationship between FI and MI under Gaussian and non-Gaussian noise similar the work in [26]. The FI with respect to θ(x) is given as [20]: Here, we assume the additive noise with density q(·), we can write ). In this case, J(θ) becomes independent of θ and thus can be rewritten as: This quantity is referred to as FI of a random variable with respect to a scalar translation parameter and Equation (18) is a constant. Conceptually, the constant J[q] summarizes the total local dispersion of a distribution. Similarly, the Shannon entropy H( z|θ)(H( z|x)) is also independent of θ(x) and identical to the noise entropy: Stam's inequality specifies the relation between FI and Shannon entropy as following: for a given amount of FI, the Shannon entropy of a continuous random variable is minimized if and only if the variable is Gaussian distribution. When the variance of a Gaussian random variable is 1/J[q], Stam's inequality implies that: Define: note that D 0 > 0 for distribution with lighter tails than a Gaussian, as well as for distributions that are asymmetric. Because the transfer function is invertible, MI(x, z) = MI(θ, z) even though H(x) H(θ) [26]. Thus we can write MI as: As H( z|θ) = H[q], thus, Equation (25) can be written as: Electronics 2020, 9, 283 6 of 17 From Equations (24) and (26), we can get that: (27) can be rewritten as: Using the formulas for a change of variables, it is straightforward to verify that: Then, Equation (28) can be rewritten as: where: . It illustrate that the degree to which MI is well approximated by FI (I Fisher ) depends on the values of C 0 and D 0 . Both terms are nonnegative and quantify two very different aspects of the noise: C 0 = H(θ + η) − H(θ) is monotonic in the magnitude of the noise. While D 0 represents the nongaussianity of the noise. From Equation (30), we can obtain the following conclusions: • If the noise is Gaussian, D 0 = 0. And C 0 ≥ 0 as the additive noise would increase the entropy, thus MI(x, z) ≥ I Fisher . That is, if and only if the noise is Gaussian, I Fisher is guaranteed to denote a lower bound on MI. • As Stam's inequality tells us that D 0 ≥ 0, we can get that MI(x, z) ≤ I Fisher + C 0 . Specially, in the case of vanishing noise, C 0 → 0 , and it follows that MI(x, z) ≤ I Fisher . Thus, I Fisher generally represents an upper bound on MI in the small noise regime.

•
Only when the noise entropy goes to zero and the noise converges to a Gaussian at the same time,

The Proposed Sensor Selection Method
Considering the problem mentioned above, we would like to take all performance evaluation metrics into account. In addition, the number of selected sensors is unknown in practice, which requires one to consider the localization performance and the cost for different applications. In this paper, we want to optimize multiple conflicting objectives at the same time: minimizing the number of selected sensors and maximizing the BFI matrix and MI of these sensors. In order to formulate the problem into a multiple objective optimal (MOP) framework, the first objective function is transformed into a maximization problem of the gap between the number of sensors to be selected and the number where: where Equations (33) and (34) represent the normalized MI and BFI of selected sensors. As the sensor selection policy Equation (31) consists of three conflicting objective functions, any single solution cannot optimize them at the same time. However, MOP methods are proposed to find a set of solutions that can trade off the objectives. Many multi-objective methods have been proposed in last decades, such as non-dominated sorting genetic algorithm (NSGA-II) [27], MOEA/D [21]. The MOEA/D method has lower computational complexity than NSGA-II, and it outperforms or performs similar to NSGA-II. In this paper, we will use MOEA/D to solve the optimization problem of Equation (31). The generalized MOP can be formulated as [21,27]: max where Ω is the decision space. Assume α 1 and α 2 are two solutions of Equation (35), , α * is called a Pareto-optimal point and F(α * ) is a Pareto-optimal objective vector. That is to say, any improvement in one objective at a Pareto-optimal point must lead to deterioration in at least one other objective. The set of all Pareto-optimal points is called the Pareto set and the set of all Pareto-optimal objective vectors is the Pareto front (PF) [21][22][23][24][25]. MOEA/D decomposes the MOP problem into scalar optimization subproblems. It solves these subproblems in a collaborative way. Any decomposition approach developed in the area of mathematical programming can be incorporated into the framework of MOEA/D. In this paper, the scalar optimization subproblems based on classical Tchebycheff approach is given by [21]: For each Pareto optimal point x * there exists a weight vector λ such that x * is the optimal solution of Equation (31) and each optimal solution of Equation (36) is a Pareto optimal solution of Equation (31). Therefore, one is able to obtain different Pareto optimal solutions by altering the weight vector. The details about MOEA/D can be found in [21].

Performance Metrics for MOEA/D
The hypervolume indicator (I H ) [21,25] is used in our study to illustrate the efficiency of the MOEA/D in sensor selection problem. Let y * = (y * 1 , . . . , y * m ) be a point in the objective space which is dominated by any Pareto optimal vectors. Let P be the obtained approximation to the PF in the objective space. Then, the I H value of P (with regard to y * ) is the volume of the region which is dominated by P and dominates y * . The higher the hypervolume, the better the approximation. In our experiments, y * = ( f min 1 , f min 2 , f min 3 ) for the three objective ones, where f min i indicates the minimum value of the ith objective in the obtained non-dominated set.

Select Solution from the Pareto-Optimal Solution
It is necessary to emphasize that the final aim of sensor selection is to obtain a single optimal solution. Since the optimization result of a MOP algorithm is a set of non-dominated solutions, the proper solution should be selected based on specific applications. There are many methods that one can employ in selecting a single solution from Pareto-optimal front. For the sensor selection problem proposed above, we need to evaluate three attributes to select the better candidate.
Let w = (w 1 , w 2 , . . . , w m ) and x = (x 1 , x 2 , . . . , x m ) be a constant weight vector and state value vector, a common decision making function is use A = w j x j to evaluate the alternatives. However, the constant weight vector is not work well in some cases. For example, if all factors are equality important, i.e., w = (w 1 , w 2 ) = (1/3, 1/3, 1/3), Hence, the weighted average synthesis expression is , however, known by the decision making function, we can find that A(x 1 ) = A(x 2 ) = A(x 3 ). This result contradicts with the expectation. Thus, the decision making based on constant weight vector has its limitations. To overcome this problem, Wang [22] proposed the variance weight method. Since then, a lot of work has been done for the variable weight decision making [23]. It emphasizes the weights should change with the state values of factors. According to the change trend of weight, the variable weight decision making mechanism can be divided punishment mechanism, incentive mechanism and mixed mechanism. The basic definition of variable weight theory is summarized as follows [22,23]: (2) the function w j (x 1 , x 2 , . . . , x m ) is continuous with respect to every variable x j .
(3) the function w j (x 1 , x 2 , . . . , x m ) is monotonically decreasing (for punishment mechanism) or increasing (for incentive mechanism) with respect to the variable x j .
Since the results of MOP are a set of Pareto solutions, several candidates which have the same value of the first state will be found. Thus the first step of finding the final solution is to select a sate vector from several candidates with the same first state value. The procedure can be summarized as follows [22,23]: Step 1: Let the constant weight vector w = (w 1 , w 2 , . . . , w m ), (m = 3 for sensor selection problem). Without any prior knowledge, one can assume w = (w 1 , w 2 , w 3 ) = (1/3, 1/3, 1/3).
Step 2: Construct the expression of state variable weight vector S. Analysis the meaning of three objective function, we can find that the better combination of selected sensors should have higher MI and BFI. We prefer to select the solution with large objective function values, but neglect the one with very small attribute value. Thus, weights need to make corresponding adjustments to the attribute values of indicators, punish the index weights with low attribute values, and encourage the index weights with high attribute values. The state variable weight vector can be expressed as: where α ≥ 0 denotes penalty factor, the bigger α is, the bigger the punishment range is. According to [23], α can be determined by: A ∈ [0, 1) is the adjust level, usually, A is usually set to be 0.5. x j is the mean of j-th attribute, which denotes all the factors less than x j will be punished.
Step 3: Calculate the state variable weight matrix for all candidates:

TOPSIS
In this paper, the TOPSIS will be used to select the final solution. According to this technique, the chosen optimal solution should have the smallest Euclidean distance from the ideal solution and also the largest Euclidean distance from the negative-ideal solution. The ideal solution is a combination of the best value of each objective. In contrast, negative-ideal solution is a combination of the worst value of each objective.
Before introduce the method, common symbols are defined as follows: f ij is the ith value of the j-th objective in the objective matrix, F ij is the normalized value of f ij , v ij is the weighted value of F ij and w ij is the weight obtained from Equation (40). Assume M solutions have been found by MOP. The TOPSIS algorithm can be described as below [24,25]: Step 1. Construct normalized objective matrix with M rows and three columns by: Step 2. Construct weighted normalized objective matrix by multiplying each column with its weight w j : v ij = F ij w ij (42) Step 3. Calculate the Euclidean distance between each solution and the ideal and negative-ideal solution: Step 4. Calculate the closeness of each optimal solution: The optimal solution having the largest C i is the final solution.

Simulations
In this section, we will use simulation results to illustrate the effectiveness of the proposed sensor selection method. N sensors are deployed in the interested area to estimate source location. Consider that 25 sensors with known reliable probability are uniformly deployed in the 100 × 100 m 2 detection area as shown in Figure 1. In the current work, we assume that the sensing probabilities of the sensors are already known to the fusion center. In the literature [28], the estimation of the detection probabilities are studied. Also, as indicated in [29], the sensing probabilities can be derived from historical data. Since these probabilities are context and scenario dependent, we do not study their estimation specifically in this paper and leave it as a future research topic. Generally, if the sensors around the source have higher reliable probabilities compared to other sensors, it is highly likely that the algorithm will select those sensors owing to both larger estimated accuracy of angles and shorter distances between source and sensors. Our interest is in considering more challenging cases to test the performance of our algorithm. Similar with [20], we assume that the sensors around the source have relatively low reliable probabilities as shown in the Figure 1. We set σ 2 = 1 • , σ 2 0 = 30 • . The source randomly appears in the area. The priori state of the target follows a uniform distribution limited in a square area with length H.
As shown in [21], MOEA/D method runs much faster than NSGA-II under same conditions. In this section, we use the Hypervolume Indicator (I H ) to observe the convergence and distribution of PF. In MOEA/D, T is set to be 25. The population size N in both MOEA/D and NSGA-II is set to be 500. Both algorithms run 30 times independently, and each run stops after 500 generations. For both methods, the genetic operators are the one point crossover operator and the standard mutation operator, the crossover probability 1 while mutation operator probability is 1/N, respectively.
The evolution of the average I H values with number of generations is plotted in Figure 2. It is evident that MOEA/D outperforms NSGA-II in both convergence speed and the quality of their final solution set.       We compare the sensor selection results of different methods when nine sensors are selected. The convex optimization method proposed in [14] and the greedy heuristic approach developed in [5] are applied to select sensors. Figure 3 plots the selection results for two different source locations. It can be seen that the sensors selected by convex optimization method are closer to the target location even when they have low reliable probabilities, while the MI-based method always select sensors with large reliable probabilities. This is consistent with the result in [20] which is observed for RSS based target localization. The proposed method, however, can select sensors with relatively high reliable probabilities but not very far away from the source location.
[5] are applied to select sensors. Figure 3 plots the selection results for two different source locations. It can be seen that the sensors selected by convex optimization method are closer to the target location even when they have low reliable probabilities, while the MI-based method always select sensors with large reliable probabilities. This is consistent with the result in [20] which is observed for RSS based target localization. The proposed method, however, can select sensors with relatively high reliable probabilities but not very far away from the source location.  To illustrate the advantage combing two metrics (BFI and MI), we will use MOEA/D to solve three objective optimal problem (MOP3) and two objective optimal problem (MOP2), respectively. The MOP2 can be formulated as: From now on, we label Equations (31), (46) and (47) as MOP3-BFI, MOP2-BFI and MOP2-MI, respectively. And MOP3-BFIU denotes the multiple objective problems when using BFIU to replace BFI in Equation (31). The sequential importance resampling (SIR) particle filter [30,31] is then used to achieve source localization. The initial Ns = 1000 particles are drawn from f (x 0 ). The root mean square error (RMSE) of 1000 Monte Carlo runs is used to measure errors between the true source location and the estimations. Figure 4 plots the RMSEs of the four compared methods as the number of selected sensors is increased. We can observe that MOP3-based sensor selection methods can improve localization performance compared to MOP2-based methods. That is to say, the proposed method using both BFI and MI as objectives has advantages over only one of them used. In addition, we can also observe that RMSE is decreased obviously with the number of sensors increasing in Figure 4a, while this phenomenon is not shown in Figure 4b. As we can see from Figure 4b, RMSE is decreased first, and then it goes up with the number of selected sensor increasing. This mainly because the second source is close to the edge of sensor networks, sensors with large distance and low reliable probabilities will have negative effect on localization accuracy. From now on, we label Equations (31), (46) and (47) as MOP3-BFI, MOP2-BFI and MOP2-MI, respectively. And MOP3-BFIU denotes the multiple objective problems when using BFIU to replace BFI in Equation (31). The sequential importance resampling (SIR) particle filter [30,31] is then used to achieve source localization. The initial Ns = 1000 particles are drawn from f(x0). The root mean square error (RMSE) of 1000 Monte Carlo runs is used to measure errors between the true source location and the estimations. Figure 4 plots the RMSEs of the four compared methods as the number of selected sensors is increased. We can observe that MOP3-based sensor selection methods can improve localization performance compared to MOP2-based methods. That is to say, the proposed method using both BFI and MI as objectives has advantages over only one of them used. In addition, we can also observe that RMSE is decreased obviously with the number of sensors increasing in Figure 4a, while this phenomenon is not shown in Figure 4b. As we can see from Figure 4b, RMSE is decreased first, and then it goes up with the number of selected sensor increasing. This mainly because the second source is close to the edge of sensor networks, sensors with large distance and low reliable probabilities will have negative effect on localization accuracy.  As discussed in Section 2, 0 σ represents the interference which influences on the localization performance directly. Figure 5 plots the RMSE when nine sensors are selected as 0 σ changes. We can see that the proposed method can improve the localization performance when a fixed number of sensors are selected. As discussed in Section 2, σ 0 represents the interference which influences on the localization performance directly. Figure 5 plots the RMSE when nine sensors are selected as σ 0 changes. We can see that the proposed method can improve the localization performance when a fixed number of sensors are selected. As stated in Section 4.2, the constant weight of different objectives will determine the select result directly. In this section, we will illustrate that how the constant weight influences the selection results. We set the weight of the first objective function be the same for MOP3 and MOP2 based methods. As stated in Section 4.2, the constant weight of different objectives will determine the select result directly. In this section, we will illustrate that how the constant weight influences the selection results. We set the weight of the first objective function be the same for MOP3 and MOP2 based methods. For MOP3, the left two objective functions have the same weight.
As shown in Figure 6, the larger w 1 is, the fewer sensors are selected. This is because large w 1 means that the cost is more important than performance. Thus, the performance is sacrificed for saving cost. We can further observe that MOP3 based methods usually select more sensors than MOP2 based methods under the same weight. As a result, the localization error of MOP3 is lower than MOP2. Furthermore, MOP3-BFIU performs better than MOP3-BFI in localization accuracy. As stated in Section 4.2, the constant weight of different objectives will determine the select result directly. In this section, we will illustrate that how the constant weight influences the selection results. We set the weight of the first objective function be the same for MOP3 and MOP2 based methods. For MOP3, the left two objective functions have the same weight.
As shown in Figure 6, the larger w1 is, the fewer sensors are selected. This is because large w1 means that the cost is more important than performance. Thus, the performance is sacrificed for saving cost. We can further observe that MOP3 based methods usually select more sensors than MOP2 based methods under the same weight. As a result, the localization error of MOP3 is lower than MOP2. Furthermore, MOP3-BFIU performs better than MOP3-BFI in localization accuracy. For different source locations shown in Figure 7, Figure 8 plots the number of selected sensors and corresponding RMSE. We can observe that MOP3-BFI selects more sensors than MOP3-BFIU but having similar localization performance. For MOP2-BFI and MOP2-MI, it is hard to say which is one better to use as performance metrics to select sensors. For different source locations shown in Figure 7, Figure 8 plots the number of selected sensors and corresponding RMSE. We can observe that MOP3-BFI selects more sensors than MOP3-BFIU but having similar localization performance. For MOP2-BFI and MOP2-MI, it is hard to say which is one better to use as performance metrics to select sensors. For different source locations shown in Figure 7, Figure 8 plots the number of selected sensors and corresponding RMSE. We can observe that MOP3-BFI selects more sensors than MOP3-BFIU but having similar localization performance. For MOP2-BFI and MOP2-MI, it is hard to say which is one better to use as performance metrics to select sensors.

Conclusions
In this paper, we propose a novel sensor selection scheme for AOA-based source localization in unreliable sensor networks. The relationship between FI and MI is investigated; it reveals that they have a good consistence only for Gaussian noise. By transforming multiple performance metrics into scale-equivariant functions, the MOEA/D method is proposed to find a set of Pareto optimal solutions. Then, VWDM and TOPSIS are proposed to select the final selection result. The simulation results show that the MOP3-based method can select more informative sensors which can provide better localization accuracy than MOP2-based methods; and different numbers of sensors can be selected by allocating different weight vectors based on our preference.

Conclusions
In this paper, we propose a novel sensor selection scheme for AOA-based source localization in unreliable sensor networks. The relationship between FI and MI is investigated; it reveals that they have a good consistence only for Gaussian noise. By transforming multiple performance metrics into scale-equivariant functions, the MOEA/D method is proposed to find a set of Pareto optimal solutions. Then, VWDM and TOPSIS are proposed to select the final selection result. The simulation results show that the MOP3-based method can select more informative sensors which can provide better localization accuracy than MOP2-based methods; and different numbers of sensors can be selected by allocating different weight vectors based on our preference.
Author Contributions: Q.Y. and J.C. proposed the algorithm, and Q.Y. performed the simulations and wrote the paper. J.C. improved this paper. All authors have read and agreed to the published version of the manuscript.