LSRR-LA: An Anisotropy-Tolerant Localization Algorithm Based on Least Square Regularized Regression for Multi-Hop Wireless Sensor Networks

As is well known, multi-hop range-free localization algorithms demonstrate pretty good performance in isotropic networks in which sensor nodes distribute evenly and densely. However, these algorithms are easily affected by network topology, causing a significant decrease in positioning accuracy. To improve the localization performance in anisotropic networks, this paper presents a multi-hop range-free localization algorithm based on Least Square Regularized Regression (LSRR). By building a mapping relationship between hop counts and real distances, we can regard the process of localization as a regularized regression. Firstly, the proximity information of the given network is measured. Then, a mapping model between the geographical distances and the hop distances is constructed by LSRR. Finally, each sensor node finds its own position via this mapping. The Average Localization Error (ALE) metric is used to evaluate the proposed method in our experiments, and results show that, compared with similar methods, our approach can effectively decrease the effect of anisotropy, thus considerably improving the positioning accuracy.


Introduction
Obtaining accurate location information is a precondition for many applications that are based on localization technology. Among the existing positioning technologies, GPS satellite positioning systems rely on their global coverage and high accuracy, and have been widely used [1,2] all over the world. However, these satellite positioning systems also have some significant shortcomings, including high energy consumption, a high cost, fixed auxiliary facilities, and the only use of open space. Since the 1990s, Wireless Sensor Networks (WSNs) have attracted the attention of many relevant researchers. As an extension of a satellite positioning system, WSNs bring localization methods a low cost, low power consumption, and no additional auxiliary facilities. In some applications, WSN-based localization systems still work well while satellite positioning systems may be out of service [3][4][5]. For example, WSN localization approaches can provide us with a relatively accurate location in areas where satellite signals are blocked, such as primitive forests and buildings. Some localization methods of WSNs are based on the characteristics of network connectivity and multi-hop information.
With the help of known nodes, which are called anchors, the unknown nodes can estimate their locations. Such localization technologies are also called multi-hop range-free localization methods [6,7]. Localization approaches in WSNs can be divided into two types: range-based algorithms [8,9] and range-free algorithms [10][11][12]. The range-based (e.g., RSSI-based, Received Signal Strength Indication) localization algorithms estimate the locations of sensor nodes by measuring their distances or angles localization methods [6,7]. Localization approaches in WSNs can be divided into two types: rangebased algorithms [8,9] and range-free algorithms [10][11][12]. The range-based (e.g., RSSI-based, Received Signal Strength Indication) localization algorithms estimate the locations of sensor nodes by measuring their distances or angles from the anchor nodes. The range-based localization algorithms can deal with anisotropic networks and ensure a relatively accurate estimation of sensor locations; however, these algorithms are easily affected by environmental factors, such as noise and obstacles. In addition, their complexity and cost also increase with the accuracy requirement. For a large-scale sensor network with many sensor nodes, it is not reasonable to equip them all with ranging devices. In contrast, range-free localization algorithms use only the connectivity information among nodes, i.e., they do not require expensive ranging devices. In the case of large-scale WSN localization applications, the range-free algorithms would be preferred where sensor nodes are provided with a small amount of resources. Since the multi-hop range-free positioning technique only needs the hop information between nodes, it can position unknown nodes without additional devices and thus has been widely used.
The multi-hop range-free localization methods perform rather well in an even and dense network, which frequently means that a large number of nodes are deployed in a limited area. They seriously depend on the network topology. Taking Figure 1 as an example, the dotted lines indicate the geometric distance between nodes A and B, while the red solid lines denote the corresponding hop distances between them. If the sensor nodes are distributed evenly, the geometric distances will roughly match the hop distances (refer to Figure 1a). However, this matching relationship will be totally undermined if there are some obstacles that bend the paths between nodes A and B, leading to a mismatch between the geometric distances and the hop distances (refer to Figure 1b). We usually call a network like that shown in Figure 1b an anisotropic network, which has different propagation models and irregular localization areas. A large deviation between the geometric distances and hop distances may probably cause a sharp drop in positioning accuracy, which is exactly the reason that the multi-hop range-free localization methods perform so poorly in anisotropic networks.  To improve the positioning accuracy in such anisotropic networks, we propose a multi-hop range-free localization method based on Least Square Regularized Regression, i.e., LSRR-LA (Least Square Regularized Regression based multi-hop range-free Localization Algorithm). In the case of LSRR-LA, we build a mapping model between geometric distances and hop counts by using Least Square Regularized Regression, and then we estimate the locations of unknown nodes in terms of this model. During the establishment step of this mapping model, structural risk minimization is obtained by tuning the weights between the empirical risk and a penalty item, thus ensuring enough stability in, and accuracy of, this mapping model. To improve the positioning accuracy in such anisotropic networks, we propose a multi-hop range-free localization method based on Least Square Regularized Regression, i.e., LSRR-LA (Least Square Regularized Regression based multi-hop range-free Localization Algorithm). In the case of LSRR-LA, we build a mapping model between geometric distances and hop counts by using Least Square Regularized Regression, and then we estimate the locations of unknown nodes in terms of this model. During the establishment step of this mapping model, structural risk minimization is obtained by tuning the weights between the empirical risk and a penalty item, thus ensuring enough stability in, and accuracy of, this mapping model.
The rest of this paper is organized as follows. Related works are briefly introduced in Section 2. Section 3 describes the proposed LSRR-LA method in detail. Extensive simulation experiments are conducted to evaluate the performance of LSRR-LA and make comparisons with typical localization solutions, and the results are reported in Section 4. Finally, Section 5 gives some conclusions.

The Proposed Localization Algorithm
In this section, we will describe LSRR-LA in detail. Firstly, we present the localization problem in Section 3.1 and then deduce a formulated mapping model. Secondly, we will demonstrate the three steps of LSRR-LA in Section 3.2. Finally, the pseudo-code of LSRR-LA is illustrated.

Problem Statement
Consider a two-dimensional region of space, where there is a sensor network given by a set of nodes N = {N 1 , N 2 , . . . , N m+n }, which consists of m anchor nodes and n sensor nodes. The coordinates of these nodes can be described with definition (1): cor(N p ) = (x p , y p ) T , f or p = 1, ..., m + n.
( 1) In WSN N, the positions of m anchor nodes N i ∈ A, A = {N i |i = 1, ..., m} are known, while the positions of the n sensor nodes N j ∈ B, B = {N j |j = m + 1, ..., m + n} are unknown. In the case of multi-hop networks, the hop counts between pair nodes are already known. The Euclidean distance from node N i to N j (i = j) can be presented by Equation (2): The hop counts from node N i to N j (i = j) can be expressed as h ij . Taking Figure 1a as an example, in such a multi-hop network d ij is proportional to h ij , i.e., d ij ∝ h ij . Thus, the localization problem can be formulated as: Estimate cor(N k ), Given cor(N i ), d ij , and h kt , where N i , N j ∈ A, N k ∈ B, and h kt denotes the hop counts between node N k and node N t (N t ∈ A ∪ B). Assume that the distances and hop counts from N i to other anchors can be represented by the vectors y i and x i , respectively. Our aim is to learn a function f : , where x i ∈ X and y i ∈ Y. The input space X (hops), is known as the dependent variable, and the output space Y (distances) is the independent variable. The function f is represented by a linear model where w is a m × 1 vector that contains the coefficients of the linear function. Usually, linear least squares (LLS) is used to determine w. It uses the vertical distance between the observed values y i and the predictions f (x i ), which are known as the residuals In the LLS case, the sum of the squared residuals is minimized, which in matrix form is with X being the m × m design matrix that combines all of the hop-count vectors, and the m × 1 vector y combining the distance values: where each row corresponds to one input/output example. The coefficients could be optimized by minimizing Equation (5) as follows: The minimizer of the problem (7) isŵ Thus, to make a prediction for a novel input x new , we only need to know the model and its parameters w, which can be represented aŝ

Localization Algorithm
The proposed LSRR-LA contains the following three parts: the measurement part, the training part, and the localization part.
Part On referring to Formula (10), we express the hop-count matrix H (or the distance matrix D) by splitting it into four parts. We denote by H 1 (resp. D 1 ) the m × m matrix for the anchors versus themselves, H 3 (resp. D 3 ) the n × m matrix for the anchors versus sensors, and H 4 (resp. D 4 ) the n × n matrix for the sensors versus themselves. It is easy to know that H 2 = H 3 T , D 2 = D 3 T .
Note that H 1 , H 3 , and H 4 are known, while D 3 and D 4 are unknown. The goal is to predict D 3 (or D 2 ) from H and D 1 .
Part B (Training): Now, we have the input matrix denoted by H 1 and the output matrix represented as D 1 , and then we can train our linear model proposed in formula (4). However, as a difference from (4), here the coefficient vector w becomes a matrix W with m × m, and the outputs are stored in the matrix D 1 . Thus, the LLS solution becomeŝ In Formula (11), the variation range of the distance matrix D 1 is different from that of the hop-count matrix H 1 . If the raw data is used directly, the calculation results will be affected due to the different variation range. We can eliminate this impact by preprocessing operations, such as normalization, making D 1 and H 1 have the same scale and to be equally emphasized. In this paper, the normalized matrices of H 1 and D 1 are represented by H 1 and D 1 , respectively. When calculating the model parameter W, we use the data of anchor nodes. If we include more anchor nodes, we will fit the training data more accurately, which means that our model is capable of minimizing the training errors more effectively. However, the minimization of training errors is not our goal. We hope that the model can accurately estimate the locations of sensor nodes, that is, the model should have a generalization capability. With many anchors, fitting the full model without penalization will result in large prediction intervals, and an LLS regression estimator may not uniquely exist. In order to avoid over-fitting, it is necessary to add a regularization item or a penalty term to the model [21]. In addition, because the LLS estimates depend upon (H T 1 H) −1 , there can be problems in computing W if H T 1 H were singular or nearly singular. In this paper, a term kI (k > 0) is added to the matrix H T 1 H for remedying this problem. The method is also called Tikhonov regularization, which is the most commonly used method of regularization of ill-posed problems. Then, the estimated W can be described asŴ Here, the matrix I is an identity matrix. With the regularization term, the model can effectively avoid over-fitting. The k in the formula is called the hyper-parameters, which is used to balance the training errors and the regularization items [22]. Considering that H T 1 H 1 will be an ill-conditioned matrix if H T 1 H 1 < 0.01, we simply set the value of parameter k to be 0.01 in the later simulation part. Part C (Localization): Now, the normalized hop-count matrix H 3 and the coefficient matrixŴ are used to estimate the normalized matrix D 3 , After obtaining the result of Equation (13), we can derive D 3 by using a reverse operation of D 3 , and then the trilateration method or the maximum likelihood method is used to estimate the coordinates of unknown nodes [23].
The pseudo-code of LSRR-LA is illustrated in Algorithm 1. With respect to step 6 in Algorithm 1, for specific details the reader can refer to [23].

Performance Evaluations
In this section, we will evaluate the performance of our LSRR-LA in a simulative and experimental way. First, in Section 4.1, we analyze the complexity of those algorithms that include LSRR-LA and other classical algorithms, such as DV-Hop, Amorphous, PDM, LSVR, and MSVR. Then, in Section 4.2, we use the performance metric ALE (Average Localization Error) to evaluate these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.

Random distribution
Sensors 2018, 18, 3974 7 of 17 these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.  these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.

C-Shaped L-Shaped D-Shaped
Random distribution Regular distribution All of the reported results are the average over 100 trials. We used the ALE to evaluate the performances of all of the compared methods. The definition of ALE can be briefly formulated as these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.

C-Shaped L-Shaped D-Shaped
Random distribution Regular distribution All of the reported results are the average over 100 trials. We used the ALE to evaluate the performances of all of the compared methods. The definition of ALE can be briefly formulated as these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.  these algorithms in C-shaped, L-shaped, and D-shaped network topologies. Finally, in Section 4.3, the positioning accuracy of LSRR-LA is verified in an outdoor experiment.

Complexity Comparison
The complexity mainly includes two aspects, i.e., communication complexity and computation complexity. The LSRR-LA method is similar to the DV-Hop, Amorphous, MDS-MAP, PDM, and LSVR methods. Each node needs to calculate the hop counts by flooding to other nodes, so their communication cost is equal, which is about O(n 2 m). Here, n denotes the number of sensor nodes while m indicates the number of anchor nodes. DV-hop and Amorphous use the Least Squares (LS) method to estimate the locations of sensor nodes, so they require computation complexity of about O(m 3 ) [24]. MDS-MAP is a centralized localization method, and its location process can be divided into three steps: constructing the global shortest paths, executing the MDS algorithm, and converting the relative coordinates to absolute coordinates. Accordingly, the computation complexity is O(n 3 ), O(n 3 ), and O(m 3 + n), respectively. The PDM method uses TSVD to process data in advance, which has a computation cost of O(m 3 ) [25]; then, it uses the LS method to continue calculating data, so the PDM method costs more than DV-hop and Amorphous do. LSVR and MSVR use regression methods based on SVM, which needs to solve a quadratic programming problem [26]; thus, its computational complexity is O(m 2 )~O(m 3 ). LSRR-LA uses the Least Square Regularized Regression method, so the computation cost is O(m 3 ). Furthermore, its computation cost can be reduced to O(m 2 logm) by using the method introduced in [27].

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.

Complexity Comparison
The complexity mainly includes two aspects, i.e.,

Simulation Results
We conduct simulations in MATLAB (Jinling Institute of Technology, Nanjing, Jiangsu, China, version R2016b) and compare the performances of our proposed method with those of the previous methods, including DV-hop, Amorphous, MDS-MAP, PDM, and LSVR. The communication range of all of the sensor and anchor nodes is identical, and these nodes are deployed in a 1000 m × 1000 m square region for all simulations. We assume two types of distribution: regular distribution and random distribution. In each type of distribution, we considered three types of networks: C-shaped, L-shaped, and D-shaped. The networks used in the following simulations are presented in Table 1. Specially, we use the symbol '*' to indicate the anchor nodes and 'o' to denote sensor nodes.  All of the reported results are the average over 100 trials. We used the ALE to evaluate the performances of all of the compared methods. The definition of ALE can be briefly formulated as Sensors 2018, 18, 3974 8 of 17 where N is the number of unknown nodes. The location error of sensor node i can be formulated as: where x i andx i denote the real and estimated location of node i, respectively. R is the communication range of nodes and • represents the Euclidean distance. The assumptions and simulation parameters are listed in Table 2. In the case of regular distribution, the sensor nodes and anchors are deployed uniformly along with the grids within the network. The size of the grid is set to 45 m × 45 m. We consider five levels of communication range R: 50 m, 100 m, 150 m, 200 m, and 250 m. Table 3 shows us the ALE results of the C-shaped network with 44 training anchor nodes. For example, the result of DV-hop is 283.8/92.6% in the case of R = 100 m, which means that the ALE of DV-hop is 283.8 and LSRR-LA improves the ALE by 92.6%. The ALE value 283.8 is a percentage, and if we turn this value to an absolute form, it will be 283.8 m (i.e., R*ALE/100). By contrast, the ALE value of LSRR-LA is 21.0, which means the absolute localization error is 21.0 m. We can see from the simulation results in Table 3 that, as the communication range becomes larger, all of the competing methods demonstrate a performance improvement. However, all of the methods exhibit a substantial performance improvement except the proposed LSRR-LA method, which keeps a lower ALE as the communication range R increases gradually. It is clear that the proposed LSRR-LA method outperforms the previous methods, for it is rather stable and not sensitive to communication range.
We  Table 3 gives the ALE results of the C-shaped topology when the communication radius R = 100 m. Table 4, it is obvious that the proposed LSRR-LA greatly outperforms the previous methods for the C-shaped anisotropic network. All of the competing methods demonstrate a performance improvement as the network gets denser; however, only the proposed LSRR-LA exhibits a substantial performance improvement while the previous methods do not exhibit satisfactory improvement.  Figure 2 shows a plot of Tables 3 and 4. Figure 2a depicts the ALE results in terms of Table 2, where the communication radius varies from 50 m to 250 m. All of them generally conform to the trend that ALE decreases as the communication radius increases. When the communication radius R is smaller (R = 50 m), the positioning accuracy of DV-Hop, Amophos, and MDS-MAP becomes very poor. As the R increases, their localization errors begin to decrease. The LSRR-LA method always maintains the best positioning accuracy, whether R is smaller (R = 50 m) or R is larger (R = 250 m). Figure 2b shows the ALE results according to Table 4 where the number of anchor nodes increases from 30 to 70. It can be seen that the ALE change ranges of these algorithms are not violent, and the trend of ALE decreases gradually as the number of anchor nodes increases. As a result, LSRR-LA exhibits the best performance. From the Table 4, it is obvious that the proposed LSRR-LA greatly outperforms the previous methods for the C-shaped anisotropic network. All of the competing methods demonstrate a performance improvement as the network gets denser; however, only the proposed LSRR-LA exhibits a substantial performance improvement while the previous methods do not exhibit satisfactory improvement.   Table 2, where the communication radius varies from 50m to 250m. All of them generally conform to the trend that ALE decreases as the communication radius increases. When the communication radius R is smaller (R = 50m), the positioning accuracy of DV-Hop, Amophos, and MDS-MAP becomes very poor. As the R increases, their localization errors begin to decrease. The LSRR-LA method always maintains the best positioning accuracy, whether R is smaller (R = 50m) or R is larger (R = 250m). Figure 2b shows the ALE results according to Table 4 where the number of anchor nodes increases from 30 to 70. It can be seen that the ALE change ranges of these algorithms are not violent, and the trend of ALE decreases gradually as the number of anchor nodes increases. As a result, LSRR-LA exhibits the best performance.      In addition, three-dimensional graphics are used to visualize the performance in order to show the localization results more clearly. In Figure 5, the three-dimensional (3D) localization results of LSRR-LA are compared with those of the classic DV-Hop method. The results prove that LSRR-LA outperforms DV-Hop, especially at the edge of these regions.    In addition, three-dimensional graphics are used to visualize the performance in order to show the localization results more clearly. In Figure 5, the three-dimensional (3D) localization results of LSRR-LA are compared with those of the classic DV-Hop method. The results prove that LSRR-LA outperforms DV-Hop, especially at the edge of these regions. In addition, three-dimensional graphics are used to visualize the performance in order to show the localization results more clearly. In Figure 5, the three-dimensional (3D) localization results of LSRR-LA are compared with those of the classic DV-Hop method. The results prove that LSRR-LA outperforms DV-Hop, especially at the edge of these regions.  In addition, three-dimensional graphics are used to visualize the performance in order to show the localization results more clearly. In Figure 5, the three-dimensional (3D) localization results of LSRR-LA are compared with those of the classic DV-Hop method. The results prove that LSRR-LA outperforms DV-Hop, especially at the edge of these regions.

Random Deployment
Sometimes, in the case of random deployment networks, some nodes cannot be connected to their neighbors due to the fact that the geometric distance between them may be beyond their communication range. This may cause a localization method to fail to estimate the locations of these isolated nodes, so we appropriately improve the total number of nodes and the communication radius to avoid this. Approximately 350 nodes (including anchor nodes and sensor nodes) are randomly distributed in an area of 1000m × 1000m. Table 5 shows the ALE results of the six methods when the total number of anchor nodes is 70 (20%) and the communication radius R varies from 100m to 300m. We can see from the simulation results in Table 5 that, in the case of random deployment, our proposed LSRR-LA method achieves the best positioning accuracy. For example, when the node communication radius R = 200m and anchors number M = 20%, the LSRR-LA method presents an ALE of 38.9, which is much better than the others.
As shown in Table 6, similar to regular deployment, five different anchor populations are also considered in random deployment. Here, we just show the ALE results of the C-shaped network when the communication radius R = 150m and the number of anchors M is 65, 75, 85, 95, and 105, respectively. All of the competing methods demonstrate a performance improvement as the network becomes denser; however, these methods do not exhibit satisfactory improvements. Taking the LSRR-LA method for instance, we notice that its ALE is 38.9 with M = 65 while its ALE is only decreased to 34.8 with M = 105, which proves that anchor populations have just a little impact on positioning accuracy in random deployment networks, since it is not the same in the case of regular deployment networks (refer to Table 4).

Random Deployment
Sometimes, in the case of random deployment networks, some nodes cannot be connected to their neighbors due to the fact that the geometric distance between them may be beyond their communication range. This may cause a localization method to fail to estimate the locations of these isolated nodes, so we appropriately improve the total number of nodes and the communication radius to avoid this. Approximately 350 nodes (including anchor nodes and sensor nodes) are randomly distributed in an area of 1000 m × 1000 m. Table 5 shows the ALE results of the six methods when the total number of anchor nodes is 70 (20%) and the communication radius R varies from 100 m to 300 m. We can see from the simulation results in Table 5 that, in the case of random deployment, our proposed LSRR-LA method achieves the best positioning accuracy. For example, when the node communication radius R = 200 m and anchors number M = 20%, the LSRR-LA method presents an ALE of 38.9, which is much better than the others.
As shown in Table 6, similar to regular deployment, five different anchor populations are also considered in random deployment. Here, we just show the ALE results of the C-shaped network when the communication radius R = 150 m and the number of anchors M is 65, 75, 85, 95, and 105, respectively. All of the competing methods demonstrate a performance improvement as the network becomes denser; however, these methods do not exhibit satisfactory improvements. Taking the LSRR-LA method for instance, we notice that its ALE is 38.9 with M = 65 while its ALE is only decreased to 34.8 with M = 105, which proves that anchor populations have just a little impact on positioning accuracy in random deployment networks, since it is not the same in the case of regular deployment networks (refer to Table 4).  Figure 6 shows us the overall summarization of Tables 5 and 6. Accordingly, Figure 6a depicts the ALE results of the six algorithms in random deployed networks, when the communication radius varies from 100 m to 300 m. The positioning accuracy of DV-Hop, Amophos, and MDS-MAP algorithm is very poor when the communication radius R is small (R = 100 m). Our LSRR-LA always maintains the highest positioning accuracy regardless of whether R is smaller (R = 100 m) or R is larger (R = 300 m). Figure 7b shows the ALE results of the six algorithms when the number of anchor nodes is increased from 65 to 105. We can see that LSRR-LA is still the best in these algorithms.  Figure 6 shows us the overall summarization of Tables 5 and 6. Accordingly, Figure 6a depicts the ALE results of the six algorithms in random deployed networks, when the communication radius varies from 100m to 300m. The positioning accuracy of DV-Hop, Amophos, and MDS-MAP algorithm is very poor when the communication radius R is small (R = 100m). Our LSRR-LA always maintains the highest positioning accuracy regardless of whether R is smaller (R = 100m) or R is larger (R = 300m). Figure 7b shows the ALE results of the six algorithms when the number of anchor nodes is increased from 65 to 105. We can see that LSRR-LA is still the best in these algorithms.    Figure 6 shows us the overall summarization of Tables 5 and 6. Accordingly, Figure 6a depicts the ALE results of the six algorithms in random deployed networks, when the communication radius varies from 100m to 300m. The positioning accuracy of DV-Hop, Amophos, and MDS-MAP algorithm is very poor when the communication radius R is small (R = 100m). Our LSRR-LA always maintains the highest positioning accuracy regardless of whether R is smaller (R = 100m) or R is larger (R = 300m). Figure 7b shows the ALE results of the six algorithms when the number of anchor nodes is increased from 65 to 105. We can see that LSRR-LA is still the best in these algorithms.     Figure 7 also shows an overall summarization of the six competing methods in random deployment networks, when the communication radius R is 150 m and the anchor node M is 70. It can be seen from the figure that our LSRR-LA demonstrates the best positioning performance. Figure 8 gives the localization results of the six algorithms when the communication radius R is 150 m and the number of anchors M is 70. Compared with Figure 4, LSRR-LA presents a rather stable performance regardless of whether it is in C-shaped, L-shaped, or D-shaped network; however, it is not the same in regular deployment networks, for it performs the best in the C-shaped network while it performs the worst in the D-shaped network. Similar to Figure 5, we also demonstrate the 3D localization errors in Figure 9 for a better comparison of positioning accuracy between LSRR-LA and DV-Hop. Our LSRR-LA still outperforms DV-Hop in the random deployment networks.  The distribution of nodes is shown in Figure 11a. We assign a number to each node. The transmission power is set to 0 dBm; thus, in this environment, each node can communicate range over a distance of 8 m. The connected graph of the entire network is shown in Figure 11b.  In the experiments, the anchor ratio is varied from 10% to 30%. The localization performance is compared using box plots in Figure 12. On each box, the central mark indicates the median while the bottom and top marks represent the minimum and maximum values, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. The distribution of nodes is shown in Figure 11a. We assign a number to each node. The transmission power is set to 0 dBm; thus, in this environment, each node can communicate range over a distance of 8 m. The connected graph of the entire network is shown in Figure 11b.  The distribution of nodes is shown in Figure 11a. We assign a number to each node. The transmission power is set to 0 dBm; thus, in this environment, each node can communicate range over a distance of 8 m. The connected graph of the entire network is shown in Figure 11b.  In the experiments, the anchor ratio is varied from 10% to 30%. The localization performance is compared using box plots in Figure 12. On each box, the central mark indicates the median while the bottom and top marks represent the minimum and maximum values, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. In the experiments, the anchor ratio is varied from 10% to 30%. The localization performance is compared using box plots in Figure 12. On each box, the central mark indicates the median while the bottom and top marks represent the minimum and maximum values, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively.  The distribution of nodes is shown in Figure 11a. We assign a number to each node. The transmission power is set to 0 dBm; thus, in this environment, each node can communicate range over a distance of 8 m. The connected graph of the entire network is shown in Figure 11b.  In the experiments, the anchor ratio is varied from 10% to 30%. The localization performance is compared using box plots in Figure 12. On each box, the central mark indicates the median while the bottom and top marks represent the minimum and maximum values, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively.  From the figures, it is evident that the positioning accuracy depends proportionally on the number of anchor nodes for the compared localization algorithms DV-hop, Amorphous, MDS-MAP, PDM, LSVR, and LSRR-LA. It also proves that our proposed method has much better performance than the compared methods.
We also use the error cumulative distribution function (CDF) to represent the performance of our proposed algorithm, as shown in Figure 13. If anchor ratio is 10%, we can see from Figure 13a that the CDF is 63.8%, 85.5%, and 98.3% when the location error is 0.5m, 0.8m, and 1.5m, respectively. If the anchor ratio is up to 30%, we can see from Figure 13b that the CDF is 67.1%, 92.8%, and 100.0% when the location error is 0.5m, 0.8m, and 1.5m, respectively. As a comparison, the LSRR-LA performs best due to its advantage of the regression model.

Conclusions
In this paper, aiming at improving the positioning accuracy in anisotropic networks, we propose a localization method based on Least Square Regularized Regression, i.e., LSRR-LA, which considers the localization problem as a regression problem and constructs a mapping model between hop counts and Euclidean distance. By using this mapping model, we can estimate the locations of unknown nodes more accurately and more effectively. Compared with the localization methods DVhop, Amorphous, MDS-MAP, PDM, and LSVR, our LSRR-LA outperforms them obviously, especially in anisotropic networks.  From the figures, it is evident that the positioning accuracy depends proportionally on the number of anchor nodes for the compared localization algorithms DV-hop, Amorphous, MDS-MAP, PDM, LSVR, and LSRR-LA. It also proves that our proposed method has much better performance than the compared methods.
We also use the error cumulative distribution function (CDF) to represent the performance of our proposed algorithm, as shown in Figure 13. If anchor ratio is 10%, we can see from Figure 13a that the CDF is 63.8%, 85.5%, and 98.3% when the location error is 0.5 m, 0.8 m, and 1.5 m, respectively. If the anchor ratio is up to 30%, we can see from Figure 13b that the CDF is 67.1%, 92.8%, and 100.0% when the location error is 0.5 m, 0.8 m, and 1.5 m, respectively. As a comparison, the LSRR-LA performs best due to its advantage of the regression model.  From the figures, it is evident that the positioning accuracy depends proportionally on the number of anchor nodes for the compared localization algorithms DV-hop, Amorphous, MDS-MAP, PDM, LSVR, and LSRR-LA. It also proves that our proposed method has much better performance than the compared methods.
We also use the error cumulative distribution function (CDF) to represent the performance of our proposed algorithm, as shown in Figure 13. If anchor ratio is 10%, we can see from Figure 13a that the CDF is 63.8%, 85.5%, and 98.3% when the location error is 0.5m, 0.8m, and 1.5m, respectively. If the anchor ratio is up to 30%, we can see from Figure 13b that the CDF is 67.1%, 92.8%, and 100.0% when the location error is 0.5m, 0.8m, and 1.5m, respectively. As a comparison, the LSRR-LA performs best due to its advantage of the regression model.

Conclusions
In this paper, aiming at improving the positioning accuracy in anisotropic networks, we propose a localization method based on Least Square Regularized Regression, i.e., LSRR-LA, which considers the localization problem as a regression problem and constructs a mapping model between hop counts and Euclidean distance. By using this mapping model, we can estimate the locations of unknown nodes more accurately and more effectively. Compared with the localization methods DVhop, Amorphous, MDS-MAP, PDM, and LSVR, our LSRR-LA outperforms them obviously, especially in anisotropic networks.

Conclusions
In this paper, aiming at improving the positioning accuracy in anisotropic networks, we propose a localization method based on Least Square Regularized Regression, i.e., LSRR-LA, which considers the localization problem as a regression problem and constructs a mapping model between hop counts and Euclidean distance. By using this mapping model, we can estimate the locations of unknown nodes more accurately and more effectively. Compared with the localization methods DV-hop, Amorphous, MDS-MAP, PDM, and LSVR, our LSRR-LA outperforms them obviously, especially in anisotropic networks.