Variance Reduction of Sequential Monte Carlo Approach for GNSS Phase Bias Estimation

: Global navigation satellite systems (GNSS) are an important tool for positioning, navigation, and timing (PNT) services. The fast and high-precision GNSS data processing relies on reliable integer ambiguity ﬁxing, whose performance depends on phase bias estimation. However, the mathematic model of GNSS phase bias estimation encounters the rank-deﬁciency problem, making bias estimation a di ﬃ cult task. Combining the Monte-Carlo-based methods and GNSS data processing procedure can overcome the problem and provide fast-converging bias estimates. The variance reduction of the estimation algorithm has the potential to improve the accuracy of the estimates and is meaningful for precise and e ﬃ cient PNT services. In this paper, ﬁrstly, we present the di ﬃ culty in phase bias estimation and introduce the sequential quasi-Monte Carlo (SQMC) method, then develop the SQMC-based GNSS phase bias estimation algorithm, and investigate the e ﬀ ects of the low-discrepancy sequence on variance reduction. Experiments with practical data show that the low-discrepancy sequence in the algorithm can signiﬁcantly reduce the standard deviation of the estimates and shorten the convergence time of the ﬁltering.


Introduction
Global navigation satellite systems (GNSS) are widely used in positioning, navigation, and timing (PNT) services. The accuracy of the precise positioning can reach the level of centimeters and satisfy a pervasive use in civil and military applications. GNSS is being developed at a fast pace, and the systems in full operation at present include the United States of America (USA)'s Global Positioning System (GPS) and Russia's Global Navigation Satellite System (GLONASS). The European Union plans to complete the construction of the Galileo system, while China is going to fully operate the BeiDou Navigation Satellite System (BDS) by 2020 [1,2]. The basic principle of GNSS data processing is to mathematically solve the interesting PNT parameters in the observation models with measurements of the distances between GNSS satellites and receivers. However, the biases in the signal measurements lead to errors in the models and degrade the accuracy of the solutions. Consequently, the bias estimation plays an important role in the quality of the final PNT services [3][4][5]. Reducing the variance of the bias estimates can more precisely recover the measurements and improve the service quality.
Fast and precise GNSS data processing uses the carrier-wave phase measurement by the receivers. The phase measurement only records the fractional part of the carrier phase plus the cumulated numbers. Therefore, the phase measurements from GNSS receivers are not directly the satellite-receiver distance, However, the random sequence used in the Monte Carlo method has possible gaps and clusters. Quasi-Monte Carlo (QMC) replaces the random sequence with a low-discrepancy sequence which can reduce the variance and has better performance [22][23][24][25]. Until now, the QMC-based variance reduction method in GNSS data processing was not addressed. This study aims to combine the GNSS data processing procedure and the sequential QMC (SQMC) methods to obtain precise GNSS phase bias estimates. The paper firstly gives an overview of the mathematical problem in GNSS bias estimation and then provides a renewed algorithm introducing the variance reduction method based on QMC to precisely estimate GNSS phase bias.
The remainder of this article is structured as follows: Section 2 presents the procedure and mathematical models of the GNSS data processing and introduce the difficulties in phase bias estimation. Section 3 gives an overview of the QMC theory. Section 4 describes the proposed GNSS phase bias estimation algorithm based on the SQMC method. Section 5 gives the results of phase bias estimation with practical data, and Section 6 draws the key research conclusions.

Mathematic Models of GNSS Precise Data Processing and the Dilemma
The code and phase measurements are usually used to derive the interesting parameters for GNSS services. The code measurement is the range obtained by multiplying the traveling time of the signal when it propagates from the satellite to the receiver at the speed of light. The phase measurement is much more precise than the code measurement but is ambiguous by an unknown integer number of wavelengths when used as range, and the ambiguities are different every time the receiver relocks the satellite signals [7].
In the measurement models, the unknowns include not only the coordinate parameters, but also the time delays caused by the atmosphere and device hardware, as well as the ambiguities for phase measurement. In relative positioning, the hardware delays can be nonzero values and should be considered in multi-GNSS and GLONASS data processing, i.e., inter-system bias (ISB) [9,10,26] and inter-frequency bias (IFB) [27], respectively. The ISB and IFB of the measurements are correlated with the ambiguities and are the key problems to be solved.
The double difference (DD) measurement models are usually constructed to mitigate the common errors of two non-difference measurements. For short baselines, the DD measurement mathematical models including the interesting parameters for GNSS PNT services, such as coordinates for positioning, the unknown ambiguities, and the ISB or IFB parameters, can be written in the form of matrices as where v denotes the vector of observation residuals; b is composed of unknown single difference (SD) ambiguities N i1 ab , N i2 ab , . . . , N in ab , where i is the reference satellite and n is the number of the DD-equations, and a and b are the stations; y includes the ISB and IFB rate; vector x contains the unknown station coordinate and the other interesting parameters; l is the measurements from the receiver; A is the design matrix of the elements in x; D is the design matrix with elements of zero and the corresponding carrier wavelength. Matrix D transforms SD ambiguities to DD ambiguities; C is the design matrix of y with elements of zero and the SD of the channel numbers for phase IFB rate parameter, with elements of zero and one for the phase ISB parameter.
GNSS data processing such as for precise positioning is used to precisely determine the elements in x. Denoting the weight matrix of the DD measurements [7] by P, the normal equation of the least-squares method is For simplification, the notation in Equation (3) is used.
If the bias vector y is precisely known, the estimation of x can be realized by following four steps.
Step 1: Derive the solution of x and b with float SD-ambiguities by least-squares method.
After the float SD ambiguities in b are estimated, the SD ambiguities and their VC matrix are transformed into DD ambiguities bDD and the corresponding VC matrix Qbb by differencing.
Step 2: Fix the integer ambiguities. The elements ofb DD are intrinsically integer values but the values calculated are floats. Resolving the float values to integers can improve the accuracy to sub-centimeter level with fewer observations [28]. The ambiguity resolution can be expressed by where function F() maps the ambiguities from float values to integers. This process can be implemented by the LAMBDA method [6,8] which can efficiently mechanize the integer least square (ILS) procedure [29]. This method is to solve the ILS problem described by where b denotes the vector of integer ambiguity candidates. The LAMBDA procedure contains mainly two steps, the reduction step and the search step. The reduction step decorrelates the elements inb and orders the diagonal entries by Z-transformations to shrink the search space. The search step is a searching process finding the optimal ambiguity candidates in a hyper-ellipsoidal space.
Step 3: Validate the integer ambiguities. The obtained ambiguity combinationb is not guaranteed to be the correct integer ambiguity vector and it requires to be validated. The R-ratio test [29,30] can be employed. This test tries to ensure that the best ambiguity combination, which is the optimal solution of Equation (6), is statistically better than the second best one. The ratio value is calculated by whereb is the second best ambiguity vector according to Equation (6). The integer ambiguity vectorb will be accepted if the ratio value is equal to or larger than a threshold, and it will be refused if the ratio value is smaller than the threshold.
Step 4: Derive the fixed baseline solution. After the integer ambiguity vector passes the validation test,b is used to adjust the float solution of other parameters, leading to the corresponding fixed solution. This process can be expressed by wherex denotes the fixed solution of x; Qxx is the VC matrix of the fixed solutionx; Qbx is the VC matrix ofb andx;b refers to the float ambiguity solution;x is the float solution of x. The fixed solutionx can reach sub-centimeter level. If errors in the observation models are removed, the successful ambiguity fixing requires observations of only a few epochs, even a single epoch.
From Equations (5) and (6), the successful ambiguity resolution requires accurate float ambiguity estimates and the corresponding VC matrix. If the bias in y is unknown, the bias cannot be separated by Equation (4) but stays with the ambiguity parameter. As a result, the obtained ambiguity parameters include bias and the ambiguity resolution will fail. When both the bias and the ambiguity are parameterized simultaneously, the bias parameter is correlated with the ambiguity parameter and, thus, it is impossible to separate the bias values and get precise float ambiguities estimates. Mathematically, the normal equation set (Equation (2)) will be rank-deficient and cannot be solved.

Bayes Filtering
In a state-space model, the transition function and the measurement model can be expressed by where y k is the measurement vector at epoch k; f k () is the transition function; h k () is the measurement function; k and e k are the state noise and the measurement noise, respectively. This model indicates a first-order Markov process because the estimated state vector is only related to the states of the previous epoch k − 1, but not to other states before.
Considering the Chapman-Kolmogorov equation [31], the posterior density p x k |y 1:k can be estimated according to the Bayes's theorem by p x k |y 1:k = p y k |x k p x k |y 1:k−1 /p y k |y 1:k−1 .
The expectation of x can, hence, be expressed bŷ Combining Equations (12) and (13), the estimates of x on each epoch can be calculated theoretically. The optimal analytical expression for p x k |y 1:k can be derived for a linear Gauss-Markov model as a Kalman filter but cannot be obtained for most cases. Fortunately, the suboptimal solutions by the Monte Carlo method are usually available.

Importance Sampling
In Monte Carlo methods, the probability density function (PDF) p x k |y 1:k is represented by N where δ() is the Dirac delta function. The posterior density is not precisely known at the beginning of epoch k. Assuming a prior PDF q(x) is known from which the samples can be generated and p(x) is the posterior density to be estimated, after the samples are generated, they can be used to load the information provided by the where w(x) = p(x)/q(x); q(x) is the importance density function.
As the PDFs are represented via the Monte Carlo method and according to the Bayes's theorem, the expectation can be written asx where

Sequential Monte Carlo Algorithm
A sequential importance sampling (SIS) procedure can be implemented to get estimates of x.
is generated with equal weights according to the initial distribution q(x 0 ). Then, the samples are reweighted according to the likelihood of the measurements and the estimatesx are calculated. Afterward, the samples are transited to the next epoch. In practice, the SIS quickly degenerates during the filtering as more and more particles get negligible weights. Fortunately, the degeneracy problem can be solved by resampling which duplicates the samples with large weights and deletes the samples with small weights.
The resampling step is implemented as follows: is assigned equal weights. It is not necessary to resample the samples each epoch, and a condition can be set by comparing the effective number with a threshold. This resampling procedure adequately solves the degeneracy problem in SIS in practice. Considering the resampling step, the SMC procedure can be implemented as Algorithm 1.
Update the weights according to likelihood function p y k |x i k of measurements with where N e f f is the effective number of samples which is calculated by and N th is a threshold which can be set to the value of 2/3N.

Sequential Quasi-Monte Carlo Algorithm
The pseudo random numbers usually used in the Monte Carlo method encounter possible gaps and clusters in the sampling fields. This can be avoided by a QMC method which replaces the random sequences with low-discrepancy sequences such as the Sobol sequence, Halten sequence, and so on [32,33].
The QMC sequence is a deterministic sequence and is uniformly distributed first belonging to . The discrepancy of the sequence is defined by where sup refers to the supremum. A sequence with small discrepancy defined by the above formula is named a low-discrepancy sequence. Koksma-Hlawka theorem indicates that the approximation error of a real function represented by discrete numbers, i.e., the left side of Equation (19), is bounded by the product of two independent factors, the variation of the real function, and the discrepancy of the discrete numbers. Therefore, we have the following inequality: where V( f ) is the variation of f in the sense of Hardy and Krause [34]. A low-discrepancy sequence It is difficult to analyze the accuracy of the approximation by QMC in practice as the points are regular. Therefore, randomized QMC (RQMC) can be used so that every element of the sequence is uniformly distributed over the unit cub but still has a low-discrepancy property [35][36][37]. Figure 1 shows the first 200 samples of the sequences for RQMC sampling, pseudo-random sampling, and the corresponding Gaussian sampling with a Sobol sequence. The SQMC algorithm is presented in Algorithm 2.
Update the weights according to likelihood function p y k |x i k of measurements with w i k = w i k−1 p y k |x i k . Normalize the weights by w i where N e f f is the effective number of samples which is calculated by and N th is a threshold which can be set to the value of 2/3N.
Repeat steps (b) and (c) for the following epochs.

SQMC-Based Algorithm for Phase Bias Estimation
The ratio for integer ambiguity validation in step 3 of Section 2 reflects the closeness of the float ambiguity vector to the integer ambiguity vector and, thus, shows the quality of the ambiguity fixing performance. If the ambiguity is successfully fixed to the integer ambiguities with high probability, the phase bias can be precisely derived. Although the searching and validation step cannot be linearized to satisfy the conditions for using the linear least-squares methods, we can count on the Monte Carlo-based method to develop algorithms for precise phase bias estimation.
The ratio value used in the ambiguity validation reflects the quality of the integer ambiguity fixing performance as used in the ambiguity validation. If b k is the correct ambiguity vector and x k represents the phase bias parameters at epoch k, we can have the assumption that the conditional probability density p(b k |x k ) has the proportional relationship p(b k |x k ) ∝ ratio(x k ), and simply let The PDF p(b k |x k ) is then used as the likelihood function in the Monte Carlo-based estimation method to update the weights. This is expressed as This designed likelihood function works for the estimation of the phase biases which affect the ratio values in ambiguity fixing.
The following procedure is implemented to calculate ratio x i k at epoch k for each element in sample set x i k N i=1 : (a) x i k is used as known bias values to calibrate the measurement model by Equation (4) and solve the equation set to get float SD ambiguities and the corresponding VC matrix; (b) the DD ambiguities and the VC matrix are calculated, and the integer ambiguities are fixed using the LAMBDA method; (c) ratio x i k is calculated using Equation (7). Moreover, the phase bias can be regarded as constant between epochs, and the transition function which transports samples from epoch k − 1 to epoch k is where v is the normal distributed noise with each element v ∼ N(0, σ). The flowchart of the SQMC procedure for phase bias estimation is plotted in Figure 2, and the corresponding algorithm is presented as Algorithm 3.
Moreover, the phase bias can be regarded as constant between epochs, and the transition function which transports samples from epoch − 1 to epoch is where is the normal distributed noise with each element ~(0, ). The flowchart of the SQMC procedure for phase bias estimation is plotted in Figure 2, and the corresponding algorithm is presented as Algorithm 3.
Update the weights according to likelihood function p y k |x i k of measurements with where N e f f is the effective number of samples which is calculated by and N th is a threshold which can be set to the value of 2 3 N. prediction Generate a QMC or RQMC points Repeat steps 2 and 3 for the following epochs.
Algorithm 3 combines the QMC method and the GNSS ambiguity fixing procedures together to estimate the GNSS phase bias. The low-discrepancy sequences of QMC are included for variance reduction. Section 5 shows the applications of the approach with practical GNSS data.

Experiments with Practical GNSS Data
The GLONASS phase IFB estimation of baseline GOP6_GOP7 in networks of international GNSS service (IGS) (ftp://ftp.cddis.eosdis.nasa.gov/pub/gnss) was taken as an example to demonstrate the variance reduction by QMC. The baseline was in Europe with the location in Figure 3, and the two GNSS stations were equipped with LEICA GRX1200+GNSS 9.20 and TRIMBLE NETR9 5.01 receivers, respectively. The measurement data were collected at GPS time (GPST) 9:00-10:00 a.m. on day of year (DOY) 180 of 2018 with an epoch interval of 30 seconds. Six GLONASS satellites were observed during the time span, and the satellite slot numbers are shown at the beginning of the satellite trajectories in Figure 4. The baseline had a post-processed GLONASS phase IFB around −29.5 mm/frequency number (FN) which can be regarded as the true values of both L1 and L2 frequencies.
GNSS stations were equipped with LEICA GRX1200+GNSS 9.20 and TRIMBLE NETR9 5.01 receivers, respectively. The measurement data were collected at GPS time (GPST) 9:00-10:00 a.m. on day of year (DOY) 180 of 2018 with an epoch interval of 30 seconds. Six GLONASS satellites were observed during the time span, and the satellite slot numbers are shown at the beginning of the satellite trajectories in Figure 4. The baseline had a post-processed GLONASS phase IFB around −29.5 mm/frequency number (FN) which can be regarded as the true values of both L1 and L2 frequencies.   Both the code and carrier phase measurements of frequency L1 and L2 were used to form Equation (1) in the experiment. Only one IFB parameter was included because the IFB values for both frequency L1 and L2 were regarded as the same. Algorithm 3 for SQMC-based phase-bias estimation was implemented, and the IFB estimate was derived at each epoch. Furthermore, the IFB estimates were also calculated using the SMC-based approach for comparison. When the bias was estimated many times, the estimates of each time were different because the RQMC sequence was used in the SQMC-based procedure, and the pseudo-random sequence was used in the SMC-based procedure. The standard deviation (SD) of the estimates was calculated to evaluate the performance.
Firstly, the IFB was estimated 1000 times with the SMC-based approach. The sigma of the transition noise in Equation (22) was set to 1 mm/FN, and the sample number was fixed to a value of 100. The 1000 estimates of IFB for the first 60 epochs are drawn in Figure 5 as yellow lines. Afterward, the SQMC-based approach with a Sobol sequence for IFB estimation was implemented. The SQMC strategy also had a sigma of the transition noise as 1 mm/FN and the number of samples as 100. The IFB was also estimated 1000 times, and the results for the first 60 epochs are plotted in Figure 5 as blue lines. The SDs of the 1000 estimates of SMC and SQMC approaches are calculated and presented in Figure 5 as a yellow line and blue line, respectively.
Obviously, both approaches could successfully estimate the IFB values. After convergence, the estimates of the two algorithms were very similar, such as the results after epoch 40th. However, the estimates of the SQMC-based algorithm converged faster than those of the SMC approach, and the corresponding SD was much lower at the beginning. In the worst case of the 1000 estimates, the results converged, i.e., became close to the true value with a difference smaller than 1.5 mm/FN at the seventh epoch for SQMC and at the 40th epoch for SMC. Both the code and carrier phase measurements of frequency L1 and L2 were used to form Equation (1) in the experiment. Only one IFB parameter was included because the IFB values for both frequency L1 and L2 were regarded as the same. Algorithm 3 for SQMC-based phase-bias estimation was implemented, and the IFB estimate was derived at each epoch. Furthermore, the IFB estimates were also calculated using the SMC-based approach for comparison. When the bias was estimated many times, the estimates of each time were different because the RQMC sequence was used in the SQMC-based procedure, and the pseudo-random sequence was used in the SMC-based procedure. The standard deviation (SD) of the estimates was calculated to evaluate the performance.
Firstly, the IFB was estimated 1000 times with the SMC-based approach. The sigma of the transition noise in Equation (22) was set to 1 mm/FN, and the sample number was fixed to a value of 100. The 1000 estimates of IFB for the first 60 epochs are drawn in Figure 5 as yellow lines. Afterward, the SQMC-based approach with a Sobol sequence for IFB estimation was implemented. The SQMC strategy also had a sigma of the transition noise as 1 mm/FN and the number of samples as 100. The IFB was also estimated 1000 times, and the results for the first 60 epochs are plotted in Figure 5 as blue lines. The SDs of the 1000 estimates of SMC and SQMC approaches are calculated and presented in Figure 5 as a yellow line and blue line, respectively. The variance of the estimates for the SQMC-based algorithm with variation in the sample numbers and the transition noise was also analyzed.
Firstly, the phase IFB was estimated 100 times using the SMC-based and SQMC-based algorithms, separately, with the number of particles varied from 30 to 200. The sigma of transition noise was fixed to 1 mm/FN for each estimation. The SDs of the 100 estimates at epochs 1, 2, 5, and 10 are presented in Figure 6, where we can see that the SD decreased for both SMC and SQMC as the number of particles increased. The SD for SQMC had much smaller values compared with SQMC. This indicates that the SQMC-based algorithm could achieve estimates with smaller SD than the SMC-based algorithm using even smaller sample numbers. This is very meaningful in GNSS data processing, because the main time-consuming step is the ambiguity fixing procedure in step 2 in Section 2 for each sample. Fewer samples result in a lighter computation load. Obviously, both approaches could successfully estimate the IFB values. After convergence, the estimates of the two algorithms were very similar, such as the results after epoch 40th. However, the estimates of the SQMC-based algorithm converged faster than those of the SMC approach, and the corresponding SD was much lower at the beginning. In the worst case of the 1000 estimates, the results converged, i.e., became close to the true value with a difference smaller than 1.5 mm/FN at the seventh epoch for SQMC and at the 40th epoch for SMC.
The variance of the estimates for the SQMC-based algorithm with variation in the sample numbers and the transition noise was also analyzed.
Firstly, the phase IFB was estimated 100 times using the SMC-based and SQMC-based algorithms, separately, with the number of particles varied from 30 to 200. The sigma of transition noise was fixed to 1 mm/FN for each estimation. The SDs of the 100 estimates at epochs 1, 2, 5, and 10 are presented in Figure 6, where we can see that the SD decreased for both SMC and SQMC as the number of particles increased. The SD for SQMC had much smaller values compared with SQMC. This indicates that the SQMC-based algorithm could achieve estimates with smaller SD than the SMC-based algorithm using even smaller sample numbers. This is very meaningful in GNSS data processing, because the main time-consuming step is the ambiguity fixing procedure in step 2 in Section 2 for each sample. Then, the effects of the transition noise were evaluated. The phase IFB was estimated another 100 times with the sigma of the transition noise from 10 −6 to 10 −2 m/FN, and the number of the samples was fixed to 100. The SDs of the 100 estimates were calculated at each epoch and the values at epochs 1, 2, 5, and 10 are plotted in Figure 7. Obviously, the SDs of the SQMC-based algorithm were much smaller than those of the SMC-based algorithm at all the four epochs. The SDs at epoch 10 showed a curve near 10 −3 m/FN and were larger than the STDs corresponding to other nearby sigma values. This indicates that the transition noise level in the transition model needs to be set carefully. Too high a transition noise will increase the SDs; however, if the sigma is too small, the samples cannot evolve to the proper field and the prior density cannot be well represented by the samples, also leading to large STD values.  Then, the effects of the transition noise were evaluated. The phase IFB was estimated another 100 times with the sigma of the transition noise from 10 −6 to 10 −2 m/FN, and the number of the samples was fixed to 100. The SDs of the 100 estimates were calculated at each epoch and the values at epochs 1, 2, 5, and 10 are plotted in Figure 7. Obviously, the SDs of the SQMC-based algorithm were much smaller than those of the SMC-based algorithm at all the four epochs. The SDs at epoch 10 showed a curve near 10 −3 m/FN and were larger than the STDs corresponding to other nearby sigma values. This indicates that the transition noise level in the transition model needs to be set carefully. Too high a transition noise will increase the SDs; however, if the sigma is too small, the samples cannot evolve to the proper field and the prior density cannot be well represented by the samples, also leading to large STD values. Then, the effects of the transition noise were evaluated. The phase IFB was estimated another 100 times with the sigma of the transition noise from 10 −6 to 10 −2 m/FN, and the number of the samples was fixed to 100. The SDs of the 100 estimates were calculated at each epoch and the values at epochs 1, 2, 5, and 10 are plotted in Figure 7. Obviously, the SDs of the SQMC-based algorithm were much smaller than those of the SMC-based algorithm at all the four epochs. The SDs at epoch 10 showed a curve near 10 −3 m/FN and were larger than the STDs corresponding to other nearby sigma values. This indicates that the transition noise level in the transition model needs to be set carefully. Too high a transition noise will increase the SDs; however, if the sigma is too small, the samples cannot evolve to the proper field and the prior density cannot be well represented by the samples, also leading to large STD values.

Conclusions
This article presented the problem of solving the mathematical models in GNSS phase bias estimation, which is essential for fast and precise GNSS data processing; it then developed a fast and efficient algorithm by combining the GNSS data processing procedure and the SQMC method.
The proposed SQMC-based GNSS phase bias estimation algorithm introduces the QMC technique to the SMC-based algorithm for variance reduction. The random sequences in SMC approach are replaced by low-discrepancy sequences in the sampling of the steps of initialization and prediction. We performed the experiments of phase IFB estimation for GLONASS data processing with practical data. The results show that the GNSS phase bias estimates have much smaller variance based on the low-discrepancy sequence. The estimation converges faster, and the results are more precise even with a much smaller number of samples. This can largely save the convergence time and the computation load of GNSS PNT services.
In addition, this study introduces the difficulty in GNSS data processing to the mathematic community, and it has the potential to boost new numerical algorithms for GNSS research and applications.