Characterization and Correction of the Geometric Errors in Using Confocal Microscope for Extended Topography Measurement. Part I: Models, Algorithms Development and Validation

This work presents a method for characterizing and correcting the geometric errors of the movement of the lateral stage of Imaging Confocal Microscope (CM) in extended topography measurement. For an extended topography measurement, a defined number of 2D images are taken and stitched by correlation methods. Inaccuracies due to linear displacement, vertical and horizontal straightness errors, angular errors, and squareness errors based on the assumption of the rigid body kinematics are described. A mathematical model for the scale calibration of the X- and Y- coordinates is derived according to the system kinematics, the axis chain vector of CM, and the geometric error functions and their approximations by Legendre polynomials. The correction coefficients of the kinematic modelling are determined by the measured and certified data of a dot grid target standard artefact. To process the measurement data, algorithms for data partitions, fittings of cylinder centers, and determinations of coefficients are developed and validated. During which methods such as form removal, K-means clustering, linear and non-linear Least Squares are implemented. Results of the correction coefficients are presented in Part II based on the experimental studies. The mean residual reduces 29.6% after the correction of the lateral stage errors.


Introduction
The calibration of measuring machines is both important for test acceptance and error compensation [1]. Use of software techniques started from the very beginning for the correction of the systematic errors of measurement instruments, as the mechanical accuracy is expensive while repeatability cost little [2]. Over the last several decades, measurement accuracy and error compensation have been an area of intensive investigation [3][4][5].
Lateral calibration of the X-and Y-coordinates serves as a calibration of the magnification of the Xand Y-axis scale [6]. Lateral calibration/correction of confocal microscope (CM) can be classified mainly into two groups: imaging system calibration and machine system calibration [7]. Calibrations of the imaging system studies aberrations of refractive systems of objective lenses [8][9][10], axial distortion by point spread function or refractive-index mismatches [11,12], etc. The machine system calibrations investigate geometric errors generated by the movement of probes [2,13,14]. The first study of using kinematic geometric errors for error compensation of coordinate measurement traces back to the work of G Zhang et al. [15]. From then on, characterization and correction of kinematic geometric errors has mainly focused on coordinate measuring machines (CMM) [16,17], and has seldom been applied for optical measurement instruments.
However, areal measurement by CM is realized by both imaging systems and machine systems. CM scan in the lateral and vertical directions using different physical mechanisms [18]. To obtain a surface topography, CM captures a series of two-dimensional images by stepping either the specimen holding base or the objectives along the Z-stage [19]. For a single topography measurement, the 2D images are acquired in lateral directions either all at once by a CCD array (Imaging CM) or by a raster scan (Laser Scanning CM) [20,21]. For an extended topography measurement, a defined number of 2D images are taken and stitched by correlation methods. Accordingly, calibrating the imaging system only is not sufficient for a CM, as its measurement is realized by both imaging and machine systems. Lateral calibration or machine system calibration is necessary and important. It is worth mentioning that some works have been carried out on characterization of metrological characteristics defined in ISO 25178-600 [22], such as amplification coefficient, linearity deviation, x-y perpendicularity deviation, etc. [23][24][25][26]. These metrological characteristics imply the whole instrumentation system's characteristics to some extent.
In this work, we present a method to characterize and correct the geometric errors of the lateral stage of the CM, with implementation of a standard artefact of dot grid target. Extended topography measurements are influenced by kinematic errors of the measurand holding base, which moves horizontally in the X-and Y-directions. This work first introduces the theory and mathematics of the 21 rigid body geometric errors applied generally in the study of error compensations for CMM and machine tools in Section 2 Afterwards, we develop our own kinematic modelling based on the theory of kinematic and geometric errors for the lateral stage movement of CM in the same Section. Section 3 introduces the methodology of the study. Mathematical models and algorithms for measurement data processing are presented in Section 4. Section 5 validates our developed models and algorithms using synthetic data. Section 6 draws the conclusions. Experimental studies of the calibration and uncertainty evaluation are presented in Part II, published as another paper in the same journal.

Mathematical Model for the X-and Y-Scale Calibration
For each of the three axes there exist three translational deviations, i.e. the linearity deviation, the straightness deviation and the ortoganality deviation. The straightness deviation is defined as the displacement orthogonal to the axis of motion, with two straightness deviation functions for each of the orthogonal directions. Rotational movements of stages are roll, yaw and pitch, with roll denoting a screw like movement, and pitch and yaw denoting angular deviation functions that describe movements within a plane [1,15,27]. Some parameters of mechanical systems, such as bearing spacing and guideway geometric errors, determine axis motion errors of machine axes [28]. T. O. Ekinci et al. categorized machine errors into three levels, i.e., geometric errors, joint kinematic errors, and volumetric errors [29]. This work clarifies the terminologies and highlights the usefulness of a machine error modeling approach.
For the measurement of a specimen using the 3D Imaging CM, the workpiece is placed on the X-and Y-stage above the supporting base, which is usually made of granite. As shown in Figure 1a, to focus the laser beam onto the measurand surface, two movements are carried ou, i.e., movement along the X-and Y-axis, controlled by the lateral stage, and movement along the Z-axis, controlled by the vertical stage. For example, to measure the point P on the measurand surface, its position can be denoted as → P(P x , P y , P z ) with respect to its coordinate origin → O(0, 0, 0). For the focus of the probe and the point of workpiece, the first movement include translation of X carriage followed by the translation of Y carriage; the second movement is the translation of Z carriage. It can be suggested that vectors X, Y and Z represent the translations of X, Y and Z carriages, the rotation matrices R(X), R(Y) and R(Z) denote the angular motion errors caused by the translations of X, Y and Z carriages, the vector T represents the X, Y and Z abbe offsets of the probe with respect to the carriage to which the probe is attached. For our Imaging CM, the probe is attached to the Z-stage. Therefore, it is obvious that the translation of the X and Y carriage, with an offset of the position → P, stops at the same point reached by the translation of the Z carriage with a probe offset T, which can be understood as the focus of the probe with the measurand. According to this conclusion, a diagram of the axis chain vector of 3D imaging microscopy is plotted and shown in Figure 1b. Electronics 2018, 7, x FOR PEER REVIEW 3 of 22 rotation matrices R(X) , R(Y) and R(Z) denote the angular motion errors caused by the translations of X, Y and Z carriages, the vector T  represents the X, Y and Z abbe offsets of the probe with respect to the carriage to which the probe is attached. For our Imaging CM, the probe is attached to the Z-stage. Therefore, it is obvious that the translation of the X and Y carriage, with an offset of the position P  , stops at the same point reached by the translation of the Z carriage with a probe offset T  , which can be understood as the focus of the probe with the measurand. According to this conclusion, a diagram of the axis chain vector of 3D imaging microscopy is plotted and shown in Figure 1b. As shown in Figure 1b,  . Therefore, the chain vector shown in Figure 1b for the two kinematical paths, from the same origin reaching the same laser beam focus, can be expressed by Equation (1).
After rearrangement of the above equation, vector P  can be expressed by all the other vectors and matrices, as the same result presented by G. Zhang et al. [1].
The rotation can be expressed by the infinitesimal rotation matrix. The vectors X  , Y  , Z  and T  can be substituted by their position matrix. This has been widely accepted and implemented by many former investigators [1,15,27,30].
In the 3D Imaging CM, there are no real X-and Y-carriages, the workpiece is carried by a lateral stage mounted on a fixed base [13,27], the Z-stage is mounted on another fixed base. In addition to that, the beam focus is considered to be the probe. As the focus is always on the workpiece surface and the information is acquired exactly at that point, there is no Abbe offset on x , y and z , nor is there an angular term. This means In this work, only geometric errors along the X-and Y-directions are studied. After dropping out the Z-stage variables, the geometric error functions for X-and Y-stage are: In each kinematical path, the actual movement of that axis is affected by the rotational error of its predecessors. The movement of T is affected by the rotational error of the movement of Z. Y is affected by the rotational error of X. → P is affected by the rotational errors of X and Y. Therefore, the chain vector shown in Figure 1b for the two kinematical paths, from the same origin reaching the same laser beam focus, can be expressed by Equation (1).
After rearrangement of the above equation, vector → P can be expressed by all the other vectors and matrices, as the same result presented by G. Zhang et al. [1].
The rotation can be expressed by the infinitesimal rotation matrix. The vectors X, Y, Z and T can be substituted by their position matrix. This has been widely accepted and implemented by many former investigators [1,15,27,30].
In the 3D Imaging CM, there are no real X-and Y-carriages, the workpiece is carried by a lateral stage mounted on a fixed base [13,27], the Z-stage is mounted on another fixed base. In addition to that, the beam focus is considered to be the probe. As the focus is always on the workpiece surface and the information is acquired exactly at that point, there is no Abbe offset on x, y and z, nor is there an angular term. This means T = 0 and R(Z) = 0.
In this work, only geometric errors along the X-and Y-directions are studied. After dropping out the Z-stage variables, the geometric error functions for X-and Y-stage are: Those errors in Equations (2) and (3) are functions, which can be simulated by Legendre polynomials. Implementation of Legendre polynomials is computationally simpler and provides a reduction of around 2% higher than using Chebyshev polynomials [31]. After substituting the error functions in Equations (21) and (22) by using the approximation of Legendre polynomials, the final mathematical models for the beam focus point → P(P x , P y , 0) with error corrections can be acquired:

Methodology and a Brief Introduction of Experimental Design
To calibrate the kinematic geometric errors of the lateral stage of a dual core 3D CM, a standard artefact with dot grid target on glass is measured and a series of algorithms are developed for the process of the raw measurement data, which include separations of cylinders and flat, fitting of cylinder centers, determination of coefficients, etc. Figure 2 illustrates the standard, which was provided by Max Levy Autograph, Inc. Those dots are deposited on the substrate, which is glass with a thickness of 1.5 mm. Table 1 gives an indication of the size, the dot spacing, the diameter, the accuracy, etc., of the dot pattern. (2) and (3) are functions, which can be simulated by Legendre polynomials. Implementation of Legendre polynomials is computationally simpler and provides a reduction of around 2% higher than using Chebyshev polynomials [31]. After substituting the error functions in Equations (21) and (22) by using the approximation of Legendre polynomials, the final mathematical models for the beam focus point ( , ,0)

Those errors in Equations
x y P P P  with error corrections can be acquired: x e x e x e x x f y f y f y y x c y c y c y y

Methodology and a Brief Introduction of Experimental Design
To calibrate the kinematic geometric errors of the lateral stage of a dual core 3D CM, a standard artefact with dot grid target on glass is measured and a series of algorithms are developed for the process of the raw measurement data, which include separations of cylinders and flat, fitting of cylinder centers, determination of coefficients, etc. Figure 2 illustrates the standard, which was provided by Max Levy Autograph, Inc. Those dots are deposited on the substrate, which is glass with a thickness of 1.5 mm. Table 1 gives an indication of the size, the dot spacing, the diameter, the accuracy, etc., of the dot pattern.    This work aims at characterizing and correcting the kinematic geometric errors of the movement of the lateral stage, by calculating and applying the coefficients of Equations (4) and (5). To calculate the coefficients, the introduced dot grid target standard is measured, the measurement data are processed for obtaining the coordinate values (x i , y i ) of the dots' centers, the coordinate values (x i , y i ) and their corresponding certified values are put into Equations (4) and (5), the coefficients are obtained by solving the equations using non-linear Least Squares method. The obtained coefficients and Equations (4) and (5) are implemented for correcting the new measurement data. Another different area of the introduced dot grid target standard is measured, and the measurement data is corrected. By comparing the residuals before correction and after correction, the significance and practicality of our model for kinematic geometric error correction can be observed.
The algorithms used for measurement data processing and for calculating the coefficients are introduced in Section 4. Validations of those algorithms are introduced in Section 5. The details of the experimental procedures, parameters, and analysis are presented in Part II, which is published in another paper in the same journal.

Algorithm and Procedure for the Separation of Flats and Cylinders
The raw measurement data usually contains defects, outliers, unmeasured points, etc. [32]. The measured surface often inclines, as it is impossible to locate the measurand surface perfectly perpendicular to the Z-axis. Figure 3 gives an example of the measurement of the dot grid target, implementing an objective of magnification 50×, numerical aperture of 0.90. The acquisition parameter of measurement area was defined as the topography stitching measurement, with 8 × 8 extended topographies, covering an area of 1.78 × 1.33 mm 2 . The parameter of the overlapping area is 25%, and the correlation takes the XYZ option. The level of resolution is 2, and the measured extended topographies contain 2673 × 2003 pixels. correction, the significance and practicality of our model for kinematic geometric error correction can be observed.
The algorithms used for measurement data processing and for calculating the coefficients are introduced in Section 4. Validations of those algorithms are introduced in Section 5. The details of the experimental procedures, parameters, and analysis are presented in Part II, which is published in another paper in the same journal.

Algorithm and Procedure for the Separation of Flats and Cylinders
The raw measurement data usually contains defects, outliers, unmeasured points, etc. [32]. The measured surface often inclines, as it is impossible to locate the measurand surface perfectly perpendicular to the Z-axis. Figure 3 gives an example of the measurement of the dot grid target, implementing an objective of magnification 50×, numerical aperture of 0.90. The acquisition parameter of measurement area was defined as the topography stitching measurement, with 8 × 8 extended topographies, covering an area of 1.78 × 1.33 mm 2 . The parameter of the overlapping area is 25%, and the correlation takes the XYZ option. The level of resolution is 2, and the measured extended topographies contain 2673 × 2003 pixels.
This measurement has many unmeasured points in the right-top corner due to the Z-range parameter setting. The first step is to choose an area of this raw surface with appropriate size and location, targeting at minimizing the influence of defects and removing outliers [32] by selecting valid grid positions with loss of the uniformity of the grid. Figure 4 shows a surface reconstruction by trimming the raw data with 10% edges along the X-and Y-directions. As there are too many measurement defects when 800 m y   , this part is also abandoned.  This measurement has many unmeasured points in the right-top corner due to the Z-range parameter setting. The first step is to choose an area of this raw surface with appropriate size and location, targeting at minimizing the influence of defects and removing outliers [32] by selecting valid grid positions with loss of the uniformity of the grid. Figure 4 shows a surface reconstruction by trimming the raw data with 10% edges along the X-and Y-directions. As there are too many measurement defects when y > 800 µm, this part is also abandoned.  After obtaining an appropriate area for analysis, it is necessary to eliminate its form. The principal axes of the distribution of data points of the measured point cloud are determined by solving the eigen value problem. The data are rotated by principal axis transformation to align them in the direction of the eigen vectors. Figure 5 gives an example of the rotated surface, which is rotated from the surface shown in Figure 4. Figure 5a was plotted by the Matlab ® function 'plot3′, while Figure 5b was plotted by the function 'surf'. It can be found that after rotation, the range of the Z-axis is much smaller, and many details of the surface can be observed. It is obvious that this surface is constructed using point clouds, which form a flat plane and many cylinders perpendicular to this flat. This rotated surface is now ready for the separation of flats and cylinders. Figure 6 gives an indication of the distribution of the heights of the surface. Figure 7 and Figure 8 give examples of the separations of the rotated surface shown in Figure 5. After obtaining an appropriate area for analysis, it is necessary to eliminate its form. The principal axes of the distribution of data points of the measured point cloud are determined by solving the eigen value problem. The data are rotated by principal axis transformation to align them in the direction of the eigen vectors. Figure 5 gives an example of the rotated surface, which is rotated from the surface shown in Figure 4. Figure 5a was plotted by the Matlab ® function 'plot3 , while Figure 5b was plotted by the function 'surf'. It can be found that after rotation, the range of the Z-axis is much smaller, and many details of the surface can be observed. It is obvious that this surface is constructed using point clouds, which form a flat plane and many cylinders perpendicular to this flat.  After obtaining an appropriate area for analysis, it is necessary to eliminate its form. The principal axes of the distribution of data points of the measured point cloud are determined by solving the eigen value problem. The data are rotated by principal axis transformation to align them in the direction of the eigen vectors. Figure 5 gives an example of the rotated surface, which is rotated from the surface shown in Figure 4. Figure 5a was plotted by the Matlab ® function 'plot3′, while Figure 5b was plotted by the function 'surf'. It can be found that after rotation, the range of the Z-axis is much smaller, and many details of the surface can be observed. It is obvious that this surface is constructed using point clouds, which form a flat plane and many cylinders perpendicular to this flat. This rotated surface is now ready for the separation of flats and cylinders. Figure 6 gives an indication of the distribution of the heights of the surface. Figure 7 and Figure 8 give examples of the separations of the rotated surface shown in Figure 5. This rotated surface is now ready for the separation of flats and cylinders. Figure 6 gives an indication of the distribution of the heights of the surface. Figures 7 and 8 give examples of the separations of the rotated surface shown in Figure 5. After separating the cylinders from the flat, the next step is to obtain the three coordinates of each cylinder point cloud. Figure 9 gives an example of the separation of cylinders into individual clusters, characterizing the cylinders with different colors. Here the squared Euclidean distance metric is applied for distance calculation, as shown in Equation (6). Moreover, each centroid of the point cloud is the mean of the coordinates' values of the points.  After separating the cylinders from the flat, the next step is to obtain the three coordinates of each cylinder point cloud. Figure 9 gives an example of the separation of cylinders into individual clusters, characterizing the cylinders with different colors. Here the squared Euclidean distance metric is applied for distance calculation, as shown in Equation (6). Moreover, each centroid of the point cloud is the mean of the coordinates' values of the points. After separating the cylinders from the flat, the next step is to obtain the three coordinates of each cylinder point cloud. Figure 9 gives an example of the separation of cylinders into individual clusters, characterizing the cylinders with different colors. Here the squared Euclidean distance metric is applied for distance calculation, as shown in Equation (6). Moreover, each centroid of the point cloud is the mean of the coordinates' values of the points. After separating the cylinders from the flat, the next step is to obtain the three coordinates of each cylinder point cloud. Figure 9 gives an example of the separation of cylinders into individual clusters, characterizing the cylinders with different colors. Here the squared Euclidean distance metric is applied for distance calculation, as shown in Equation (6). Moreover, each centroid of the point cloud is the mean of the coordinates' values of the points.
where x is the initial observations, c is the centroid.
Electronics 2018, 7, x FOR PEER REVIEW 8 of 22 Where x is the initial observations, c is the centroid. According to the previous investigations, there are several efficient algorithms for data partitioning, such as the K-means clustering algorithm [33,34] and Lloyd's algorithm [35]. These two algorithms are iterative data-partitioning algorithms, which assign the initial observations into k clusters according to the squared Euclidean distance metric and the centroids. The number of clusters k is defined before data partition. In our work, the Lloyd's algorithm is used for finding the centroids of k clusters while the K-means clustering algorithm is used for partitioning the initial observations into k clusters according to the squared Euclidean distance metric with respect to the centroid of each cluster. The K-means clustering algorithm requires the input as a matrix of M points in N dimensions as well as a matrix of K initial cluster centers in N dimensions.
Denoting the number of points in cluster L is ( ) NC L , and the Euclidean distance between point I and cluster L is ( , ) D I L , the general procedure of the K-means clustering algorithm is to find a K-partition of clusters with locally optimal within-cluster sum of squares by changing the initial observations from one cluster to another [33], as indicated by Equation (7).
The procedures of the algorithm for partitioning the filtered cylinders into individuals are shown in Figure 10, which consists of 6 steps. It is worth mentioning that the data are subjected to dimensionality reduction before data partitioning. These cylinders are a point cloud of three dimensions. The aim of dimensionality reduction is to produce a compact low-dimensional encoding of a given high-dimensional data set [36]. The dimensionality can be reduced in two ways [37]. The first is by only keeping the most relevant variables from the original dataset, which is called feature selection. The second way is to exploit the redundancy of the input data and to find a smaller set of new variables, each being a combination of the input variables containing basically the same information as the input variables. In this work, the dimensionality of the data is reduced to 2D from 3D by projection along Z-axis.
Since the equations characterizing lateral linearity deviations, Equations (4) and (5) are independent of the vertical axis z, the dimension of the 3D point cloud is reduced to the two dimensions of the x-y-plane According to the previous investigations, there are several efficient algorithms for data partitioning, such as the K-means clustering algorithm [33,34] and Lloyd's algorithm [35]. These two algorithms are iterative data-partitioning algorithms, which assign the initial observations into k clusters according to the squared Euclidean distance metric and the centroids. The number of clusters k is defined before data partition. In our work, the Lloyd's algorithm is used for finding the centroids of k clusters while the K-means clustering algorithm is used for partitioning the initial observations into k clusters according to the squared Euclidean distance metric with respect to the centroid of each cluster. The K-means clustering algorithm requires the input as a matrix of M points in N dimensions as well as a matrix of K initial cluster centers in N dimensions. Denoting the number of points in cluster L is NC(L), and the Euclidean distance between point I and cluster L is D(I, L), the general procedure of the K-means clustering algorithm is to find a K-partition of clusters with locally optimal within-cluster sum of squares by changing the initial observations from one cluster to another [33], as indicated by Equation (7).
The procedures of the algorithm for partitioning the filtered cylinders into individuals are shown in Figure 10, which consists of 6 steps. It is worth mentioning that the data are subjected to dimensionality reduction before data partitioning. These cylinders are a point cloud of three dimensions. The aim of dimensionality reduction is to produce a compact low-dimensional encoding of a given high-dimensional data set [36]. The dimensionality can be reduced in two ways [37]. The first is by only keeping the most relevant variables from the original dataset, which is called feature selection. The second way is to exploit the redundancy of the input data and to find a smaller set of new variables, each being a combination of the input variables containing basically the same information as the input variables. In this work, the dimensionality of the data is reduced to 2D from 3D by projection along Z-axis.

Algorithm for Determination of Coefficients
This section aims at the determination of the coefficients, 1 Equations (4) and (5). The objective is to fit the 18 coefficients to obtain the least squared residuals between the corrected X-and Y-coordinates and the nominal position. That is: i corr P x y denotes the th i corrected position obtained from the measured data and Equations (4) and (5). Therefore, the objective function is The minimum value of F arrives when the gradient is zero [38,39], i.e.,: Where F is the objective function, i  represents the parameters of coefficients to be determined.
There are 18 coefficients in this part of the work, and hence there are 18 partial derivatives: Since the equations characterizing lateral linearity deviations, Equations (4) and (5) are independent of the vertical axis z, the dimension of the 3D point cloud is reduced to the two dimensions of the x-y-plane.

Algorithm for Determination of Coefficients
This section aims at the determination of the coefficients, a 1 , a 2 , a 3 , b 1 . . . , f 3 , defined in Equations (4) and (5). The objective is to fit the 18 coefficients to obtain the least squared residuals between the corrected X-and Y-coordinates and the nominal position. That is: where, P nom i (x, y) represents the i th nominal position defined by the X-and Y-coordinates, P corr i (x, y) denotes the i th corrected position obtained from the measured data and Equations (4) and (5). Therefore, the objective function is min F(x, y) a 1 ,a 2,..., The minimum value of F arrives when the gradient is zero [38,39], i.e.,: where F is the objective function, β i represents the parameters of coefficients to be determined. There are 18 coefficients in this part of the work, and hence there are 18 partial derivatives: These equations can first be solved by solving the equation for the initial solutions. The initial solutions might be far from the correct solutions, as there are too many more constraints than unknown coefficients. Those initial solutions can be used as the starting point for iteratively finding the nonlinear least squares solutions. Extensive work has been done on the nonlinear least squares algorithms [40,41]. Let the model for data fitting be [41]: where x 1 , x 2 , . . . , x m are independent variables, b 1 , b 2 , . . . , b k are k parameters of coefficients to be determined, E(y) is the expected value of the dependent variable y. Denote the data points as: (Y i , X 1i , X 2i , · · · X mi ), where, i = 1, 2, · · · , n. The objective is to calculate the k parameters of coefficients which will minimize the squares of the residuals: This problem can be written as an objective function which aims at optimizing the coefficients of each function Y i −Ŷ i , denoting the vector of the parameters to be optimized as t, where t = [a 1 , a 2 , · · · , f 3 ] in this work. The objective function is: where, f i (t) are functions of the to-be-optimized vector t. When f i (t) are nonlinear functions with respect to t, this problem is about nonlinear optimization. The solution is to use Taylor expansion to convert f i (t) into linear functions. As only the first order of the Taylor series is linear, here we approximate it by the first order: where ∇ f i (t (k) ) is the value of the first derivative of f i (t (k) ) on vector t evaluated at the point t (k) . Substituting f i (t) in Equation (16) with its Taylor Expansion approximation indicated by Equation (17), the approximation of F(t) is: To simplify the above equation, use A k to represent ∇ f i (t (k) ) T , and b to represent (18) is simplified into: where, Therefore, it can be obtained: The solution of this equation is: This solution can be simplified as: where, H k is the Hessian matrix: Equation (25) can be solved iteratively by starting with an initial solution.

Validation of the Algorithm for Determination of Dots' Centers and Distance
To validate the algorithm for determination of dots' centers, synthetic data with known cylinder centers and point cloud distributions is generated. This generated data contains 35 cylinders of point cloud, distributed as 5 × 7 along the X-axis and Y-axis, as shown in Figure 11.
Electronics 2018, 7, x FOR PEER REVIEW 12 of 22 Equation (25) can be solved iteratively by starting with an initial solution.

Validation of the Algorithm for Determination of Dots' Centers and Distance
To validate the algorithm for determination of dots' centers, synthetic data with known cylinder centers and point cloud distributions is generated. This generated data contains 35 cylinders of point cloud, distributed as 5 × 7 along the X-axis and Y-axis, as shown in Figure 11. The generated data are processed using our developed algorithm for the determination of dots' centers. The first step is to separate the data into individual point clouds. The results of the point cloud separation are shown in Figure 12, with each individual being indicated by numbers and different colors. In this step, the algorithm not only separates the point cloud into individuals, but also gives an indication of the initial centroids of each cloud. The generated data are processed using our developed algorithm for the determination of dots' centers. The first step is to separate the data into individual point clouds. The results of the point cloud separation are shown in Figure 12, with each individual being indicated by numbers and different colors. In this step, the algorithm not only separates the point cloud into individuals, but also gives an indication of the initial centroids of each cloud. After the separation, each individual point cloud is processed by our developed algorithm in order to fit the circle centers. The calculated centers are exactly the same with the generated ones. Figure 13 makes a plot of the calculated centers of the synthetic data generated for algorithm validation. The results indicate that the algorithm developed in this work for the determination of the centers of cylinder point cloud is valid.

Validation of the Algorithm for Determination of Coefficients
Here, we use two methods for the validation of the algorithm for the determination of the coefficients.
The first method is to create a dataset containing points with intervals of 125 μm as certified positions, while making a little displacement with each point. Calculate the correction coefficients and correct the points with the displacement using the algorithm defined in Section 4.2. The points with the displacement with respect to the certified points represent the measured points in the experiments. Therefore, two residuals can be obtained, i.e., the residual between the certified points and the points with displacement as well as the residual between the certified points and the corrected points. Compare those two residuals to check whether the correction is meaningful.
The algorithm first creates the two data described above. Then it rotates the points with displacement to the certified points, aiming at making the axis parallel. This arises from the methodology of this work, which requires locating the standard artefact that is parallel with the axis. After the separation, each individual point cloud is processed by our developed algorithm in order to fit the circle centers. The calculated centers are exactly the same with the generated ones. Figure 13 makes a plot of the calculated centers of the synthetic data generated for algorithm validation. After the separation, each individual point cloud is processed by our developed algorithm in order to fit the circle centers. The calculated centers are exactly the same with the generated ones. Figure 13 makes a plot of the calculated centers of the synthetic data generated for algorithm validation. The results indicate that the algorithm developed in this work for the determination of the centers of cylinder point cloud is valid.

Validation of the Algorithm for Determination of Coefficients
Here, we use two methods for the validation of the algorithm for the determination of the coefficients.
The first method is to create a dataset containing points with intervals of 125 μm as certified positions, while making a little displacement with each point. Calculate the correction coefficients and correct the points with the displacement using the algorithm defined in Section 4.2. The points with the displacement with respect to the certified points represent the measured points in the experiments. Therefore, two residuals can be obtained, i.e., the residual between the certified points and the points with displacement as well as the residual between the certified points and the corrected points. Compare those two residuals to check whether the correction is meaningful.
The algorithm first creates the two data described above. Then it rotates the points with displacement to the certified points, aiming at making the axis parallel. This arises from the methodology of this work, which requires locating the standard artefact that is parallel with the axis. The results indicate that the algorithm developed in this work for the determination of the centers of cylinder point cloud is valid.

Validation of the Algorithm for Determination of Coefficients
Here, we use two methods for the validation of the algorithm for the determination of the coefficients.
The first method is to create a dataset containing points with intervals of 125 µm as certified positions, while making a little displacement with each point. Calculate the correction coefficients and correct the points with the displacement using the algorithm defined in Section 4.2. The points with the displacement with respect to the certified points represent the measured points in the experiments. Therefore, two residuals can be obtained, i.e., the residual between the certified points and the points with displacement as well as the residual between the certified points and the corrected points. Compare those two residuals to check whether the correction is meaningful.
The algorithm first creates the two data described above. Then it rotates the points with displacement to the certified points, aiming at making the axis parallel. This arises from the methodology of this work, which requires locating the standard artefact that is parallel with the axis. As it is impossible to realize it manually, the coordinate axes are adjusted mathematically. After that, the error coefficients are calculated by the algorithm. With those coefficients, the values of the X-and Y-coordinates of the corrected points can be obtained. Figure 14 shows a comparison of the positions of the certified, measured and corrected points. The contour of the errors between certified and measured points is displayed in Figure 15. The contour of the errors between the certified and corrected points is shown in Figure 16. It is obvious that almost all of the errors are smaller after correction.   As it is impossible to realize it manually, the coordinate axes are adjusted mathematically. After that, the error coefficients are calculated by the algorithm. With those coefficients, the values of the X-and Y-coordinates of the corrected points can be obtained. Figure 14 shows a comparison of the positions of the certified, measured and corrected points. The contour of the errors between certified and measured points is displayed in Figure 15. The contour of the errors between the certified and corrected points is shown in Figure 16. It is obvious that almost all of the errors are smaller after correction.    As it is impossible to realize it manually, the coordinate axes are adjusted mathematically. After that, the error coefficients are calculated by the algorithm. With those coefficients, the values of the X-and Y-coordinates of the corrected points can be obtained. Figure 14 shows a comparison of the positions of the certified, measured and corrected points. The contour of the errors between certified and measured points is displayed in Figure 15. The contour of the errors between the certified and corrected points is shown in Figure 16. It is obvious that almost all of the errors are smaller after correction.    The error is calculated by the Euclidean distance between the certified point and its corresponding point with displacement. The same goes with the error between the certified point and the corrected point. Table 2 makes a comparison of the error before correction and after correction. The difference is calculated as: Di f f (i) = Err c (i) − Err m (i) (28) where, Di f f (i) is the difference between the error after correction and before correction. When Di f f (i) is negative, it means the error is smaller after correction. Err c (i) is the Euclidean distance between the certified point and corrected point. Err m (i) is the Euclidean distance between the certified point and the measured point. The mean residual after correction is 5.60 µm, while the mean error between the measured and the certified point is 9.65 µm. The square residual of all the 35 corrected points is 1587.99 µm 2 , while it was 3684.85 µm 2 before correction. It can be found that some points are more distorted after correction. This is because the fitting method used is the nonlinear least squares, the objective function of which aims at finding the sum of least squares of the residuals of all points, but does not ensure that every residual is smaller after fitting. The second method creates another 10 data by varying the displacement from the certified positions based on this data. In these 10 simulations, the certified positions are all the same with those certified positions introduced above. The measured points' positions are assigned with displacements, the values of which are shown in Table 3. The 10 data are processed by this algorithm. The results of each simulation are compared and analyzed. Figure 17 indicates the positions of the certified, measured, and corrected points of those 10 simulations. It is obvious that almost all of the corrected points are closer than the measured points to the certified points.   Table 4 lists two mean residual and two squared residuals. One of the mean residuals arises from the displacements of the measured points from the certified points, while the other mean residual comes from the corrected points from the certified points. The mean residual is calculated by Equations (29) and (30) Where, Re The two squared residuals include the squared residual of the measured points to the certified points and the squared residual of the corrected points to the certified points. The squared residual is calculated by:   Table 4 lists two mean residual and two squared residuals. One of the mean residuals arises from the displacements of the measured points from the certified points, while the other mean residual comes from the corrected points from the certified points. The mean residual is calculated by Equations (29) and (30): where, Res meas M and Res corr M represent the mean residual of the measured points and the mean residual of the corrected points, respectively, P cert (i) represents the position of the i th point of the certified positions, P corr (i) represents the position of the i th point of the corrected positions. The two squared residuals include the squared residual of the measured points to the certified points and the squared residual of the corrected points to the certified points. The squared residual is calculated by: Res corr S = n i=1 ( P cert (i) − P corr (i) 2 ) 2 (32) Figures 18 and 19 plot the mean residuals and squared residuals of the 10 simulations. It can be observed that both the mean residual curve and the squared residual curve of the measured points are concave in the middle and convex at the two sides. The two residuals of the corrected points are much smoother. The larger the original residuals, the more the correction can be realized.  Figure 18 and Figure 19 plot the mean residuals and squared residuals of the 10 simulations. It can be observed that both the mean residual curve and the squared residual curve of the measured points are concave in the middle and convex at the two sides. The two residuals of the corrected points are much smoother. The larger the original residuals, the more the correction can be realized.
The results of the Simulation No. 10 are shown. Figure 20 compares the positions of the certified, measured, and corrected points of Simulation No. 10. Figure 21 and Figure 22 presents contours of the mean residual and the squared residual between certified and measured points of Simulation No. 10. Figure 23 shows the error vectors between the measured and corrected points of Simulation No. 10. Figure 18. Comparison of the mean residuals before correction and after correction. Figure 18. Comparison of the mean residuals before correction and after correction.
Electronics 2018, 7, x FOR PEER REVIEW 18 of 22 Figure 19. Comparison of the squared residuals before correction and after correction. Figure 19. Comparison of the squared residuals before correction and after correction.
The results of the Simulation No. 10 are shown. Figure 20 compares the positions of the certified, measured, and corrected points of Simulation No. 10. Figures 21 and 22 presents contours of the mean residual and the squared residual between certified and measured points of Simulation No. 10. Figure 23 shows the error vectors between the measured and corrected points of Simulation No. 10.          As the results above show, all the measured values are improved after geometric error correction. It can be concluded that our developed algorithm for the determination of correction coefficients is valid.

Conclusions
The work presented herein is Part I of the whole work, describing our methods for characterization and correction of lateral stage geometric errors. With implementation of the 21 parametric errors based on the assumption of the rigid body kinematics for calculating the machine volumetric errors of a three-axis machine, a mathematical model for correcting the lateral stage deviations of the CM was developed.
According to the measurement principle of CM, a diagram of the axis chain vector was drawn. Equation (1)   As the results above show, all the measured values are improved after geometric error correction. It can be concluded that our developed algorithm for the determination of correction coefficients is valid.

Conclusions
The work presented herein is Part I of the whole work, describing our methods for characterization and correction of lateral stage geometric errors. With implementation of the 21 parametric errors based on the assumption of the rigid body kinematics for calculating the machine volumetric errors of a three-axis machine, a mathematical model for correcting the lateral stage deviations of the CM was developed.
According to the measurement principle of CM, a diagram of the axis chain vector was drawn. Equation (1) was developed according to the kinematic path of the chain vector. After the rearrangement of Equation (1) and its simplification by dropping Z and T, this equation was further simplified due to the study specifications and characteristics of the CM. The function errors were approximated by Legendre polynomials. Finally, Equations (4) and (5) for corrections of kinematic geometric errors were obtained. The methodology for calculation and application of the corrections coefficients defined in Equations (4) and (5) are introduced afterwards. Experiments on a dot grid target standard artefact will be measured and studied in Part II, published in the same journal.
The algorithms that will be implemented for the measurement data processing are presented and validated. These algorithms apply a lot of methods, including form removal, K-means clustering, linear and non-linear Least Squares. Their mathematical models were also introduced. The results of the validation based on synthetic data imply that our mathematical models and algorithms are reliable.

Conflicts of Interest:
The authors declare no conflict of interest.