Data Fusion Based on Subspace Decomposition for Distributed State Estimation in Multi-Hop Networks

This paper deals with the problem of estimating the distributed states of a plant using a set of interconnected agents. Each of these agents must perform a real-time monitoring of the plant state, counting on the measurements of local plant outputs and on the exchange of information with the rest of the network. These inter-agent communications take place within a multi-hop network. Therefore, the transmitted information suffers a delay that depends on the position of the sender and receiver in a communication graph. Without loss of generality, it is considered that the transmission rate and the plant sampling rate are both identical. The paper presents a novel data-fusion-based observer structure based on subspace decomposition, and addresses two main subproblems: the observer design to stabilize the estimation error, and an optimal observer design to minimize the estimation uncertainties when plant disturbances and measurements noises come into play. The performance of the proposed design is tested in simulation.


Introduction
In the last decade, the development of intelligent sensors and actuators, endowed with communication and computational capabilities, has fostered their integration in a variety of large-scale plants, especially in cyber-physical systems and cyber-physical systems of systems [1]. These intelligent devices are entangled with physical systems and interact in real-time with them, receiving measurements, making calculations, and taking decisions in a continuous manner. To perform these tasks, the sensors need to exchange measurements of the system outputs, which are very often geographically distributed and are only a small subset or linear combination of the whole system state. This exchange of information makes it possible to reconstruct the complete state distributedly using the so-called distributed estimation algorithms, which have recently attracted much attention from the control community.
Although distributed estimation architectures offer several advantages, such as scalability, flexibility, fault tolerance, and robustness [2], it is also true that the complexity associated with these problems makes them harder to solve. This is due to the fact that none of the nodes manage all the state measurements of the plant and thus they require an exchange of some state information, sometimes experiencing delays and communication dropouts that complicate the maintenance of the observer's stability. Consequently, a remarkable research effort has been made in this field and important contributions to the topic can be found in the literature adapted to different considerations, for example the system modeling or the communication setup. Attending to the former feature, several approaches have been published for the case of noiseless/unperturbed models, where typically the main objective is the asymptotic stabilization of the estimation errors. For instance, in [3,4], reduced-order observers are designed based on coordinate transformations. However, although

•
Unlike the conventional data fusion approaches [17] or [18], the information is not required to spread through the network in a single sample time, thereby relaxing the requirements of the network.

•
The observer design reduces the exchange of information with respect to other data fusion algorithms, due to the fact that the agents are not required to collect the information of every agent to reconstruct the whole state.

•
In case of duplicated information, the proposed subspace decomposition allows the observer to be selective when deciding the agents who will be the source of the required data.
The paper is organized as follows. In Section 2, the problem is formally stated, together with some definitions and assumptions. Section 3 presents the observer structure. In Section 4 some properties useful for the consequent developments are introduced. Sections 5 and 6 present methods to design the observer in order to guarantee the stability in the estimations and to minimize the expected estimation error when the system and the measurements made by the agents are affected by noise, respectively. In Section 7 some simulations examples are exposed. Finally, conclusions are drawn in Section 8.

Notation 1.
A graph is a pair G = (V, E ) comprising a set V = {1, 2, . . . , p} of vertices or agents, and a set E ⊂ V × V of edges or links. A directed graph is a graph in which edges have directions, so that if (j, i) ∈ E , then agent i obtains information from agent j. A directed path from node i 1 to node i k is a sequence of edges such as (i 1 , i 2 ), (i 2 , i 3 ), . . ., (i k−1 , i k ) in a directed graph. The neighborhood of i, N i := {j : (j, i) ∈ E }, is defined as the set of nodes with edges incoming to node i. Given ρ ∈ Z + , the ρ-hop reachable set of i, N i,ρ , is defined as the set of nodes with a direct path to i involving ρ edges. Loosely speaking, the ρ-hop reachable set of agent i is comprised of those agents whose transmitted information takes ρ hops, or sampling instants, to reach to agent i. Note that the 1-hop reachable set of i corresponds to the neighborhood of i and the 0-hop reachable set of i matches with i. The operator col(·, ·) stacks subsequent matrices into a column vector, e.g., for A ∈ R m 1 ×n and B ∈ R m 2 ×n , col(A, B) = [A B ] ∈ R (m 1 +m 2 )×n . Let I n be the identity matrix of dimension n. Let E{·} denotes the expected value.

Problem Formulation
Consider a set of agents V = {1, 2, . . . , p} connected according to a given graph G = (V, E), and intended to distributedly estimate the state of the following discrete-time linear time-invariant system: where x ∈ R n is the state vector, A ∈ R n×n is the system matrix, y i ∈ R m i is the output locally measured by each agent i, C i ∈ R m i ×n is the corresponding output matrix and w ∈ R n and n i ∈ R m i are mutually independent Gaussian state and measurements noises, respectively, with covariance matrices M ∈ R n×n and R i ∈ R m i ×m i . We assume that the information flows slowly through the network, consuming one sample time to get from any agent to its neighbors. This is the case, for instance, of networks set up based on Zigbee [22].
The observation structure proposed in the next section relies on system transformations to the observability staircase form (see for instance Theorem 16.2 in [23]). Prior to introducing this structure, the following definitions are needed.
Definition 1. The ρ-hop output matrix of agent i, C i,ρ , is a matrix that stacks the (ρ − 1)-hop output matrix of agent i and the (ρ − 1)-hop output matrices of its neighborhood, N i . That is: Intuitively speaking, the ρ-hop output matrix of agent i, C i,ρ , is composed by its output matrix C i and the output matrices of all the agents j with a direct path to i involving ρ or less edges.
is collectively detectable if every agent is able to detect the whole system with the information provided by the network. This is, for each agent i ∈ V, there exists a finite number of hops i ∈ Z + such that pair C i, i , A is detectable.
There always exists a coordinate transformation matrix V i,ρ V i,ρ ∈ R n×n according to pair (C i,ρ , A) such that the change of variable ξ i,ρ := [V i,ρ V i,ρ ] x ∈ R n transforms the original state-space representation into the observability staircase form. Note thatV i,ρ ∈ R n×nō i,ρ is composed by nō i,ρ column vectors in R n that form an orthogonal basis of the unobservable subspace of pair (C i,ρ , A).
i,ρ is an orthogonal basis of its orthogonal complement.

Definition 3.
The ρ-hop unobservable subspace from agent i, denotedŌ i,ρ , is composed of all system modes that cannot be observed from the output locally measured by agent i and those measured by all the agents belonging to the s-hop reachable set of i, ∀s ∈ {0, . . . , ρ}. Equivalently, the ρ-hop unobservable subspace from agent i is the unobservable subspace related to pair (C i,ρ , A) using the above coordinate transformation:Ō i,ρ := Im(V i,ρ ). The orthogonal complement ofŌ i,ρ , with some abuse of notation, is denoted ρ-hop observable subspace from According to Definition 3, it is clear that: where we consider O i,−1 = ∅. Then, the vectors of the "innovation" basis that generates O i,ρ ∩ (O i,ρ−1 ) ⊥ can be stacked into a matrix W i,ρ ∈ R n×n i,ρ , where n i,ρ = n o i,ρ − n o i,ρ−1 , in such a way that: Let us, to be selected later, define i ∈ Z >0 as an arbitrary number of hops. From these definitions it is clear that for all ρ ∈ {0, . . . , i } and all i ∈ V, it holds that It is worth pointing out that Im(W i,ρ ) corresponds to the innovation introduced by the ρ-hop reachable set N i,ρ of agent i, that is, the observable modes for agent i at hop ρ that are not observable at hop ρ − 1. Accordingly, the transformation matrix T i , defined as T i = [V i, i V i, i ], can be divided using the innovations at each hop: for all ρ ∈ {0, . . . , i }, where it is easy to identify the observable and unobservable subspaces of the system by agent i at hop ρ.
The following lemma, previously presented in [21], introduces some important properties that are central for the subsequent derivations.
Lemma 1. For any agent i ∈ V, the next properties hold, ∀ρ, ρ ∈ {1, . . . , i } such that ρ = ρ : Remark 1. The subspace decomposition can be altered willfully in the case that one agent does not want to consider some measurements (see for instance the example in Figure 1).

Figure 1.
Assume that agent 1 does not want to consider y 3 . Then, it can design W 1,1 to only include the basis vector of y 2 and W 1,2 to include the corresponding basis of y 3 .
Once the above lemma has been introduced, the following proposition can be established. This proposition is key for the observer to be presented in the next section.
Proposition 1. For each agent i, the orthogonal similarity transformation given by T i in (7) transforms the system matrix A into a block upper-triangular matrix in the form [21]: The following lemma introduces an important property to be considered.
Proof. From Definition 1 it is easy to see that Im(C j ) ⊆ Im(C i,ρ ) for all j ∈ N i,ρ . By using Definition 3, we have that pair (C i,ρ , A) generates the subspace O i,ρ which directly implies that Im(C i,ρ ) belongs to Im(V i,ρ ) and consequently Im(C j ) ⊆ Im(V i,ρ ). Finally, considering the orthogonality between V i,ρ and W i,r for every r > ρ the Lemma is proved.

Observer Structure and Design Goal
This section presents the observer structure that makes use of the notions previously introduced: where ρ is the distance from agent i to the agent that constitutes the source of information, and N i,ρ ∈ R n i,ρ ×n i,ρ are a set of gains to be designed. Recall that, since we are assuming slow communication networks in which the information consumes one sample time to get from one agent to its neighbors, this information reaches the destination agent with a fixed delay of ρ sample times (The fixed delay for the communication from agent j to agent i depends on their relative position in the graph. Therefore we could have written ρ i,j instead of ρ. For the sake of simplicity on the notation, we have kept the second option.). In other words, there exists a match between the total delay the packet suffers between sender and receiver, and the hop at which the information affects (It is trivially easy to modify the delay for the communications to other values higher than one. However, this would make the notation harder.). The observer structure proposed in (9) decomposes the observer dynamics in two different terms: The second one is a correction term. The agents belonging to the ρ − hop reachable set of i, N i,ρ , communicate the measurements made to agent i. Since transmitted measurements flow through the graph at a rate of 1 hop per sampling time, any agent i receives the measurements from its ρ-hop reachable set with a constant delay of ρ sampling times. The correction made with these measurements is projected into Im(W i,ρ ) and multiplied by the gain matrix N i,ρ (k). The result is used as weights to perform linear combinations of W i,ρ . Thus, these corrections only affect the observable subspace generated by the innovations of the neighborhood of agent i at hop ρ.

Remark 2.
It is worth pointing out that if i = 0, pair (C i , A) is observable and consequently W i,0 is a full rank matrix. Thus, Equation (9) can be rewritten as: Note that above expression is clearly the well-known Luenberger observer structure.
For each agent i ∈ V, let us define the estimation error as e i := x −x i , and similarly, the transformed estimation error as ε i := ξ i −ξ i = T i e i , which can be decomposed in the transformed estimation error of agent i at each hop ρ: and thus, due to the fact that T i is an orthogonal matrix (and therefore T i T i = I n ), the expression of the estimation error in ε i,ρ coordinates yields: The goal of this paper is to design the gain matrices N i,ρ (k) in structure (9) for every agent i and every hop ρ ∈ {0, . . . , i } to solve the following problems: Problem 1. (Distributed data fusion) In the absence of plant and measurement noises, i.e., w(k) = 0 and n i (k) = 0, given plant (1) and (2), and the interconnection graph G = (V, E ), design gains N i,ρ (k) in (9) such that all the estimatesx i asymptotically converge to the actual plant state x. Problem 2. (Distributed optimal filtering) Given plant (1) and (2) and the interconnection graph G = (V, E ), design gains N i,ρ (k) in (9) in order to minimize E{||ε i,ρ (k + 1)|| 2 }.
The solutions for Problems 1 and 2 are introduced in Sections 5 and 6, respectively. However, prior to that, it is necessary to introduce some properties on which subsequent developments are supported. Additionally, the setup procedure of the distributed observer is presented next.

Distributed Observer Setup
It is worth mentioning that observer structure (9) needs some neighboring information before starting the estimation phase. That is, the construction of matrices W i,ρ for every ρ ∈ {0, . . . , i } and the value of i for every agent i ∈ V it is needed before starting the estimation procedure.
This section presents an algorithm to design the matrices and parameters aforementioned. Note that although gain matrices N i,ρ (k) are also required before the estimation phase, this design is tackled in the following sections. The pseudocode of the setup algorithm is given in Algorithm 1. Exchange matrices C i,ρ with the neighborhood N i .

5
Compute C i,ρ+1 and matrix W i,ρ . Include the path in the routing tables of the interconnection nodes. 6 If pair (C i,ρ+1 , A) is detectable, then stop and fix i = ρ + 1. Otherwise increment ρ and go to 3.
The routing tables of the interconnection nodes includes the information that the agents must know in order to route the information from one source agent to the destination agent in a direct path.
After this initialization phase, that finishes after a finite number of steps, the agents are ready to execute their estimation of the state in a distributed way.

Remark 3.
By letting this algorithm be executed successive times, and not just at the initialization phase, the agents can detect changes in the topology and, then, redesign their observer gains. Hence, the proposed observer can be made resilient to time-varying topologies.

Remark 4.
Consider a situation in which the destination agent i is selective with the agent who will act as a source of information to reconstruct some part of the state, as was introduced in the Remark 1. In this situation, agent i must take this fact under consideration in step 5 of the setup algorithm, modifying matrices C i,ρ and consequently matrices W i,ρ (which denote the innovation basis of agent i of the observable subspace at hop ρ).

Estimation Error Dynamics
This section presents the evolution of the transformed estimation error, ε i,ρ (k). After introducing these dynamics, it will become clear that new delayed versions of this error will need to be defined, together with their associate dynamics.

Proposition 2.
Consider the network of agents described by the graph G = (V, E ), where every agent i implements the observer structure (9) to estimate the state of the system (1). Then, the transformed estimation error dynamics at every hop ρ is given by the following equation: Proof. Let us write first the evolution of the estimation error dynamics for system (1) under the observation structure in (9): Using (10), we can write the transformed estimation error dynamics for agent i at hop ρ: where W i,ρ W i,ρ = I n i,ρ and Lemma 1 (i) has been used. Next, relying on Equation (11), we know the expression of e i in ε i,ρ coordinates, so previous equation can be accordingly modified: From Lemma 1 (iv) it holds W i,ρ A V i, i ε i, i +1 + ∑ i r=ρ+1 W i,r ε i,r = 0 so we can rewrite the above equation as: Next, from Lemma 2 we know that C j W i,ρ = 0 for all j ∈ N i,ρ with ρ > ρ and therefore which is the desired expression. Now, let us define the delayed transformed estimation error of agent i at hop ρ,ε i,ρ , as the vector that stacks the transformed estimation error of agent i at hop ρ from time k to k − i , that is: . . .
and let us defineε i as the vector that stacks the delayed transformation error of agent i at every hop . . .
Observe that ε i,ρ (k) = S i,ρεi,ρ (k), with S i,ρ = I n i,ρ 0 · · · 0 ∈ R n i,ρ ×( i +1)n i,ρ . Analogously, Based on these error vectors, we can obtain an expression for the dynamics of the delayed transformed estimation error,ε i,ρ , that will be useful later.

Proposition 3.
Consider the network of agents described by the graph G = (V, E ), where every agent i implements the observer structure (9) to estimate the state of the system (1). Then, the delayed transformed estimation error dynamics at hop ρ is given by the following equation: for r = ρ and r, ρ ∈ {0, . . . , i } where the terms −N i,ρ (k)W i,ρ ∑ are placed in the block component (1, ρ + 1) of the matrices i,(ρ,ρ) and i,(ρ,r) , respectively.
The proof is immediate by substituting Equation (12) into Equation (13).

Distributed Data Fusion
This section tackles Problem 1. In particular, the section presents necessary and sufficient stability conditions for the observer structure (9) in absence of perturbations. Moreover, a linear matrix inequality (LMI) method (see [24] for details) for the design of gain matrices N i,ρ (k) that guarantees stability is introduced. This design is independent of time and then, for simplicity in the notation we will use N i,ρ = N i,ρ (k). Theorem 1. Consider the network of agents described by the graph G = (V, E ), where every agent i implements the observer structure (9) to estimate the state of the system (1). Under the unperturbed scenario, the estimates of all the agents tend asymptotically to the actual plant state if and only if matrices i,(ρ,ρ) are Schur for every ρ ∈ {0, . . . , i }.
Proof. The dynamics of the delayed transformed estimation error of agent i at hop ρ is given in Proposition 3. By setting w(k) ≡ n i (k) = 0, ∀i ∈ V, in (15), the following expression can be obtained: . . .
Please note that (18) reveals a cascade structure in which the delayed transformed estimation error at each hop ρ depends on the errors at previous hops. Thus, the eigenvalues of the block upper triangular matrix in (18) are given by the eigenvalues of the corresponding matrices placed in its diagonal, which are the matrices defined in (16) establishing the proof.
It is worth pointing out that the stability of the distributed observer can be seen as the stability of several constant time-delay discrete-time systems. Next, a design method for gain matrices N i,ρ that met Theorem 1 is presented.

Theorem 2.
(Design for stability) If there exist matrices R i,ρ ∈ R n i,ρ ×n i,ρ , X i,ρ ∈ R n i,ρ ×n i,ρ and (19) is satisfied, then gain N i,ρ = R i,ρ X −1 i,ρ stabilizes matrices (16), and therefore the estimates of every agent tends asymptotically to the actual plant state.
Proof. From Lyapunov theory [24], Theorem 1 is fulfilled if and only if there exists a positive definite matrix P i,ρ ∈ R ( i +1)(n i,ρ ×n i,ρ ) such that: With some mathematical manipulations, previous conditions can be rewritten as Adding the first two matrices and applying the Schur complement [24], previous inequality is equivalent to Note that the obtained matrix inequality is not an LMI because there are terms on X i,ρ and X −1 i,ρ . If the matrix inequality is pre-multiplied and post-multiplied by the symmetric, non-singular matrix I ( i +1)n i,ρ 0 0 X i,ρ , and by using the change of variables R i,ρ = N i,ρ X i,ρ , LMI (19) is obtained, establishing the proof.
Recall that LMI (19) can be solved locally by every agent, requiring just the information available after the setup described in Section 3. The choice of a block-diagonal Lyapunov matrix introduces conservatism, but it is required to transform the matrix inequality into an LMI. Otherwise, the previous condition would be necessary and sufficient for stabilization.

Distributed Optimal Filtering
This section deals with Problem 2, that is, gains N i,ρ (k) must be designed in such a way that the expected value of the norm of the delayed transformed estimation error of agent i at hop ρ, that is, E{||ε i,ρ (k + 1)|| 2 }, is minimized. Note that E{||ε i,ρ (k + 1)|| 2 } = E{ε i,ρ (k + 1)ε i,ρ (k + 1)} = tr(E{ε i,ρ (k + 1)ε i,ρ (k + 1)}) = tr(E{Q i,ρ (k + 1)}), where Q i,ρ is the covariance matrix of the transformed estimation error of agent i at hop ρ. The dynamics of the covariance matrix will be studied and, then, it will be possible to present a method to find the gains N i,ρ (k) to minimize tr(E{Q i,ρ (k + 1)}).
LetQ i,(ρ,r) (k) = E{ε i,ρ (k)ε i,r (k)} be the cross covariance matrix of the delayed transformed estimation error of agent i between hops ρ and r. Thus, it is possible to conform the covariance matrix of the delayed transformed estimation error of agent i for every hop ρ: It is worth pointing out that it is possible to relate the diagonal terms of the covariance matrixQ i (k) through the use of the selection matrices defined in Section 4. Therefore, Note that matrices i,(ρ,r) and i,(ρ,ρ) in (16) and (17) for r, ρ ∈ {0, . . . , i } can be rewritten as: for every r, ρ ∈ {0, . . . , i } with r = ρ. Using this decomposition, the next proposition studies the dynamics of the complete covariance matrixQ i . Its proof requires just some simple mathematical manipulations and, hence, it is omitted.
Proposition 4. The evolution of covariance matrixQ i is given by: , and: Based on this proposition we present a design method for gains N i,ρ (k) that solves the Problem 2.
Proof. The proof is divided into several steps. Firstly, the dynamics of matrix Q i,ρ will be obtained derived from results in Proposition 4. Then, its trace will be derived with respect to N i,ρ (k) in order to find the value of the gain that minimizes the trace of Q i,ρ (k).
Observe that Q i,ρ (k) =S i,ρQi (k)S i,ρ . Then, it holds: It is a simple matter to check thatS i, :) , and, consequently, we can rewrite the above expression as: Now, with the purpose of minimizing the trace of the covariance matrix, tr(Q i,ρ (k + 1)) is partially derived with respect to the gain matrix N i,ρ (k): Then, equaling the above expression to zero, we can find that the gain N i,ρ (k) that minimizes the trace of Q i,ρ (k + 1) satisfies After some manipulations, expression (21) is found.
This theorem has proposed a design method for gain matrices N i,ρ (k) that minimize the expected value of the estimation error. Additionally, the design method only requires solving a system of linear equations that depends only on local information of the agent, allowing its resolution in a distributed way.
The algorithm that every agent i ∈ V must follow to initialize the distributed estimation procedure and run the estimation phase is summarized in Algorithm 2.

Simulations Results
In order to show the effectiveness and optimality of the distributed observer some simulations are driven in this section. Consider the following system where there is one state with a stable dynamics, a pair of conjugated imaginary poles and a state with an unstable dynamics: The system is being observed by a set of three agents with the network topology depicted in Figure 2. Example 1. In this example the performance of the observer is tested under the perturbed scenario in which the system model and the agents measurements are affected by Gaussian noises. Consider the sequel covariance matrices of the noises terms: Figure 3b depicts the evolution of the system state and agent 1 estimates (in dashed lines). It is shown that the estimate of the stable pole that is locally measured by agent 1 converges the first to the real value. After it, the estimates of the states 2 and 3, measured by agent 2, reach the actual value. Finally, the estimator converges to the values of the unstable pole, measured by agent 3 (that belongs to N i,2 ). This simple example makes it clear that the decay rate of the estimator error decreases with ρ. In Figure 3a the evolution of tr(Q i,ρ ) for every i ∈ V are shown. To conclude this section, it is shown that the gain matrices obtained in this example in steady state (through the use of Theorem 3), fulfill the LMI stated in Theorem 2 and, consequently, meet the stability conditions required in Theorem 1. Considering for example agent 1 with the gain matrices N 1,ρ obtained through Theorem 3, it is possible to find a solution for the LMI (19) on the variables X 1,ρ and Y 1,ρ , and therefore to guarantee the stability of the estimation errors. The values obtained for N 1,ρ , X i,ρ , and Y 1,ρ are given in Table 1.

Conclusions
In this paper, we present a data-fusion-based algorithm to solve the problem of distributedly estimating the state of a plant through an interconnected network of agents. Based on a coordinate transformation matrix, it is possible to transform the original representation of the state estimation of every agent into the observability staircase form. In this form, the observable modes of each agent at every hop ρ are distinguished. Relying on this transformation, a novel observer structure is presented in such a way that the observer gains can be designed distributedly guaranteeing the stability and minimizing the expected value of the estimation error.
Furthermore, the agents can modify conveniently the construction of the transformation matrix that allows them to perform the subspace decomposition. Thus, they can be selective with those that can act as a source of information of a specific portion of the state.
As a further work it will be interesting to explore the performance of the distributed observer dealing with random events such as packet losses or time-varying delays.

Conflicts of Interest:
The authors declare no conflict of interest.