A Self-Organized Reciprocal Decision Approach for Sensing Coverage with Multi-UAV Swarms

This paper tackles the problem of sensing coverage for multiple Unmanned Aerial Vehicles (UAVs) with an approach that takes into account the reciprocal between neighboring UAVs to reduce the oscillation of their trajectories. The proposed reciprocal decision approach, which is performed in three steps, is self-organized, distributed and autonomous. First, in contrast to the traditional method modeled and optimized in configuration space, the sensing coverage problem is directly presented as an optimal reciprocal coverage velocity (ORCV) in velocity space that is concise and effective. Second, the ORCV is determined by adjusting the action velocity out of weak coverage velocity relative to neighboring UAVs to demonstrate that the ORCV supports a collision-avoiding assembly. Third, a corresponding random probability method is proposed for determining the optimal velocity in the ORCV. The results from the simulation indicate that the proposed method has a high coverage rate, rapid convergence rate and low deadweight loss. In addition, for up to 103-size UAVs, the proposed method has excellent scalability and collision-avoiding ability.


Introduction
Sensing coverage with a UAV swarm is an important issue of how to cover an accessible region of interest (ROI) by multiple UAVs with specified sensors in an optimal manner, i.e., achieving the optimal performance including low coverage time, high coverage rate and so on. It has multifarious applications, for instance, mapping, search and rescue, forest fire monitoring and fighting, flood and earthquake response. Though ROI may vary in shape, size and may be cluttered with obstacles, sensing coverage mainly includes the following series of technical processes after obtaining surrounding information. First, the area is divided by using some diverse area decomposition method, after which the UAV makes an action decision [1,2]. Next, the UAV conducts task planning and path planning [3,4]. Last, the plan is executed by the UAV's controller and actuator [5]. Among these technical processes, area decomposition and action decision are the most fundamental and vital; however, they are coverage decision problems in nature.
Earlier works on the coverage decision problem focused on the methods by which a single UAV covers the ROI, such as sweep manner [6,7], area decomposition [8,9] and process occasion [10]. Subsequently, researchers focused on multi-UAV cooperating coverage because of its better coverage performance than the single-UAV mode. Two methods of multi-UAV cooperating coverage are used: centralized decision and distributed decision. The former method can achieve optimal deployment and action of the UAVs based on global information; however, the expandability is limited by its exponentially increasing computation [11,12]. The latter method has more flexibility and scalability suitable for various situations since it may be difficult to achieve optimal coverage [13][14][15]. For the

Basic Idea
Swarm coverage is an essential technology aiming to cover a selected region by a fleet of UAVs. It need a decision process that is self-organized, decentered and autonomous. Self-organization means that the total mission is assigned to the team with no need to decompose it into a series of subtasks and assign each UAV a specific task. Decentralization signifies the absence of a leader, with each UAV able to join or quit the team without any influence on the completion of tasks. Autonomy indicates that global behavior emerges naturally, though it is unknown to individual UAVs.
In general, the problem of swarm coverage is simplified to the process of designing a distributed algorithm for the individual UAV to cooperate with other UAVs. The general procedure of each UAV can be simply divided into three aspects-perception, decision and action-as shown in Figure 1. The first part is acquiring the information, such as position and velocity, via sensors or communication, which includes an UAV's own information and that of other UAVs on factors such as static/dynamic obstacles and neighboring UAVs. The second part is the decision, which can be divided into two subparts: modeling and optimizing. Finally, the optimal decision will be executed by the actuator. The process is continuously repeated. The decision part is the core of the aforementioned technique, and the present work focuses on this aspect. The reciprocal coverage method is designed to model the swarm coverage problem as the distributed optimization in velocity space by constructing the optimal region that is coverage-beneficial and collision-free. Moreover, an optimization technique is proposed to select the optimal velocity in ORCV. The processes of modeling and optimizing are presented in detail in Sections 3 and 4, respectively.

Reciprocal Decision Approach
In this section, the reciprocal decision (RD) approach for sensing coverage, which is coverage-beneficial and collision-free, is described in detail. Section 3.1 describes the coordination between two UAVs; the coordination is extended to swarm cooperation in Section 3.2. Collision-free constraints and the relevant proof are shown in Section 3.3.

Two-UAV Cooperative Coverage
The region of interest (ROI) to be covered is annotated as  , which is a convex compact set in  2 . A set of n UAVs share the environment, with each UAV having its shape with limited coverage. Without loss of generality, the paper assumes for simplicity that the UAVs moving in the plane  2 are disc shaped with radius r ; the ranges of coverage and communication are discs with The decision part is the core of the aforementioned technique, and the present work focuses on this aspect. The reciprocal coverage method is designed to model the swarm coverage problem as the distributed optimization in velocity space by constructing the optimal region that is coverage-beneficial and collision-free. Moreover, an optimization technique is proposed to select the optimal velocity in ORCV. The processes of modeling and optimizing are presented in detail in Sections 3 and 4, respectively.

Reciprocal Decision Approach
In this section, the reciprocal decision (RD) approach for sensing coverage, which is coverage-beneficial and collision-free, is described in detail. Section 3.1 describes the coordination between two UAVs; the coordination is extended to swarm cooperation in Section 3.2. Collision-free constraints and the relevant proof are shown in Section 3.3.

Two-UAV Cooperative Coverage
The region of interest (ROI) to be covered is annotated as Ω, which is a convex compact set in R 2 . A set of n UAVs share the environment, with each UAV having its shape with limited coverage. Without loss of generality, the paper assumes for simplicity that the UAVs moving in the plane R 2 are disc shaped with radius r; the ranges of coverage and communication are discs with radius R and CR, respectively. Moreover, each UAV A has its maximum speed v max A , maximum calculation disc-range radius R max A , maximum calculation neighbor UAVs n max A , and the predicted time interval τ. The position of an UAV is m, and its velocity is v. The UAVs are randomly distributed in the rectangular region Ω e with both sides being l e , and all of them are initially static.
For two UAVs A and B with limited coverage ability, they are initially close to each other, and the initial velocity v cur A = v cur B = 0; thus, they should move by selecting their own new velocity v new A and v new B to maximize their own coverage, which also contributes to the total area coverage. If UAV A adopts a relative velocity to B in the time interval τ that is against to the increase of its coverage and the total coverage, we name this velocity "weak coverage velocity". The set WCV τ A|B contains all weak coverage velocity for UAV A relative to UAV B, which will be formally defined in the following. Let C(m, R) be an open disc of radius R centered at m: Thus, the weak coverage velocity set WCV τ A|B is formally defined as follows.
The corresponding optimal coverage velocity set OCV τ A|B for UAV A relative to B is defined as follows, which is beneficial to the increase of UAV A's coverage and the total coverage in the time interval τ: The geometric interpretation of weak coverage velocity set WCV τ A|B for UAV A is exhibited in Figure 2; it is clear that WCV τ A|B and WCV τ B|A are symmetric with the origin. In Figure 2a, a visual display of two UAVs A and B in the configuration space is shown, while UAVs A and B have different radius of shapes (r A and r B ) and communication (R A and R B ), which are centered at m A and m B . In Figure 2b, the weak coverage velocity set WCV τ A|B (gray) for UAV A is presented in velocity space as a circle with the disc of radius R τ = m B − m A /τ centered at (m B − m A )/τ, where τ is the predicted time interval; here, τ = 1 and τ = 2. WCVL τ A|B is a line that separates the weak coverage velocity set WCV τ A|B and the optimal coverage velocity set OCV τ A|B . It is clear that an UAV is difficult to move at a fixed velocity v B in time interval τ. Thus, if v B ∈ V B (V B is a scope that includes UAV B's all possible velocity v B in the time interval τ), then UAV A should select a velocity that is out of the Minkowski sum [31] sets WCV τ A|B ⊕ V B to increase its area coverage and the total coverage, as shown in Figure 2c. And the optimal coverage velocity OCV τ A|B (V B ) is defined as follows: Considering the reciprocal of UAVs, if a pair of velocity sets V A , V B for UAV A and B respectively satisfy the constraints OCV V ), and for all radii r , it holds that: ORCV is shown geometrically in Figure 3.
, and for all radii r, it holds that: In other words, ORCV τ A|B and ORCV τ B|A contain the most velocities close to UAV A and B's current velocities v cur A and v cur B . The difference between the velocities and v cur A and v cur B is equal for A and B. The establishment of ORCV τ A|B and ORCV τ B|A is shown geometrically in Figure 3.     out of WCV τ A|B , then the total coverage area will be increased. Let u be the vector from v cur A − v cur B to the closest point on the boundary of WCV τ A|B : Alternatively, let u be the vector from v cur to the closest point on the boundary of WCVL τ A|B : n is the outward normal vector of the boundary of WCV τ A|B at point (v cur A − v cur B ) + u. Because u is the smallest change required for the relative velocity of UAV A and B to avert weak coverage within τ time and both UAVs share the responsibility of avoiding weak coverage, UAV A adapts velocity 0.5u at least, and B is responsible for another half: Clearly, the set ORCV τ B|A for B is defined symmetrically. Moreover, the above method is applicable when v cur A − v cur B / ∈ WCV τ A|B , which indicates that A and B will not lead to the weak coverage if they still adopt their current velocities v cur A and v cur B , respectively. However, in this situation, both UAVs can also utilize the abovementioned method to maintain a coverage-beneficial movement.

Multi-UAV Swarm Coverage
The overall method is as follows: UAV A executes a continuous cycle of sensing and acting with time step ∆t. In each period, UAV A acquires the coverage radius, current positions and velocities of its neighboring UAVs and itself. Let D max A be the maximal calculation distance of UAV A with respect to its neighboring UAV. n max A (n max A ∈ N) is a positive constant of the maximum considered neighboring number for UAV A. Hence, when any UAV B satisfies the constraints m A − m B ≤ D max A , UAV A only concerns its n max A neighboring UAV B that are closer than the others in Euclidean distance. KD-tree is used for UAV A to search the neighboring UAVs in this paper. UAV A deduces the optimal half-plane of velocities ORCV τ A|B relative to neighboring UAVs B. And ORCV τ A is a set of optimal velocity spaces that are optimal for UAV A relative to all its n max A neighboring UAVs, which is the intersection of the half-planes of optimal velocities conducted by its neighbor. UAV A is also conditioned to its own maximum speed v max A . Therefore, the optimal velocity set ORCV τ A for UAV A is defined as follows, and its geometrical expression is shown in Figure 4: Next, UAV A selects a new velocity v new A for itself that is optimal (center velocity of the optimal space in this paper) among all velocities within the optimal velocities space: Finally, the UAV A reaches its new position: Equations (9) and (10)  is added by random order in the effective algorithm [33]. Therefore, the running time of algorithm still depends on constraints' number n, which is equal to n max A here. It has an expected running time of O(n).
Finally, the UAV A reaches its new position: Equations (9) and (10)    A ORCV may be available or vacant (shown as Figure 5), which will adopt different optimization strategies. In the Section 5, the random probability method will be utilized for searching the optimal velocity. ORCV τ A may be available or vacant (shown as Figure 5), which will adopt different optimization strategies. In the Section 5, the random probability method will be utilized for searching the optimal velocity.

Collision Avoidance between UAVs
The ORCV described by the RD method as noted above satisfies the constraints of avoiding collision with other UAVs, as will be proved as below.
It is clear that UAVs A and B will collide within  time if UAV A selects the velocity [34] defined as follows: Obviously, although UAVs remain in a crowded environment, they are assumed to be initially collision-free. Next, the UAVs can move without collision between other UAVs by employing the RD method. The proof is shown as follows:

Collision Avoidance between UAVs
The ORCV described by the RD method as noted above satisfies the constraints of avoiding collision with other UAVs, as will be proved as below.
It is clear that UAVs A and B will collide within τ time if UAV A selects the velocity relative to B within VO τ A|B [34] defined as follows: Obviously, although UAVs remain in a crowded environment, they are assumed to be initially collision-free. Next, the UAVs can move without collision between other UAVs by employing the RD method. The proof is shown as follows: Corollary 1. For any time interval τ, it holds that: Proof. For any time interval τ, UAV A and B remain in a crowded environment but without collision (see Figure 2a) at the beginning, which satisfies: Corollary 1 is directly shown in the geometry in Figure 6; the space of WCV τ A|B always contains VO τ A|B at any time τ, and the VO τ A|B is always to the left of WCVL τ A|B , which indicates that coverage-beneficial velocity will not collide in the time window τ.
VO WCV □ Corollary 1 is directly shown in the geometry in Figure 6; the space of

Avoiding Collision with Obstacles
The coverage environment contains not only UAVs but also static obstacles and perhaps unknown dynamic objects that are regarded as dynamic obstacles. The reciprocal coverage method is flexible and extensible and can easily add the constraints of collision-free to shrink the ORCV to meet the need.
Static Obstacle

Avoiding Collision with Obstacles
The coverage environment contains not only UAVs but also static obstacles and perhaps unknown dynamic objects that are regarded as dynamic obstacles. The reciprocal coverage method is flexible and extensible and can easily add the constraints of collision-free to shrink the ORCV to meet the need.

Static Obstacle
UAVs should take full responsibility for coverage-beneficial motion when faced with static obstacles, resulting from the fact that the obstacle cannot cooperate.
In this paper, the obstacles are modeled as a collection of line segments. Let O be one of these line segments, and let A be an UAV with shape-radius r A and coverage-radius R A positioned at m A . Next, the weak coverage velocity set WCV τ A|O generated by obstacle O is defined as follows (o is a selected point as shown in Figure 7): When the distance between UAV A and obstacle O is less than a constant λ A , UAV A should consider the obstacle O. If allowing UAVs not to be sensitive to the obstacle, then λ A can be less than coverage-radius R A . Otherwise, λ A should be equal to R A . If UAV A's velocity v A is within WCV τ A|O , then the weak coverage during the time interval τ is appeared relative to obstacle O. ORCV τ A|O is defined for the optimal velocity to realize the coverage-beneficial motion relative to obstacle O, which is the intersection of WCVL τ A|O and C(0, v max ). WCVL τ A|O is determined by selected point o in segment O, which is the weakest coverage point, as shown in Figure 7a,b.  with the limit of maximum speed max v is shown. In this paper, UAV A will not consider obstacle O that is out of range when the distance between UAV A and obstacle O is greater than their collision distance value A R .

Dynamic Obstacle
The crux is that dynamic obstacles do not coordinate and even interfere with the coverage, in contrast with the UAVs. Therefore, the UAVs should take full responsibility for coverage-beneficial motion.
As discussed in Section 3.1, u is the smallest change required to the relative velocity of A and B to avoid weak coverage within time  , but in contrast to Section 3.1, UAV A should take full responsibility for collision-free motion or even more, UAV A adapts its velocity by     The geometric construction of the coverage-beneficial and collision-free space ORCV τ A|O with the limit of maximum speed v max is shown. In this paper, UAV A will not consider obstacle O that is out of range when the distance between UAV A and obstacle O is greater than their collision distance value R A .

Dynamic Obstacle
The crux is that dynamic obstacles do not coordinate and even interfere with the coverage, in contrast with the UAVs. Therefore, the UAVs should take full responsibility for coverage-beneficial motion.
As discussed in Section 3.1, u is the smallest change required to the relative velocity of A and B to avoid weak coverage within time τ, but in contrast to Section 3.1, UAV A should take full responsibility for collision-free motion or even more, UAV A adapts its velocity by αu (α ≥ 1). The constraints are as follows:

Optimal Velocity Decision
The ORCV for UAV A is constructed in Section 4. In this section, a technique for searching the optimal velocity in ORCV is declared formally, which is effective relative to other traditional traversal methods.

Random Probability Method
The traditional traversal method must confirm the exact region of the search space; however, the exact space is typically difficult to obtain because of the uncertainty of shape. It is inefficient to traverse all the possible values of ORCV. Therefore, a random probability method inspired by the Monte Carlo method is proposed for identifying the optimal velocity within the confirmed optimal space.
The random probability method utilizes the concept of convergence in probability, where the mean of abundant random optimal velocities will approach the center of the optimal velocity space, despite the specific shape of ORCV being unknown. The core of the random probability method is shown as follows: For the set of G composed of abundant random velocities v rand , v rand ∈ PR, it holds that: Proof. Assuming all the velocities of PR are traversed, the following is obtained: When n is sufficiently large, according to the Bernoulli law of large numbers: The symbols used in this paper are defined in Table 1.

Symbol Description v opt
The optimal velocity decision.

PR
The permitted region with unknown shape.

Square(x)
Centered at 0, the length of edge is twice that of x.

RV(S)
Random velocity in the set of S.

AVE(G)
The center of the set of G in Euclidean Space.

IdleVel()
The velocity of 0 NUM(S) The number of point in the set of S.

Optimum Available
When the optimal region is available, the exploration of optimal velocity follows Algorithm 1. Algorithm 1 is an effective technique to determine an optimal velocity resolution that is very close to the center of ORCV. Algorithm 1 is described below. if v rand ⊂ ORCV τ A| * then 8: v rand → The optimal velocity has been explored. 15: return.

Vacant Optimal Velocity Space
When the ORCV is empty, the exploration of optimal velocity follows Algorithm 2 as below. Algorithm 2 is an effective technique to confirm the likelihood that the ORCV is vacant.

Numerical Test
The ORCV constructed via the RD method may be available or null. Therefore, the numerical test is conducted in these two situations to verify the feasibility and rationality of the technique and the value adopted in this paper.

Available Set
When the ORCV is available, it may present various situations, such as circular sector, regular polygon and irregular shape (see Figure 8).
In this paper, the center of the ORCV for each UAV is regarded as the optimal choice because of the equilibrium of benefit relative to each other neighbor. However, it is difficult to obtain the optimum with calculations because the shape of the ORCV is difficult to determine. Thus, a technique for selecting a velocity as close as possible to the optimal velocity is proposed.
The error between the velocity generated by the proposed technique and the optimal velocity is shown in Figure 9.  It can thus be seen that the error decreases as the number of random points increases, and it is very close to zero with a large number of random points.

Null Set
The situation of a null set may occur in the construction of the ORCV, such as the case of overspeed or excess constraints (see Figure 10). However, it is also difficult to determine whether the ORCV is empty. Therefore, the proposed technique is suitable to such a case.  It can thus be seen that the error decreases as the number of random points increases, and it is very close to zero with a large number of random points.

Null Set
The situation of a null set may occur in the construction of the ORCV, such as the case of overspeed or excess constraints (see Figure 10). However, it is also difficult to determine whether the ORCV is empty. Therefore, the proposed technique is suitable to such a case. It can thus be seen that the error decreases as the number of random points increases, and it is very close to zero with a large number of random points.

Null Set
The situation of a null set may occur in the construction of the ORCV, such as the case of overspeed or excess constraints (see Figure 10). However, it is also difficult to determine whether the ORCV is empty. Therefore, the proposed technique is suitable to such a case.
The error of evaluation of the null set while the ORCV is non-empty is shown in Figure 11. It is intuitive that the error declines with the augmentation of the number of random points.

Null Set
The situation of a null set may occur in the construction of the ORCV, such as the case of overspeed or excess constraints (see Figure 10). However, it is also difficult to determine whether the ORCV is empty. Therefore, the proposed technique is suitable to such a case.  The error of evaluation of the null set while the ORCV is non-empty is shown in Figure 11. It is intuitive that the error declines with the augmentation of the number of random points. The error nearly reaches zero after 200 random points, which confirms the validity of the proposed technique to evaluate the null set when the number of random point is set to 1000.
The optimal velocity can be improved with an increased number of random points with a small penalty. However, it is sufficient to set the number of random points to 1000 in this paper.

Simulation and Results
To validate the effectiveness and performance of the proposed method with or without static obstacles, small-scale coverage is simulated in Section 5.1 and extended to large scale in Section 5.2, which includes various performances, such as coverage rate, deadweight loss, trajectory smoothness, and convergence speed. In addition, a Robotic Operation System (ROS) simulation is conducted to improve the reliability of the proposed method further. The simulation is programmed in C++ using OpenMP to parallelize key computation across eight Intel(R) 2.60 GHz cores. The simulations parameters are shown in Table 2.  The error nearly reaches zero after 200 random points, which confirms the validity of the proposed technique to evaluate the null set when the number of random point is set to 1000.
The optimal velocity can be improved with an increased number of random points with a small penalty. However, it is sufficient to set the number of random points to 1000 in this paper.

Simulation and Results
To validate the effectiveness and performance of the proposed method with or without static obstacles, small-scale coverage is simulated in Section 5.1 and extended to large scale in Section 5.2, which includes various performances, such as coverage rate, deadweight loss, trajectory smoothness, and convergence speed. In addition, a Robotic Operation System (ROS) simulation is conducted to improve the reliability of the proposed method further. The simulation is programmed in C++ using OpenMP to parallelize key computation across eight Intel(R) 2.60 GHz cores. The simulations parameters are shown in Table 2.
The algorithm will be terminated when sup a * i+1 − a * i 2 ≤ ζ. UAVs are random distributed in Ω a in the beginning.

Small-Scale
A case in a closed environment Ω e without obstacles is shown in Figure 12. First, UAVs are randomly distributed in crowded region Ω a , as shown in Figure 12a. Thus, according to UAVs' local information, they begin to disperse to improve coverage without collision. The UAVs' moving trajectories are recorded in Figure 12b, where the smoothness of the trajectories is noticeable. The algorithm is convergent when simulation step k = 574, and the optimal coverage position of each UAV is shown in Figure 12c.

Small-Scale
A case in a closed environment  e without obstacles is shown in Figure 12. First, UAVs are randomly distributed in crowded region  a , as shown in Figure 12a. Thus, according to UAVs' local information, they begin to disperse to improve coverage without collision. The UAVs' moving trajectories are recorded in Figure 12b, where the smoothness of the trajectories is noticeable. The algorithm is convergent when simulation step  574 k , and the optimal coverage position of each UAV is shown in Figure 12c. During the movement of UAVs, collisions with other UAVs are avoided. The minimum distance between UAVs (green thick line) and the collision critical value (red thin dashed line) in each simulation step k are shown in Figure 13, which demonstrates the collision-free movement of UAVs intuitively. During the movement of UAVs, collisions with other UAVs are avoided. The minimum distance between UAVs (green thick line) and the collision critical value (red thin dashed line) in each simulation step k are shown in Figure 13, which demonstrates the collision-free movement of UAVs intuitively.

Small-Scale
A case in a closed environment  e without obstacles is shown in Figure 12. First, UAVs are randomly distributed in crowded region  a , as shown in Figure 12a. Thus, according to UAVs' local information, they begin to disperse to improve coverage without collision. The UAVs' moving trajectories are recorded in Figure 12b, where the smoothness of the trajectories is noticeable. The algorithm is convergent when simulation step  574 k , and the optimal coverage position of each UAV is shown in Figure 12c. During the movement of UAVs, collisions with other UAVs are avoided. The minimum distance between UAVs (green thick line) and the collision critical value (red thin dashed line) in each simulation step k are shown in Figure 13, which demonstrates the collision-free movement of UAVs intuitively. A case with a rectangular static obstacle  o with both sides being 10 m as shown in Figure 14 is considered, while other conditions are same as before. UAVs' initial positions are shown in Figure 14a. Next, UAVs begin to disperse to improve coverage, balancing the avoidance of collision with other UAVs and static obstacles. The obstacle is considered only when it is within the range of UAV A , which is equal to coverage radius R in this paper. UAVs' moving trajectories are recorded in Figure 14b, where circumnavigation around the obstacle is noticeable. The algorithm is convergent A case with a rectangular static obstacle Ω o with both sides being 10 m as shown in Figure 14 is considered, while other conditions are same as before. UAVs' initial positions are shown in Figure 14a. Next, UAVs begin to disperse to improve coverage, balancing the avoidance of collision with other UAVs and static obstacles. The obstacle is considered only when it is within the range of UAV A, which is equal to coverage radius R in this paper. UAVs' moving trajectories are recorded in Figure 14b, where circumnavigation around the obstacle is noticeable. The algorithm is convergent when the simulation step k = 1248 and the optimal coverage position of each UAV is shown in Figure 14c.  During the movement of UAVs, collisions with other UAVs and with obstacles are avoided. If the minimal distance between two UAVs is greater than the collision critical value (the sum of two UAVs' shape radii), then the collision between these two UAVs would occur at that moment. In Figure 15, the minimum distance between UAVs (green thick line) and the collision critical value (red thin dashed line) in each simulation step k are shown, demonstrating the effectiveness of averting collision with other UAVs intuitively. During the movement of UAVs, collisions with other UAVs and with obstacles are avoided. If the minimal distance between two UAVs is greater than the collision critical value (the sum of two UAVs' shape radii), then the collision between these two UAVs would occur at that moment. In Figure 15, the minimum distance between UAVs (green thick line) and the collision critical value (red thin dashed line) in each simulation step k are shown, demonstrating the effectiveness of averting collision with other UAVs intuitively. In Figure 16, the distance between the UAVs and obstacle is shown only when the obstacle is within the range of the UAV, demonstrating the effectiveness of averting collision with an obstacle intuitively. In this case, only UAV 1, UAV 8 and UAV 16 are assumed to consider the obstacle, while other UAVs only require consideration of their neighboring UAVs and the boundary of  e . In Figure 16, the distance between the UAVs and obstacle is shown only when the obstacle is within the range of the UAV, demonstrating the effectiveness of averting collision with an obstacle intuitively. In this case, only UAV 1, UAV 8 and UAV 16 are assumed to consider the obstacle, while other UAVs only require consideration of their neighboring UAVs and the boundary of Ω e . To quantify and objectively appraise the performance of the RD method proposed in this paper, comparisons with the traditional V-based method [20] and the VFA method [30] under the environment without obstacles is shown as follows. The traditional V-based method and the VFA method use the same parameters as the RD method.

Distance(m)
The coverage situations of the V-based method and the VFA method are shown in Figure 17, of which (a-c) are belong to the V-based method and (d-f) are belong to the VFA method. The initial positions of each UAV are shown in Figure 17a,d, the recorded trajectories of UAVs in simulation step  0~574 k are shown in Figure 17b,e, and the situation at  574 k are displayed in Figure 17c,f, where the RD method (shown in Figure 12c) is superior to the V-based method and the VFA method in the field of convergence speed and the coverage rate can be easily found visually. To quantify and objectively appraise the performance of the RD method proposed in this paper, comparisons with the traditional V-based method [20] and the VFA method [30] under the environment without obstacles is shown as follows. The traditional V-based method and the VFA method use the same parameters as the RD method.
The coverage situations of the V-based method and the VFA method are shown in Figure 17, of which (a-c) are belong to the V-based method and (d-f) are belong to the VFA method. The initial positions of each UAV are shown in Figure 17a,d, the recorded trajectories of UAVs in simulation step k = 0 ∼ 574 are shown in Figure 17b,e, and the situation at k = 574 are displayed in Figure 17c,f, where the RD method (shown in Figure 12c) is superior to the V-based method and the VFA method in the field of convergence speed and the coverage rate can be easily found visually. During the simulation step  0~574 k , the trajectories of UAVs generated by the RD method, the V-based method and the VFA method are shown in Figure 18, where Figure 18a is the RD method's trajectories, Figure 18b is the V-based method's trajectories and Figure 18c is the VFA method's trajectories. A feature of the trajectory of UAV 23 reveals the improved smoothness and lower oscillation of the RD method than the V-based method and the VFA method. The reason is that RD considers the reciprocal of UAVs but the other methods ignore it. During the simulation step k = 0 ∼ 574, the trajectories of UAVs generated by the RD method, the V-based method and the VFA method are shown in Figure 18, where Figure 18a is the RD method's trajectories, Figure 18b is the V-based method's trajectories and Figure 18c is the VFA method's trajectories. A feature of the trajectory of UAV 23 reveals the improved smoothness and lower oscillation of the RD method than the V-based method and the VFA method. The reason is that RD considers the reciprocal of UAVs but the other methods ignore it. During the simulation step  0~574 k , the trajectories of UAVs generated by the RD method, the V-based method and the VFA method are shown in Figure 18, where Figure 18a is the RD method's trajectories, Figure 18b is the V-based method's trajectories and Figure 18c is the VFA method's trajectories. A feature of the trajectory of UAV 23 reveals the improved smoothness and lower oscillation of the RD method than the V-based method and the VFA method. The reason is that RD considers the reciprocal of UAVs but the other methods ignore it. Next, the comparison of the coverage rate and deadweight loss among the RD, V-based and VFA methods are exhibited in Figure 19a,b respectively, which shows that the RD method has a higher coverage rate and less deadweight loss than the other two methods at the same time. The advantage of RD in coverage rate and deadweight loss also owes to its consideration of the UAVs' reciprocity. With the increasing scale of UAVs, the difference in convergence speed among these three methods is shown in Figure 20, which indicates that the RD method is more scalable and adaptable than the other two methods.
The calculation speed in various environments but with  25 n UAVs is shown in Table 3. For each UAV, it takes 14.807 ms in average to optimize coverage decision while utilizing the RD method. For each UAV, more than 500 ms is required to make a decision using the V-based method and the VFA method needs about 42.798 ms in average. This is because RD is direct optimized in velocity space while V-based method spends a lot of time in Voronoi partition in configuration space. Next, the comparison of the coverage rate and deadweight loss among the RD, V-based and VFA methods are exhibited in Figure 19a,b respectively, which shows that the RD method has a higher coverage rate and less deadweight loss than the other two methods at the same time. The advantage of RD in coverage rate and deadweight loss also owes to its consideration of the UAVs' reciprocity. During the simulation step  0~574 k , the trajectories of UAVs generated by the RD method, the V-based method and the VFA method are shown in Figure 18, where Figure 18a is the RD method's trajectories, Figure 18b is the V-based method's trajectories and Figure 18c is the VFA method's trajectories. A feature of the trajectory of UAV 23 reveals the improved smoothness and lower oscillation of the RD method than the V-based method and the VFA method. The reason is that RD considers the reciprocal of UAVs but the other methods ignore it. Next, the comparison of the coverage rate and deadweight loss among the RD, V-based and VFA methods are exhibited in Figure 19a,b respectively, which shows that the RD method has a higher coverage rate and less deadweight loss than the other two methods at the same time. The advantage of RD in coverage rate and deadweight loss also owes to its consideration of the UAVs' reciprocity. With the increasing scale of UAVs, the difference in convergence speed among these three methods is shown in Figure 20, which indicates that the RD method is more scalable and adaptable than the other two methods.
The calculation speed in various environments but with  25 n UAVs is shown in Table 3. For each UAV, it takes 14.807 ms in average to optimize coverage decision while utilizing the RD method. For each UAV, more than 500 ms is required to make a decision using the V-based method and the VFA method needs about 42.798 ms in average. This is because RD is direct optimized in velocity space while V-based method spends a lot of time in Voronoi partition in configuration space. With the increasing scale of UAVs, the difference in convergence speed among these three methods is shown in Figure 20, which indicates that the RD method is more scalable and adaptable than the other two methods.
The calculation speed in various environments but with n = 25 UAVs is shown in Table 3. For each UAV, it takes 14.807 ms in average to optimize coverage decision while utilizing the RD method. For each UAV, more than 500 ms is required to make a decision using the V-based method and the VFA method needs about 42.798 ms in average. This is because RD is direct optimized in velocity space while V-based method spends a lot of time in Voronoi partition in configuration space.

Large-Scale
To verify the scalability of the RD method, a case of swarm coverage is simulated by 1000 UAVs in a    2000 m 1000 m rectangular region  ' C with static obstacles as shown in Figure 21. The parameter of each UAV is the same as in Section 5.1. First, UAVs are static and randomly distributed within a    1000 m 500 m rectangular region  ' a as shown in Figure 21a. During the collision-free interaction among UAVs, the covered area is increasing, as shown in Figure 21b. Finally, the algorithm is converged to an extremum solution, as shown in Figure 21c.
From the simulation and data above, the advantages of RD can be easily summarized. First, RD with the property of distributed, asynchronous and self-organized UAVs has a higher coverage rate and less deadweight loss while converging quickly. Additionally, RD leads to smoother moving trajectory and faster decisions. Finally, the RD method is more adaptive to various scenes, such as situations with obstacles or large-scale coverage, and provides the capacity for collision-avoiding, scalability and flexibility.

Large-Scale
To verify the scalability of the RD method, a case of swarm coverage is simulated by 1000 UAVs in a 2000 m × 1000 m rectangular region Ω C with static obstacles as shown in Figure 21. The parameter of each UAV is the same as in Section 5.1.

Large-Scale
To verify the scalability of the RD method, a case of swarm coverage is simulated by 1000 UAVs in a    2000 m 1000 m rectangular region  ' C with static obstacles as shown in Figure 21. The parameter of each UAV is the same as in Section 5.1. First, UAVs are static and randomly distributed within a    1000 m 500 m rectangular region  ' a as shown in Figure 21a. During the collision-free interaction among UAVs, the covered area is increasing, as shown in Figure 21b. Finally, the algorithm is converged to an extremum solution, as shown in Figure 21c.
From the simulation and data above, the advantages of RD can be easily summarized. First, RD with the property of distributed, asynchronous and self-organized UAVs has a higher coverage rate and less deadweight loss while converging quickly. Additionally, RD leads to smoother moving trajectory and faster decisions. Finally, the RD method is more adaptive to various scenes, such as situations with obstacles or large-scale coverage, and provides the capacity for collision-avoiding, scalability and flexibility. First, UAVs are static and randomly distributed within a 1000 m × 500 m rectangular region Ω a as shown in Figure 21a. During the collision-free interaction among UAVs, the covered area is increasing, as shown in Figure 21b. Finally, the algorithm is converged to an extremum solution, as shown in Figure 21c.
From the simulation and data above, the advantages of RD can be easily summarized. First, RD with the property of distributed, asynchronous and self-organized UAVs has a higher coverage rate and less deadweight loss while converging quickly. Additionally, RD leads to smoother moving trajectory and faster decisions. Finally, the RD method is more adaptive to various scenes, such as situations with obstacles or large-scale coverage, and provides the capacity for collision-avoiding, scalability and flexibility.

Robotic Operation System (ROS) Simulation
For the sake of verifying the proposed method's effectiveness further, a simulation of multi-UAV sensing coverage is conducted by using ROS Jade and Gazebo 5.0 on an Intel PC (×86) running Ubuntu 14.04.
Limited by PC's performance, a mimitype multi-UAV sensing coverage is customized, where 16 UAVs execute a cooperative coverage of mountainous region. The region is an area of 16,384 square meters, whose both length and width are 128 m. Each UAV flies at 50-m height with a maximum velocity of 20 m/s and sensing scope of 30 m × 30 m. In addition, each UAV is instantiated as an independent ROS node, which means that the simulation is running in a distributed way. The printscreen of simulation on Gazebo is exhibited in Figure 22.

Robotic Operation System (ROS) Simulation
For the sake of verifying the proposed method's effectiveness further, a simulation of multi-UAV sensing coverage is conducted by using ROS Jade and Gazebo 5.0 on an Intel PC (×86) running Ubuntu 14.04.
Limited by PC's performance, a mimitype multi-UAV sensing coverage is customized, where 16 UAVs execute a cooperative coverage of mountainous region. The region is an area of 16,384 square meters, whose both length and width are 128 m. Each UAV flies at 50-m height with a maximum velocity of 20 m/s and sensing scope of    30 m 30 m . In addition, each UAV is instantiated as an independent ROS node, which means that the simulation is running in a distributed way. The printscreen of simulation on Gazebo is exhibited in Figure 22. In Figure 23, three typical moments are captured, where Figure 23a shows that 16 UAVs assemble in the center of the mountainous region at  0 t . Then, UAVs begin to scatter for maximizing the sensing coverage in Figure 23b. Finally, UAVs reach steady state that they have get their maximum coverage at 10 s. As can be seen from the simulation, the proposed method has potential and practical value.  In Figure 23, three typical moments are captured, where Figure 23a shows that 16 UAVs assemble in the center of the mountainous region at t = 0. Then, UAVs begin to scatter for maximizing the sensing coverage in Figure 23b. Finally, UAVs reach steady state that they have get their maximum coverage at 10 s. As can be seen from the simulation, the proposed method has potential and practical value.

Robotic Operation System (ROS) Simulation
For the sake of verifying the proposed method's effectiveness further, a simulation of multi-UAV sensing coverage is conducted by using ROS Jade and Gazebo 5.0 on an Intel PC (×86) running Ubuntu 14.04.
Limited by PC's performance, a mimitype multi-UAV sensing coverage is customized, where 16 UAVs execute a cooperative coverage of mountainous region. The region is an area of 16,384 square meters, whose both length and width are 128 m. Each UAV flies at 50-m height with a maximum velocity of 20 m/s and sensing scope of    30 m 30 m . In addition, each UAV is instantiated as an independent ROS node, which means that the simulation is running in a distributed way. The printscreen of simulation on Gazebo is exhibited in Figure 22. In Figure 23, three typical moments are captured, where Figure 23a shows that 16 UAVs assemble in the center of the mountainous region at  0 t . Then, UAVs begin to scatter for maximizing the sensing coverage in Figure 23b. Finally, UAVs reach steady state that they have get their maximum coverage at 10 s. As can be seen from the simulation, the proposed method has potential and practical value.

Conclusions and Future Work
In this paper, a reciprocal decision approach is proposed for sensing coverage with multi-UAV swarms. The approach is self-organized, distributed, and autonomous, with no need for determining optimal parameters through repeated experiments, which is more suitable for heterogeneous sensing coverage especially in multi-UAVs swarms where each UAVs' sensing capability is changing with its flight height. In contrast to the traditional configuration methods, the coverage problem is directly optimized in velocity space, which is more concise and efficient. First, the reciprocal of UAVs has been considered to reduce the oscillation of UAVs' trajectories. Second, the coverage-beneficial and collision-free set ORCV is determined by adjusting the velocity out of WCV relative to neighboring UAVs. Furthermore, a corresponding random probability method is proposed for selecting the optimal velocity in ORCV. Finally, compared with two significant methods, the simulation results corroborate that the proposed method has better performance in terms of coverage rate, convergence rate, trajectory smoothness and scalability than the V-based and VFA methods. In addition, a ROS simulation is conducted to validate the availability and practicability of the RD method. The model of UAVs can be more specific in terms of the kinematics and dynamics and the capacity of coverage by adding constraints to the velocity space. Moreover, the 2-D environment is demonstrated in this paper; the method can be further extended to the 3-D situation.

Conflicts of Interest:
The authors declare no conflict of interest.