Case Study of a Retrieval Method of 3D Proxy Reflectivity from FY-4A Lightning Data and Its Impact on the Assimilation and Forecasting for Severe Rainfall Storms

: As the first Geostationary Satellite with the LMI (Lightning Mapping Imager) instrument aboard running over the eastern hemisphere, FY-4A (Feng-Yun-4A) can better indicate severe convection and compensate for the limitations of radar observation in temporal and spatial resolution. In order to realize the application of FY-4A lightning data in numerical weather prediction (NWP) models, a logarithmic relationship between FY-4A lightning density and maximum radar reflectivity is presented to convert FY-4A lightning data into maximum FY-4A proxy reflectivity. Then, according to the profiles of radar reflectivity, the maximum FY-4A proxy reflectivity is extended to 3D FY-4A proxy reflectivity. Finally, the 3D FY-4A proxy reflectivity is assimilated in RMAPS-ST (Rapid-refresh Multi-scale Analysis and Prediction System — Short Term) to compare with radar assimilation. Four groups of continuous cycling data assimilation and forecasting experiments are carried out for a severe rainfall case. The results demonstrate that cycling assimilation of 3D FY-4A proxy reflectivity can adjust the moisture condition effectively, and indirectly affects the temperature and wind fields, then makes the thermal and dynamic analysis more reasonable. The Fractions Skill Scores (FSSs) show that the rainfall forecasts are improved significantly within 6 hours by assimilating 3D FY-4A proxy reflectivity, which is similar to the parallel experiments in assimilating radar reflectivity. In addition, other cycling data assimilation experiments are carried out in mountainous areas without radar data. The improvement of precipitation forecasts in mountainous areas further proves that the application of assimilating 3D FY-4A proxy reflectivity can be considered a useful substitute where observed radar data are missing. Through the two severe rainfall cases, this method could be framed as an example of how to use lightning for data assimilation.


Introduction
Lightning is a common discharge phenomenon in convective weather that exists in clouds or cloud to ground [1,2]. With the information of time, position, quantity and polarity in lightning events, lightning data can identify complex thermal or dynamic processes in thunderstorm clouds [3][4][5][6]. Therefore, lightning data can be used to monitor the occurrence and evolution of mesoscale convective systems [7]. Moreover, lightning detection has the advantages of high-precision, long-distance coverage, and is relatively unaffected by terrain [8][9][10][11]. With the accumulation of high-quality lightning data, the application of lightning assimilation techniques in NWP models, aiming to improve the forecasting skills of convective precipitation, has been a major scientific topic recently [12][13][14][15].
Although it has been proved that the assimilation of lightning data can improve the initialization of NWP, the difficulty still exists that lightning is not a modeled or prognostic variable in most existing NWP models. Hence, the aim of assimilating lightning data is to find a suitable observation operator to transfer model or prognostic variables to lightning data, or to convert lightning data into some other observations which can be assimilated in the existing data assimilation (DA) systems [16,17].
As lightning data has been widely recognized as a valuable proxy variable for identifying the occurrence of severe convection, some researchers converted lightning data into humidity or hydrometeors, and these retrieved observations were then used to initialize NWP by using some objective analysis techniques. For instance, rainfall rates were derived from lightning data for assimilation in the meteorological model, which has a positive feedback on 12-24h precipitation forecasts [18,19]. Humidity profiles are nudged towards empirical profiles using lightning data to force deep moist convection in a mesoscale model [8,9]. Water vapor mixing ratio was adjusted by a nudging function where lightning was observed but deep convections were not simulated, and by means of modifying the microphysics parameterization scheme, the locations of the convective systems were improved on 0-3h different model forecasts (WRF [12][13][14]; RAMS [20,21]). Through substituting the water mass by mixing ratios of selected ice species in nudging function, lightning data were also used to adjust the density of ice-phase particles in the mixed-phase region in WRF, and the convective precipitation was improved clearly in active lightning regions [22,23]. Besides, lightning data were used to discern the graupel-dominant regions, and its assimilation helped improve the short-term lightning and precipitation forecasts [24]. Some other researchers tried to use lightning data for controlling the trigger of parameterization schemes in NWP models. Lightning data are used as a proxy for the presence or absence of deep convection to control the triggering of the Kain-Fritsch convective parameterization scheme in different models (COAMPS [10]; MM5 [15]; WRF [25]).
In addition, the location and density of lightning were found to be highly correlated to that of radar reflectivity observations, so some researchers tried to use lightning data as a supplement to radar reflectivity, and results show that assimilating radar reflectivity and lightning data together has a significant and positive impact on the precipitation forecast [14,21,28]. Some other researchers converted lightning data into proxy radar reflectivity and then assimilated the proxy reflectivity [27][28][29][30][31][32]. This scheme has been evaluated via different assimilation methods (DDFI [33]; EnKF [16]) in DA systems, and the results demonstrated that it can improve the accuracy of reflectivity prediction as well as short-term rainfall forecasts, especially in areas with missing radar data [34].
As the first Geostationary Satellite with an LMI (Lightning Mapping Imager) instrument aboard running over the Eastern Hemisphere, FY-4A is able to detect the real-time total lightning over China, the Western Pacific and the India Ocean via optical detection techniques [36]. Compared with Doppler radar network and land-based lightning detection systems, FY-4A is capable of detecting lightning over broader regions. With higher temporal and spatial resolution, the detection efficiency of the total flash is theoretically 90% [36]. The performance characteristic of LMI [35] is similar to that of the GOES-R GLM (Geostationary Operational Environmental Satellites-R Series; Geostationary Lightning Mapper) [37], which has been applied in GLM lightning assimilation [38]. Although FY-4A products are relatively mature and LMI demonstrates outstanding advantages in lightning detection, few lightning data assimilation methods of FY-4A LMI products have been applied in NWP models.
In order to realize the application of FY-4A lightning data in NWP, the quantitative relationship between radar reflectivity and FY-4A lightning data is discussed, and its impact on the assimilation and forecasting for a rainfall case is explored. This article is organized as follows. In Section 2, the FY-4A lightning data are described. Section 3 presents the retrieval method to convert FY-4A lightning data into 3D proxy radar reflectivity. The assimilation experiments comparing with radar assimilation are presented in Section 4. The assimilation experiments in mountainous areas without radar data are discussed in Section 5. The discussions and conclusions are given in Sections 6 and 7.

Lightning Data Detected by FY-4A LMI
The FY-4A LMI is the first lightning detection sensor on Chinese satellites, which is able to detect total lightning activity (in-cloud and cloud-to-ground lightning) continuously day and night. The FY-4A LMI provides 500 frames per second and has a  400 600 pixel chargecouple device (CCD) camera to operate at 777.4 nm to count flashes and measure their intensity. As used for Tropical Rainfall Measuring Mission (TRMM) Lightning Imaging Sensor (LIS) and GOES-16 GLM, a tree-structured algorithm is used to cluster optical events into groups and groups into flashes, therefore, three types of products including 'event', 'group' and 'flash' are available [35,36]. In this study, the product of 'flash' is used, which records the information of lightning occurrence, including latitude, longitude, time and other information [35].
The comparison of 1-hour accumulated precipitation, 1-hour accumulated lightning strikes observed by FY-4A LMI and by ADTD (Advanced Time of Arrival and Direction) lightning observation system at different times are presented in Figure 1. It is shown that the lightning strikes observed by FY-4A LMI correspond well with those by ADTD. When comparing with ADTD, more lightning strikes have been observed by FY-4A LMI because FY-4A LMI is capable of observing the lightning strikes inside clouds and among clouds, which account for a large proportion of total lightning strikes [36,39]. The distributions of the 1-hour accumulated lightning strikes observed by FY-4A and ADTD are both in good agreement with the 1-hour accumulated precipitation. The 1-hour accumulated precipitation in red boxes corresponds to the lightning strikes observed by FY-4A, while these lightning strikes failed to be detected by ADTD. It might be because FY-4A is less affected by geographical constraints than ADTD, and the lightning strikes detected in clouds by FY-4A are more likely to trigger precipitation over the next few hours. Compared with ADTD lightning observation system, FY-4A LMI can observe more lightning information and better indicate precipitation.

Figure 1.
The comparison of 1-hour accumulated precipitation (the first column), 1-hour accumulated lightning strikes observed by FY-4A LMI (the second column) and lightning strikes observed by ADTD (the third column) at different times; (a,b,c) 0800 UTC to 0900 UTC, 9 June 2018; (d,e,f) 0900 UTC to 1000 UTC, 20 June 2018; (g,h,i) 1000 UTC to 1100 UTC, 21 June 2018; (j,k,l) 0800 UTC to 0900 UTC, 25 June 2018. The red box represents the precipitation which corresponds to the lightning strikes observed by FY-4A, while failed to be detected by ADTD.

Retrieval Methods
In this section, the lightning density is first compared to the observed radar reflectivity. Then, the lightning density is converted to maximum FY-4A proxy reflectivity using a logarithmic relationship. Finally, using the real-time profiles of reflectivity, the maximum FY-4A proxy reflectivity is expanded within the vertical grid-column to obtain 3D FY-4A proxy reflectivity ( Figure 2).

Lightning Density
Lightning density is the number of lightning strikes in a given grid box summed over a period of time around the analysis hour [16,17,34,40]. In this study, the lightning density at the analysis time (T) was defined as the accumulated lightning strikes over a 1-h period (T-20 to T + 40 min) within a 0.1° 0.1° grid box. The distributions of maximum radar reflectivity and FY-4A lightning density at 0900 UTC, 1200 UTC, 1500 UTC and 1800 UTC on 8 August 2017 are shown in Figure 3. It can be seen that FY-4A lightning density and maximum radar reflectivity have similar patterns. The locations of FY-4A lightning density centers and maximum radar reflectivity centers are close to each other. At all the analysis time of this case study, the maximum FY-4A lightning density reaches 25 strikes and the maximum radar reflectivity reaches 55 dBZ, on 8 August 2017.

Proxy Radar Reflectivity Retrieval
The FY-4A lightning density is converted into the 3D FY-4A proxy reflectivity in two steps. The first step is to convert the FY-4A lightning density into the maximum FY-4A proxy reflectivity according to their logarithmic relationship. The second step is extending the maximum FY-4A proxy reflectivity into the vertical distribution based on the profiles of radar reflectivity.

Proxy Maximum Radar Reflectivity
Based on the high correlation between lightning and convective rainfall, lightning strikes occurring within a time window around the PacNet (Pacific Lightning Detection Network) overpass time were counted over the grid cells and compared with convective rainfall rate derived from TRMM's data. The results of curve fitting present an obvious logarithmic lightning-convective rainfall relationship [40]. As the previous section shows the high correlation between lightning density and maximum radar reflectivity, referring to the method of curve fitting in lightning-convective rainfall relationship, in this section we present a logarithmic relationship between lightning density and maximum radar reflectivity. To present the quantitative relationship between FY-4A lightning density and maximum radar reflectivity, 6 severe convective cases from the Beijing area in summer 2017 were selected (Table 1).
Using the radar reflectivity and FY-4A lightning data for the 6 severe convective cases, the mean and median of maximum radar reflectivity corresponding to the same lightning density are taken to determine the logarithmic relationship between FY-4A lightning density and maximum radar reflectivity. It can be seen from Figure 4 that the reflectivity fitted by the Simple-Linear and Non-Linear curves in the GSI (Gridpoint Statistical Interpolation) assimilation system are underestimated comparing with the observation, which might be because the empirical formulas in GSI counted the observations in north America, which might not be consistent with FY-4A lightning data in China.
Both the mean and median are representative. The median is not affected by the maximum or minimum value of the data sequence, which is different from the mean. However, as determined by its position in an array, the median cannot make full use of the characteristics of all data. Since it is unknown whether the maximum or minimum value is an outlier or has other negative effects on analysis, both the mean and median of maximum radar reflectivity corresponding to the same lightning density are taken in this study.
Two sets of coordinate points, (lightning density, the mean of maximum radar reflectivity) and (lightning density, the median of maximum radar reflectivity), are drawn on the same plot. All the 60 coordinate points are used at the same time for curve-fitting, where X values are lightning density from 1 to 30 twice, and the Y values are the mean and median of radar reflectivity corresponding to the same lightning density, to determine the logarithmic curvefitting between FY-4A lightning density and maximum radar reflectivity, where X is the FY-4A lightning density, Y is the maximum radar reflectivity. According to the results of curve-fitting, a and b are set to 14.133 and 26.533.
The logarithmic fitting curve is shown in Figure 4. The FY-4A logarithmic curve fitted from FY-4A observations passes significance test with 95% confidence interval, the correlation coefficient between the reflectivity fitted by the logarithmic curve with the observations is 0.767. It is obvious that the FY-4A logarithmic fitting curve calculated in this section is consistent with the observed maximum radar reflectivity.  The maximum FY-4A proxy reflectivity is calculated by using the formula at the grids where lightning density is more than 3 strikes. From the comparison of maximum radar reflectivity above 35 dBZ (Figure 5a,c,e,g) and maximum FY-4A proxy reflectivity at the grids where lightning density is more than 3 strikes (Figure 5b,d,f,h), it indicates that, in general, the intensity and distribution of the maximum FY-4A proxy reflectivity are analogous to that of maximum radar reflectivity, but at some times there is a slight underestimation of the proxy maximum radar reflectivity in intensity (1500 UTC, 1800 UTC) and a little overestimation of proxy maximum radar reflectivity in distribution (0900 UTC, 1200 UTC), which might be for the time lag difference between lightning frequency and radar reflectivity as respectively determined by accumulation and instant.

D FY-4A Proxy Reflectivity
The maximum FY-4A proxy reflectivity needs to be extended within the vertical gridcolumn by a profile to obtain the 3D FY-4A reflectivity. The profile used in GSI is a function of the maximum proxy reflectivity, which varies as maximum proxy reflectivity increases from 20-25 dBZ to 45-50 dBZ (in 5dBZ bins [15,17,34]). Referring to the classification of profiles for expansion in GSI, we calculated real-time profiles for the different maximum radar reflectivity thresholds respectively. Considering the radar reflectivity used as statistical data are from realtime background fields output, and there are relatively few radar data within the threshold range of 5dBZ, the profiles calculated in 5dBZ bins are excessively oscillating, and is unable to represent the overall trend of the vertical reflectivity distribution. Therefore, the maximum radar reflectivity is divided into three thresholds (30 to 40 dBZ, 40 to 50 dBZ, and 50 to 60 dBZ) with the threshold range of 10dBZ, and the radar reflectivity data from the ground to the height of 14 km at each analysis time is divided into vertical layers in 200m bins. For each maximum radar reflectivity threshold, the radar reflectivity are averaged in each layer to obtain a set of coordinate points (mean of reflectivity, height), which constitutes the profiles for expansion ( Figure 6). Based on the height of the maximum radar reflectivity in the profiles, the maximum FY-4A proxy reflectivity is extended to other layers and 3D FY-4A proxy reflectivity are finally obtained.   To quantitatively evaluate the 3D FY-4A proxy reflectivity converted from FY-4A lightning data, verification scores are calculated in this section. Referring to the 2×2 contingency table resulted from a dichotomous (yes-no) system [41], the POD (the probability of detection) and Bias (for 0-35 dBZ and 35-80 dBZ) at different layers are calculated at each analysis time from 0900 UTC to 1800 UTC with a time interval of 1 hour. The 10-hour average of the verification scores for 3D FY-4A proxy reflectivity at different heights can be seen in Table 2.
For reflectivity below 35 dBZ, the POD scores at layers below 3000 m and above 6000 m are higher than the layers between 3000 m and 6000 m. For reflectivity above 35 dBZ, the POD scores at layers between 3000 m and 6000 m are higher than the layers below 3000 m and above 6000 m. This distribution of POD scores might be for most of the maximum radar reflectivity above 35dBZ are concentrated at the middle layers around 4000m, and the reflectivity of the upper and lower layers are relatively low. The results are relatively consistent with the distribution shown in the Figure 7. Besides, bias scores more than 1 means the area of 3D FY-4A proxy reflectivity is bigger than the area of radar reflectivity. For reflectivity below 35 dBZ, the bias scores are more than 1 for the layers below 3000 m and less than 1 for the layers above 3000 m, which means there's a little overestimation of 3D FY-4A proxy reflectivity in distribution for layers below 3000 m and a little underestimation for layers above 3000 m. On contrary, for reflectivity above 35 dBZ, there's a little underestimation of 3D FY-4A proxy reflectivity in distribution for layers below 3000 m and a little overestimation for layers above 3000 m. The difference of the bias behavior for 0-35 dBZ and 35-80 dBZ indicates that underestimation in intensity exists for reflectivity below 3000m and overestimation exists for reflectivity above 3000m. Maximum FY-4A proxy reflectivity concentrated around 4000 m might be one of the reasons for the overestimation of the layers between 3000 m and 6000 m.

Experiments Set Up
In this study, the Rapid update Multi-scale observation data Assimilation analysis and short-term Prediction (0-12h) System, henceforth referred to as RMAPS-ST, is used in numerical experiments. RMSPS-ST, the operational system of the Institute of Urban Meteorological, CMA, Beijing [42][43][44], is a regional rapid-updating system based on the Weather Research and Forecasting (WRF) model and the WRFDA data assimilation system that started its operational analysis and forecasting in May 2017. The operational domain of this system is shown in Figure 8 and the configuration is shown in Table 3.
RMAPS-ST starts eight times a day at intervals of 3 hours by a partial cycle strategy [45], which means the background field at 0000 UTC is the 6-hour forecast of 1800 UTC on the previous day, and the cold start-up at 1800 UTC on the previous day is driven by the 6-hour forecast of 0.125° ECMWF (European Centre for Medium-Range Weather Forecasts) global forecast fields at 1200 UTC. Other background fields after 0000 UTC are driven by the 3-hour forecast of the previous start-up. Besides, at 1800 UTC on the previous day, the boundary conditions are made long enough for cycling assimilation of the following whole day, using the 3-hourly ECMWF global forecasts.
The conventional data are assimilated in D01, the conventional data and radar observations are assimilated in D02 with a two-step strategy [48]. Observations of seven Doppler weather radars in the Beijing-Tianjin-Hebei region of North China (red points in Figure 8) are assimilated in RMAPS-ST. The radial velocity observations are assimilated by employing the U/V control variables [47,48], and the reflectivity observations are assimilated using an indirect scheme which assimilates the retrieved hydrometeors and estimated water vapor [49].  The experiments start four times using the initial fields of 0600 UTC, 0900 UTC, 1200 UTC, 1500 UTC provided by RMAPS-ST as background fields. At intervals of 3 hours, four groups of hourly cycling data assimilation experiments are carried out on each background field in D02. After assimilation, 6-hour forecasting experiments are carried out (Figure 9). No additional observation is assimilated in CTRL. The radial velocity observations are assimilated hourly in RADAR_RV. In RADAR_RF_RV, the radial velocity and reflectivity observations are both assimilated hourly. In FLASH_RF_RV, the radial velocity and 3D FY-4A proxy reflectivity are both assimilated hourly (Table 4).

The Rainfall Case
In this study, a severe rainfall in Beijing and Tianjin (38°N-41°N, 115°E-118°E) occurring on 8 August 2017 is selected as the case of interest. In this case, lightning, strong wind and rainstorms occurred in most regions of Beijing and Tianjin. The 3-h accumulated precipitation from different time in this case is shown in Figure 10. At 1200 UTC, the convective system formed over the central of Beijing (39.2 °N-41.0 °N, 115 °E-117.5 °E) (Figure 10c). Over the next few hours, the regions of severe rainfall became larger and the convective system moved gradually towards South-East, sweeping across the whole region of Beijing and Tianjin ( Figure  10d,e). The convective weather system began to weaken from 2100 UTC and ended at 2400 UTC (Figure 10f). The severe rainfall lasts about 12 hours. In this case, the maximum 12-hour accumulated precipitation intensity reaches 214.0 mm. The averaged 12-hour accumulated

Rainfall Forecast Skill Scores
The Fractions Skill Scores (FSSs) is used as the verification metric to objectively analyze the hourly rainfall forecast of each experimental group [50]. FSS varies from 0 to 1, with 0 meaning there is no overlap between the prediction and observation fields (zero skill), and 1 meaning complete overlap (perfect skill). The influence radius used in this study is from 12 km to 48 km and the scoring area covers the entirety of Beijing and Tianjin (38°N-41°N, 115°E-118°E). The FSSs of hourly accumulated precipitation for different thresholds (1.5, 7.0, 15.0 and 50 mm) for each experiment averaged from the four groups with different influence radius are presented in Figure 11. It is still feasible to achieve useful scores, and the results hold-up when the radius is increased incrementally. It can be seen that the FSSs of the three assimilation experiments are substantially better than the CTRL experiment for all thresholds, and the FSSs of the FLASH_RF_RV experiment are close to those of RADAR_RF_RV experiment. Assimilating radar radial velocity (RADAR_RV) adjusts the dynamic condition for the severe rainfall, thus the positive effects can be seen for most forecast time. Based on the better dynamic condition, assimilating radar reflectivity and 3D FY-4A proxy reflectivity experiments further improves the moisture and microphysical conditions, leading to an improvement in precipitation forecasts. It is shown that both RADAR_RF_RV and FLASH_RF_RV perform better than CTRL and RADAR_RV, and the FSSs of FLASH_RF_RV are lower than RADAR_RF_RV, which might due to the time lag and the difference in quantity between lightning and radar reflectivity. The T significance test for FSS score with influence radius of 48 km can be seen in Table 5. Results indicate that comparing with RADAR_RV, the improvement of RADAR_RF_RV and FLASH_RF_RV is more significant. Although the scores of RADAR_RF_RV and FLASH_RF_RV in the first three hours are relatively higher than the last three hours for the thresholds of 15.0mm and 50.0mm (Figure 12), the improvement is not as significant as the last three hours. This might be for the overestimation of precipitation caused by the strong effect in adjusting moisture condition by cycling assimilation of radar reflectivity and 3D FY-4A proxy reflectivity. Table 5. T significance test for FSS score. Statistically significant changes are highlighted in asterisks. One asterisk denotes α < 0.10 (t > 1.53), Two asterisks denote α < 0.05 (t > 2.13)).

Precipitation Distribution
The 3-hour accumulated precipitation of the observation and each experiment from 1800 UTC on 8 August 2017 are presented in Figure 12. The observed precipitation is centered around Tianjin (black point in Figure 12), and the maximum precipitation occurred at 39°N with a central rainfall of 50 mm. The CTRL experiment fails to simulate the rain belt along 39°N and forecasts a spurious rainfall center near 38°N (Figure 12a,b). After assimilating radar radial velocity observations, the RADAR_RV experiment decreases the spurious rain belt near 38°N and simulates a small rain belt between 38°N and 39°N (Figure 12c). Based on the assimilation of radar radial velocity, assimilating radar reflectivity in RADAR_RF_RV gives the better reconstruction of the strong rain belt which is missing in CTRL along 39°N (Figure 12d). Compared with RADAR_RF_RV, assimilating the 3D FY-4A proxy radar reflectivity converted from FY-4A lightning data in FLASH_RF_RV can well-capture the rain belt at 39°N as well (Figure 12e). However, there's a little western bias of the rain belt simulated in FLASH_RF_RV compared with RADAR_RF_RV, which might be for the small bias in location between 3D FY-4A proxy reflectivity and maximum radar reflectivity. Besides, cycling assimilation of reflectivity has a strong effect in adjusting moisture condition, which might be the reason for the stronger intensity of simulated rainfall.

Analysis Increments of Hydrometers
The cross sections of analysis increments of rainwater mixing ratio, water vapor mixing ratio, graupel mixing ratio from RADAR_RF_RV and FLASH_RF_RV, along 39°N, at 1800 UTC, 8 August 2017 are shown in Figure 13. It can be seen that assimilating radar reflectivity and 3D proxy reflectivity in RADAR_RF_RV and FLASH_RF_RV produces moister condition between 116°E and 118°E, which may be one of the reasons for the occurrence of strong precipitation near 39°N (Figure 12d,e). As rain water and water vapor both locate at lower layers and graupel locates at higher layers, it can be seen that the cycling assimilation of reflectivity adjusts the rain water mixing ratio below 4km and adjusts the graupel mixing ratio above 4km (Figure 13a,d,c,f); the increment of hydrometers might be one of the reasons for the reconstruction of rainfall center which is missing in CTRL. Besides, the analysis increments of the rain water mixing ratio, water vapor mixing ratio, and graupel mixing ratio in FLASH_RF_RV are similar to those in RADAR_RF_RV, which means that, compared with observed radar reflectivity, assimilating 3D FY-4A proxy reflectivity has an analogous effect on improving the moisture condition, which indicates that the FY-4A lightning data could be a substitute for radar reflectivity when radar data are missing ( Figure 13).  Figure 14. It can be seen from Figure 14 that the dew point temperature (blue lines) are closer to the ambient temperature lines (black lines) after assimilating radar reflectivity and 3D FY-4A proxy reflectivity, which means the moisture conditions are favorable at Tianjin (39.0°N, 117.1°E). In CTRL and RADAR_RV experiment, the CAPEs are only 63 J and 204 J, which are small and unfavorable to the development of convection. In contrast, the CAPEs of RADAR_RF_RV and FLASH_RF_RV experiments are both larger than 700 J. The CAPE of FLASH_RF_RV reaches 998 J, which is enough to support the severe convection. After cycling assimilation, lightning and radar data are better assimilated into the NWP models, and the effect of lightning and radar data in the hydrometeors adjust to the moisture condition, making the analysis of water vapor tend to saturation, and releasing the latent heat of condensation to adjust to temperature. The adjustment in temperature affects the updraft and horizontal airflow, which makes the thermal and dynamic analysis more reasonable. After cycling assimilation, the adjustment of temperature and updraft results in the increment of CAPEs. This might be one of the reasons why the precipitation simulations in RADAR_RF_RV and FLASH_RF_RV are closer to the observation.

Experiments Set Up
Affected by the complex mountainous terrain, it is difficult to obtain high-quality radar observations in Hubei. In order to further explore the effect of the lightning data assimilation scheme in areas where radar data are missing, we selected the Hubei region with complex terrain conditions as the research area ( Figure 15). Three groups of experiments were performed for a severe rainfall case in Hubei on 14 July, 2017.  (Table 6). After each assimilation, 6-hour forecasting experiments are carried out.

The Rainfall Case in Mountainous Areas
In this study, a severe rainfall in D03 occurring on 14 July 2017 is selected as the case of interest. In this case, rainstorms along with lightning occurred in the mountainous area of D03 ( Figure 15). The 3-h accumulated precipitation from different times in this case is shown in Figure 16. At 1500 UTC, the convective rainfall belt formed over south of Hubei, (29.7°N-31.0°N, 108.2°E-111.0°E) (Figure 16c). Over the next few hours, the intensity of the severe rainfall became larger and the convective system moved gradually towards the South-East, sweeping across the whole mountainous area of D03 (Figure 16d,e). The convective weather system began to weaken from 0300 UTC and ended at 0600 UTC, 15 July 2017 (Figure 16f). In this case, the intensity of the maximum 3-hour accumulated precipitation reached over 50 mm, which directly caused a flash flood disaster in the Qingjiang river around the mountainous area.

Precipitation Distribution
The 3-hour accumulated precipitation of the observation and each experiment from 2100 UTC on 14 July 2017 are presented in Figure 17. The observed precipitation belt is located around south of 30°N. The two precipitation centers are on the west (the red box) and east (the black box) respectively, and the maximum precipitation occurred on the east (110.2°E, 29.7°N) with a central rainfall over 50 mm. The CTRL experiment fails to simulate the precipitation center on the west and forecasts a spurious rainfall center near 109.2°E (Figure 17a,b). The CONV experiment assimilates conventional observations and decreases the spurious rain belt near 109.2°E, but there is a south bias of the east precipitation center and it failed to simulate the west precipitation center (Figure 17a,c). Based on the assimilation of conventional observations, assimilating 3D FY-4A proxy reflectivity in CONV_FLASH gives the better reconstruction of the west precipitation center which is missing in CTRL (Figure 17a,d). Compared with CONV, assimilating the 3D FY-4A proxy reflectivity converted from FY-4A lightning data in CONV_FLASH can well-capture the east precipitation center as well ( Figure  17a,d). The rain belt of CONV_FLASH is closer to the observation. However, the strong effect of cycling assimilation in adjusting moisture condition might cause the stronger intensity of the east precipitation center than observations (Figure 17a,d).

Diagnosis
The comparison of FY-4A lightning density, maximum FY-4A proxy reflectivity and the analysis increment of rain water mixing ratio, at 2100 UTC, on 14 July 2017 is shown in Figure  18. It can be seen that the main center of lightning density is between 109°E and 110°E, and there are two minor centers on the west. The distribution of maximum FY-4A proxy reflectivity is the same as the lightning density. Assimilating 3D FY-4A proxy reflectivity in CONV_FLASH causes the increment of rain water and produces moister condition between 109°E and 110°E, which might be one of the reasons for the occurrence of strong precipitation in the black box ( Figure 17 and 18c). Besides, the two minor centers on the west adjust the rain water as well, which might cause the reconstruction of the west precipitation center in the red box ( Figure 17 and 18c). The distribution of the analysis increment of rain water mixing ratio benefits the distribution of the rain belt to the north, which makes the precipitation forecasts closer to the observation.

Rainfall Forecast Skill Scores
The influence radius used in Hubei case is from 12 km to 48km and the scoring area covers the d03 domain (28.5°N-31.9°N, 107.4°E-111.6°E). The FSSs of hourly accumulated precipitation for different thresholds (1.5, 7.0, 15.0 and 50 mm) for each experiment are presented in Figure  19. It is still feasible to achieve useful scores and the results hold-up when the radius is increased incrementally. It can be seen that the FSSs of the CONV_FLASH experiment perform better than the CTRL experiment for all thresholds at most forecast times, which indicates that assimilating 3D FY-4A proxy reflectivity can effectively improve the short-term precipitation forecast within 6 hours and 3D FY-4A proxy reflectivity can be considered a useful substitute where the observed radar data are missing. Besides, compared with the FSSs of CONV experiments, it is more obvious that CONV_FLASH experiments improve the distribution of forecasts. On the basis of the 6h cycling assimilation of conventional observations, the 3h highfrequency cycling assimilation of 3D FY-4A proxy reflectivity is applied in CONV_FLASH. The more frequent data assimilation for CONV_FLASH (3h) compared to CONV (6h) can better adjust the distribution of hydrometeors and further extend the improvement of thermal condition to dynamic condition by multiple assimilation. However, the improvement of the assimilation in Hubei is not as obvious as that in the Beijing and Tianjin. It might be because the assimilation in Hubei is without radar velocity and failed to trigger convection directly by adjusting the dynamic condition.

Discussion
In order to realize the application of FY-4A lightning data in NWP models, we tried to convert FY-4A lightning data into 3D FY-4A proxy reflectivity by a set of quantitative relationship. A logarithmic relationship between FY-4A lightning data and maximum radar reflectivity is studied first. Compared with the observation, the reflectivity fitted by the Simple-Linear and Non-Linear curves in GSI using FY-4A lightning data are underestimated, which indicates that the empirical formula counted by statistics in North America in GSI are not applicable to China. Therefore, radar reflectivity and FY-4A lightning data of six severe rainfall cases in China are used to recount the relationship of FY-4A lightning data and maximum radar reflectivity. Based on the high correlation of intensity and distribution between FY-4A lightning data and maximum radar reflectivity, with the previous research on the logarithmic lightningconvective rainfall relationship [40], the study mainly focuses on the logarithmic curve fitting methods and presents the logarithmic relationship between FY-4A lightning density and maximum radar reflectivity in retrieval, which indicates that log10 might not work best in all the fitting methods, and it is possible for other better fitting methods to exist. Besides, the realtime radar reflectivity profiles are counted using the output of the background fields. Maximum FY-4A proxy reflectivity are converted into 3D FY-4A proxy reflectivity by the realtime radar reflectivity profiles. The similarity of the 3D FY-4A proxy reflectivity and observations indicates that it is relatively reasonable to use the relationship to convert FY-4A lightning density into 3D FY-4A proxy reflectivity, which provides a reliable premise for its assimilation.
The 3D FY-4A proxy radar reflectivity are assimilated by RMAPS-ST through four groups of cycling data assimilation and forecasting experiments for a severe rainfall case in Beijing and Tianjin, on 8 August 2017 to compare the lightning assimilation scheme with radar assimilation. Detailed diagnostic analysis results reveal that the cycling assimilation of 3D FY-4A proxy reflectivity can adjust the moisture condition effectively, and indirectly affect the temperature and wind fields, which makes the thermal and dynamic analysis more reasonable.
Another cycling assimilation experiments performed in mountainous areas without radar data further indicates that FY-4A lightning data could be anticipated as a reasonable substitute where radar data are missing. However, some central precipitation is overestimated after assimilating lightning. Thus, it is necessary to do future research on how to suppress spurious severe convection when assimilating 3D FY-4A proxy reflectivity [19,51]. Besides, the same case study in Beijing and Tianjin was used to set up the relationship between lightning density and maximum FY-4A proxy reflectivity. Therefore, the application of the method to the analyzed case in Beijing and Tianjin is not completely independent. The case in mountainous areas further evaluated the method, which is completely independent of the six severe rainfall cases. These assimilation and forecasting results are only for the two rainfall cases, and the method could be framed as an example of how to use lightning for data assimilation, which needs more tests to be validated.
The study shows that FY-4A lightning data can be used as a significant supplement to regional high-resolution data assimilation. However, as FY-4A is the first geostationary meteorological satellite abroad LMI for test in China, the data access to us is limited. Therefore, the FY-4A lightning data lacks a detailed comparative evaluation with lightning data detected by other satellites in this study. It is necessary to conduct a more detailed assessment of the FY-4A LMI lightning data in the future [33]. Besides, in this lightning data assimilation method, a wet bias might be created or increased to cause the spurious severe convection produced, which indicates the lightning assimilation method could combine with other methods to adjust the hydrometeors towards smaller values at spurious regions to limit spurious convection effectively [19,51]. In addition, lightning is generated in strong updraft regions [52], the strong relationship among lightning and cloud dynamics means that we can consider using lightning data to adjust the dynamic fields of numerical models directly to improve precipitation in the future.

Conclusions
As the first Geostationary Satellite with the LMI instrument aboard, FY-4A is able to implement total lightning real-time detection over the eastern hemisphere and identify complex thermal or dynamic processes in thunderstorm clouds more successfully. In order to realize the application of FY-4A lightning data in NWP models, a set of relationships is present which convert the FY-4A lightning density into 3D FY-4A proxy reflectivity. The benefits of assimilating 3D FY-4A proxy reflectivity by WRF-3DVAR are demonstrated in a series of experiments. The conclusions of this study are summarized as follows: The lightning strikes observed by FY-4A LMI correspond well with those by ADTD. Since FY-4A LMI is capable of observing the lightning strikes inside clouds and among clouds, some lightning strikes that were not detected by ADTD could be observed by FY-4A LMI in rainfall areas.
A logarithmic relationship between FY-4A lightning density and maximum radar reflectivity is presented, which better fits with the FY-4A observations. By the logarithmic relationship and real-time radar reflectivity profile, FY-4A lightning data were converted into 3D FY-4A proxy reflectivity. The intensity and distribution of the 3D FY-4A proxy reflectivity are in good agreement with the observed radar reflectivity.
The results of cycling data assimilation and forecasting experiments for the first severe rainfall case show that assimilating 3D FY-4A proxy reflectivity achieves an analogous effect with assimilating observed reflectivity, which gives a better reconstruction of the strong rain belt which is missing in CTRL experiments. Forecast precipitation improves considerably and the positive effect was sustained for 6 hours. Other cycling assimilation experiments in mountainous areas without radar data further indicate that FY-4A lightning data could be anticipated as a reasonable substitute where radar data are missing.