Inter-Comparison of Rain-Gauge, Radar, and Satellite (IMERG GPM) Precipitation Estimates Performance for Rainfall-Runoff Modeling in a Mountainous Catchment in Poland

: Precipitation is one of the essential variables in rainfall-runoff modeling. For hydrological purposes, the most commonly used data sources of precipitation are rain gauges and weather radars. Recently, multi-satellite precipitation estimates have gained importance thanks to the emergence of Integrated Multisatellite Retrievals for Global Precipitation Measurement (IMERG GPM), a successor of a very successful Tropical Rainfall Measuring Mission (TRMM) mission which has been providing high-quality precipitation estimates for almost two decades. Hydrological modeling of mountainous catchment requires reliable precipitation inputs in both time and space as the hydrological response of such a catchment is very quick. This paper presents an inter-comparison of event-based rainfall-runoff simulations using precipitation data originating from three different sources. For semi-distributed modeling of discharge in the mountainous river, the Hydrologic Engineering Center-Hydrologic Modelling System (HEC-HMS) is applied. The model was calibrated and validated for the period 2014–2016 using measurement data from the Upper Skawa catchment a small mountainous catchment in southern Poland. The performance of the model was assessed using the Nash–Sutcliffe efﬁciency coefﬁcient (NSE), Pearson’s correlation coefﬁcient (r), Percent bias (PBias) and Relative peak ﬂow difference (rPFD). The results show that for the event-based modeling adjusted radar rainfall estimates and IMERG GPM satellite precipitation estimates are the most reliable precipitation data sources. For each source of the precipitation data the model was calibrated separately as the spatial and temporal distributions of rainfall signiﬁcantly impact the estimated values of model parameters. It has been found that the applied Soil Conservation Service (SCS) Curve Number loss method performs best for ﬂood events having a unimodal time distribution. The analysis of the simulation time-steps indicates that time aggregation of precipitation data from 1 to 2 h (not exceeding the response time of the catchment) provide a signiﬁcant improvement of ﬂow simulation results for all the models while further aggregation, up to 4 h, seems to be valuable only for model based on rain gauge precipitation data.


Introduction
Precipitation being one of the key variables of the water cycle plays a vital role in the rainfall-runoff modeling in hydrology [1][2][3]. The principal instruments used for measuring precipitation are rain gauges, weather radars and satellite sensors. Nowadays the rain gauge and weather radar data are considered the best precipitation data sources for catchment modeling [1] whereas satellite data software allows one to perform both continuous and event-based simulations of which the latter is the subject of this study.
The objective of this paper is to analyze and inter-compare the performance of rain gauge, radar and satellite precipitation estimates for event-based rainfall-runoff modeling in a small mountainous catchment. Also a simulation time-step analysis is performed to assess the impact of the time-step aggregation on the simulated hydrographs.

The Study Area
The Upper Skawa catchment is a small mountainous catchment in southern Poland. It is predominantly covered by non-irrigated arable lands and coniferous and mixed forests. The catchment, having total area of 240.4 km 2 , can be sub-divided into six sub-catchments ( Figure 1). The annual rainfall of the sub-catchments ranges from 700 mm to 1200 mm. Besides that, the annual mean temperature up to 700 m a.s.l. varies from 4 °C to 6 °C, between 700-1100 m a.s.l. from 4 °C to 6 °C, and below 4 °C above 1100 m a.s.l. In the catchment area, there are four rain gauges, and one among them is located directly on-site. The discharge data are available at the river gauging station in Osielec. The closest meteorological radar is located in Ramża around 100 km north-west of research area.

Data Collection and Processing
Precipitation and discharge data used in this study were collected between 2014 and 2016. The beginning year of the analysis-2014, was chosen regarding the availability of the Global Precipitation Measurement (GPM) mission products. R software was used for statistical analysis and data processing. The annual rainfall of the sub-catchments ranges from 700 mm to 1200 mm. Besides that, the annual mean temperature up to 700 m a.s.l. varies from 4 • C to 6 • C, between 700-1100 m a.s.l. from 4 • C to 6 • C, and below 4 • C above 1100 m a.s.l. In the catchment area, there are four rain gauges, and one among them is located directly on-site. The discharge data are available at the river gauging station in Osielec. The closest meteorological radar is located in Ramża around 100 km north-west of research area.

Data Collection and Processing
Precipitation and discharge data used in this study were collected between 2014 and 2016.
The beginning year of the analysis-2014, was chosen regarding the availability of the Global Precipitation Measurement (GPM) mission products. R software was used for statistical analysis and data processing. For calibration and validation of the HEC-HMS hydrological model, the discharge data from the gauging station in Osielec were applied. The runoff data (2014)(2015)(2016) were provided by the Institute of Meteorology and Water Management-National Research Institute in Poland. During this period there were six flash flood events caused by excessive rainfall that were chosen for further analysis (Table 1). Three aspects need to be highlighted when analyzing these flood events. Firstly, the Event 1 is characterized by significantly higher maximum discharge value than the other events. Secondly, the Event 1, Event 3, and Event 4 exhibit bimodal time distribution while the other events show unimodal distribution. Thirdly, Event 6 has the longest time of recession among the analyzed events. Therefore to provide a reliable and multi-aspect calibration of the model, Events 1, 2 and 6 were used in the calibration phase, whereas Events 3-5 served for the model validation. Additionally, it should be noted that the Event 1 has the maximum discharge 8 to 10 times of other events and precipitation rate plays a key role during this flood event.

Rain Gauges
The rain gauge data from the period 2014 to 2016 were provided to this research by the Institute of Meteorology and Water Management-National Research Institute in Poland. The 10-min time-step data were collected in four rain gauges located in the catchment area (Table 2). The rain gauges used in this study are part of the telemetric rain gauge network in Poland which consists of 491 gauges. The measurements are subject to automatically performed quality control concerning a range check using climatological values and analysis of the spatial and temporal consistency [34]. In this research the collected data were aggregated to 1-h intervals. Afterwards, the spatial distribution of precipitation was interpolated from the point data using the inverse distance weighting interpolation method (IDW). Like other interpolation algorithms it affects the spatial variability of the approximated rainfall distribution and therefore has an impact on discharge simulation. The IDW method is one of the most commonly used deterministic methods of spatial interpolation. However, recent studies (e.g., [35]) indicate that Radial Basis Function can estimate precipitation more precisely than IDW. As for geostatistical interpolation methods Ordinary Kriging and Co-Kriging are the most popular ones [36]. Some studies indicate that geostatistical approach of interpolation provide better results that deterministic one [37,38], but other studies report better performance of IDW than Kriging (e.g., [39]). Regardless of the method, the interpolation over mountainous areas is always challenging due to complex orography. In this study, the IDW interpolation method was used for several reasons. Firstly, there is a built-in component in HEC-HMS which allows performing the IDW interpolation based on the rain gauge measurements provided by the user. Secondly, despite its limitations, the IDW method is often used to create spatial distribution of rainfall distribution (e.g., [34,40,41]). Thirdly, even if other interpolation methods may provide better spatial representation of precipitation field, in this study a semi-distributed hydrological model is used, and the interpolated values of rainfall rates are averaged over sub-catchments. Finally, the hyetographs, representing a mean value of precipitation in all the IDW interpolation cells that were within the boundaries of particular sub-catchment, were drawn for each of the sub-catchments.

Radar Rainfall Estimates
The radar estimates were collected from the Ramża weather radar, which is a part of the Polish weather network POLRAD. The meteorological radar in Ramża is operating in a dual-polarization mode and frequency of 5600-5650 MHz (C-Band). These data (2014-2016) were also provided by the Institute of Meteorology and Water Management-National Research Institute in Poland. The product used in the study is called PAC (Precipitation Accumulation) and is retrieved using the formula which is expressed as follows [42]: where: Z-radar reflectivity (mm 6 /m 3 ), R-radar-derived rain amount (mm/h). Before being released, the radar estimates are subject to the quality control regarding attenuation in heavy rain, anomalous propagation of the radar beam, beam blockage and hardware instability performed on raw data [34]. The PAC data were delivered in 10-min accumulations at spatial resolution of 1 × 1 km.
Radar-based rainfall estimates are associated with and effected by various sources of uncertainty. These are, among others, radar calibration, attenuation of radar signal, variability in the relationship between reflectivity (Z) and rainfall (R) and radar beam blockage. Calibration of the radar is associated with estimation of radar constant which is related to the radar technical components and thermal effects. The radar constant should be estimated within 1-db accuracy [43] otherwise the radar is considered as miscalibrated which may result in systematic over-or under-estimation of the rainfall estimates. Attenuation of radar antenna signal is the reduction of electromagnetic radiation power when passing through a medium of any density [14]-like clouds or rainfall. This problem mostly affects radars with short wavelengths (like C-band or X-band) [44], but can also be noticed for longer wavelengths (like S-band) [45]. The Z-R formula expresses the relation between reflectivity and rainfall rate for selected drop size distribution (DSD). For hydrological purposes, this relationship is frequently assumed to be constant in time and space even though it varies within rainfall intensity [46,47]. The complete or partial blockage of the radar beam caused by terrain or obstacles like buildings results in shielding of the radar beam. Over complex terrain, like mountainous environments, the beam shielding effect must be corrected before further application.
It is generally agreed that as the period of integration of radar estimates increases the mean difference between radar rainfall estimates and ground measurements performed by rain gauge (considered as 'real' precipitation) is decreasing [15,48,49]. Quantification of the radar rainfall estimates accuracy primarily depends on the applied adjustment method. It has been a subject of many studies, which demonstrated that without adjustment procedure, the errors in radars estimates are excessive [48]. For example Steiner et al. [50] used a high-quality gauge data and storm-based bias adjustment method and achieved root-mean-square errors of radar estimates of approximately 10% for rainfall accumulations greater than 30 mm. In other work the mean percentage difference of 15% close to the calibration site and up to 20% within a distance of 20 km was achieved by Harrold et al. [48] when the calibration process was performed using rain gauge measurements and the horizontal drift of the rain in the wind between the radar beam and the calibration site was allowed.

Adjustment of Radar Rainfall Estimates Using Weighted Multiple Regression (WMR) Method
Before applying in hydrological modeling, the radar rainfall estimates should be adjusted (normalized) to reduce the measurement uncertainty. To mitigate the impact of orography and distance from the radar on measurement performance, the rainfall data were adjusted using weighted multiple regression (WMR) method, which, in this case, is expressed by multiple-linear relationship [18,54]: where: R-radar-derived rain-amount (mm), G-time-accumulated rain gage amount (mm), DR-distance between the radar and the gauge (km), MH-minimum height that radar can target above the gage (m), HG-height of the gage (m a.s.l.), a 1 -a 4 -regression coefficients (-). The WMR adjustment method aims to define the ratio between the time-accumulated precipitation measured by rain gauge and the corresponding radar estimate considering the orographic characteristics (parameters DR, MH, and HG in Equation (2)). With the growing distance between the radar and the gauge (DR) the altitude of the radar beam increases and the radar beam broadens. This to some extent is responsible for attenuation of radar signal which can result in under-or overestimation precipitation estimates at longer ranges [54]. The beam-shielding effect, which can be significant in the mountainous areas, is reflected by MH parameter indicating minimum height that must be reached by the radar beam to perform a proper measurement. The MH parameter corresponds to the measurement angle of radar. Therefore the lower is the MH value the better. Especially in the mountainous areas, the elevation effects have a significant influence on estimated precipitation as it may lead to the growth of precipitation related to orography. HG parameter in Equation (2) is to take this phenomenon under consideration. The Equation (2) doesn't consider terrain slope which also has an impact on rainfall amount [55]. To minimize the uncertainty related to the mismatch of measurement in time 1-h accumulation of rainfall was used (instead of 10-min accumulations) for both rain gauge rainfall and radar estimates.
After all radar-gage data are substituted to the regression Equation (2) all coefficients a 1 -a 4 can be estimated using, for instance, the least square method. Ultimately, the corrected radar estimate of precipitation at any location can be calculated from Equation (2) for any known value R, i.e., from radar-derived rain-amount. The corrected R-data serve as an input to the hydrological model of flow in the Upper Skawa catchment.
For the purpose of further analysis, the following assumptions are made: 1.
The spatial distribution of radar rainfall estimates corresponds to the rain gauge-based point measurements.

2.
Radar and rain gauge instruments perform the measurements at different heights, but the estimated rainfall from these instruments is assumed to be measured at the same level.

3.
The data analysis in each year is limited to the period from April to October to minimize the risk that radar would measure solid hydrometeors instead of liquid particles.

4.
Only simultaneous rainfall observations of rain gauge and radar are taken into further consideration; cases where only rain gauge or only radar registered rainfall were available are neglected.

5.
A semi-distributed hydrological model for each sub-catchment is assumed; mean value from all the radar estimates over the sub-catchment is assigned to its area. This mean value is accordingly adjusted and applied in the hydrological model.

IMERG GPM Satellite Rainfall Estimates
Since March 2014 the Global Precipitation Measurement (GPM) mission, led by the National Aeronautics and Space Administration (NASA) and the Japan Aerospace and Exploration Agency (JAXA), has provided quasi-global precipitation estimates. The GPM mission is a successor of the Tropical Rainfall Measuring Mission (TRMM) and aims at continuing satellite-based rainfall observations. GPM provides a wide variety of products, e.g., rainfall estimates that are data combined from active and passive instruments in the GPM constellation-Integrated Multisatellite Retrievals for GPM (IMERG).
In the study, IMERG version 4 (V04a) GPM-Level 3 Final Run products were used. They are derived from the IMERG algorithm, which is a post-real time research product released within a latency of 2.5 months. The IMERG algorithm intends to intercalibrate, merge, and interpolate different precipitation observations: satellite microwave precipitation estimates, satellite microwave-calibrated infrared precipitation estimates, monthly rain gauges measurements and other precipitation estimates [56]. These products include a gridded rainfall of a 0.1 • × 0.1 • spatial and 30 min temporal resolution. For the purpose of this study, we aggregated the data to 1-h accumulations. For each sub-catchment, a hyetograph was created, which represented a weighted mean value accounting the area of each sub-catchment covered by each grid.
IMERG data were provided by the NASA/Goddard Space Flight Center's PMM and PSS teams from http://pmm.nasa.gov/data-access/.

Digital Elevation Model and Land-Cover
The Digital Elevation Model (DEM) of 100 m resolution was acquired from the Central Centre for Geodetic and Cartographic Documentation (currently Head Office of Geodesy and Cartography in Poland). Terrain complexity and DEM resolution have a significant impact on estimation of DEM hydrological derivatives (e.g., slopes) [57]. Using a low-resolution DEM in hydrological models might result in predicting lower peaks and higher baseflow compared to the use of high-resolution grid [58]. Therefore, using a coarser DEM resolution can generate worse results [57]. However, in this study, a semi-distributed hydrological model is applied and the slope information is used only in the Routing Model as a parameter of the river bed. More impact of DEM resolution on the flow model results should be expected when using fully distributed models. For adjustment of radar rainfall estimates (Section 3.1) DEM was upscaled to the resolution of radar precipitation field.
CORINE Land Cover Project CLC2012 v.18.5.1 was used as a source of land-cover information within the study area. A comprehensive description of land-cover delimitation for the study area can be found in Gilewski et al. [33].

HEC-HMS Hydrological Model
To simulate basin runoff, HEC-HMS (Hydrologic Engineering Center-Hydrologic Modelling System) version 4.2.1. developed by the US Army Corps of Engineers was used. The HEC-HMS model is designed for both continuous and event-based modeling. In this study, methods which are primarily dedicated to event-based modeling were used.

Model Set-up
Two major input components of the HEC-HMS model are the catchment model and the meteorological model. The list of parameters and methods selected in the catchment modeling is shown in Table 3. A detailed description of the modeling concepts and equations behind for all the HEC-HMS sub-models and methods can be found in the technical reference manual by Feldman [59]. The initial values of parameters for the selected catchment modeling methods are provided in Section 3.3.1 along with the parameters' values adjusted during the calibration process. The Soil Conservation Service (SCS) Curve Number (CN) was the method used to estimate water losses. For incremental losses, the SCS-CN method uses the curve number methodology. The precipitation excess is estimated as a function of cumulative precipitation, soil cover, land-use, and antecedent moisture content. Land-use information was retrieved from CLC2012. For every sub-catchment, a weighted value of CN based on the land-use was calculated and fraction of impervious areas specified. Curve numbers were taken from standard tables [60]. The initial abstraction is considered as a precipitation depth before precipitation excess can occur. For each of the sub-catchment, the following parameters are to be specified: Initial Abstraction (mm), Curve Number (-) and Impervious Area (%).
A synthetic unit hydrograph (Snyder Unit Hydrograph), was chosen as a transform method to calculate the actual surface runoff. In this method, the unit hydrograph parameters are fitted for a sub-catchment using observed precipitation and discharge data. There are two parameters that were estimated for every sub-catchment: Standard Lag (h)-which is defined as the length of time between the centroid of precipitation mass and the peak flow of the resulting hydrograph; and Peaking Coefficient (-)-a measure of the steepness of the hydrograph resulting from a unit of precipitation.
Calculations of subsurface flow are performed by Recession Baseflow method which is primarily designed for event-based simulations. The method approximates the typical exponential shape of flow curve observed in the catchment when channel flow recedes after a flood event. For the sub-catchments the following parameters were specified: Initial Discharge (m 3 /s), Recession Constant (-)-rate at which the baseflow recedes between the flood events and Threshold Flow (m 3 /s)-specified flow value at which the baseflow is always reset when the receding limb of the hydrograph falls to that value.
For water routing in the river bed the Muskingum-Cunge Routing (MCRM) method was applied. It is based on the combination of the conservation of mass and the diffusion representation of the conservation of momentum.
In meteorological models considered in this research the Inverse Distance method was used to interpolate the precipitation data from rain gauges across the sub-catchments using the inverse-distance-squared weighting method. To specify time-series of hyetographs in sub-catchments the Specified Hyetograph method was used for radar and satellite data. Figure 2 presents a schematic HEC-HMS hydrological model of the Upper Skawa Catchment. As a semi-distributed model was used, the catchment was sub-divided into six sub-catchments.

Calibration and Validation
Calibration and validation of the river flow model were carried out through comparing flow simulated by the model and the flow observed (at hourly time-steps) at the gauging station. Three out of six analyzed events were used for the calibration process. The peak-weighted RMSE metric was used as an objective function during the automatic calibration process for every event. Since precipitation intensity and its time distribution have a significant impact on the values of model parameters estimated during the calibration phase, the flow model was calibrated and validated separately for each of the three considered methods of measuring precipitation.
The selection of evaluation metrics should allow performing multi-aspect analysis of the models simulation results. Many diverse criteria are used to assess the performance efficiency of hydrological models [61]. Based on the literature [27,62,63] the following were used to compare the performance of the flow model in relation to the observed flows: NSE-broadly used for calibration and validation of hydrological models regarding discharge, r-primarily used for evaluation of the timing of simulated and observed time series, PBias-used to investigate the tendency of over-or underestimation of simulated flow, and rPFD-important criterion in terms of flood risk.
Nash-Sutcliffe efficiency coefficient (NSE) assess the predictive power of the model. It is defined as [64]: where Qsim and Qobs are simulated and observed river flow, Q is the mean of observed values and n is the number of observations. NSE values vary from −∞ to 1, where NSE = 1 means that the modeled discharge perfectly matches to the observed data, NSE = 0 indicates that the accuracy of the model prediction corresponds to the mean of the observation, while NSE < 0 means that the mean of observed flow is a better predictor than the model.
Pearson's correlation coefficient [65] was used to measure the degree of linear association between simulated and observed flow. It is expressed as:

Calibration and Validation
Calibration and validation of the river flow model were carried out through comparing flow simulated by the model and the flow observed (at hourly time-steps) at the gauging station. Three out of six analyzed events were used for the calibration process. The peak-weighted RMSE metric was used as an objective function during the automatic calibration process for every event. Since precipitation intensity and its time distribution have a significant impact on the values of model parameters estimated during the calibration phase, the flow model was calibrated and validated separately for each of the three considered methods of measuring precipitation.
The selection of evaluation metrics should allow performing multi-aspect analysis of the models simulation results. Many diverse criteria are used to assess the performance efficiency of hydrological models [61]. Based on the literature [27,62,63] the following were used to compare the performance of the flow model in relation to the observed flows: NSE-broadly used for calibration and validation of hydrological models regarding discharge, r-primarily used for evaluation of the timing of simulated and observed time series, PBias-used to investigate the tendency of over-or underestimation of simulated flow, and rPFD-important criterion in terms of flood risk.
Nash-Sutcliffe efficiency coefficient (NSE) assess the predictive power of the model. It is defined as [64]: where Q sim and Q obs are simulated and observed river flow, Q obs is the mean of observed values and n is the number of observations. NSE values vary from −∞ to 1, where NSE = 1 means that the modeled discharge perfectly matches to the observed data, NSE = 0 indicates that the accuracy of the model prediction corresponds to the mean of the observation, while NSE < 0 means that the mean of observed flow is a better predictor than the model. Pearson's correlation coefficient [65] was used to measure the degree of linear association between simulated and observed flow. It is expressed as: where Q sim and Q obs are simulated and observed river flow, Q sim is the mean of simulated values Q obs is the mean of observed values and n is the number of observations. Pearson's correlation coefficient varies from −1 to 1.
To investigate the tendency of the simulated flow to over-or underestimate the observations the percent bias was used: where Q sim and Q obs are simulated and observed river flow. The ideal PBias value is equal to 0. As the peak flow values are of particular interest regarding flood risk the relative peak flow difference metric was used: where Q p,sim and Q p,obs are the peak values of simulated and observed river flow. The ideal value of rPFD is equal to 0. The accurate prediction of peak flow value is essential regarding flood risk forecasting across the river. Table 4 provides a classification of the performance of the metrics that were used for a model evaluation.

Simulation Time-Step Analysis
The initial time-step of the rainfall-runoff simulations was set up to 1-h interval. Generally, it can be observed that time aggregation of precipitation data reduces the bias between the observed and simulated data. With the increase in simulation time-step the probability of mismatch in time and space of rainfall measurements and rainfall estimates decreases. Therefore, one of the research goals was to check how the hydrological model performance changes for different time-step aggregations. All the simulated flows for validation period were computed using 2-, 3-, 4-, and 6-h time intervals. The performances of simulated hydrographs with aggregated time-steps were evaluated using the performance metrics described in Section 2.3.2. Table 5 shows the values of parameters DR, HG, and MH for locations of rain gauges, that were used to obtain the WMR coefficients, and the values of adjusted parameters for sub-catchments used for adjustment of radar rainfall estimates. It can be noticed that with the increasing distance from radar the minimum height that is targeted by radar is increasing. Moreover, complex topography in the mountainous region leads to an increase of the minimum height of radar beam due to its blockage by terrain elevation. Using values of parameters for rain gauges from Table 5 along with rainfall measured at rain gauges and their corresponding radar rainfall estimates WMR coefficients of the relationship between R-radar-derived rain-amount (mm) and R'-time-accumulated adjusted radar-derived rain amount (mm) have been estimated:

Adjustment of Radar Rainfall Estimates
To retrieve the adjusted radar rainfall estimate R ' from Equation (7) values of radar-derived rain amount R and parameters DR, MH, and HG for sub-catchments from Table 5 were used.
Estimates of R ' have been distributed over sub-catchments and compared to raw radar rainfall values- Figure 3. Table 5 shows the values of parameters DR, HG, and MH for locations of rain gauges, that were used to obtain the WMR coefficients, and the values of adjusted parameters for sub-catchments used for adjustment of radar rainfall estimates. It can be noticed that with the increasing distance from radar the minimum height that is targeted by radar is increasing. Moreover, complex topography in the mountainous region leads to an increase of the minimum height of radar beam due to its blockage by terrain elevation. Using values of parameters for rain gauges from Table 5 along with rainfall measured at rain gauges and their corresponding radar rainfall estimates WMR coefficients of the relationship between R-radar-derived rain-amount (mm) and R'-time-accumulated adjusted radar-derived rain amount (mm) have been estimated: To retrieve the adjusted radar rainfall estimate R ' from Equation (7) values of radar-derived rain amount R and parameters DR, MH, and HG for sub-catchments from Table 5 were used.
Estimates of R ' have been distributed over sub-catchments and compared to raw radar rainfall values- Figure 3.  After the adjustment process, the radar rainfall estimates values are significantly reduced compared to the raw data (around 40%). That may indicate that the raw radar rainfall estimates are overestimated. The same pattern was observed by Kawka et al. [52] when using a simple mean field bias radar data adjustment.

Intercomparison of Precipitation Products
As the result of measurement method used, three precipitation products applied in this research vary in terms of spatial and temporal resolution. Figure 4 shows spatial distribution of precipitation assessed by these products. After the adjustment process, the radar rainfall estimates values are significantly reduced compared to the raw data (around 40%). That may indicate that the raw radar rainfall estimates are overestimated. The same pattern was observed by Kawka et al. [52] when using a simple mean field bias radar data adjustment.

Intercomparison of Precipitation Products
As the result of measurement method used, three precipitation products applied in this research vary in terms of spatial and temporal resolution. Figure 4 shows spatial distribution of precipitation assessed by these products.  Temporal and spatial upscaling or downscaling of gridded precipitation data affect their accuracy and make them difficult to compare with each other. Interpolated precipitation field for sparsely gauged catchment might have a high degree of uncertainty and coarser spatiotemporal resolution than the satellite products. That may result in missing data while matching precipitation products of finer and coarser resolution [4]. To solve that problem Omranian et al. [67] suggested that application of precipitation product of higher spatiotemporal resolution may lead to better assessment of satellite products and successfully compared radar and IMERG GPM rainfall estimates by referring to the radar grid cell nearest to satellite product. Some of the satellite grids cells cover the areas outside the sub-catchments or are partially common for several sub-catchments. Therefore, it may happen that the rain being observed by the satellite is falling outside of the sub-catchments or partly into several ones. To take these aspects into account, for each sub-catchment, a hyetograph was created which represented a weighted mean accounting for the sub-catchment area covered by each grid. As for the rain gauge and radar precipitation data, distribution of rainfall rate for subcatchments was created by taking a mean value from all the grid cells that could be found within subcatchment boundaries. Figure 5 presents the inter-comparison of precipitation products made for flood events 1-6. The information on total precipitation registered during the flood events is provided in Table 6. Temporal and spatial upscaling or downscaling of gridded precipitation data affect their accuracy and make them difficult to compare with each other. Interpolated precipitation field for sparsely gauged catchment might have a high degree of uncertainty and coarser spatiotemporal resolution than the satellite products. That may result in missing data while matching precipitation products of finer and coarser resolution [4]. To solve that problem Omranian et al. [67] suggested that application of precipitation product of higher spatiotemporal resolution may lead to better assessment of satellite products and successfully compared radar and IMERG GPM rainfall estimates by referring to the radar grid cell nearest to satellite product. Some of the satellite grids cells cover the areas outside the sub-catchments or are partially common for several sub-catchments. Therefore, it may happen that the rain being observed by the satellite is falling outside of the sub-catchments or partly into several ones. To take these aspects into account, for each sub-catchment, a hyetograph was created which represented a weighted mean accounting for the sub-catchment area covered by each grid. As for the rain gauge and radar precipitation data, distribution of rainfall rate for sub-catchments was created by taking a mean value from all the grid cells that could be found within sub-catchment boundaries. Figure 5 presents the inter-comparison of precipitation products made for flood events 1-6. The information on total precipitation registered during the flood events is provided in Table 6. Water 2018, 10, x FOR PEER REVIEW 13 of 23 Figure 5. Comparison of the temporal distribution of 1-h rainfall accumulation (for the entire catchment) for rain gauges, adjusted radar, and IMERG GPM data during the analyzed events 1-6 (af). Table 6. Comparison of total precipitation for analyzed flood events. Event 1  397  1680  692  678  Event 2  212  437  179  223  Event 3  227  577  236  293  Event 4  129  452  184  149  Event 5  175  424  175  190  Event 6  448  1108  456  783 According to Table 6 during some of the flood events, the total accumulated precipitation calculated from different precipitation data sources is quite similar even temporal distribution of rainfall ( Figure 5) during these events is different. The analyzed data sources of precipitation perform differently under extreme rainfall conditions. In most of the cases, the rain gauge rainfall seems to be underestimated and sensitive to registration of outliers. After adjustment, the radar rainfall estimates seem to have a smoother rainfall distribution that the other precipitation products and do not contain outlying values. It can be noticed that IMERG GPM rainfall estimates seem to be overestimated under the extreme rainfall conditions. The same pattern was observed for instance by Omranian et al. [67] when analyzing the performance of IMERG GPM rainfall data for Hurricane Harvey or by Prakash et al. [68] over India which is a monsoon dominated region.

Total Precipitation Accumulation (mm) Event Rain Gauges Raw Radar Adjusted Radar IMERG GPM
Besides other reasons like model parameterization or temporal integration of water balance dynamics [69], the inaccuracy of the input data (in this case precipitation) is one of the main reasons for inaccuracy of the hydrological model. The occurrence of values that are probably outliers in the input precipitation may have a significant impact on calibration of the hydrological model. Taking Figure 5. Comparison of the temporal distribution of 1-h rainfall accumulation (for the entire catchment) for rain gauges, adjusted radar, and IMERG GPM data during the analyzed events 1-6 (a-f). According to Table 6 during some of the flood events, the total accumulated precipitation calculated from different precipitation data sources is quite similar even temporal distribution of rainfall ( Figure 5) during these events is different. The analyzed data sources of precipitation perform differently under extreme rainfall conditions. In most of the cases, the rain gauge rainfall seems to be underestimated and sensitive to registration of outliers. After adjustment, the radar rainfall estimates seem to have a smoother rainfall distribution that the other precipitation products and do not contain outlying values. It can be noticed that IMERG GPM rainfall estimates seem to be overestimated under the extreme rainfall conditions. The same pattern was observed for instance by Omranian et al. [67] when analyzing the performance of IMERG GPM rainfall data for Hurricane Harvey or by Prakash et al. [68] over India which is a monsoon dominated region.
Besides other reasons like model parameterization or temporal integration of water balance dynamics [69], the inaccuracy of the input data (in this case precipitation) is one of the main reasons for inaccuracy of the hydrological model. The occurrence of values that are probably outliers in the input precipitation may have a significant impact on calibration of the hydrological model. Taking into account the characteristics of the analyzed datasets of precipitation, results shown in Figure 5 and in Table 6 as well as topography of the study area one can expect that the adjusted radar rainfall estimates will perform the best and that the worst result will be obtained when using the rain gauge data. The performance of IMERG GPM data should be somewhere in between the other two sources of precipitation data.

Calibration and Validation of the Model
The results of evaluation criteria, described in Section 2.3.2, for three events used for calibration of the models are shown in Table 7. Figure 6 presents the comparison of observed and simulated hydrographs for the calibration events. The simulations of the outflow were performed at an hourly time steps. into account the characteristics of the analyzed datasets of precipitation, results shown in Figure 5 and in Table 6 as well as topography of the study area one can expect that the adjusted radar rainfall estimates will perform the best and that the worst result will be obtained when using the rain gauge data. The performance of IMERG GPM data should be somewhere in between the other two sources of precipitation data.

Calibration and Validation of the Model
The results of evaluation criteria, described in Section 2.3.2., for three events used for calibration of the models are shown in Table 7. Figure 6 presents the comparison of observed and simulated hydrographs for the calibration events. The simulations of the outflow were performed at an hourly time steps.  Unsatisfactory performance of all performance metrics is observed for simulation of Event 1 while using rain gauges as precipitation data source. A good or very good performance of Nash-Sutcliffe efficiency coefficient (NSE), Pearson's correlation coefficient (r) and Percent bias (PBias) can be observed for the rest of the analyzed cases. However, the outlook on the performance of Relative peak flow difference (rPFD) indicates that for all Event 1 simulation results are not acceptable. Therefore, even though the other metrics performed well for that event, the simulation results cannot Unsatisfactory performance of all performance metrics is observed for simulation of Event 1 while using rain gauges as precipitation data source. A good or very good performance of Nash-Sutcliffe efficiency coefficient (NSE), Pearson's correlation coefficient (r) and Percent bias (PBias) can be observed for the rest of the analyzed cases. However, the outlook on the performance of Relative peak flow difference (rPFD) indicates that for all Event 1 simulation results are not acceptable. Therefore, even though the other metrics performed well for that event, the simulation results cannot be considered as acceptable. It's worth noticing that there are cases (e.g., Event 6 simulated with IMERG-GPM precipitation data source) when some of the metrics (r and PBias) perform well while the other (NSE and rPFD) give unsatisfactory results. Therefore, the analysis of the selected simulations must be done regarding all performance metrics at the same time.
According to Figure 6, it can be noticed that the computed hydrographs agree well with the observed hydrographs particularly for the hydrological model run with adjusted radar precipitation estimates. The results of the hydrological model for Event 1, using adjusted radar and IMERG GPM precipitation estimates, show that an artificial second peak is produced. That may be a result of the chosen loss method (SCS Curve Number), which predicts an average trend of rainfall losses rather than the response of individual storm (particularly during the events of intense rainfall) [70]. All the hydrological models underestimated the peak flow values for Event 1. In the case of Event 2, all three hydrological models performed well, but the simulated peak value appeared earlier than the observed one. For Event 3 the hydrological model run with IMERG GPM precipitation as an input significantly overestimated the observed flow. Simulations for precipitation measured by rain gauges and adjusted radar estimates gave similar results regarding simulation of the first peak, whereas the second peak was respectively over-and underestimated.
Tables 8-10 provide calibrated model parameters values (which represent a mean from all three calibration events) for loss, transform and base flow methods assumed in the hydrological models and run with precipitation data from rain gauges, adjusted radar, and IMERG GPM.  The calibrated model parameters from Tables 8-10 were then used in the validation phase. The preliminary results indicated that the performance of the model is unsatisfactory when the model is run with constant values of the optimized parameters. Therefore, it was decided to implement a standard deviation interval of each parameter to the hydrological models and use it when validating the models.
Results of evaluation criteria, described in Section 2.3.2, for the events used for validation of the models are shown in Table 11. Figure 7 contains the comparison of observed and simulated hydrographs for the validation events. Similarly, like in calibration stage, the simulations of the outflow were performed at an hourly time step. The calibrated model parameters from Tables 8-10 were then used in the validation phase. The preliminary results indicated that the performance of the model is unsatisfactory when the model is run with constant values of the optimized parameters. Therefore, it was decided to implement a standard deviation interval of each parameter to the hydrological models and use it when validating the models.
Results of evaluation criteria, described in Section 2.3.2, for the events used for validation of the models are shown in Table 11. Figure 7 contains the comparison of observed and simulated hydrographs for the validation events. Similarly, like in calibration stage, the simulations of the outflow were performed at an hourly time step.  As far as the validation stage is concerned the rain gauge-based simulations resulted in the most unsatisfactory performance of evaluation metrics. A good or very good performance is observed for all the validation events while using adjusted radar rainfall estimates. Except for Event 3, similar results are obtained for simulations with IMERG GPM data. However, the performance of rPFD criterion indicates that within the analyzed simulations the adjusted radar rainfall estimates perform the best. Alike the calibration events analysis, the overall evaluation of simulation must be done by considering all the performance metrics. For instance, if only performance PBias is taken into account it may result in the misleading conclusion that almost all simulations are acceptable.
The visual inspection of the simulated hydrographs- Figure 6, shows that they fit well to the observed ones for the simulations based on adjusted radar estimates and relatively well for Event 4 and Event 5 when IMERG GPM data are used. As to Event 3, in which river flow has a bimodal As far as the validation stage is concerned the rain gauge-based simulations resulted in the most unsatisfactory performance of evaluation metrics. A good or very good performance is observed for all the validation events while using adjusted radar rainfall estimates. Except for Event 3, similar results are obtained for simulations with IMERG GPM data. However, the performance of rPFD criterion indicates that within the analyzed simulations the adjusted radar rainfall estimates perform the best. Alike the calibration events analysis, the overall evaluation of simulation must be done by considering all the performance metrics. For instance, if only performance PBias is taken into account it may result in the misleading conclusion that almost all simulations are acceptable.
The visual inspection of the simulated hydrographs- Figure 6, shows that they fit well to the observed ones for the simulations based on adjusted radar estimates and relatively well for Event 4 and Event 5 when IMERG GPM data are used. As to Event 3, in which river flow has a bimodal distribution, only adjusted radar estimates provide satisfactory results. The other simulations are not reproducing two peaks during one simulation. Most of the simulations present satisfactory results regarding the simulated peak flow value. However, in case of rain gauge and IMERG GPM simulations, there is a mismatch regarding the time of the peak occurrence. Figures 8-10 show the values of performance metrics (NSE, r, PBias, rPFD) for validation events in reference to time-step intervals of hydrological model: 1-, 2-, 3-, 4-, and 6-h, respectively.

Simulation Time-Step Analysis
According to Figures 8-10 for simulations using rain gauge precipitation the best results are obtained not for the initial time-step (1 h), but for the aggregated ones. Particularly, further aggregation in time longer than 2 h provide better results. That may indicate that the uncertainty of precipitation field created by interpolation from the rain gauge stations is quite significant. As for adjusted radar rainfall-based simulations the simulation results for 1-and 2-h time steps are similar, but generally slightly better for 2-h time step. Further aggregation in time in these cases does not provide better results. Simulations using IMERG GPM data give the best results for 1-2 h time step and longer time steps usually lead to a decrease of model performance. Therefore, it can be noticed that the optimal time-step of the simulation for the hydrological model using radar or IMERG GPM rainfall estimates is 1-2 h, whereas for rain gauges the time step should be extended. The worst results are obtained for aggregation in time up to 6-h which seem to be a too long time-step particularly for short event periods.
The observed deterioration of performance with increasing aggregation time-step is primarily related to the response time of the catchment. The average time to peak of the catchment is around 2.5 h (Table 9). Too much aggregated time-step (over the catchment response time) implicates the loss of information on the hydrological processes dynamics and lead to decrease of maximum discharge values. The aggregation of time-step has also impact on the estimation of model parameters which are time-dependent. distribution, only adjusted radar estimates provide satisfactory results. The other simulations are not reproducing two peaks during one simulation. Most of the simulations present satisfactory results regarding the simulated peak flow value. However, in case of rain gauge and IMERG GPM simulations, there is a mismatch regarding the time of the peak occurrence. Figures 8-10 show the values of performance metrics (NSE, r, PBias, rPFD) for validation events in reference to time-step intervals of hydrological model: 1-, 2-, 3-, 4-, and 6-h, respectively.

Simulation Time-Step Analysis
According to Figures 8-10 for simulations using rain gauge precipitation the best results are obtained not for the initial time-step (1 h), but for the aggregated ones. Particularly, further aggregation in time longer than 2 h provide better results. That may indicate that the uncertainty of precipitation field created by interpolation from the rain gauge stations is quite significant. As for adjusted radar rainfall-based simulations the simulation results for 1-and 2-h time steps are similar, but generally slightly better for 2-h time step. Further aggregation in time in these cases does not provide better results. Simulations using IMERG GPM data give the best results for 1-2 h time step and longer time steps usually lead to a decrease of model performance. Therefore, it can be noticed that the optimal time-step of the simulation for the hydrological model using radar or IMERG GPM rainfall estimates is 1-2 h, whereas for rain gauges the time step should be extended. The worst results are obtained for aggregation in time up to 6-h which seem to be a too long time-step particularly for short event periods.
The observed deterioration of performance with increasing aggregation time-step is primarily related to the response time of the catchment. The average time to peak of the catchment is around 2.5 h (Table 9). Too much aggregated time-step (over the catchment response time) implicates the loss of information on the hydrological processes dynamics and lead to decrease of maximum discharge values. The aggregation of time-step has also impact on the estimation of model parameters which are time-dependent.

Conclusions
This paper presents a case study of rainfall-runoff modeling for a small mountainous catchment in southern Poland using three different precipitation data sets: rain gauge data, adjusted radar

Conclusions
This paper presents a case study of rainfall-runoff modeling for a small mountainous catchment in southern Poland using three different precipitation data sets: rain gauge data, adjusted radar

Conclusions
This paper presents a case study of rainfall-runoff modeling for a small mountainous catchment in southern Poland using three different precipitation data sets: rain gauge data, adjusted radar rainfall estimates and satellite rainfall (IMERG GPM). A semi-distributed HEC-HMS hydrological model was used for river flow simulation purposes. From the analysis of outcomes, the following conclusions can be drawn: (1) A good or very good model performance was obtained for most of the simulations during the calibration phase, but for the validation period, best results were obtained using the adjusted radar rainfall estimates and IMERG GPM data as precipitation data source. (2) Spatial and temporal distributions of rainfall estimated from different data sources vary significantly. As rainfall distribution in both time and space has a substantial impact on estimated values of model parameters a separate hydrological model should be applied for each source of the precipitation data. (3) Radar-estimated precipitation seems to be the most reliable source of information on the 'real' precipitation field. Precipitation interpolated from the rain gauge data seems to have a high degree of uncertainty, whereas IMERG GPM provides precipitation estimates of low spatial resolution. (4) Raw radar rainfall estimates seem to overestimate the observed rainfall significantly. Therefore, the radar data should be adjusted to minimize the bias between rain gauge measurement and radar estimation. When applied, the adjustment method for the radar rainfall estimates performed very well for event-based rainfall-runoff simulations in the mountainous area and can be easily adapted to other areas as it requires a relatively few data. (5) Short time of latency of IMERG GPM rainfall estimates makes it a valuable data source for near-real-time flood monitoring, but a rather sparse spatial resolution offsets this. Application of IMERG GPM rainfall estimates is challenging for small catchments as the satellite grids may cover the areas outside the sub-catchment or be partially common for several sub-catchment. If this is the case a weighting of rainfall should be done to account for the area of each sub-catchment covered by each grid. (6) Adequate choice of performance metrics is essential to evaluate the simulation results thoroughly.
The evaluation criteria should allow judging the performance of the flow model regarding various flow characteristics (for the event-based modeling, these are predictive power of the model, timing of simulated and observed time series, tendency of over-or under-estimation of simulated flow, and accuracy in peak flow estimation). The applied evaluation metrics (Nash-Sutcliffe efficiency coefficient, Pearson's correlation coefficient, percent bias, and relative peak flow difference) allowed to make a comprehensive assessment of simulation results regarding these characteristics.