Influence of Host and Environmental Factors on the Distribution of the Japanese Encephalitis Vector Culex tritaeniorhynchus in China

Culex tritaeniorhynchus is an important vector that transmits a variety of human and animal diseases. Japanese encephalitis (JE), an endemic disease in the Asia-Pacific region, is primarily transmitted by Cx. tritaeniorhynchus. Insufficient monitoring of vector mosquitoes has led to a poor understanding of the distribution of Cx. tritaeniorhynchus in China. To delineate the habitat of Cx. tritaeniorhynchus and any host and environmental factors that affect its distribution, we used a maximum entropy modeling method to predict its distribution in China. Our models provided high resolution predictions on the potential distribution of Cx. tritaeniorhynchus. The predicted suitable habitats of the JE vector were correlated with areas of high JE incidence in parts of China. Factors driving the distribution of Cx. tritaeniorhynchus in China were also revealed by our models. Furthermore, human population density and the maximum NDVI were the most important predictors in our models. Bioclimate factors and elevation also significantly impacted the distribution of Cx. tritaeniorhynchus. Our findings may serve as a reference for vector and disease control.


Introduction
Japanese encephalitis (JE) is a mosquito-borne zoonosis caused by infection with Japanese encephalitis virus (JEV). JEV is a member of the genus Flavivirus, family Flaviviridae [1]. JE usually manifests as mild central nervous symptoms, primarily in children and adolescents [2,3]. A small number of cases cause serious viral encephalitis and the disease has a mortality rate of approximately 20% to 40% [4,5]. Permanent neuropsychiatric sequelae may occur in up to 50% of JE survivors with encephalitis symptoms [6]. JE is prevalent in most parts of Asia and the Western Pacific, and has been estimated to result in approximately 67,900 cases annually [7]. JEV also causes reproductive losses and encephalitis in animals, such as swine, equine and cattle [8,9]. Unlike other animal hosts, swine play an important role in the spread of JEV as the major amplifying host [10].
All provinces of China, except Xinjiang and Qinghai, have reported cases of JE. Since 1951, JE has been included on the list of notifiable infectious diseases class B in China. The first confirmed JE case in China was in 1949, and the highest morbidity (20.92/100,000) occurred in 1971 [11]. A national vaccination program against JE was implemented in the 1970s [12]. The attenuated vaccine SA 14-14-2 developed in China is now widely used in JE-endemic countries [13].
In Asia, the principle vector of JE is Culex tritaeniorhynchus [14]. Cx. tritaeniorhynchus is a mosquito species that is widespread in Asia, the Mediterranean and Afrotropical region [15]. In addition to JE, Cx. tritaeniorhynchus also has the ability to transmit some other human and animal viral diseases such as Rift Valley fever, West Nile fever, Dengue fever and Tembusu virus infection [16][17][18][19]. It is widely accepted that Cx. tritaeniorhynchus prefers low lying water bodies with abundant plants, so paddy fields are a major habitat of the larvae [20,21]. As the area for rice planting has expanded in the last 40 years, Cx. tritaeniorhynchus has gained a broader habitat in countries and regions with developed levels of agriculture [22,23]. According to FAO Statistical databases, China has the second biggest rice planting area (30,449,860 hectares in 2016) in the world and is facing a growing public health threat.
In recent years, ecological niche modeling techniques have demonstrated their outstanding ability to predict the spatial distribution of various species and epidemic diseases, such as snails in South Africa, Toxoplasma gondii oocysts in China, Bacillus anthracis in Zimbabwe and plague in the U.S [24][25][26][27]. This modeling relies on high quality and high resolution information about environmental layers for the research area, which can help to provide good predictions with limited occurrence data.
The factors that influence the distribution of Cx. tritaeniorhynchus in China remain unclear. In this study, we applied a maximum entropy (Maxent) niche modelling method to predict the potential suitable habitats of Cx. tritaeniorhynchus in China. By modeling the spatial distribution of this insect vector, we can identify the main areas under threat of JE proliferation and other vector-borne diseases. It is also possible to understand the specific role of environmental factors that drive the distribution of the JE vector in China.

Mosquito Presence Data
The Cx. tritaeniorhynchus larvae and adult collections in China, used in our research, were obtained by literature retrieval. Literatures which provided sufficient information on collection locations to confirm coordinates were adopted. The records were taken from the period between 1970 and 2015. Detailed sampling time and literature sources are given in Table S1. Google Earth software was used to acquire the coordinates of mosquito collection points in the provided locations, when coordinates were not given in the literature. To match the resolution (1 km × 1 km) of environmental layers used in this study, presence records within 1 km 2 were considered as one point.

Environmental Variables and Data Processing
We obtained twenty-seven environmental variable layers in total, including bioclimate data, elevation data, vegetation index and host population densities (Table 1). Bioclimate variables were downloaded from the WorldClim dataset. Nineteen variables that reflect temperature and precipitation conditions for 1970-2000 were the latest available data. Elevation data was obtained from the NASA's Shuttle Radar Topography Mission (SRTM) Digital Elevation Data. In addition to altitude, slope and aspect were also considered as geographical factors affecting mosquito habitats [28]. Slope represents the flow velocity and runoff rate of surface and subsurface water, which affects the abundance of surface water and the soil moisture content [29]. Aspect affects the exposure to sunlight, which has a direct effect on the growth of plants [30]. Slope and aspect layers were extracted from the digital elevation model (DEM) using ArcGIS 10.2 (ESRI Inc., Redlands, CA, USA). We considered the Normalized Difference Vegetation Index (NDVI) as a predictor that reflects vegetation coverage. The NDVI value has been widely accepted as an important predictor for the presence of mosquitoes. It has been referenced in mosquito predictive studies, such as identifying mosquito clusters in urban areas and predicting the peak production of mosquitoes in rice fields [31,32]. The maximum, average and minimum NDVI were derived from Moderate Resolution Imaging Spectroradiometer (MODIS) images captured by satellite Terra. Human population density data for China was provided by the Chinese Academy of Sciences. Pig density data for China was obtained from the Livestock Geo-Wiki [33]. All environmental layers were treated as follows: (1) resampled to the resolution of 1 km × 1 km; (2) defined projection to GCS_WGS_1984; (3) clipped to the geographical area of China; (4) converted to ASCII format. All operations above were accomplished in ArcGIS 10.2. A strong correlation among bioclimate variables usually exists in the modeling, which may unjustifiably affect results [34]. The possible collinearity amongst the variables was investigated by calculating the variance inflation factor (VIF), using the car package in R [35][36][37]. The principle we selected the variables was that the VIF value is less than 10 [38].

Establishment of Maxent Models
Maxent 3.4.1 (http://biodiversityinformatics.amnh.org/open_source/maxent/) was used to establish the distribution models of Cx. tritaeniorhynchus. Due to the lack of absence data, a presence-only modeling approach was adopted in our study. Compared with other modeling methods using presence-only data, Maxent showed outstanding predictive performance in species niches and distribution modeling, even when the sample size was very small [39][40][41].
We initially hypothesized that host population densities (human and pigs) may be strong predictors that may obscure the effect of other variables [42]. We ran the program with two different combinations of layers. Layers including host population densities were used in the first round of modeling, and layers excluding host population densities were used for the second round of modeling. We set the random test percentage as 25%, meaning that 75% of the points would be used for training and 25% for testing [43,44]. And we ran Maxent with 10 cross-validation replicates [45].
Sampling bias is a general problem in the modeling of species distribution [46]. Sampling locations are usually conveniently accessed and these locations do not reflect the real distribution of the target species. In the default setting, background points are randomly selected by Maxent software in the entire research area. In order to counteract this effect in our models, we selected background points with the same spatial bias as the presence points [47]. Based on presence points, a surface of sampling intensity was created using the Gaussian kernel density estimator tool in ArcGIS [48,49]. This bias file was input into Maxent, and the software weighted the bias file to create 10,000 background points.

Model Evaluation and Interpretation
Model accuracy can be evaluated by area under the curve (AUC). The receiver operating characteristic curve (ROC) was generated by the Maxent software. A completely random prediction leads to an AUC value of 0.5. If the AUC value is more than 0.5, it means that the prediction is better than random. The closer the AUC value is to 1, the better the prediction performance will be.
A jackknife test, built in Maxent software, was selected to assess the importance of each variable in the model [50]. The jackknife test is to drop each variable in turn in each run, and then use only each variable in turn in each run. Whether the training gain decreased when the variable was dropped or increased when used alone, this variable can be thought as containing information that no other variables have. This shows the importance of each variable to the model. We can also refer to the "Percent contribution" given by Maxent to judge the relative contribution of each variable to the model. Response curves provided by Maxent represent different models, which were created by using only the corresponding variable. Each curve shows the trend of predictive suitability as the variable varies. This can be interpreted as the specific impact of variables on predictive suitability.

Selection of Variables
By adopting a one-by-one elimination method, variables with VIF > 10 were removed out of the final model. Six variables (Bio 3, Bio 5, Bio 11, Bio 14, Bio 15 and Bio 18) were selected from nineteen bioclimate variables.

Cx. tritaeniorhynchus Habitat Suitability
A total of 173 Cx. tritaeniorhynchus collection points were counted in our models ( Figure 1). The distribution prediction maps for Cx. tritaeniorhynchus are shown in Figures 2 and 3.
We found that both models predicted similar geographical areas as having a medium to high probability for the presence of Cx. tritaeniorhynchus. In central China, southern Shaanxi and Shanxi showed high habitat suitability of Cx. tritaeniorhynchus. In eastern China, extensive areas in Henan and Anhui, and southeast Shandong showed high probabilities of Cx. tritaeniorhynchus presence, as well as the Yangtze River Delta area. In the southwest of China, Sichuan, Chongqing, Yunnan and Guizhou provinces were the most likely areas for the presence of Cx. tritaeniorhynchus. Southern Guangxi and Guangdong, and coastal areas of Hainan, Fujian and Taiwan can also be regarded as suitable habitats. It should be noted that both models gave a high prediction for a limited border area in southeast Tibet. The two models showed some disagreements concerning the central and eastern region of China. The model including host population densities gave a prediction of medium probability for eastern Gansu, northern Shaanxi, eastern Hubei and Hunan, whereas the "Without Host" model gave these areas a higher probability. The prediction of "Without Host" model for southeast Shandong, northern Jiangsu and Jiangxi was broader and higher than that of the "Host" model. In general, the "Host" model provided a prediction of more conservative in extent and lower suitability of Cx. tritaeniorhynchus habitats in China.

Model Evaluation
The model evaluation indicators are shown in Table 2. Both models showed good performance. The model including host population densities gave higher training AUC and test AUC. Compared to the "Host" model, the "Without Host" model predicted a significantly broader exposed area, but a slightly larger exposed population. This indicated that more sparsely populated area was predicted as suitable habitat by the "Without Host" model. Sensitivity equals specificity threshold was applied to generate binary models and calculate test omission rate [41,47,51,52]. Binary models representing suitable/unsuitable were shown in Supplementary File S1.

Importance and Contribution of Variables
According to the jackknife test (Figures 4 and 5), it was obvious that human population density shows the greatest importance to the "Host" model. The increase in training gain when population density was used alone and the reduction when it was discarded were both the largest. In addition to population density, Bio 5, Bio 11, Bio 14, Bio 18, NDVI max, pig density and elevation also showed significant importance to the prediction.  The percent contribution of each variable in the two models are shown in Table 3. Bio 5, Bio11, Bio 14, Bio 18, NDVI max and elevation were regarded as variables with greater contributions in both models. In the "Host" model, human population density had the greatest relative contribution with 66.20%, whereas pig density was 0.41%. Maximum NDVI had the highest contribution (35.28%) in the "Without Host" model, followed by Bio 14 (32.21%).

Response Curves of Variables
According to their contribution and importance to the model, response curves of representative variables are shown in Figure 6.

Discussion
The two models in our study gave similar predictions of the distribution of the JE vector Cx. tritaeniorhynchus in China. The highly suitable habitats were mainly in the southwestern, central, eastern, and coastal areas of China. This coincides with the spatial distribution of JE in China. Since the 21st century, the southwestern region has experienced the most serious JE epidemics in China. According to the China CDC, the average incidence of JE in Guizhou (2.36/100,000), Chongqing (1.28/100,000), Sichuan (1.02/100,000), and Yunnan (0.91/100,000) ranked in the top four in China, far above the national average (0.37/100,000). Shaanxi (0.61/100,000), Henan (0.52/100,000) and Anhui (0.40/100,000) also had a high incidence of JE, ranking fifth to seventh. In recent years, due to advanced economic levels and higher immunization coverage in the coastal areas of China, the incidence of JE has been relatively low. In the 1970s, areas such as Shandong, Jiangsu, Zhejiang, Guangdong and Guangxi were once the most serious epidemic areas of JE in China [53]. There are certainly abundant vectors in areas that experience JE epidemics, which shows that our predictions are relatively accurate.
The predictions were also supported by reports of JE epidemics and the isolation of JEV in recent years. Human JE cases were obtained from literature reports and ProMED mail (http://www. promedmail.org/) [2,5,[54][55][56][57][58][59]. The reported locations were mostly located in the high-risk areas predicted by our models (Table S2). In addition, there were several reports of JEV infection in pigs. JEV infections in swine were reported in Jiangsu (2008-2009), Shanxi (2009) and Sichuan (2012), places that were also predicted as high suitability habitats [60][61][62]. However, due to the limited numbers of JE cases obtained, this is not sufficient for demonstrating the predictive power of the model. Furthermore, some additional mosquito species within the genus Culex and Aedes can also transmit JEV in some regions [63][64][65]. Therefore, some of the observed JE cases may not be due to Cx. tritaeniorhynchus solely. More detailed information on JE cases and future mosquito surveys for suspected high-risk areas are required to make a thorough validation of our model's predictive accuracy.
The "Host" model was more conservative whereas the "Without Host" model predicted a more extensive distribution of high-risk areas. Unsurprisingly, human activities led to such a difference. Habitat suitability may be affected by human interventions. According to the response curve, the habitat suitability rose sharply with the increase of population density ( Figure 6). The most obvious impact of human activities on the distribution of mosquitoes is that mosquitoes have an abundant food source (blood) in densely populated areas. Furthermore, manmade structures not only provide shelter for humans, but also provide mosquitoes with a suitable living environment. As expected, the density of pigs also has an impact on the distribution of Cx. tritaeniorhynchus, but was not found to be as high as that of human influence.
Bioclimate variables played an important role in the modeling. The curve of Bio 5 (maximum temperature of the warmest month) rose first and then dropped, and the peak was at 32.5 • C. This shows that mosquitoes are benefited from higher temperatures, however, excessive temperatures have a negative effect on their survival. In another study, the upper limit of the suitable temperature for JE vectors was considered to be 34.5 • C in India [66]. At higher temperatures, mosquito bites become inactive and the mortality rate increases [67]. Bio 11 (mean temperature of the coldest month) represents the resistance of Cx. tritaeniorhynchus to low temperatures. Although there is evidence that adult Cx. tritaeniorhynchus can overwinter [68], their activity and number in winter were significantly lower. Only 10 female Cx. tritaeniorhynchus were collected from January to April in 2008 in a park in Tokyo, while 14,069 females were collected from April to November in 2007 [69]. The cold winter in northeast China is challenging for the mosquito survival and activity. Bio 14 (precipitation of the driest month) showed the adaptability of mosquitoes to drought. It is known that adequate precipitation is strictly required by Cx. tritaeniorhynchus eggs to maintain activity and hatch [70]. However, Xinjiang is a typically arid and semiarid province in China [71]. Two vast deserts, the Taklimakan Desert and Gurbantunggut Desert, are located in Xinjiang and occupy most of Xinjiang. The lack of precipitation makes the survival of mosquitoes difficult and this explains why there have never been JE cases in Xinjiang. Bio 18 (precipitation of the warmest quarter) demonstrated that summers with abundant rainfall promote the reproduction of mosquitoes.
The maximum NDVI was the most powerful predictor in the "Without Host" model. Vegetation coverage helps to avoid high temperatures and rapid evaporation of surface water caused by direct exposure to strong sunlight. Blood is sucked by female mosquitoes for reproduction, while both males and females obtain energy by feeding on sugar [72]. Mosquitoes get sugar supplements mainly from fruits and nectar [73]. Therefore, vegetation coverage provides not only suitable resting places, but also a sufficient source of sugar for mosquitoes. However, according to the response curve, when the NDVI value was too high (0.55), the suitability began to decline. As observed in a previous study, lush rice led to a dramatic decrease in the number of mosquito larvae [74]. Vegetation that is too dense will limit the sunlight received and cause temperatures to be too low.
Elevation was also considered to be an important factor that affects the distribution of Cx. tritaeniorhynchus in both models. The response curve dropped rapidly as elevation rose. Low temperatures and thin air at high altitudes lead to sparse blood meal hosts and plants [75]. Low lying terrain is also beneficial for the accumulation of water. In a mosquito survey in Yunnan province, the number of Cx. tritaeniorhynchus decreased rapidly with elevation, and at an altitude of more than 3000 m no mosquitoes were captured [76]. The high predictive suitability of low altitude regions was reflected in two typical basins in our models, the Sichuan Basin and the Ordos Basin. In fact, the incidence of JE in these two areas is indeed high. On the contrary, Qinghai (free of JE) and Tibet (few JE cases) benefit from the existence of the Qinghai-Tibet Plateau, the highest plateau in the world.
Two areas require particular attention. A small border area in southeast Tibet was predicted to have a high habitat suitability. In this area, JEV isolates were reported in 2009, indicating that local vector and virus surveillance is an important requirement [77]. The surrounding area of the Taklimakan Desert in southern Xinjiang and the border areas of northern Xinjiang have the possibility of establishing habitat for Cx. tritaeniorhynchus.
Our research is novel in our focus on modeling the distribution of Cx. tritaeniorhynchus for the entire territory of China, with consideration for vegetation and host factors. A previous work mapped the distribution of Cx. tritaeniorhynchus within JE risk areas [78]. However, Tibet, Xinjiang and Qinghai province were not taken into consideration in this study. Human JE cases have been reported in Tibet since 2006, according to the China CDC (http://www.Chinacdc.cn/en/). And JEV isolates (from mosquitoes, pigs and humans) have been described in Tibet [77,79,80]. This demonstrated that JEV is currently circulating in Tibet, so we think it is necessary to include Tibet in the modeling. Xinjiang and Qinghai should also be considered to provide early warning of possible future JE invasion. Our research also reveals the relationship between habitat preference of Cx. tritaeniorhynchus and environmental factors. And the jackknife test and response curves are helpful to understand the conditions that affect the survival of mosquitoes. Compared with the variables used in the previous research (land surface temperature and tasselled cap wetness), we believe that the variables adopted in our research (Bioclimate variables) can better reflect the climate seasonality and complexity, especially for a country with a large territory like China, which crossed several climate zones. In addition, the prediction of the previous model may not be accurate in some areas. For example, there were records of Cx. tritaeniorhynchus in Hainan (Figure 1) but the previous prediction for Hainan was extreme low. Sampling bias was also treated reasonably in our model. Based on these efforts, we obtained a more reliable and precise prediction result for China and discovered potential habitats in Xinjiang and Tibet. Furthermore, our study is unique in its purpose of exploring the effects of hosts and environment on Cx. tritaeniorhynchus distribution. Host factors (densities of humans and pigs) were introduced into our models. By establishing the "Host" model, we investigated the effects of the existence of hosts on mosquitoes. Similarly, by establishing the "Without Host" model, we obtained the true impact of environmental variables on the distribution of Cx. tritaeniorhynchus and defined all possible suitable mosquito habitats for breeding, resting, and foraging activities. The results have important significance for guiding the targeted prevention and control of JE in China. Immunization should be promoted for susceptible human and livestock populations in the high-risk areas predicted by our models. The "Without Host" model gave a broader range of suitable areas for the Cx. tritaeniorhynchus survival, including areas that are sparsely populated at present. Possible future population migration needs to take this into account. At the same time, Cx. tritaeniorhynchus was also proved to be the vector for other zoonoses, such as Rift Valley fever, West Nile fever and Dengue fever. Pathogen surveillance in vectors of these diseases is necessary, especially as RVF and WNV have not yet been reported in China. These two exotic animal diseases have been listed in the Medium and Long-term State Strategy of Animal Disease Prevention and Control of China.
A mark-release-recapture study has shown the ability of Cx. tritaeniorhynchus to disperse more than 5 km, which may cause additional errors with predictions at the 1 km × 1 km resolution [81]. Due to many factors, it is difficult to completely randomize the experimental sampling sites. As with other presence-only modeling approaches, sampling bias is an unavoidable problem. Although we have performed a selection of background points to reduce the influence of sampling bias on the prediction results, the predictions should be interpreted with caution and need to be corroborated by mosquito surveys [40,82].
Our current research on the impact of environmental factors on mosquito distribution is limited, as we only used data on overall climate and vegetation status. If more detailed data were to be introduced in a further study, such as monthly meteorological data, there would be a deeper understanding of the impact at different time-points.

Conclusions
In our current study, we modelled the spatial distribution of Cx. tritaeniorhynchus in China, based on the Maxent niche modelling method. The central, eastern, southwestern and coastal area of China were predicted to be suitable habitats for Cx. tritaeniorhynchus. The most powerful predictors were human population density and maximum NDVI. Several bioclimate factors (Bio 5, Bio 11, Bio 14 and Bio 18) and elevation also had significant impacts on the distribution of Cx. tritaeniorhynchus. Our research reveals the relationship between habitat preference of Cx. tritaeniorhynchus and environmental factors. The results have important implications for the prevention and control of JE and other vector-borne infectious diseases.