The Impacts of Transportation Infrastructure on Sustainable Development: Emerging Trends and Challenges

Transportation infrastructure has an enormous impact on sustainable development. To identify multiple impacts of transportation infrastructure and show emerging trends and challenges, this paper presents a scientometric review based on 2543 published articles from 2000 to 2017 through co-author, co-occurring and co-citation analysis. In addition, the hierarchy of key concepts was analyzed to show emerging research objects, methods and levels according to the clustering information, which includes title, keyword and abstract. The results expressed by visual graphs compared high-impact authors, collaborative relationships among institutions in developed and developing countries. In addition, representative research issues related to the economy, society and environment were identified such as cost overrun, spatial economy, prioritizing structure, local development and land value. Additionally, two future directions, integrated research of various effects and structure analysis of transportation network, are recommended. The findings of this study provide researchers and practitioners with an in-depth understanding of transportation infrastructure’s impacts on sustainable development by visual expression.


Introduction
Transportation infrastructure, as a complex network, connects cities and accommodates human activities coupling the social, economic and environmental systems with the urbanization and population growth. Additionally, the transportation network contributes to the socioeconomic development and the increased quality of life through generating inter-or intra-city connections during urbanization [1,2]. In addition, goals such as low-carbon, resilient and sustainable development should not be ignored when the transportation network is expanded [3]. In detail, transportation infrastructure among cities leads to urban aggregation and diffusion, greatly boosting the regional and national economic development [4,5]. However, the irrational planning of transportation infrastructure also generates negative effects, such as the ecological destruction, increased traffic accidents, climate change, CO 2 emissions and lower transport efficiency [6][7][8][9][10][11]. Therefore, it is necessary to identify multiple impacts of transportation infrastructure from existing studies.
Recently, the impact of transportation infrastructure has been a hot topic, and the economic effect of transportation infrastructure has been receiving more attention and debate [12] because of the pursuit to direct economic growth of both regions and sectors [13]. To review multiple impacts of transportation infrastructure, scientometric studies have been used to analyze the literature and

Method
In the field of transportation, many reviews have been published to identify the research status, while most reviews have mainly depended on researchers' backgrounds and experiences. To build an overview of existing studies with a relatively complete literature, the scientometrics method was used to find out the scientific regularity related to the effects of transportation research based on mathematical statistics and computing techniques [45]. In addition, scientometric analysis mainly depends on bibliographic data to identify the research trends and literature relationships [46]. The scientometric method was used in this study to build a visual information graph for further data mining using the software Citespace (http://cluster.cis.drexel.edu/~cchen/citespace/). The visualization process of the bibliography is meaningful for discovering the potential information based on the graphical representation of data using shapes, colors and images [47]. This method reduces the difficulty in analyzing a large literature, and effectively finds the regularity and the hidden information in existing studies. In this section, the data overview and research path of the scientometric method are presented.

Data Overview
The Web of Science (WOS) database was used to collect published literature data related to the transportation infrastructure. Apart from WOS, Google Scholar's database is extensive, but its citation information is incomplete and inconsistent [48]. Therefore, it is difficult to use for scientometric analysis. In addition, WOS contains the most important and influential journals in the world [49,50]. The impact of transportation infrastructure includes many categories, such as human, economic and environmental. Therefore, in this section, a comprehensive data overview is presented to show the trend of existing studies. In addition, "transport infrastructure" and "transportation infrastructure" were used as keywords to collect the data, initially.
This paper analyzed all collected literature in the WOS core database from 2000 to 2017. The search code SS = (transport* infrastructure*) was used in the WOS core collection. Here, "*" denotes a fuzzy search and "SS" means an article subject search. A total of 2543 bibliographic records were collected in October, 2017, and there are 14 related records filtered by being highly cited in the field, as shown in Table 1. Highly cited papers are the top one percent in each of the 22 Essential Science Indicators (ESI) subject areas per year, which indicates scientific excellence. We can see that this literature is distributed over recent years, and almost all records are related to the environmental dimensions. It is notable that the highest cited article was published in 2012 and is about biofuel application in transportation vehicles. Additionally, Figure 1 shows the top 20 research fields related to transportation infrastructure, including engineering, transportation, business economics, environmental sciences, computer science, geography, public administration, urban studies, and so on. This means the studies related to transportation infrastructure range from the technological level to the management level, providing more challenges and opportunities to interdisciplinary research.
The data overview above shows the overall research trends and fields. According to the research scope and objects, some keywords are chosen to filter the results that are more related to the spillover effects of the transportation infrastructure network. Then the words (SS' = effect* or affect* or influence* or impact*) were selected to refine the results, and a total of 1568 bibliographic records were searched. This step refined the records referring to the impact of transportation infrastructures or other effects on transportation infrastructure. Finally, the main keywords (SS" = railway* or rail* or road* or highway* or expressway* or freeway*) were used to further refine these results in accordance with the specific research objects of this paper, and got 764 records. Figure 2 shows the distribution of 1568 bibliographic records related to the two-way influence of transportation infrastructure and the records related to the influence of railway or road. In addition, the final 531 papers were used for further review analysis. Multi-step data filtering benefits a narrow data range, promoting study depth and guaranteeing the data integrity. It is clear that the distribution trend of is similar between the original data and the filtered data, which means the impact of railway and road could follow the path of the development of transportation infrastructure. In addition, research about the impact of railways and roads accounts for around 20-40% of research on transportation infrastructure during the timespan. In other words, the analysis of railways and roads partially represents the transportation infrastructure.

Scientometric Method
Scientometric analysis is a systematic method to identify and analyze the published literature, and it has become increasingly frequently used to obtain a deeper understanding of a research area [51]. In addition, this analysis has been recognized as an efficient method to identify the hidden information in published bibliographies [52]. In the field of transportation, scientometric analysis has been used as a quantitative approach to identify research phenomena and trends [14,15], but these previous studies did not systematically analyze the research network or recognize the hidden research trends and relations. The software CiteSpace can visualize the emerging trends, transient patterns, substantial theoretical and methodological contributions in scientific literature from the perspective of a social network [53,54]. The accessible graphs based on network analysis and clustering algorithms are able to show the knowledge more logically and systematically [55]. Therefore, CiteSpace was used to identify and analyze the main effects of transportation infrastructure on sustainable development based on the literature. In this study, some scientometric techniques were used, such as fundamental information analysis (author, institution and country) and network analysis (subject, keywords and co-citation). According to these analysis results, the research challenges and trends were further systematically analyzed.
In detail, the research procedure of this study includes three main parts, according to the collected bibliographic data, as shown in Figure 3. Firstly, 2543 records were collected to perform the data overview, including the highly cited analysis and the top 20 research fields. After the filtering, 2056 records were analyzed by CiteSpace software to show representative people, institutions, countries and relationships among them. Then the dual-map overlay and keyword network of the literature were analyzed to show representative research subjects and issues. Additionally, references in the collected literature were analyzed to build the co-citation network, which generates the clustering information to expand the data source. Finally, according to the clustered information, the research status and trend were summarized systematically to generate the hierarchy of key concepts. All of these steps reviewed the bibliographic information from different dimensions to find the respective research issues. 16

Scientometric Method
Scientometric analysis is a systematic method to identify and analyze the published literature, and it has become increasingly frequently used to obtain a deeper understanding of a research area [51]. In addition, this analysis has been recognized as an efficient method to identify the hidden information in published bibliographies [52]. In the field of transportation, scientometric analysis has been used as a quantitative approach to identify research phenomena and trends [14,15], but these previous studies did not systematically analyze the research network or recognize the hidden research trends and relations. The software CiteSpace can visualize the emerging trends, transient patterns, substantial theoretical and methodological contributions in scientific literature from the perspective of a social network [53,54]. The accessible graphs based on network analysis and clustering algorithms are able to show the knowledge more logically and systematically [55]. Therefore, CiteSpace was used to identify and analyze the main effects of transportation infrastructure on sustainable development based on the literature. In this study, some scientometric techniques were used, such as fundamental information analysis (author, institution and country) and network analysis (subject, keywords and co-citation). According to these analysis results, the research challenges and trends were further systematically analyzed.
In detail, the research procedure of this study includes three main parts, according to the collected bibliographic data, as shown in Figure 3. Firstly, 2543 records were collected to perform the data overview, including the highly cited analysis and the top 20 research fields. After the filtering, 2056 records were analyzed by CiteSpace software to show representative people, institutions, countries and relationships among them. Then the dual-map overlay and keyword network of the literature were analyzed to show representative research subjects and issues. Additionally, references in the collected literature were analyzed to build the co-citation network, which generates the clustering information to expand the data source. Finally, according to the clustered information, the research status and trend were summarized systematically to generate the hierarchy of key concepts. All of these steps reviewed the bibliographic information from different dimensions to find the respective research issues.

Co-Authorship Analysis
According to the author collaboration analysis, the domain authors have a relatively large number of links to other authors in the network, which means the domain authors have higher academic relevance [56]. In this study, 2056 valid bibliographic records were collected from 2000 to 2017. The co-authorship network is shown in Figure 4, where each node represents an author and links between authors denote collaboration established through co-authorship of articles. In this network, excessive links were removed by Pathfinder using network pruning [57], and eventually 189 nodes and 173 links were identified. In addition, the node size represents the frequency of published references and the node color accounts for different collaboration modularity.
According to the cluster information from 2000 to 2017, the network density is 0.0097 and the cluster modularity Q is 0.8626, which means the network of co-authors is fragmented. Only a few closed circuits exist in the network, such as the Mulley C group, Flyvbjerg B group and Ogilvie D group. As shown in the left-bottom graph of Figure 4, many authors collaborated with one or two productive authors. For example, Flyvbjerg Band Van Wee B were both productive and central authors in the community. All centralities of these groups are small, which indicates that in the timespan 2000-2010, important collaboration groups were not formed by author centrality. In order to determine the timeliness of the study, the research period was limited to 2010-2017. In addition, the right-bottom graph of Figure 4 shows the author collaboration network during this period. In this network, 1899 valid records are included and there are 185 nodes and 172 links. The network density is 0.0101, which is similar to the left graph. The network modularity and network structure only change slightly. It is notable that the author clusters change slightly, which means these central authors play significant roles in this field. Overall, from the perspective of timespan, author collaboration groups remained stable and relatively separate from the increased cumulative number of works in the published literature.

Co-Authorship Analysis
According to the author collaboration analysis, the domain authors have a relatively large number of links to other authors in the network, which means the domain authors have higher academic relevance [56]. In this study, 2056 valid bibliographic records were collected from 2000 to 2017. The co-authorship network is shown in Figure 4, where each node represents an author and links between authors denote collaboration established through co-authorship of articles. In this network, excessive links were removed by Pathfinder using network pruning [57], and eventually 189 nodes and 173 links were identified. In addition, the node size represents the frequency of published references and the node color accounts for different collaboration modularity.
According to the cluster information from 2000 to 2017, the network density is 0.0097 and the cluster modularity Q is 0.8626, which means the network of co-authors is fragmented. Only a few closed circuits exist in the network, such as the Mulley C group, Flyvbjerg B group and Ogilvie D group. As shown in the left-bottom graph of Figure 4, many authors collaborated with one or two productive authors. For example, Flyvbjerg Band Van Wee B were both productive and central authors in the community. All centralities of these groups are small, which indicates that in the timespan 2000-2010, important collaboration groups were not formed by author centrality. In order to determine the timeliness of the study, the research period was limited to 2010-2017. In addition, the right-bottom graph of Figure 4 shows the author collaboration network during this period. In this network, 1899 valid records are included and there are 185 nodes and 172 links. The network density is 0.0101, which is similar to the left graph. The network modularity and network structure only change slightly. It is notable that the author clusters change slightly, which means these central authors play significant roles in this field. Overall, from the perspective of timespan, author collaboration groups remained stable and relatively separate from the increased cumulative number of works in the published literature.

Co-Author's Institution and Country Analysis
As shown in Figure 5, the institution network includes 280 nodes and 236 links from 2000 to 2017. The node size represents the amount of published literature from one institution. According to the collected information, the studies related to transportation infrastructure were rich at institutions such as Delft University of Technology (50 records), University of Sydney (33 records), Universidad Politécnica de Madrid (29 records), University College London (28 records), University of Oxford (28 records) and Chinese Academy of Sciences (25 records). This indicates that transportation infrastructure research was active and advanced. In addition, institution nodes with high betweenness centrality are shown in Figure 5. The size of the colored circle represents the amount of published literature in one institution, and different colors show the number in different years. Institutions with high centrality play important roles in the institution network, such as Delft University of Technology (centrality = 0.24), University of Illinois (centrality = 0.17), Georgia Institute of Technology (centrality = 0.14), Shanghai Jiao Tong University (centrality = 0.12) and University of Florida (centrality = 0.10), and they drove the research collaborations among different institutions. Apart from the Delft University of Technology, the top productive institutions did not have higher relative centrality. This means that institutions that published more articles did not play an equally important role in the collaboration network. The institutions with higher centrality would have greater potential.   Japan (centrality = 0.14), which means that they occupied key positions in the collaboration network. During 2010 to 2017, as shown in the right graph, the top 5 countries with the highest centrality include USA (centrality = 0.53), England (centrality = 0.31), Germany (centrality = 0.17), Australia (centrality = 0.15) and the People's Republic of China (centrality = 0.1). We can see the centralities of USA and England experienced a decrease and the roles of Germany, Australia and the People's Republic of China became increasingly significant. Additionally, apart from the top central countries, Spain, Netherlands and Canada had higher published frequencies, which indicates their higher relative potentials. According to the clustering results, we can see the change of research interests. The labels of clusters were generated by log-likelihood ratio method in the software. It is notable that during 2000-2010, an overview of popular topics included infrastructure surveillance, local development and evidence; during 2010-2017 those consisted of transportation decision, regional development and infrastructure surveillance. In addition, the clustering members experienced an increase and transfer.

Discipline Analysis
Every citation and cited work was assigned to a specific research discipline according to the journals in a global map of science generated from over 10,000 journals indexed in the WOS [58]. Therefore, this study built an overlay map to show the dual-map of the science sketch database that perfectly described the interdisciplinary research. Figure 7 shows the main disciplines of collected citing articles and cited articles. The left part of the graph shows the distributed disciplines of citing articles and the right part describes that of cited articles. In addition, the color curves represent the fluctuant relations. It is clear that the journals of citing articles related to transportation infrastructure are mainly distributed in disciplines such as mathematics, systems, economics and physics. Cited articles' journals are mainly distributed in the areas of ecology, computer, social education and economics. The distribution of cited articles indicates the application fields and research foundations. More importantly, transportation infrastructure papers are published in almost all major disciplines, which means transportation infrastructure studies play important roles in multidisciplinary research. Additionally, the dual-map overlay shows the information about the field studies more macro compared with article clustering analysis. Thus, Figure 8 shows the interdisciplinary co-occurring network of the literature based on the WOS discipline categories. We can see that the top frequent disciplines include Engineering, Transportation and Business & Economics. The links among different nodes mean the existence of collaboration among different disciplines. Interdisciplinary research is quite obvious in the field of transportation infrastructure.

Discipline Analysis
Every citation and cited work was assigned to a specific research discipline according to the journals in a global map of science generated from over 10,000 journals indexed in the WOS [58]. Therefore, this study built an overlay map to show the dual-map of the science sketch database that perfectly described the interdisciplinary research. Figure 7 shows the main disciplines of collected citing articles and cited articles. The left part of the graph shows the distributed disciplines of citing articles and the right part describes that of cited articles. In addition, the color curves represent the fluctuant relations. It is clear that the journals of citing articles related to transportation infrastructure are mainly distributed in disciplines such as mathematics, systems, economics and physics. Cited articles' journals are mainly distributed in the areas of ecology, computer, social education and economics. The distribution of cited articles indicates the application fields and research foundations. More importantly, transportation infrastructure papers are published in almost all major disciplines, which means transportation infrastructure studies play important roles in multidisciplinary research. Additionally, the dual-map overlay shows the information about the field studies more macro compared with article clustering analysis. Thus, Figure 8 shows the interdisciplinary co-occurring network of the literature based on the WOS discipline categories. We can see that the top frequent disciplines include Engineering, Transportation and Business & Economics. The links among different nodes mean the existence of collaboration among different disciplines. Interdisciplinary research is quite obvious in the field of transportation infrastructure.

Co-Occurring Keyword Analysis
Keywords catch the core content of a paper, and in this section, the collected keywords show the situation and development of research using the software CiteSpace. According to the 2056 valid records collected, the keyword co-occurring network includes 225 nodes and 1092 links shown in Figure 9. The node size represents the frequency of a keyword in all records and links among nodes indicate different keywords occurring in the same record. The t-SNE view was used to lay out the keyword map. The t-SNE technique is a perfect visual method for this map, and gave a complete and clear description. Among the top 50 hot keywords, the most frequent keywords include model (frequency = 176), impact (frequency = 120), system (frequency = 109), growth (frequency = 94), investment (frequency = 86), network (frequency = 85), accessibility (frequency = 80), city (frequency = 74) and policy (frequency = 72), in addition to transportation infrastructure (frequency = 225). More importantly, China (frequency = 86) and the United States (frequency = 52) are two representative country keywords, which means that in these two countries, studies related to transportation infrastructure attracted more attention. In addition, some keywords with high frequency had a relatively high centrality, such as city (centrality = 0.15), network (centrality = 0.14), investment (centrality = 0.12) and impact (centrality = 0.09). To indicate the change of hot topics, we divided the timespan into 2010-2010 and 2010-2017, as shown in Figure 9. The top three keywords are model, infrastructure and impact. The related keywords experienced a significant increase; in particular, keyword impact-related topics included climate, urban studies, land use, resilience and accessibility, which indicated this role. However, this network only shows information based on the collected records, and its difference from the co-citation network is the limitation of this relatively incomplete data. Therefore, the co-citation analysis further solves the data incompleteness in the next section.

Co-Occurring Keyword Analysis
Keywords catch the core content of a paper, and in this section, the collected keywords show the situation and development of research using the software CiteSpace. According to the 2056 valid records collected, the keyword co-occurring network includes 225 nodes and 1092 links shown in Figure 9. The node size represents the frequency of a keyword in all records and links among nodes indicate different keywords occurring in the same record. The t-SNE view was used to lay out the keyword map. The t-SNE technique is a perfect visual method for this map, and gave a complete and clear description. Among the top 50 hot keywords, the most frequent keywords include model (frequency = 176), impact (frequency = 120), system (frequency = 109), growth (frequency = 94), investment (frequency = 86), network (frequency = 85), accessibility (frequency = 80), city (frequency = 74) and policy (frequency = 72), in addition to transportation infrastructure (frequency = 225). More importantly, China (frequency = 86) and the United States (frequency = 52) are two representative country keywords, which means that in these two countries, studies related to transportation infrastructure attracted more attention. In addition, some keywords with high frequency had a relatively high centrality, such as city (centrality = 0.15), network (centrality = 0.14), investment (centrality = 0.12) and impact (centrality = 0.09). To indicate the change of hot topics, we divided the timespan into 2010-2010 and 2010-2017, as shown in Figure 9. The top three keywords are model, infrastructure and impact. The related keywords experienced a significant increase; in particular, keyword impact-related topics included climate, urban studies, land use, resilience and accessibility, which indicated this role. However, this network only shows information based on the collected records, and its difference from the co-citation network is the limitation of this relatively incomplete data. Therefore, the co-citation analysis further solves the data incompleteness in the next section.

Co-Citation Analysis
Co-citation analysis has been defined as the frequency with which two articles are cited together in another article [59]. In this section, co-citation analysis identifies the underlying intellectual structures of the knowledge in the field of transportation infrastructure according to references. The co-citation network was generated based on 2047 valid records between 2000 and 2017, and the top 50 most cited publications in each year were used to construct a network of references cited in that year. As shown in Figure 10, the synthesized network contains 879 references and 174 co-citation clusters after the clustering process. This network has a modularity of 0.8934, which is considered to be very high, suggesting that the specialties in science mapping are clearly defined in terms of cocitation clusters. The mean silhouette is 0.3855, which is relatively low, mainly because of the numerous small clusters. The major clusters that we focus on in this paper were sufficiently high. The areas in different colors indicate the time at which co-citation links in those areas appeared for the first time. Areas in green were generated earlier than areas in yellow. Each cluster can be labeled by title terms, keywords, and abstract terms of articles citing the cluster. We can see that studies related to new application, cost overruns and case study appeared earlier, and urban transportation and public-private partnerships appeared more recently. In addition, cluster areas of new transport infrastructure, cost overruns and evidence study are relatively bigger, which means that these studies received more attention. According to the LLR, labels of the largest 62 clusters were summarized as shown in Appendix A and the most active citer can be checked in Appendix B.

Co-Citation Analysis
Co-citation analysis has been defined as the frequency with which two articles are cited together in another article [59]. In this section, co-citation analysis identifies the underlying intellectual structures of the knowledge in the field of transportation infrastructure according to references. The co-citation network was generated based on 2047 valid records between 2000 and 2017, and the top 50 most cited publications in each year were used to construct a network of references cited in that year. As shown in Figure 10, the synthesized network contains 879 references and 174 co-citation clusters after the clustering process. This network has a modularity of 0.8934, which is considered to be very high, suggesting that the specialties in science mapping are clearly defined in terms of co-citation clusters. The mean silhouette is 0.3855, which is relatively low, mainly because of the numerous small clusters. The major clusters that we focus on in this paper were sufficiently high. The areas in different colors indicate the time at which co-citation links in those areas appeared for the first time. Areas in green were generated earlier than areas in yellow. Each cluster can be labeled by title terms, keywords, and abstract terms of articles citing the cluster. We can see that studies related to new application, cost overruns and case study appeared earlier, and urban transportation and public-private partnerships appeared more recently. In addition, cluster areas of new transport infrastructure, cost overruns and evidence study are relatively bigger, which means that these studies received more attention. According to the LLR, labels of the largest 62 clusters were summarized as shown in Appendix A and the most active citer can be checked in Appendix B. In addition, the timeline visualization in CiteSpace depicted clusters along horizontal timelines. As shown in Figure 11, each cluster was displayed from left to right and clusters were arranged vertically in descending order of their size. The colored curves represent co-citation links added in the year of the corresponding color. Large-sized nodes or nodes with red tree rings received particular attention because they were either highly cited or had citation bursts, or both. We can see that the three most-cited references in a particular year are displayed. The labels of these references were placed in the lowest position. The cluster labels were generated based on terms identified by Latent Semantic Indexing (LSI) [60]. Figure 11 shows the top 2 largest clusters, listed as cluster#0 and cluster#1. The periods in which the clusters were sustained were different, which means that the difference of topic activity. For example, topic #0 (cost overrun) was active during the period from 2008 to 2017 and most of the top active topics were active about 20 years. Furthermore, the top ten largest clusters include cost overrun, quantitative spatial economics, prioritizing highway defragmentation location, local development, land value, regional economic growth, new transportation infrastructure, public-private partnerships, infrastructure change region, recent laboratory research and microbial engineering. All of these clusters have relative network substructures and research status, and trends hide in these references. For example, for the cluster around spatial economics, 2011 to 2012 was the most active timespan for citers.
The analysis above shows the research base and fronts that mine the potential research challenges and trends. In addition, main research topics were further analyzed according to the selected and filtered data above. Table 3 shows the temporal properties of major clusters. We can see that most of the representative references are related to the spillover effect of the transportation infrastructure. For example, Cluster #0 (cost overrun) is the largest cluster, containing 94 references from 2011 to 2017. The mean year of all references is 2008 and the year of the most representative cited articles in this cluster is 2008, too. The timeline visualization reveals the top three cited references from the period of 2000 to 2017. As shown in Figure 11, the three most representative cited references (Priemus Hugo, Banister David and Khadaroo Jameel) occur in 2008. We can see that the period 2008 to 2016 was full of high-impact contributions-large colored citation circles and red citation bursts. We chose the top three cited circles and nine references to analyze the main research topics. Similarly, in the other five clusters, the top three circles and nine representative references were chosen to further analyze the hot research status and research trends. Appendix C shows the high-impact members of the other clusters. These authors may be not the most highly cited authors, but they play important roles in the corresponding fields. In addition, the timeline visualization in CiteSpace depicted clusters along horizontal timelines. As shown in Figure 11, each cluster was displayed from left to right and clusters were arranged vertically in descending order of their size. The colored curves represent co-citation links added in the year of the corresponding color. Large-sized nodes or nodes with red tree rings received particular attention because they were either highly cited or had citation bursts, or both. We can see that the three most-cited references in a particular year are displayed. The labels of these references were placed in the lowest position. The cluster labels were generated based on terms identified by Latent Semantic Indexing (LSI) [60]. Figure 11 shows the top 2 largest clusters, listed as cluster#0 and cluster#1. The periods in which the clusters were sustained were different, which means that the difference of topic activity. For example, topic #0 (cost overrun) was active during the period from 2008 to 2017 and most of the top active topics were active about 20 years. Furthermore, the top ten largest clusters include cost overrun, quantitative spatial economics, prioritizing highway defragmentation location, local development, land value, regional economic growth, new transportation infrastructure, public-private partnerships, infrastructure change region, recent laboratory research and microbial engineering. All of these clusters have relative network sub-structures and research status, and trends hide in these references. For example, for the cluster around spatial economics, 2011 to 2012 was the most active timespan for citers.
The analysis above shows the research base and fronts that mine the potential research challenges and trends. In addition, main research topics were further analyzed according to the selected and filtered data above. Table 3 shows the temporal properties of major clusters. We can see that most of the representative references are related to the spillover effect of the transportation infrastructure. For example, Cluster #0 (cost overrun) is the largest cluster, containing 94 references from 2011 to 2017.
The mean year of all references is 2008 and the year of the most representative cited articles in this cluster is 2008, too. The timeline visualization reveals the top three cited references from the period of 2000 to 2017. As shown in Figure 11, the three most representative cited references (Priemus Hugo, Banister David and Khadaroo Jameel) occur in 2008. We can see that the period 2008 to 2016 was full of high-impact contributions-large colored citation circles and red citation bursts. We chose the top three cited circles and nine references to analyze the main research topics. Similarly, in the other five clusters, the top three circles and nine representative references were chosen to further analyze the hot research status and research trends. Appendix C shows the high-impact members of the other clusters. These authors may be not the most highly cited authors, but they play important roles in the corresponding fields.  Figure 11. A timeline visualization of the largest clusters.

Hierarchy Analysis of Key Concepts
The co-citation network above was divided into 174 co-citation clusters. These clusters were labeled by index terms from their own citers. These keywords show the most representative research topics related to transportation infrastructure. The left part of Figure 12 shows the word cloud based on cluster labels filtered by the same or similar labels of clusters. In this figure, the keyword size represents the frequency of cluster labels. It is clear that the main research topics include economic, region or urban development and spatial effect analysis. However, the cluster data only analyzed the label information, and did not identify other potentially relevant information. Therefore, a report of automatically generated narratives was used to analyze the word cloud distribution further, as shown in the right part of Figure 12. The narratives include the main subjects in the titles and abstracts of the top references in the top 62 clusters that are relatively complete. We can see that hot research topics consist of urban development, project, economic, cost and policies. In particular, we identified Figure 11. A timeline visualization of the largest clusters.

Hierarchy Analysis of Key Concepts
The co-citation network above was divided into 174 co-citation clusters. These clusters were labeled by index terms from their own citers. These keywords show the most representative research topics related to transportation infrastructure. The left part of Figure 12 shows the word cloud based on cluster labels filtered by the same or similar labels of clusters. In this figure, the keyword size represents the frequency of cluster labels. It is clear that the main research topics include economic, region or urban development and spatial effect analysis. However, the cluster data only analyzed the label information, and did not identify other potentially relevant information. Therefore, a report of automatically generated narratives was used to analyze the word cloud distribution further, as shown in the right part of Figure 12. The narratives include the main subjects in the titles and abstracts of the top references in the top 62 clusters that are relatively complete. We can see that hot research topics consist of urban development, project, economic, cost and policies. In particular, we identified some potential topics that were excluded in the left graph, such as land, risk, panel data and policies. By means of the two-step summary, potential keywords could be easily identified. some potential topics that were excluded in the left graph, such as land, risk, panel data and policies. By means of the two-step summary, potential keywords could be easily identified. Additionally, key concepts identified from the titles of citing articles in Cluster #0 were algorithmically organized according to hierarchical relations derived from co-occurring concepts. Figure 13 shows the main concept tree of Cluster #0. The largest branch of such a hierarchy typically reflects the main concepts of scholarly publications produced by the specialty behind the cluster. The main logical categories include transport infrastructure, projects, cost overruns and impact. The category "Transportation Infrastructure" mainly focuses on improving the project performance separately from the traditional and important problem "Cost Overrun". In particular, the category "impact" emerged gradually as an independent branch, which was driven by the increasing quantity and complexity of transportation infrastructure. It is notable that the transportation infrastructure branch highlights the characteristics (large, resilience, spatial and complexity), research methods (modeling, econometric and network mapping) and research questions (quality, risk, performance and PPP). In other words, sub-categories in this figure indicate the characteristics, questions, objects, dimensions and methods related to transportation infrastructure. The identity of category labels based on the title data obeys the logical tree algorithm of the software Citespace. This figure not only shows the main research topics but the logical relationships among these topics. To understand the hierarchy better, the key concepts in the top 7 Clusters (#0-#6) were identified in one hierarchy. Figure 14 summarizes the concept tree generated by Citespace according to the reference titles in Cluster #0-#6. The categories colored blue were identified automatically. For the systematic expression of the hierarchy, some branches colored green are used to conclude the fragmented research questions. In addition, this hierarchy filtered the repeating keywords and Additionally, key concepts identified from the titles of citing articles in Cluster #0 were algorithmically organized according to hierarchical relations derived from co-occurring concepts. Figure 13 shows the main concept tree of Cluster #0. The largest branch of such a hierarchy typically reflects the main concepts of scholarly publications produced by the specialty behind the cluster. The main logical categories include transport infrastructure, projects, cost overruns and impact. The category "Transportation Infrastructure" mainly focuses on improving the project performance separately from the traditional and important problem "Cost Overrun". In particular, the category "impact" emerged gradually as an independent branch, which was driven by the increasing quantity and complexity of transportation infrastructure. It is notable that the transportation infrastructure branch highlights the characteristics (large, resilience, spatial and complexity), research methods (modeling, econometric and network mapping) and research questions (quality, risk, performance and PPP). In other words, sub-categories in this figure indicate the characteristics, questions, objects, dimensions and methods related to transportation infrastructure. The identity of category labels based on the title data obeys the logical tree algorithm of the software Citespace. This figure not only shows the main research topics but the logical relationships among these topics. some potential topics that were excluded in the left graph, such as land, risk, panel data and policies. By means of the two-step summary, potential keywords could be easily identified. Additionally, key concepts identified from the titles of citing articles in Cluster #0 were algorithmically organized according to hierarchical relations derived from co-occurring concepts. Figure 13 shows the main concept tree of Cluster #0. The largest branch of such a hierarchy typically reflects the main concepts of scholarly publications produced by the specialty behind the cluster. The main logical categories include transport infrastructure, projects, cost overruns and impact. The category "Transportation Infrastructure" mainly focuses on improving the project performance separately from the traditional and important problem "Cost Overrun". In particular, the category "impact" emerged gradually as an independent branch, which was driven by the increasing quantity and complexity of transportation infrastructure. It is notable that the transportation infrastructure branch highlights the characteristics (large, resilience, spatial and complexity), research methods (modeling, econometric and network mapping) and research questions (quality, risk, performance and PPP). In other words, sub-categories in this figure indicate the characteristics, questions, objects, dimensions and methods related to transportation infrastructure. The identity of category labels based on the title data obeys the logical tree algorithm of the software Citespace. This figure not only shows the main research topics but the logical relationships among these topics. To understand the hierarchy better, the key concepts in the top 7 Clusters (#0-#6) were identified in one hierarchy. Figure 14 summarizes the concept tree generated by Citespace according to the reference titles in Cluster #0-#6. The categories colored blue were identified automatically. For the systematic expression of the hierarchy, some branches colored green are used to conclude the fragmented research questions. In addition, this hierarchy filtered the repeating keywords and To understand the hierarchy better, the key concepts in the top 7 Clusters (#0-#6) were identified in one hierarchy. Figure 14 summarizes the concept tree generated by Citespace according to the reference titles in Cluster #0-#6. The categories colored blue were identified automatically. For the systematic expression of the hierarchy, some branches colored green are used to conclude the fragmented research questions. In addition, this hierarchy filtered the repeating keywords and deleted words which cannot indicate the main research questions, such as "analysis". However, the sub-category concepts of the branch "analysis" were distributed in other branches. Meanwhile, some sub-categories of the branch "transportation infrastructure" were distributed in the summative branches such as the "objects" and "methods". We can see from Figure 14 that the main branch is "impacts", in which the "spillover effects" and "countries" are listed separately. This means the topic of spillover effect is the intensive research issue, and there are many countries analyzing the impacts of transportation infrastructure on the national scale. In addition, the "impact" category summarized some detailed topics such as land use, urban development and spatial effect. Compared with Figure 13, this hierarchy identified more specific topics, such as rail and road research. Although the amount of data in Figure 14 is about seven times greater than in Figure 13, the hierarchy framework becomes more clear and systematic after filtering out repeated data. More importantly, this systematic hierarchy can help to identify the hottest and most representative research issues quickly. deleted words which cannot indicate the main research questions, such as "analysis". However, the sub-category concepts of the branch "analysis" were distributed in other branches. Meanwhile, some sub-categories of the branch "transportation infrastructure" were distributed in the summative branches such as the "objects" and "methods". We can see from Figure 14 that the main branch is "impacts", in which the "spillover effects" and "countries" are listed separately. This means the topic of spillover effect is the intensive research issue, and there are many countries analyzing the impacts of transportation infrastructure on the national scale. In addition, the "impact" category summarized some detailed topics such as land use, urban development and spatial effect. Compared with Figure  13, this hierarchy identified more specific topics, such as rail and road research. Although the amount of data in Figure 14 is about seven times greater than in Figure 13, the hierarchy framework becomes more clear and systematic after filtering out repeated data. More importantly, this systematic hierarchy can help to identify the hottest and most representative research issues quickly.

Conclusions
This scientometric review based on over 2500 publications from 2000 to 2017 presented the systematic knowledge structure related to impacts of transportation infrastructure on sustainable development. Due to the complex impact mechanism, the identification process needs an in-depth understanding and clear expression. Although reviews related to transportation infrastructure have received attention, the scientometric review with visual expression provides a better way to explore the potential information hidden in knowledge network compared with the traditional review. In this paper, the presentation of scientometric and systematic reviews includes four main steps. Firstly, co-author analysis was used to identify the highly productive authors, institutions and countries to show the overall research status. Then, the co-occurring analysis was used to identify and visualize the overall research trends based on discipline and keyword information. Next, citing articles and cited references were modulated to find co-citation relationships and modularity labels by timeline visualization. Finally, after the modularity, the cluster information was analyzed further to conclude

Conclusions
This scientometric review based on over 2500 publications from 2000 to 2017 presented the systematic knowledge structure related to impacts of transportation infrastructure on sustainable development. Due to the complex impact mechanism, the identification process needs an in-depth understanding and clear expression. Although reviews related to transportation infrastructure have received attention, the scientometric review with visual expression provides a better way to explore the potential information hidden in knowledge network compared with the traditional review. In this paper, the presentation of scientometric and systematic reviews includes four main steps. Firstly, co-author analysis was used to identify the highly productive authors, institutions and countries to show the overall research status. Then, the co-occurring analysis was used to identify and visualize the overall research trends based on discipline and keyword information. Next, citing articles and cited references were modulated to find co-citation relationships and modularity labels by timeline visualization. Finally, after the modularity, the cluster information was analyzed further to conclude the hierarchy concepts of the main clusters, which accurately identified key points. These four steps analyzed the research status and emerging trends of transportation infrastructure's impacts on sustainable development from multiple perspectives, such as author information, collaborative relationships and reference relationships. In addition, compared with the traditional literature review, this scientometric analysis shows the representative information clearly based on a visual map. Importantly, this visual expression provides an easier way to understand the complex collaboration network of literature.
The main research findings are as follows. First, collaboration links among the most productive authors were more frequent than other authors. Moreover, the productive authors generally led to modularity. Second, institutions with high centrality play important roles in the institution network, such as Delft University of Technology, University of Illinois and Georgia Institute of Technology. In addition, countries occupying key positions include USA, England and China. Third, the hot topics related to transportation infrastructure include cost, performance, quality and investment issues from the project level. In addition, from a more macro perspective, economic, social and environmental effects of transportation infrastructure were all caught. Fourth, according to the hierarchy analysis, specific research objects, methods and multiple effects of transportation infrastructure were identified. It is noticeable that spillover effects of transportation infrastructure include some dependent sub-categories, such as spatial, regional, economic and environmental effects. These more macro keywords indicate the complexity of impact mechanisms. In addition, transportation infrastructure has huge impacts on land, urban development, human life and city networks.
However, there are also some limitations that need further improvement in this study. A limitation of using bibliographic databases is that the WOS lacks the information of books and reports in public sources, thus necessitating the integration of multiple data resources. In addition, due to the limitations of the analysis tool, our study could not analyze the information hidden in the references' context. Additionally, the determination of search keywords mainly relies on the subjective judgment of the authors, which might lead to data being missing or incomplete. Given these limitations, multiple analysis was exerted in this work to make up for the data limitations. In this study, the titles, keywords and abstracts could be credibly representative of the main context. Our findings not only reveal research trends, but future research directions. In the future, two directions-integrated study of various spillover effects and network effects of mega transportation infrastructures such as railway and road-will be valuable research issues. By conducting further research in these directions, an improved understanding of the significance of the transportation infrastructure will be obtained, and the planning of transport networks will be conducted under proper advice. In conclusion, this study provides valuable information for both researchers and practitioners to understand the significant and complex impact of transportation infrastructure. It is clear that in both technological issues and management issues, the impact assessment is the key step to justifying the research in the field of transportation infrastructure. This scientometric review will lead to the construction of a theoretical framework to guide this practice.