Quantifying Bicycle Network Connectivity in Lisbon Using Open Data †

: Stimulating non-motorized transport has been a key point on sustainable mobility agendas for cities around the world. Lisbon is no exception, as it invests in the implementation of new bike infrastructure. Quantifying the connectivity of such a bicycle network can help evaluate its current state and highlight speciﬁc challenges that should be addressed. Therefore, the aim of this study is to develop an exploratory score that allows a quantiﬁcation of the bicycle network connectivity in Lisbon based on open data. For each part of the city, a score was computed based on how many common destinations (e.g., schools, universities, supermarkets, hospitals) were located within an acceptable biking distance when using only bicycle lanes and roads with low trafﬁc stress for cyclists. Taking a weighted average of these scores resulted in an overall score for the city of Lisbon of only 8.6 out of 100 points. This shows, at a glance, that the city still has a long way to go before achieving their objectives regarding bicycle use in the city.


Introduction
Stimulating non-motorized transport has been one of the many strategies adopted by cities to tackle climate change [1] and boost their inhabitants' living conditions. Introducing sustainable mobility into an urban planning agenda has beneficial effects on public health, as it decreases air pollution and stimulates physical activity [2][3][4][5][6]. Additionally, it has proven to reduce traffic jams [7] and reactivate public spaces [8]. Therefore, investing in cycling and pedestrian infrastructure is a critical part of transportation policies in metropolitan areas.
The city of Lisbon has undertaken this investment by establishing new cycle ways and public bike-sharing stations, i.e., Gira Bicicletas [9]. The municipality hopes to turn the bicycle into a means of transport that, along with public transportation, will enable inhabitants to make safe and efficient short distance journeys, up to six kilometers, without the use of motorized private transport [10,11]. This is happening in a city where 89% of commuters use a private vehicle [12], with a car occupancy of 1.2 passengers per vehicle [13]. The municipality originally presented a plan which would introduce more than 150 km of new cycle ways (Figure 1) into the city's infrastructure by 2018 [14]. However, by early 2018, only 80 km of the proposed routes had actually been built. The planned completion date has been pushed back to 2020 [15].  [16] In the meantime, Lisbon's cyclists have increased in number [17], which intensifies the need to invest in well-connected and safe infrastructure, not only for regular commuter trips, but also for trips to common destinations (e.g., schools, universities, supermarkets, hospitals). Quantifying the bicycle network's connectivity can help city officials to evaluate its current state and identify specific challenges that should be addressed. The use of publicly available data sources should be favored for such inquiries, given their ease of access and knowledge sharing, constant updates, and reproducibility [18,19]. Therefore, the aim of this study is to develop an exploratory score that allows a quantification of the bicycle network connectivity in Lisbon based on open data, as a first attempt to apply such an index to a European city with a developing biking infrastructure.
The remainder of this paper will be structured as follows: Section 2 provides an overview of work related to bicycle network connectivity. Section 3 describes the specific approach on which this study is based. Section 4 details the adapted methodology applied to address the particular case of the Lisbon study area. Section 5 presents the main results, discusses them within the context of Lisbon, and compares them to other cities' examples. Section 6 suggests directions for future research and development and, finally, Section 7 concludes the report and summarizes its limitations.

Background
Bicycle network analyses have been undertaken from several angles, making use of GIS tools [20,21]. Connectivity analyses and measures have been developed and applied to understand how accessible a city's bike infrastructure really is (see Chapter 3 of [22] for an overview of these methods).
A recurrent approach within the literature is the quantification of cyclists' perceived traffic stress [23,24]. Mekuria, Furth and Nixon [25] proposed a scheme to classify road segments into four levels of traffic stress (LTS), ranging from LTS 1, the level suitable for children; to LTS 4, the level tolerated by "strong and fearless" cyclists [26]. Their approach was tested via a case study in San Diego, California, in which every street and crossing was classified by LTS. In later work, Furth, Mekuria and Nixon [27] developed a measure for the connectivity of low stress cycling networks, using origin-destination data from home-to-work trips. Applying their methodology once again in San Diego, California, they concluded that the road network is divided into disconnected islands of low-stress segments. Investments in new bicycle infrastructure between such islands could drastically improve the connectivity of low-stress networks.
Building on the findings of [25], Lowry et al. [28] created a tool for transportation planners to rank more than 750 bicycle improvement projects from the Seattle, Washington's Bicycle Master Plan, based on their potential to connect homes and important destinations over a low-stress network. The same master plan was evaluated in [29], by comparing the bicycle network connectivity for different types of cyclists and different neighborhoods in Seattle, in both the existing and proposed situation. The results showed that connectivity differs between neighborhoods and types of cyclists, and emphasized the importance of policies and programs that increase the confidence of cyclists. Both [28,29] focused solely on utilitarian travel, and did not distinguish between destinations of varying importance.
Boettge et al. [30] validated the LTS concept in a qualitative way, gathering information from individual cyclists using surveys. Their findings in St. Louis, Missouri, confirmed a positive correlation between cyclists' levels of stress on a road and the speed limit, number of lanes and functional class of the road. No relationship was found between specific bicycle facilities and LTS.

PeopleForBikes "Bike Network Analysis Score" Approach
Yet another example of the use of LTS to quantify bicycle network connectivity comes from the organization PeopleForBikes (PfB). They developed a scoring system called the Bike Network Analysis score (BNA score). The score applies a slightly modified version of LTS whose methodology is described by [25]. The modifications add bicycle facility types that were not considered by the initial LTS approach. Besides this, the method is the same, classifying the four original LTS levels into Low Stress (LTS 1 and LTS 2) and High Stress (LTS 3 and LTS 4).
The BNA score determines how people in a city can get to common destinations on a comfortable and connected bike network, i.e., a low stress network. The way these destinations are picked is similar to how Lowry et al. define their so-called basket of destination types, following the theory that "it is possible that some bicyclists would not need certain destinations in the basket, and it is also possible that different bicyclists would have unique preferences for particular destination types, e.g., preference for a particular restaurant. Nevertheless (...) the concept of a basket provides a means to calculate a meaningful metric with objectivity" ( [28], p. 130).
The score is computed by counting the total number of common destination points accessible within 10 min on a low-stress bike network from an origin census block. A scoring scale between 0 and 100 is assigned to each destination type, in a stepped manner. Finally, the scores for all census blocks are aggregated as a weighted averaged to get to one score for the whole city [31].
The PfB analysis is unique because the methodology is based on OpenStreetMap (OSM), a crowd-sourced project to create a free and editable map of the entire world. The use of OSM, although complemented by open governmental data from the 2011 US Census, still makes the tool easier to implement in other areas where street network and points of interest information might not be available to the public. Therefore, this paper used the PeopleForBikes open source methodology as a guide to develop a BNA score for the city of Lisbon [32].

Data Acquisition and Preprocessing
The computation of the BNA score for Lisbon in this paper is based solely on open data. The main sources used were OpenStreetMap (OSM), Lisbon Municipal Council's Spatial Open Data and Open Data Portal, and the National Statistics Institute.
The OSM data were downloaded in February 2018 through the built-in QGIS tool and loaded into a PostgreSQL database in two formats: the osm2pgrouting bicycle map configuration .XML file to generate edge/segment and node/intersection tables, and the osm2pgsql adjusted to the .STYLE file from PeopleForBikes to generate points, lines, and polygon base data.
The CML data consisted of Lisbon's municipal boundary, the Parishes belonging to the Council, and the existing cycleways. The Parishes data were merged with the corresponding population data obtained from the 2011 Portuguese Census.
The PeopleForBikes methodology uses census blocks as the unit of analysis. However, information at this administrative level was not available for the study area. Therefore, a hexagonal grid was laid over Lisbon's municipal area to perform the analysis at a higher spatial resolution and evaluate the results locally. A hexagonal shape was selected over a rectangular grid as it reduces the bias of the edge-effect generating a more symmetrical neighborhood and is thus more convenient for connectivity analyses [33]. Each hexagon in the grid had an x-spacing of 200 m. This value was arbitrarily chosen, but based on the underlying idea that all destinations in a single cell should be within a reasonable walking distance of each other. The population of each cell was estimated from the parish population data, by simply dividing the total population of a parish by the number of grid cells that belong predominantly to that parish.
For the pre-processing, all the data were projected to ETRS89/Portugal TM06 and clipped to the study area outline. Next, the point and polygon data were organized to create one homogeneous layer with common destination points. The centroids of the polygon data were calculated to be appended to the point data. For polygons that comprehended large areas (nature reserves and parks), a spatial join with the hexagonal grid was performed to assign a destination to each cell centroid that intersects these polygons.
Finally, the segment data were built upon attributes like maximum speed, number of lanes, road type, taken from OSM tags. To assure that the official cycleways in the CML data were comprised within the OSM segments, an intersection between them was performed. An additional variable concerning the mean slope was added for this analysis, given the challenging conditions of the study area. The slope was calculated for each segment from the Digital Terrain Model raster (49 m resolution), obtained from the Portuguese Open Data Portal.

Biking Network Stress Levels Classification
The segment data were categorized into two possible classes: low and high stress. This classification follows the PeopleForBikes simplification of the LTS mentioned in Section 3. The conditions to determine if a segment is of high or low stress are based on five variables: maximum speed, whether or not it is a residential area, the number of lanes, the slope, and a bicycle tag existing among the OSM information. With these variables, the type of segments, identified by the OSM tag are classified. The criteria followed can be found in Table 1.

Bike Network Analysis (BNA)
A network analysis was performed to count, for each cell separately, the number of destinations reachable within a biking distance of six kilometers on the low stress network. First, the hexagonal grid and destinations were spatially joined, so that the number of destinations per type was known for each cell. A spatial join was also performed between the grid and the nodes of the low-stress network, to select only those cells that are reachable with the low-stress network. For each of these, the centroid was computed, after which the nearest node to each centroid was found ( Figure 2). By doing this, each reachable cell was then represented by one single node.
Motorized road network (road, primary, secondary and tertiary segments and links) Residential roads (unclassified, residential, living street) Pedestrian segments and foot ways --- Service lanes (public transport) Shortest paths were computed between all the nearest nodes, using the Dijkstra algorithm. Only those that were shorter than six kilometers were kept. Knowing that each node uniquely represents a grid cell, for each cell it was now known which other cells were reachable by the low-stress network within the six-kilometer buffer. Combining this information with the destination counts, the number of reachable destinations of each type for each cell could be determined. These data were used as input for the calculation of the BNA scores.

BNA Scoring
Each cell on the hexagonal grid was evaluated according to its capacity to reach the destinations within the study area, being able to get a maximum score of 100 points. These destinations were not equally weighted, given that their number and importance differ from one another. For example, universities are not as common as parks, therefore the ability of the network to reach one park would be rewarded with 30 points, whereas reaching a university campus or building would award the cell 70 points. Hence, different scoring processes were established for each type of destination, which can be observed in Table 2. The scores for each type of destination, following the scoring process described in Table 2, were then categorized into three distinct groups: Opportunities for those destinations related to education, Core Services for health, consumer goods, and social services, and Recreation including parks and nature reserves. Within each of these categories, the scores for the different types of destinations were given a weight ( Table 3) that would allow the calculation of a weighted average per category. Finally, these averaged scores were once again aggregated given the weights assigned to the categories, and the BNA score per cell was obtained. To calculate the overall score for the whole study area, a final weighted average was performed. This average consisted of the addition of all the cells inside the grid, weighted by the fraction of the population they would comprehend, assuming that the population is equally distributed along each parish.
The complete scoring process can best be summarized using a short example. Assume in a radius of 6 km around a cell that there are five parks and two nature reserves. With the low stress network, one can reach four of these parks and one nature reserve. Both parks and nature reserves have scoring process A, so, for the first park and the nature reserve, the cell will get 30 points. The fact that a second and third park can be reached will account for two times 20 points more. Hence, for three reachable parks, the cell now has 70 points. There are two more parks in the radius, of which one can be reached. That is, of the remaining 30 points (100-70), the cell will get half, i.e., 15 points. The 85 points for the four reachable parks and the 30 points for the nature reserve both count for 50% in the Recreation category, so the cell will have a score of: This score counts for 20% in the computation for the final BNA score of the cell. How much weight is assigned to this particular final score when the overall is computed for the whole city depends on the fraction of the total population of Lisbon assigned to the cell.

Low-Stress Biking Network
A low-stress set of segments was identified from the bike network classification, as can be observed in cyan color in Figure 3. The resulting low-stress network is very limited according to the classification criteria. Only 9% out of the 42,294 analyzed segments are classified as low-stress. These are comprised in their majority by the existing biking facilities, with only a few additions of some roads considered suitable for commuting cycling. Even if the results are tightly linked to the quality of the OSM data, certain factors might explain the limited low-stress network within the city context. The first one is the maximum speed allowed on the segments, closely attached to the number of lanes and residential areas. At least 75% of those allow a maximum speed equal to or higher than 50 km/h, 105 km/h being the highest maximum speed. Even if these are not the official values that the municipality handles, as they were generated by the osm2pgrouting tool, they give an idea of how the actual street network in the city is structured. According to a sociological study on the pedestrian and cycling practices in Portugal ( [34], p. 298), "the automobiles' excessive speed is referred to [as an obstacle] specially by the (. . . ) people who commute in Lisbon", hence classified as high-stress.
Another decisive variable, especially within Lisbon's context, is the slope. Twenty five percent of the segments have a slope higher than 9% in the analyzed network, with a mean slope of 6.6% and a maximum reaching 49% It is known in commuting cycling that, given the decision between a steep and shorter route, and a flat and longer one, a person would choose the second option [35], as "the slope increases the amount of effort that cyclists need to make" ( [36], p. 67).
Studies [35][36][37][38] do not talk about a threshold for the maximum percentage that an average commuter-cyclist would consider as too physically demanding. Therefore, the 10% chosen for this study was based on the authors' experience, considering that even this slope could be challenging for some people but that a lower threshold would imply an even more limited network. It should be noted in this context that the influence of slope on cyclists' stress level could become of far less importance, as the share of electric bikes is rising sharply in Northern Europe, opening up sustainable mobility to new markets that did not exist before [39]. As of 2017, Lisbon has a public bike sharing program consisting mainly of electric bikes. Eventually, this system should be expanded to 1410 bikes, of which two-thirds will be electrical [40]. Such initiatives, in combination with e-bike friendly infrastructure, lower the barrier of slope and could potentially increase the extent of the low-stress network.
Intersections were not evaluated as part of the network. The main reason to omit them was the lack of sufficient information incoming from the OSM data for all the nodes generated during the topology creation. Further work requires complementary open data and further assumptions regarding the road network to successfully include the information for intersections, given their importance when evaluating the status of a biking network, especially a high-traffic one, as the presence of stop signs and traffic signals increase perceived safety [35]. The PeopleForBikes approach does take them into consideration; therefore, their strategy will be closely followed to integrate the intersections within the network analysis to be performed in future analyses.
Behavior of fellow road users forms another factor that could be of influence on the amount of stress that cyclists perceive. Particularly in Lisbon, it is still hard to create a culture where the motorized vehicle drivers acknowledge the right of way of cyclists. The main reason for this behavior is that only in 2013 the road traffic regulations migrated from a motorized-centered perspective to one that includes the cyclists' rights as well [34]. Since this change is so recent, it still fails to be embraced by the daily commuters. Besides motorized traffic, also pedestrian behaviors can turn the low-stress network into a highly stressful experience. One of the most noticeable examples of these behaviors is the usage of cycleways as footways. This adds up to the fact that the pedestrians do not appear to have a cycling-city culture, and, therefore, become unpredictable when reacting upon the sudden meeting of a cyclist in their way. These limitations raise the question of whether the resulting low-stress network for Lisbon can really be considered as such. However, at the same time, these cultural factors are hardly accountable for in a quantitative index based on open data.

Lisbon's BNA Score
The connectivity of the low-stress network can be quantified by the scoring mechanism, awarding the city of Lisbon with a score of 8.6 out of 100 possible points. Each of the grid cells was punctuated in order to obtain this overall score, which can be visualized in Figure 4. At a first glance, the figure shows how the cells with high scores spatially correlate with those areas where the municipality's bike infrastructure is located; however, this is not always true. It can be observed how, in fact, the cells correspondent to the central and northwestern axes are highly scored, whereas the cells along the riverside show lower scores even if an acceptable bike infrastructure can be already found there. This is mainly because the number of destinations accessible within the analysis are clustering predominately around the central areas and not the riversides ( Figure A1 in the Appendix). As [41] indicate in their analysis, the current cycling network in Lisbon does not satisfy the commuters' needs for daily commuting, as it was projected as a network for leisure trips, providing infrastructure for parks, gardens, and touristic areas. Furthermore, as also mentioned in Section 5.1, the hilly areas in the city limit the extent of the low-stress network, thus scoring the steep cells low ( Figure A2 in the Appendix).
The resulting score shows an extremely low performance of the bike network, which does not really come as a surprise given the factors already discussed in Section 5.1. Of course, it is true that obtaining a score of 100 points is difficult in a city where the biking culture and policies are not completely established yet, and where the mind shifting is still an ongoing process; however, it is important to know that there is big room for improvements, and that this starting process can follow the steps of cities where commuting-biking is a daily way of life. This does not necessarily mean that there are cities around the world with a perfect BNA score. For example, Groningen in the Netherlands is one of the world's leading cities regarding bike infrastructure, and yet, when the PeopleForBikes organization applied their methodology to compute a BNA score for Groningen, it was awarded 75 points [42]. If we look for the highest rated cities in the USA, where the original tool was implemented, we see that some places reach scores as high as 88 (Crested Butte, CO) and 85 (Provincetown, MA, USA, values taken from PeopleForBikes BNA score visualizer), but it is important to notice that these are small cities with less than 3000 inhabitants.
Comparing the obtained value for Lisbon with what the PeopleForBikes BNA score has done for the US cities is not completely accurate, as the methodology differs. One of the biggest challenges of applying such methodology outside of the USA is that the open data available is not the same as in Europe. As already mentioned, the PeopleForBikes BNA approach includes census blocks information with exact population counts, which was not easily and openly available for the Lisbon case, as far as the authors were aware. Additionally, their approach included workplace data, which is also an important factor to consider, but not possible to obtain as open data in the European data portal or similar local portals for Lisbon. Nevertheless, seeing how other cities score, and analyzing the strengths and weaknesses of their biking policies, can guide Lisbon on its way to become a more sustainable city.

Future Research
Future research will focus on analyzing the set up of street networks to include segments and intersections altogether, as the PeopleForBikes approach considers. This will be an effort to translate the BNA score for any city in Europe, considering the data limitations and possible replacements or, even further, modifications to the way the BNA score is calculated, given the available data.
A validation procedure is also within the future research scope, in a way that the score is not only considered an exploratory index, but a trustful tool for urban planners. The tool could serve to optimize the current bike network connectivity in cities, by including scenarios of potential low-stress segment locations within the bike network, and observing the enhancement of the BNA score as a new bike infrastructure is being planned.

Conclusions
The BNA score of Lisbon is 8.6 out of 100 points. The exploratory tool developed has shown how the use of open data, even with its limitations, can provide a quantification index of sustainable mobility for a city. For the Lisbon study case, the score per cell can be used by urban planners to thoroughly analyze the existing biking network, identifying specific areas where low-stress infrastructure should either be introduced or better connected. Additionally, the methodology can be used to evaluate proposals for new cycling infrastructure, and guarantee the introduction of a well-connected low-stress network for bicycle commuters.
The current overall score can be considered an extremely low performance and shows that there is still a long way to go until the bicycle can be considered a fully-fledged mean of transport, able to compete with the car, as the municipality aims for. Of course, making radical changes like this is a slow process and it takes a lot of time for both the inhabitants and policy makers to adapt to them, but, with a clear vision and a strong belief, significant improvements can be expected in the coming years.

Limitations
The major limitation of the BNA approach is the level of completeness and accuracy of the OpenStreetMap data for the study area. PfB is constantly encouraging current and potential users of their tool to also contribute to the mapping of their OSM area so that the score becomes more reliable.
Another issue is the weighted influence that destinations have on the overall score, as different weights can have a high impact on the obtained score. It is therefore important to always keep the same criteria for scores calculations when comparisons want to be performed among different study areas.
Finally, as seen in the background section, validation schemes have been proposed for this type of connectivity measures. However, the lack of open data and the labor intensity of qualitative methods could become a problem for some cities to successfully validate the computed index. Therefore, validation approaches that are automatized, affordable and accessible should be explored to expand this tool to a broader audience. In addition, a good validation procedure should include an evaluation of the destination weights to ensure that their value matches the actual importance given to them by commuters.