Research on Generating an Indoor Landmark Salience Model for Self-Location and Spatial Orientation from Eye-Tracking Data

Landmarks play an essential role in wayfinding and are closely related to cognitive processes. Eye-tracking data contain massive amounts of information that can be applied to discover the cognitive behaviors during wayfinding; however, little attention has been paid to applying such data to calculating landmark salience models. This study proposes a method for constructing an indoor landmark salience model based on eye-tracking data. First, eye-tracking data are taken to calculate landmark salience for self-location and spatial orientation tasks through partial least squares regression (PLSR). Then, indoor landmark salience attractiveness (visual, semantic and structural) is selected and trained by landmark salience based on the eye-tracking data. Lastly, the indoor landmark salience model is generated by landmark salience attractiveness. Recruiting 32 participants, we designed a laboratory eye-tracking experiment to construct and test the model. Finding 1 proves that our eye-tracking data-based modelling method is more accurate than current weighting methods. Finding 2 shows that significant differences in landmark salience occur between two tasks; thus, it is necessary to generate a landmark salience model for different tasks. Our results can contribute to providing indoor maps for different tasks.


Introduction
Wayfinding to a destination through an indoor or outdoor environment is a purposive, directed, and motivated behavior for efficiently finding one's way [1,2]. Wayfinding also involves a series of challenging behaviors that require participants to be aware of their self-location and to orient themselves [3] with the assistance of representative sensory cues from the external environment. Landmarks play an important role in providing guiding information for wayfinding in the physical environment [4,5] and can accelerate decision-making processes, especially at decision points for changing direction. Albrecht [6] discovered that landmarks had a strong relationship with participants' spatial cognition and memory. Clearly, landmarks play an essential role as wayfinding enhancers and navigational error reducers and can affect wayfinding tactics and strategies.
Given the importance of landmarks, it is necessary to measure the salience of different kinds of landmark. Raubal and Winter [7] proposed the first approach to automatically identifying landmarks and calculating landmark salience. They defined three different kinds of landmark salience: visual, semantic, and structural. For example, geographic objects are visually attractive if they are in 1.
Can eye-tracking data be used to construct an indoor landmark salience model? If so, how can the accuracy of the salience results be ensured? 2.
Are there any differences in landmark salience between self-location and orientation in indoor wayfinding? If differences occur, how can an indoor landmark salience model be built for self-location and orientation?
This study makes two main contributions. On the one hand, our feature selection method and weighting algorithm are beneficial for understanding the relationship between eye movement metrics and indoor landmark salience and can extend the calculation method for indoor landmark salience. On the other hand, comparing the differences in landmark salience between self-location and orientation is also helpful for researchers to redesign different indoor landmarks on navigation maps for various wayfinding tasks.
The rest of this article is organized as follows. The related work is presented in Section 2. Section 3 presents the method used to construct the indoor landmark salience model. Section 4, a case study, is designed to test the model and compare it with the differences in landmark salience in two tasks. Section 5 discusses the important factors for the construction of the landmark salience model and compares it with previous studies. Section 6 ends this report with a conclusion and directions for future research.

Indoor Landmark Salience Models
Landmarks are important features in route directions during wayfinding. Sorrows and Hirtle [16] defined landmarks as prominent objects that individuals use as a reference point to help them in memorizing and recognizing routes, as well as locating themselves in terms of their ultimate destination. The aim of landmark identification is to find all the geographic objects in a given region that may in principle serve as a landmark [17]. To quantitatively compute landmarks, the concept of landmark salience has been proposed. Landmark salience is based on the concept of attractiveness, which reflects the importance of each landmark. The generation of a landmark salience model includes two major components, landmark salience attractiveness and weighting methods.
On the one hand, Raubal and Winter [7] presented the first approach to classifying landmark salience attractiveness, dividing landmark salience into three types of attractiveness (visual, semantic ISPRS Int. J. Geo-Inf. 2020, 9,97 3 of 26 and structural salience) to identify landmarks. Based on this finding, Elias [18] introduced building labels, building density and road orientation to describe the salience of geographic objects. Richter and Winter [17] defined the formal model for landmark salience, which includes four measures of visual attractiveness: the façade area, shape, color, and visibility. Zhu [19] proposed the façade area, the board size and design features to calculate the salience of indoor landmarks. However, there is no consensus regarding the salience attractiveness classification of geographic objects.
On the other hand, it is necessary to weight salience attractiveness. Currently, weighting methods, such as questionnaires, documentary sources and expert knowledge, are used to measure landmark salience. Mummidi and Krumm [20] calculated salience by comparing the number of times a specific n-gram appears in a cluster (term frequency) with the number of times the same n-gram appears in all clusters combined (document frequency). Wang [21] combined expert knowledge and proposed some definitions from cognitive and computational perspectives to evaluate indoor landmark salience. However, such methods are cumbersome and labor intensive. Furthermore, the results of such methods depend on the available data but will fail to detect other data because there are only very few data.
In addition, recent years have seen rapid advances in indoor spatial data modelling and an increasing availability of indoor geographic information system (GIS) data [22]. As a result of these rapid advances in indoor data modelling, many innovative indoor location-based service (LBS) applications have been developed, such as indoor wayfinding [23]. Thus, the indoor landmark salience models have been researched in recent years. Researchers [14,19] have proposed indoor landmark salience models based on visual, semantic and structural attractiveness, which is similar to the formal outdoor salience model. However, these attractiveness parameters (visual, semantic and structural) in the outdoor salience model cannot be directly applied to the indoor salience model. On the one hand, the landmark attractiveness factors in indoor spaces differ from those in outdoor spaces. Although Li L [24] has proposed that outdoor landmark attractiveness (shape factor, color and size) can be applied to describe indoor landmark salience, the cultural and historical importance in outdoor landmark attractiveness cannot be directly taken to describe landmarks in indoor environments, such as malls or airports. On the other hand, the spatial arrangement of indoor spaces differs from that of outdoor spaces. For example, multiple kinds of object can be regarded as landmarks in outdoor environments [25], such as churches, shopping malls, and bridges. However, these objects cannot be regarded as landmarks in indoor environments [26]. Lyu [27] mentioned that landmarks can be classified into four types: architecture (pillars and fronts), function (doors, stairs, and elevators), information (signs and posters) and furniture (tables, chairs, benches and vending machines). Thus, it is essential to propose a landmark salience model for indoor environments.

Differences in Landmark Salience during Wayfinding
The current progress in the cognitive sciences relevant to wayfinding investigates how to identify relevant landmarks, how to improve route instructions, and how to compute a better route [3]. Landmarks at decision points are important features in route directions during wayfinding. However, there are a large number of possible landmarks that can be included in route instructions in different situations and for different travelers [28]. Different travelers will find different landmarks to be most useful in a given situation.
There are three important dimensions (personal, navigation system-related and environmental) that impact the differences in landmark salience in wayfinding [29]. Among these dimensions, the personal dimension plays the most important role in person-centric navigation, and it is in this dimension that the most differences in wayfinding occur. For instance, Nuhn [8] identified the personal dimension and its attributes by taking into account five dimensions: personal knowledge, personal interests, personal goals, personal background and individual traits. This author proposed a personal landmark salience model based on these dimensions. In addition, the dimensions of a wayfinding task emerge based on landmark differences. Although task inference has been widely researched in ISPRS Int. J. Geo-Inf. 2020, 9, 97 4 of 26 pedestrian navigation, few researchers have further investigated the task dimensions in landmark salience, especially the landmark differences in person-centric wayfinding.
Wayfinding is a cognitive behavior for finding a distal destination with a series of tactical and strategic tasks [30], including reading a map, remembering the route, finding one's location and maintaining one's orientation with external features or landmarks. Various tasks result in different forms of visual attention and cognitive behaviors with regard to landmarks. Thus, landmark salience keeps changing as participants accomplish different tasks. Two crucial behaviors during wayfinding are self-location and spatial orientation [31]. In self-location, one identifies his or her position in a spatial setting, and it includes several sub-processes, such as map orientation, feature matching, and configuration matching [17]. Spatial orientation is closely related to self-location; it involves determining the direction that one is facing when given an external instruction (cognitive or real maps) [32]. Wiener [33] reported a gaze bias between free exploration and pre-set route tasks. Participants displayed a significant tendency to choose the path leg offering the longest line of sight during free exploration, but that trend did not occur in the chosen route group. Wang [21] reported that although males and females had similar levels of effectiveness and efficiency in self-location, route memorization, and route following, there was a significant difference between them in map reading and indoor wayfinding tasks. However, little research has measured the differences in landmark salience between the self-location and orientation tasks in indoor environments.

Eye-Tracking for Task Differences in Landmark Salience
There are two important factors in comparing the task differences in landmark salience models: landmark salience calculation and statistical analyses of task differences. Researchers have adopted multiple methods [34][35][36] to calculate landmarks during wayfinding, such as questionnaires, pose estimation, and eye-tracking methods. Among these methods, eye-tracking can directly capture visual attention to landmarks in a quantitative way. Both quantitative and qualitative analyses of eye-tracking data can be applied to determine whether significant differences in salience occur during wayfinding.
On the one hand, in recent years, the eye-tracking method has been gradually proposed to analyze spatial cognitive performance in landmark identification because the user's gaze can provide an easy, fast and natural way to capture visual behaviors with regard to landmarks [37]. In addition, eye-tracking data assist researchers in analyzing gaze performance in quantitative ways [38]. For instance, eye-tracking data have been used as a rich data source for mining visual attention to landmarks during wayfinding [39,40]. Only recently have eye-tracking data been used by Jia [14] to calculate the visual attractiveness of landmarks. However, this author only resolves the problem of calculating the visual salience model. The use of eye-tracking data to generate a landmark salience model (visual, semantic and structural attractiveness) has not been tested. Researchers have applied the eye-tracking method to analyze semantic and spatial information [21,40]. For example, Raubal [7] proposed that city maps and street graphs are complemented with images and content databases, which could provide visual data as well as semantic and structural data. Wang [21] extracted areas of interest (AOIs) on indoor maps to analyze the semantic information of landmarks. Kiefer [41] proposed that the eye-tracking method could be taken to analyze route information and landmarks at decision positions, reflecting the structural attractiveness of landmarks. Thus, it is theoretically possible to calculate landmark semantic and structural salience model by eye-tracking data.
On the other hand, participants produce different eye movement patterns with regard to landmarks as their tasks change [41,42]. Kiefer [43] applied machine learning methods to detect six common map activities from eye-tracking data, proving that eye-tracking data can be applied to distinguish user tasks. Liao [15] demonstrated that wayfinding tasks (self-location, orientation, route remembering) can be inferred by eye-tracking data in outdoor environments, opening the door to potential indoor wayfinding applications that can provide task-related information depending on the task that a user is performing. However, little research has measured the differences in landmark salience between self-location and orientation based on eye-tracking data.

Indoor Landmark Salience Model
Based on the discussion of the related work, eye-tracking data can be applied to measure landmark salience. The core of this calculation method consists of regarding eye-tracking data as a mediator that is taken to measure the coefficient of landmark salience attractiveness (visual, semantic, structural). For example, Jia [14] proposed a visual salience model of landmarks based on eye-tracking experiments.
The key to this author's method consisted of using eye-tracking data to represent a salience model with controllable and computable visual attractiveness, which is essential to improve the accuracy of the salience model and to provide a solution that addresses the landmark discrimination in interactions between humans and environments. However, the author did not consider structural or semantical attractiveness, and the landmark differences between various tasks were not discussed. In this section, we propose a method to generate an indoor landmark salience model based on eye-tracking data that considers the differences in self-location and orientation tasks.

Landmark Salience Based on Eye-Tracking Data
We defined landmark salience based on eye-tracking data as the salience results of objects calculated by eye-tracking data. Landmark salience based on eye-tracking data S eye was calculated based on the stimulated landmark salience (S sti ), while the accuracy of S eye was determined in two ways, by selecting the eye-tracking data and by calculating the coefficients of eye-tracking data.

Stimulated Landmark Salience
The concept of stimulated landmark salience S sti was introduced by Jia [14]. S sti means stimulated landmark salience results, which is measured by the percentage of participants who selected the object as a landmark in one specific setting [14]. In this paper, we selected eight indoor scene images as the specific settings. Participants were required to observe these images and select their favourite landmark in each of these images. The most attractive landmarks in each of these images were selected to calculate the stimulated landmark salience (S sti ). The percentage of participants who chose the most attractive landmark was measured as the result of S sti . S sti played an important role in calculating S eye . On the one hand, eye-tracking data that have no statistically significant relationship with S sti was deleted; on the other hand, S sti was taken to measure the coefficient of eye-tracking data.

Data classification
Based on previous studies, eye-tracking data (fixations, saccades and pupil) have been widely adopted in eye-tracking studies [15,44]. The quantitative analysis of visual search strategies was closely related to eye-tracking data in predetermined areas of interest (AOIs) [45]. Thus, eye-tracking data are collected in both AOIs and the total area. The description of each example of eye-tracking data is provided in Table 1.

Data selection
Eye-tracking data that were significantly different from the stimulated visual salience S sti were selected to calculate landmark salience.

• Normalization
Normalization can transform a dimensional expression into a dimensionless expression so that the indexes of different units or scales can be compared and weighted. The features are converted to a decimal value ranging from 0 to 1 through min-max normalization [46].
• Selection process To avoid the uncertainty errors associated with landmark salience based on eye-tracking data, the normalized features should have a statistically significant relationship with the salience results. Stimulated landmark salience (S sti ) was regarded as the landmark salience result. Then, one-way ANOVA was used to measure the significant differences between eye-tracking data and stimulated landmark salience (S sti ). Only significant features (p < 0.05) were selected in this research.

Algorithm selection
The weighting algorithm is an essential method to calculate the correlation. To guarantee the reliability of correlation results, five weighting methods were used in our research. These methods include partial least square regression (PLQR), Analytic Hierarchy Process (AHP), Entropy weight method (EWM), Standard deviation method (SDM) and the Critic method. These five weighting methods are classic and commonly used. The weighting results were calculated by SPSS 11.0. The most accurate algorithm was selected in this paper.

Accuracy test
The precision of the weighting method was tested by the absolute difference value between stimulated salience (S sti ) and the visual salience calculated by eye-tracking data (S eye ). A smaller difference in the test results proved the better accuracy of the weighting method.

Calculating Process
Based on previous finding [14], S eye was measured by the product of the eye-tracking data and their coefficients, and the S sti was taken to calculate the coefficients of the eye-tracking data. Thus, the formula of landmark salience based on eye-tracking data S eye is proposed as follows: where x is the name of a landmark, n represents the amount of eye-tracking data types, λ denotes different types of eye-tracking data, and e is the coefficients of eye-tracking data. The larger the value of e is, the greater the importance of λ.
The calculation process of landmark salience based on eye-tracking data is presented in Table 2. Table 2. Calculation method for landmark salience based on eye-tracking data.
Input: eye-tracking data λ and stimulated landmark salience S sti Output: landmark salience based on eye-tracking data S eye for the result of each λ i and S sti calculated by one-way ANOVA ∈ β do If β i < 0.05 then return λ i end If β i > 0.05 then delate λ i end end /* feature selection */ for the S sti and S eye calculated by the five weighting algorithms do calculate the absolute difference value between stimulated salience (S sti ) and the landmark salience based on eye-tracking data (S eye ); select the weighting algorithm that results in the lowest difference value as the most accurate algorithm; end /* weighting algorithm comparison*/ for the selected weighting algorithms do calculate the coefficient of eye-tracking data (e), and establish the landmark salience based on eye-tracking data (S eye ). end /* landmark salience based on eye-tracking data*/

Shape features
An outstanding shape is an essential salient attribute. According to the definition by Richter and Winter [17], the shape factor and deviation were selected. Put simply, the shape factor is the proportion of height and width. Deviation is the ratio of the area of the minimum-bounding rectangle (mbr) of the object's façade to its façade area [26]. Unusual shapes and deviations, especially among more regular, box-like objects, are highly remarkable.

Colour features
A landmark is salient if its colour or lightness contrasts with the surrounding objects. We used the hue error (∆h) and lightness to measure landmark colour. The hue error can be used to compare the hue value difference between a landmark and an indoor environment. The RGB of the landmark and floor was converted to LAB by Photoshop, and the hue errors were calculated by ColorTell (www.colortell.com). Lightness was measured as though the landmark contained a bright section (window, door, pictures). If there was a bright section in the landmark, then the value of lightness was 1.

Façade area
The façade area is used to calculate the size of a landmark [26]. If the façade area of an object is significantly larger or smaller than the façade areas of the surrounding objects, then this object becomes well noticeable. The façade area was measured as height multiplied by width.

Visibility features
Visual distance is used to measure visibility. Clearly, if the visual distance of a landmark is shorter than that of other objects, then the landmark will be more noticeable. Visual distance was measured as the shortest distance between the participant's location and the landmark.
The detailed information of visual attractiveness was shown in Table 3.

Semantic importance
This property reflects whether an object has an important meaning. Semantic importance represents the proportion of AOI fixation duration and total fixation duration during map reading tasks. The AOI includes the name and point symbol of an object on the map.
Just and Carpenter [47] mentioned that a longer fixation duration either means difficulty in understanding information or represents that the participants show more interest. However, the former explanation is rejected in this paper because participants were educated and can interpret the semantic information on the map. In addition, the participants were driven by the tasks to find and remember important landmarks on maps without a time limit. Thus, the longer the amount of time of visual attention is given to an AOI, the greater the semantic importance of the meaning of the object. Based on previous research [14,48], the AOIs include the name and symbol of objects with a buffer due to the imprecision in eye movements. The AOIs are shown in Figure 1. and floor was converted to LAB by Photoshop, and the hue errors were calculated by ColorTell (www.colortell.com). Lightness was measured as though the landmark contained a bright section (window, door, pictures). If there was a bright section in the landmark, then the value of lightness was 1.

Façade area
The façade area is used to calculate the size of a landmark [26]. If the façade area of an object is significantly larger or smaller than the façade areas of the surrounding objects, then this object becomes well noticeable. The façade area was measured as height multiplied by width.

Visibility features
Visual distance is used to measure visibility. Clearly, if the visual distance of a landmark is shorter than that of other objects, then the landmark will be more noticeable. Visual distance was measured as the shortest distance between the participant's location and the landmark.
The detailed information of visual attractiveness was shown in Table 3.

Semantic importance
This property reflects whether an object has an important meaning. Semantic importance represents the proportion of AOI fixation duration and total fixation duration during map reading tasks. The AOI includes the name and point symbol of an object on the map.
Just and Carpenter [47] mentioned that a longer fixation duration either means difficulty in understanding information or represents that the participants show more interest. However, the former explanation is rejected in this paper because participants were educated and can interpret the semantic information on the map. In addition, the participants were driven by the tasks to find and remember important landmarks on maps without a time limit. Thus, the longer the amount of time of visual attention is given to an AOI, the greater the semantic importance of the meaning of the object. Based on previous research [14] [48], the AOIs include the name and symbol of objects with a buffer due to the imprecision in eye movements. The AOIs are shown in Figure 1.

Explicit marks
An object may have explicit marks, such as signs on the front of a store. These signs explicitly label an object, communicating its semantics. This property was assessed by a Boolean value.

Degree of familiarity
This property indicates whether participants are familiar with a mark. First, the object must have a mark; otherwise, the value is 0. Then, the degree of familiarity was calculated by the proportion of participants who were familiar with the mark (Table 4).

Number of adjacent routes
Objects located at intersections are more important for route instructions than objects located along routes. If an object is adjacent to more than one route, then it is located at a street intersection and is therefore more suitable. To assess landmark salience, the number of edges adjacent to the object was stored.

Number of adjacent objects
Freestanding objects are more attractive than objects with many neighbours. This attribute is mainly important for signs and elevators because other objects, such as stores, are normally connected to other structures. The number of adjacent objects was stored to assess structural salience.

Location importance
Location importance indicates the attractiveness of objects caused by different locations. Location importance can be calculated by the distances between one object and the nearest nodes. Nodes are the intersections in a network [17]. In this paper, intersections in the David Mall indoor map (map.baidu.com) were selected as nodes.
where x is the node and y is an object. The d(y, x) denotes the distance between the nodes x and the object y. Since no two rooms occupy the same space, the distance between any x and y will not be 0. If the distance between x and y is less than 1 m, then L(x) is 1 ( Table 5).

Comparing the differences in landmark salience between self-location and orientation
It is essential to determine whether differences in landmark salience occur between self-location and orientation tasks before generating a landmark salience model for the two tasks. Landmark salience based on eye-tracking data S eye was calculated for both self-location (S sel f −location eye ) and orientation (S orientation eye ). One-way ANOVA was used to measure the statistically significant differences between S sel f −location eye and S orientation eye using SPSS 11.0. The results (p < 0.05) indicate that a significant difference between self-location and orientation is found, and the landmark salience model can be generated for both tasks. Otherwise, it is meaningless to construct a visual salience model for the two tasks.

Indoor landmark salience model for two tasks
Jia [14] proposed that S eye could be used to construct the coefficient of landmark salience attractiveness. With regard to whether a significant difference occurs between S sel f −location eye and S orientation eye , the coefficients are different between the two tasks. Thus, S tasks landmark represents the visual salience model for different tasks.
The formula of the visual salience model is as follows: S tasks landmark (x) = 1/3 S tasks visual (x) + S tasks semantic (x) + S tasks structural (x) (8) where f denotes landmark attractiveness and w is the coefficient of landmark attractiveness. The tasks include self-location and orientation.

Apparatus
A Tobii X120 (Tobii AB, Sweden, www.tobii.com) eye tracker with a sampling rate of 120 Hz and a Samsung 22-inch monitor were selected. The X120 had a recording accuracy of 0.5 • and might have a deviation of 0.1 • . The spatial resolution was 0.3 • , and the head movement error was within 0.2 • . The visual tracking distance was between 50 and 80 cm. The monitor displayed the stimuli with a screen resolution of 1680 × 1050 pixels. Tobii Pro Analyzer software was used to manage and analyse the eye-tracking data. The research was capable of obtaining deep insights into visual saliency in regard to indoor pictures and maps by analysing eye-tracking data.

Procedure
The experiment was conducted in a quiet and well-lit room (Figure 2a). In the pre-test training, the participants were welcomed and were required to provide their personal information (gender, age, familiarity of the David Mall and experience using a computer in everyday life) and complete two skill tests. Previous research has reported that wayfinding tasks were examined in relation to spatial skills and self-reported skills [50]. The Mental Rotation Test (MRT) and the Santa Barbara Sense of Direction Scale (SBSOD) could be used to test spatial skills and self-reported skills, respectively. The participants were asked to complete two tests before the experiment to ensure they had similar skills. In addition, as the Tobii X120 could not support the participants to check the stimuli again, the participants were informed to remember the experimental stimuli during the experimental procedure. In the formal experiment, the participants were asked to complete three tasks ( Figure 2b). The instructions for the participants are described below: • Task #1 (landmark selection): Assume that you are shopping in a mall. When the experiment begins, you will view eight indoor scene images one at a time. Please select the most attractive landmark (store, bench, elevator or signs) in each image. When you find the result, click on the landmark to proceed. There is no time limit for you to find the landmark. • Task #2 (self-localization): You will find your location on the map. First, you should observe an indoor scene image carefully and try to memorize the necessary landmark information as much as possible. You are not allowed to look at the image again. Then, you should find the location and click on it on the indoor map. Two locations should be found in this phase. • Task #3 (orientation): You will find your orientation in the indoor scene image with the assistance of landmarks. You should remember the landmark information related to the route from A to B on the indoor map. Then, you will point out the correct orientation to get to B and click on it on the image. Two orientations need to be noted in this task. After that, the experiment ends.

Stimuli
Twelve panoramas were created as indoor scene images for the eye-tracking experiment. The panoramas were photographed using a Canon 800D camera with an 18-55 mm lens in Zhengzhou David Mall, China. The camera was fixed on a 1.5-m tripod. We took 60 pictures at each location. The computer-generated panoramas were made using PTGui (www.ptgui.com). However, 360° panoramas were not used because image distortions occurred as the 360° panoramas were dragged. In the meantime, it was difficult for the participants to recognize detailed information in the 360° panorama shown in one picture due to the limitation of screen size. Thus, we selected half of the panorama with a 180 o visual angle (Figure 2c,d).
Based on the experimental procedure, eight indoor scene images were selected in task 1 ( Figure  3a-h). When the participants observed one image, they clicked on their favorite landmark in the stimuli. After the experiment, the most attractive landmarks, highlighted in Figure 3a-h, were defined as AOIs for analyzing the eye-tracking data. The AOIs were divided by research though Tobii Analyzer, which could be used to collect and calculate eye-tracking data within or without the AOIs. The display of these eight images followed the Latin square principle.
In task 2, four Baidu indoor maps (map.baidu.com) were selected as indoor 2D maps. Baidu indoor maps are widely used by the general public in China, which ensures that the participants have In the formal experiment, the participants were asked to complete three tasks (Figure 2b). The instructions for the participants are described below: • Task #1 (landmark selection): Assume that you are shopping in a mall. When the experiment begins, you will view eight indoor scene images one at a time. Please select the most attractive landmark (store, bench, elevator or signs) in each image. When you find the result, click on the landmark to proceed. There is no time limit for you to find the landmark.

•
Task #2 (self-localization): You will find your location on the map. First, you should observe an indoor scene image carefully and try to memorize the necessary landmark information as much as possible. You are not allowed to look at the image again. Then, you should find the location and click on it on the indoor map. Two locations should be found in this phase. • Task #3 (orientation): You will find your orientation in the indoor scene image with the assistance of landmarks. You should remember the landmark information related to the route from A to B on the indoor map. Then, you will point out the correct orientation to get to B and click on it on the image. Two orientations need to be noted in this task. After that, the experiment ends.

Stimuli
Twelve panoramas were created as indoor scene images for the eye-tracking experiment. The panoramas were photographed using a Canon 800D camera with an 18-55 mm lens in Zhengzhou David Mall, China. The camera was fixed on a 1.5-m tripod. We took 60 pictures at each location. The computer-generated panoramas were made using PTGui (www.ptgui.com). However, 360 • panoramas were not used because image distortions occurred as the 360 • panoramas were dragged. In the meantime, it was difficult for the participants to recognize detailed information in the 360 • panorama shown in one picture due to the limitation of screen size. Thus, we selected half of the panorama with a 180 • visual angle (Figure 2c,d).
Based on the experimental procedure, eight indoor scene images were selected in task 1 (Figure 3a-h). When the participants observed one image, they clicked on their favorite landmark in the stimuli. After the experiment, the most attractive landmarks, highlighted in Figure 3a-h, were defined as AOIs for analyzing the eye-tracking data. The AOIs were divided by research though Tobii Analyzer, which could be used to collect and calculate eye-tracking data within or without the AOIs. The display of these eight images followed the Latin square principle.
ISPRS Int. J. Geo-Inf. 2018, 7, x FOR PEER REVIEW 12 of 26 similar levels of familiarity. Photoshop was applied to re-mark all of the landmarks and to redesign the point symbol in the same pattern ( Figure 4(a2,b2,c2,d2)). To observe the orientation behaviors in task 3, navigation routes from A to B were drawn in these resigned maps (Figure 4(a3,b3,c3,d3)). The participants were required to view images to find their locations in task 2 and to remember the navigation routes to find their orientations in task 3. In order to avoid the participants observing the same indoor scene images in task 2 and 3, the participants were divided into two groups. The order of experimental stimuli was different between Groups 1 and 2. For the participants in Group 1, the order was Figure 4 a2-a1-b2-b1 in task 2 and Figure 4 c1-c3-d1-d3 in task 3. For Group 2, the order was Figure 4 c2-c1-d2-d1 in task 2 and Figure 4 a1-a3-b1-b3 in task 3. In task 2, four Baidu indoor maps (map.baidu.com) were selected as indoor 2D maps. Baidu indoor maps are widely used by the general public in China, which ensures that the participants have similar levels of familiarity. Photoshop was applied to re-mark all of the landmarks and to redesign the point symbol in the same pattern ( Figure 4(a2,b2,c2,d2)). To observe the orientation behaviors in task 3, navigation routes from A to B were drawn in these resigned maps (Figure 4(a3,b3,c3,d3)).
ISPRS Int. J. Geo-Inf. 2018, 7, x FOR PEER REVIEW 12 of 26 similar levels of familiarity. Photoshop was applied to re-mark all of the landmarks and to redesign the point symbol in the same pattern ( Figure 4(a2,b2,c2,d2)). To observe the orientation behaviors in task 3, navigation routes from A to B were drawn in these resigned maps (Figure 4(a3,b3,c3,d3)). The participants were required to view images to find their locations in task 2 and to remember the navigation routes to find their orientations in task 3. In order to avoid the participants observing the same indoor scene images in task 2 and 3, the participants were divided into two groups. The order of experimental stimuli was different between Groups 1 and 2. For the participants in Group 1, the order was Figure 4 a2-a1-b2-b1 in task 2 and Figure 4 c1-c3-d1-d3 in task 3. For Group 2, the order was Figure 4 c2-c1-d2-d1 in task 2 and Figure 4 a1-a3-b1-b3 in task 3.

Participants
A total of forty-six young male students majoring in cartography were recruited to join our pilot experiment as an experimental lesson. The results of five participants were omitted because their sample rates (calculated by Tobii Analyser) were below 80% [51]. Four participants were omitted because they did not pass SBSOD and MRT. Five participants did not continue to conduct the experiment because they were familiar with the David Mall. Thus, thirty-two participants continued to conduct the formal experiments. According to the experimental stimuli, the participants were divided into two groups. The sixteen participants in Group 1 were aged between 18 and 29 years old (mean age = 23.97, SD = 1.54). In Group 2, the sixteen participants were aged between 18 and 27 years old (mean age = 22.63, SD = 1.67).
All of the participants were familiar with computing technology. They all had normal or corrected-to-normal vision and could complete the experiment independently. The experiment was reviewed and approved by the local institutional review board (IRB). All of the participants provided their written informed consent to participate in the experiment.

Landmark Salience Based on Eye-Tracking Data
To answer question 1, with regard to whether eye-tracking data can be used to calculate landmark salience, eye-tracking data were selected and weighted using five algorithms. According to the participants' selection, the stimulated landmark salience ( ) results were 0.594, 0.906, 0.750, 0.594, 0.875, 0.813, 0.688 and 0.875 in Figure 3a-h, respectively.

Feature selection
All of the images (Figure 3a-h) in task 1 were used for feature selection. One-way ANOVA was used to compare the significance relationships between stimulated landmark salience ( ) and the eye-tracking data ( ); the results are shown in Table 6. Clearly, seven features have significant differences with stimulated landmark salience. These features, including total fixation duration, total fixation counts, gaze duration, fixation counts, total saccade duration, saccade duration and pupil difference, were selected to calculate the visual salience based on eye-tracking data. These features included fixation, saccade and pupil type, ensuring that all types of eye metric were considered in our research. The participants were required to view images to find their locations in task 2 and to remember the navigation routes to find their orientations in task 3. In order to avoid the participants observing the same indoor scene images in task 2 and 3, the participants were divided into two groups. The order of experimental stimuli was different between Groups 1 and 2. For the participants in Group 1, the order was Figure 4 a2-a1-b2-b1 in task 2 and Figure 4 c1-c3-d1-d3 in task 3. For Group 2, the order was Figure 4 c2-c1-d2-d1 in task 2 and Figure 4 a1-a3-b1-b3 in task 3.

Participants
A total of forty-six young male students majoring in cartography were recruited to join our pilot experiment as an experimental lesson. The results of five participants were omitted because their sample rates (calculated by Tobii Analyser) were below 80% [51]. Four participants were omitted because they did not pass SBSOD and MRT. Five participants did not continue to conduct the experiment because they were familiar with the David Mall. Thus, thirty-two participants continued to conduct the formal experiments. According to the experimental stimuli, the participants were divided into two groups. The sixteen participants in Group 1 were aged between 18 and 29 years old (mean age = 23.97, SD = 1.54). In Group 2, the sixteen participants were aged between 18 and 27 years old (mean age = 22.63, SD = 1.67).
All of the participants were familiar with computing technology. They all had normal or corrected-to-normal vision and could complete the experiment independently. The experiment was reviewed and approved by the local institutional review board (IRB). All of the participants provided their written informed consent to participate in the experiment.

Landmark Salience Based on Eye-Tracking Data
To answer question 1, with regard to whether eye-tracking data can be used to calculate landmark salience, eye-tracking data were selected and weighted using five algorithms. According to the participants' selection, the stimulated landmark salience (S sti ) results were 0.594, 0.906, 0.750, 0.594, 0.875, 0.813, 0.688 and 0.875 in Figure 3a-h, respectively.

Feature selection
All of the images (Figure 3a-h) in task 1 were used for feature selection. One-way ANOVA was used to compare the significance relationships between stimulated landmark salience (S sti ) and the eye-tracking data (S eye ); the results are shown in Table 6. Clearly, seven features have significant differences with stimulated landmark salience. These features, including total fixation duration, total fixation counts, gaze duration, fixation counts, total saccade duration, saccade duration and pupil difference, were selected to calculate the visual salience based on eye-tracking data. These features included fixation, saccade and pupil type, ensuring that all types of eye metric were considered in our research.

Feature weighting
To build the visual salience formula based on eye-tracking data, the eye-tracking data in Figure 3a-f in task 1 were used to measure the coefficient. Both statistical regression and weighting methods could be used to calculate the coefficient. To choose the best method, SPSS 11.0 was applied to compare these weighting algorithms. The results are shown in Table 7.

Results accuracy
The eye-tracking data in Figure 3g,h were collected to test the accuracy of the weighting algorithm. Figure 5 shows the difference value (dv) results of the participants. Clearly, the difference value of the SDM is higher than that of the other algorithms from five participants (dv = 5.86) to 32 participants (dv = 3.99), proving that the SDM is the worst method for calculating the coefficient. All of the lowest difference value results are observed for PLSR, which shows that PLSR is the best weighting method in this experiment. This finding confirms the previous evidence showing that PLSR is the most accurate method for visual salience based on eye-tracking data, as proposed by Jia [14].

Differences in Landmark Salience between Self-Location and Orientation
To answer question 2, with regard to whether differences in landmark salience occur between the self-location and orientation tasks, the visual salience of nineteen landmarks in tasks 2 and 3 was calculated based on eye-tracking data (formula in Section 3.1), and one-way ANOVA was applied to analyse the significant differences. The results are shown in Table 8.
Task 2 (self-location) was significantly different from task 3 (orientation) in landmark salience (F = 4.156 p = 0.048 < 0.05), indicating that the participants have a different visual performance regarding the AOIs in tasks 2 and 3.
Differences in landmark salience between tasks 2 and 3 are present in each AOI. According to Table 8, nineteen AOIs show significant differences in landmark salience between the self-location and orientation tasks. The participants in task 3 paid significantly greater visual attention to store AOIs (AOI1 = 1.105, AOI5 = 0.789, AOI6 = 1.287, AOI11 = 1.074, AOI14 = 1.148, AOI16 = 1.085 and AOI19 = 0.723) than did those in task 2 (AOI1 = 0.851, AOI5 = 0.413, AOI11 = 0.663, AOI14 = 0.835, AOI16 = 0.817 and AOI19 = 0.339). The elevator AOIs also show similar significant differences; the landmark salience of the elevator in task 3 is significantly higher than that in task 2, indicating that the participants paid more attention to store and elevator landmarks in the orientation task. However, it is difficult to determine the landmark salience of benches. Although the landmark salience of the bench in task 3 (AOI4 = 0.481) is significantly lower than those in task 2 (AOI4 = 0.346), those in AOI9 do not show the same tendency. 4. Landmark salience based on eye-tracking data Figure 5 shows that the different value results decrease as the number of participants among these algorithms increases. The average difference between 30 and 32 participants is 0.01, which shows that the number of participants is sufficient for this research. The formula of landmark salience based on eye-tracking data is shown as follows: S eye (x) = 0.005λ 1 + 0.007λ 2 − 0.034λ 3 + 0.003λ 4 − 0.212λ 5 + 1.631λ 6 + 1.349λ 7 + 0.128

Differences in Landmark Salience between Self-Location and Orientation
To answer question 2, with regard to whether differences in landmark salience occur between the self-location and orientation tasks, the visual salience of nineteen landmarks in tasks 2 and 3 was calculated based on eye-tracking data (formula in Section 3.1), and one-way ANOVA was applied to analyse the significant differences. The results are shown in Table 8.
Task 2 (self-location) was significantly different from task 3 (orientation) in landmark salience (F = 4.156 p = 0.048 < 0.05), indicating that the participants have a different visual performance regarding the AOIs in tasks 2 and 3.
Differences in landmark salience between tasks 2 and 3 are present in each AOI. According to Table 8, nineteen AOIs show significant differences in landmark salience between the self-location and orientation tasks. The participants in task 3 paid significantly greater visual attention to store AOIs (AOI1 = 1.105, AOI5 = 0.789, AOI6 = 1.287, AOI11 = 1.074, AOI14 = 1.148, AOI16 = 1.085 and AOI19 = 0.723) than did those in task 2 (AOI1 = 0.851, AOI5 = 0.413, AOI11 = 0.663, AOI14 = 0.835, AOI16 = 0.817 and AOI19 = 0.339). The elevator AOIs also show similar significant differences; the landmark salience of the elevator in task 3 is significantly higher than that in task 2, indicating that the participants paid more attention to store and elevator landmarks in the orientation task. However, it is difficult to determine the landmark salience of benches. Although the landmark salience of the bench in task 3 (AOI4 = 0.481) is significantly lower than those in task 2 (AOI4 = 0.346), those in AOI9 do not show the same tendency.

Landmark Salience Model for Self-Location and Orientation
The previous results in Section 4.2.2 show that differences in landmark salience occur between tasks 2 and 3. Thus, the landmark salience model can be constructed for the self-location and orientation tasks. Based on the modelling process, the landmark salience attractiveness was normalized, and the results were compared with the landmark salience based on eye-tracking data in tasks 2 and 3 using one-way ANOVA. Then, landmark salience attractiveness with a significant difference was selected for regression through PLSR. The results are shown in Table 9. Table 9 shows that the landmark salience model includes nine factors for both tasks 2 and 3. The brightness, explicit marks and adjacent object factors were not selected for landmark salience modelling. In addition, the coefficients of the selected factors were not the same. In task 2, the visual distance was significantly negatively correlated with S eye . In task 3, the visual distance and degree of familiarity factors were negatively correlated with S eye . According to Table 9, the landmark salience models are generated as follows: The landmark salience model for self-location:

Discussion
In this section, we first analyse the important factors that influence the generation of the indoor landmark salience model. We then discuss the differences in landmark salience between the self-location and orientation tasks from the perspective of the participants and indoor environments. Finally, we compare the accuracy of our model with previous weighting methods and propose improvements to our model.

Important Factors for the Landmark Salience Model
As indicated by the previous findings in Sections 4.2.1 and 4.2.2, question 1 with regard to whether landmark salience can be calculated by eye-tracking data has been answered. In this part, three important factors for the construction of the landmark salience model are shown.
The first factor is the type of eye-tracking data. To prove the reliability of the selected eye-tracking data (combined features), seven types of feature (combined features, fixation, saccade, pupil, fixation+saccade, fixation+pupil and saccade+pupil) were collected in the images of task 1, and PLSR was applied to calculate the coefficient. The difference value results between the stimulated visual salience and the predicted salience of the seven features are shown in Figure 6. Clearly, the combined feature has the lowest difference value (mean = 0.0022, SD = 0.0004), making it possible to improve the accuracy of visual salience based on fixation features and other types of eye-tracking data.
The second factor is the weighting algorithm. The previous results in Section 5.1 prove that PLSR is the best algorithm in this research. There are two reasons for this conclusion. On the one hand, the selected eye-tracking data are significantly different from the stimulated visual salience (Table 8), and the ANOVA results based on SPSS show that eye-tracking data follow a normal distribution, which means that the selected features could be used to establish multiple linear regression equations [14]. On the other hand, in PLSR, the stimulated visual salience and eye-tracking data are measured as dependent and independent variables, respectively, while the other algorithms consider only the variability of eye-tracking data.

Differences in Landmark Salience between Self-Location and Orientation
Section 4.2 answered question 2 with regard to whether differences in landmark salience occurred between the self-location and orientation tasks. To explain this problem, the participants' visual behaviours and indoor environments are analysed in this section. The last factor is the significance of landmark salience attractiveness. Table 9 reveals that the factor coefficients for the self-location and orientation tasks are different. For instance, the degree of familiarity is significantly positively correlated in the self-location group, but it is negatively correlated with landmark salience based on eye-tracking data in the orientation group.

Differences in Landmark Salience between Self-Location and Orientation
Section 4.2 answered question 2 with regard to whether differences in landmark salience occurred between the self-location and orientation tasks. To explain this problem, the participants' visual behaviours and indoor environments are analysed in this section.

Differences in the Participants' Visual Behaviours
A t-test was applied to prove that the participants have significantly different visual behaviours in self-location and orientation (Table 10). There are significant differences between the two tasks in four types of eye-tracking data (total fixation duration, gaze duration, total saccade duration and saccade duration). For instance, the participants in the self-location group had a gaze duration of 1.124 s (SD = 0.535) on the self-location task, which is significantly less than that of the participants in the orientation task (mean = 1.818 s, SD = 0.789), indicating that the participants' gaze duration was shorter on the self-location task than on the orientation task.
Similar differences in total saccade features occur. The total saccade duration of the groups in tasks 2 and 3 was 1.059 s (SD = 0.176) and 1.597 s (SD = 0.126), respectively, which indicates that the participants in the self-location group had a significantly shorter saccade duration than did the orientation group (t = −20.207, p < 0.001). However, there are no significant differences in pupil features (pupil size, AOI pupil size and pupil difference) between the self-location and orientation groups, which indicates that the participants have similar pupil behaviours with regard to the AOIs in tasks 2 and 3.

Differences in Indoor Environments
In this section, we select gaze duration and saccade duration. The t-test results show that the participants pay significantly different amounts of attention to indoor landmark types between self-location and orientation (Figure 7).
The participants in the orientation group had a significantly longer gaze duration (mean = 2.242 s, SD = 0.547) and saccade duration (mean = 0.401 s, SD = 0.141) on store landmarks than did the participants in the self-location group (mean = 1.179 s, SD = 0.442; mean = 0.207 s, SD = 0.107). A similar trend also occurs for the elevator landmark. The mean gaze duration and saccade duration of the participants in the orientation group were 1.725 s (SD = 0.858) and 0.255 s (SD = 0.121), respectively, which were significantly higher than those in the self-location group, indicating that the store and elevator landmarks were significantly more attractive to the orientation group than to the self-location group. Although the participants in the orientation group had a significantly shorter gaze duration on signs (t = −2.698, p = 0.004), the saccade duration on signs did not show a significant difference between the self-location and orientation groups (t = −0.521, p = 0.309).
The participants in the orientation group had a significantly longer gaze duration (mean = 2.242 s, SD = 0.547) and saccade duration (mean = 0.401 s, SD = 0.141) on store landmarks than did the participants in the self-location group (mean = 1.179 s, SD = 0.442; mean = 0.207 s, SD = 0.107). A similar trend also occurs for the elevator landmark. The mean gaze duration and saccade duration of the participants in the orientation group were 1.725 s (SD = 0.858) and 0.255 s (SD = 0.121), respectively, which were significantly higher than those in the self-location group, indicating that the store and elevator landmarks were significantly more attractive to the orientation group than to the self-location group. Although the participants in the orientation group had a significantly shorter gaze duration on signs (t = −2.698, p = 0.004), the saccade duration on signs did not show a significant difference between the self-location and orientation groups (t = −0.521, p = 0.309).
(a) (b) Figure 7. Comparison of the differences in visual salience among stores, elevators, signs and benches between tasks 2 and 3. * means p < 0.05. (a) Gaze duration differences between task 2 and 3; (b) Saccade duration differences between task 2 and 3.
There are two reasons for this phenomenon. First, we selected an indoor mall as the experimental environment; stores are the most important landmarks in a shopping centre, and participants are prone to observe store landmarks. Second, according to the landmark salience model, explicit marks play an important role in semantic salience, which indicates that it is easier for participants to find this factor attractive. Store landmarks have explicit marks, but the others do not.

Compared with Previous Research
Previous research has mainly calculated landmark salience by equal weighting, expert knowledge and instance-based scoring method. For example, the original salience model sets all weights as equal [7], and this method has been adapted by different salience models [17,26]. Nuhn [8] proposed a salience model that weights results using expert knowledge based on formal research. Zhu [19] constructed an instance-based scoring system to evaluate indoor landmark salience. Thus, the PLSR, equal weighting, expert knowledge and instance-based scoring methods were selected to compare the accuracy of landmark salience evaluation methods.
The landmark salience weighting results calculated by PLSR are mentioned in Section 4.2. The weighting results evaluated by the equal weighting method are shown in Table 11. As for the expert knowledge method, seven researchers with a PhD in cartography were invited to weight the importance of indoor landmark salience, and the results are shown in Table 11. Regarding the instance-based scoring method, the landmark salience attractiveness and weighting results are adapted from Zhu [19]. The landmark salience measured by the three algorithms was compared with Figure 7. Comparison of the differences in visual salience among stores, elevators, signs and benches between tasks 2 and 3. * means p < 0.05. (a) Gaze duration differences between task 2 and 3; (b) Saccade duration differences between task 2 and 3.
There are two reasons for this phenomenon. First, we selected an indoor mall as the experimental environment; stores are the most important landmarks in a shopping centre, and participants are prone to observe store landmarks. Second, according to the landmark salience model, explicit marks play an important role in semantic salience, which indicates that it is easier for participants to find this factor attractive. Store landmarks have explicit marks, but the others do not.

Compared with Previous Research
Previous research has mainly calculated landmark salience by equal weighting, expert knowledge and instance-based scoring method. For example, the original salience model sets all weights as equal [7], and this method has been adapted by different salience models [17,26]. Nuhn [8] proposed a salience model that weights results using expert knowledge based on formal research. Zhu [19] constructed an instance-based scoring system to evaluate indoor landmark salience. Thus, the PLSR, equal weighting, expert knowledge and instance-based scoring methods were selected to compare the accuracy of landmark salience evaluation methods.
The landmark salience weighting results calculated by PLSR are mentioned in Section 4.2. The weighting results evaluated by the equal weighting method are shown in Table 11. As for the expert knowledge method, seven researchers with a PhD in cartography were invited to weight the importance of indoor landmark salience, and the results are shown in Table 11. Regarding the instance-based scoring method, the landmark salience attractiveness and weighting results are adapted from Zhu [19]. The landmark salience measured by the three algorithms was compared with the landmark salience based on eye-tracking data in Section 4.2.2. The difference value results are shown in Figure 8.
As presented in Figure 8, the circle size represents the landmark salience difference between eye-tracking data and four weighting methods. The larger the size of the circle, the bigger the difference value. The difference values of PLSR are lower than the other three weighting methods in both task 2 and 3. The highest difference value shown in AOI18 was calculated by equal weighting in both tasks 2 and 3. However, the difference value calculated by equal weighting is lower than those calculated by expert knowledge and instance-based scoring method in AOI13 in task 3. Thus, the most accurate weighting method in this study is PLSR. The accuracy of the other three weighting methods changes in different landmark attractiveness and tasks.

Improvements for Current Studies
Prior studies have considered landmark salience differences in personal dimensions, time dimensions (day and night) and environmental dimensions (indoor and outdoor), but few researchers have determined the differences in salience among various wayfinding tasks. For example, Nuhn and Timpf [8] defined the personal dimensions of landmarks, including spatial knowledge, interests, goals and backgrounds. Additionally, the salience of an object is different across people with various personal dimensions. Duckham, Winter, and Robinson [4] introduced nighttime vs. daytime factors into computing the salience of individual landmarks. Regarding the environmental dimension, researchers have proposed a salience model for both indoor and outdoor settings [24,27,28]. However, they could not determine the differences in salience in various tasks. This article provides a method to calculate landmark salience based on participants' eye-tracking data, and the landmark salience models are different between self-location and orientation tasks.

Improvements for Current Studies
Prior studies have considered landmark salience differences in personal dimensions, time dimensions (day and night) and environmental dimensions (indoor and outdoor), but few researchers have determined the differences in salience among various wayfinding tasks. For example, Nuhn and Timpf [8] defined the personal dimensions of landmarks, including spatial knowledge, interests, goals and backgrounds. Additionally, the salience of an object is different across people with various personal dimensions. Duckham, Winter, and Robinson [4] introduced nighttime vs. daytime factors into computing the salience of individual landmarks. Regarding the environmental dimension, researchers have proposed a salience model for both indoor and outdoor settings [24,27,28]. However, they could not determine the differences in salience in various tasks. This article provides a method to calculate landmark salience based on participants' eye-tracking data, and the landmark salience models are different between self-location and orientation tasks.
The indoor landmark salience proposed in this article could be applied to design indoor maps for different tasks. For instance, we calculated landmark salience results in the first floor of the David Mall. Landmarks with high salience results could be regarded as areas of interest (AOI) [26]. Thus, we selected landmarks with salience results higher than the mean results as AOIs. The indoor maps for self-location and orientation were design by ArcMap 10.1 (Figure 9). Figure 9 shows that AOIs are different between indoor maps for self-location and orientation. For instance, Coach and BOTTEGA are AOIs in indoor maps for orientation, but they are not AOIs in indoor map for self-location. The indoor landmark salience proposed in this article could be applied to design indoor maps for different tasks. For instance, we calculated landmark salience results in the first floor of the David Mall. Landmarks with high salience results could be regarded as areas of interest (AOI) [26]. Thus, we selected landmarks with salience results higher than the mean results as AOIs. The indoor maps for self-location and orientation were design by ArcMap 10.1 (Figure 9). Figure 9 shows that AOIs are different between indoor maps for self-location and orientation. For instance, Coach and BOTTEGA are AOIs in indoor maps for orientation, but they are not AOIs in indoor map for selflocation.
(a) (b) Figure 9. Comparison of the indoor map between tasks 2 and 3. (a) Indoor map based on task 2; (b) Indoor map based on task 3.
In addition, the existing research has applied visual input technologies to realize humancomputer interactions for navigation. Richter [17] pointed out that interaction between humans and computers has a specific focus on wayfinding. Additionally, landmark references have been proven to show the importance and benefits of this interaction. Landmark salience based on eye-tracking data can be theoretically inputted into real-time indoor navigation. For instance, participants can only passively receive landmark information when using traditional navigation applications. Thus, it is possible to construct a real-time gaze-aware navigation assistant that can actively detect the participant's eye movements [15], calculate the visual salience and recommend attractive landmarks to the participant in the future.

Conclusions and Future Research
This study aimed to establish an indoor landmark salience model based on eye-tracking data and to compare the differences in salience between self-location and orientation. The results show two findings. Finding 1 proves that eye-tracking data could be used to measure indoor landmark salience. For instance, seven types of eye movement could be applied to analyse salience, and the salience result of the combined eye-tracking data was more accurate than that of the other types of In addition, the existing research has applied visual input technologies to realize human-computer interactions for navigation. Richter [17] pointed out that interaction between humans and computers has a specific focus on wayfinding. Additionally, landmark references have been proven to show the importance and benefits of this interaction. Landmark salience based on eye-tracking data can be theoretically inputted into real-time indoor navigation. For instance, participants can only passively receive landmark information when using traditional navigation applications. Thus, it is possible to construct a real-time gaze-aware navigation assistant that can actively detect the participant's eye movements [15], calculate the visual salience and recommend attractive landmarks to the participant in the future.

Conclusions and Future Research
This study aimed to establish an indoor landmark salience model based on eye-tracking data and to compare the differences in salience between self-location and orientation. The results show two findings. Finding 1 proves that eye-tracking data could be used to measure indoor landmark salience. For instance, seven types of eye movement could be applied to analyse salience, and the salience result of the combined eye-tracking data was more accurate than that of the other types of eye movements. In addition, the PLSR weighting algorithm was more accurate than the other current weighting methods. Finding 2 shows that significant differences in landmark salience occurred between self-location and orientation. The participants paid more attention to landmarks that were stores and elevators in the orientation task. Thus, it is necessary to generate an indoor landmark salience model for different tasks. This study can contribute to the development of an indoor navigation map design in cartography and GIScience. For instance, landmarks with higher landmark salience could be highlighted in indoor navigation maps. In the meantime, it is meaningful for cartographers to redesign indoor maps for different wayfinding tasks.
However, since the experimental materials were statistical images, we did not discuss visual performance in real-world environments. The first challenge of indoor real-world experiments is how to calculate landmark salience attractiveness. For example, the information on shape factors, visual distance or façade areas keeps changing as the participants walk. It is difficult to define the exact visual factors of different landmarks. The second challenge is distraction caused by customers in a mall. Participants might be attracted by the people in real-world environments, and their concentration in the experiment may decrease.
As discussed in the previous section, future research can focus on the definition and calculation of landmark salience attractiveness in changing scenes. Moreover, future research can conduct experiments to detect the landmark salience attractiveness in various indoor scenes, such as airports, hospitals or conference centres.