Location-Based Social Network’s Data Analysis and Spatio-Temporal Modeling for the Mega City of Shanghai, China

: The aim of the current study is to analyze and extract the useful patterns from Location-Based Social Network (LBSN) data in Shanghai, China, using di ﬀ erent temporal and spatial analysis techniques, along with speciﬁc check-in venue categories. This article explores the applications of LBSN data by examining the association between time, frequency of check-ins, and venue classes, based on users’ check-in behavior and the city’s characteristics. The information regarding venue classes is created and categorized by using the nature of physical locations. We acquired the geo-location information from one of the most famous Chinese microblogs called Sina-Weibo (Weibo). The extracted data are translated into the Geographical Information Systems (GIS) format, and after analysis the results are presented in the form of statistical graphs, tables, and spatial heatmaps. SPSS is used for temporal analysis, and Kernel Density Estimation (KDE) is applied based on users’ check-ins with the help of ArcMap and OpenStreetMap for spatial analysis. The ﬁndings show various patterns, including more frequent use of LBSN while visiting entertainment and shopping locations, a substantial number of check-ins from educational institutions, and that the density extends to suburban areas mainly because of educational institutions and residential areas. Through analytical results, the usage patterns based on hours of the day, days of the week, and for an entire six months, including by gender, venue category, and frequency distribution of the classes, as well as check-in density all over Shanghai city, are thoroughly demonstrated.


Introduction
The study of extracting valuable information and gaining useful insights from spatio-temporal data has become very important in recent years. Due to the popularity of Location-Based Social Networks (LBSNs) in the modern era, the availability of huge amounts of information generated by users has become invaluable, especially from a practical point of view. The information extracted from this data can be used in many application areas, such as analysis of public transit flows, location recommendations, population density estimation, route planning, disaster management, and many more [1]. Online services encourage different users to share their activities and interests with their social friends, and moreover generate enormous amounts of data, enabling researchers to understand users' activities, patterns, and preferences more accurately. These online services provide and store the information of users by considering their real-time locations. The data collected through such

Related Work
In the last few decades, the interest of researchers in big data has increased exponentially and research into big data, compared to other fields of computer science, has attracted a tremendous amount of attention. The term itself and articles such as "Big data is opening doors, but maybe too many" [20] and "Big data: the greater good or invasion of privacy?" [21] suggest a perception of volume; however, there are more features to be considered regarding big data, such as complexity, structure, behavior, and the tools, techniques, and technologies used to process and analyze it [22]. Dumbill [23] discussed three different dimensions of big data: volume, velocity, and variety of contents. Mayer-Schönberger and Cukier [24] highlighted the three main challenges of big data: populations instead of samples, messy instead of clean data, and correlation instead of causality. Additionally, Miller and Goodchild [25] defined big data as data that cannot be analyzed using traditional tools. In 2013, Ovadia and Librarian [26] emphasized the importance of big data for social scientists and librarians, and suggested that it is much too important to be ignored, as most social science research is based on huge amounts of data and enormous datasets.
As a central focus of many study fields, including time and space geography, urban functionalities, and human mobility, big data analysis is a vast research field that was initially studied using statistical data from surveys, interviews, travel diaries, questionnaires, and other manual collections of datasets [27][28][29]. Statistical data collection may not be an efficient way to determine patterns in said fields and related studies; therefore, data from mobile devices, smart cards, global positioning system (GPS) navigators, and location-based and online applications containing users' activities with geo-locations are widely used, and have been found to be more efficient for such studies in recent years [24,[30][31][32][33][34]. With the advancement of mobile technologies and widespread use of mobile devices, it is easy to track users' locations from their devices and activities. For example, Gonzalez et al. [35] introduced a dataset that contained data from 100,000 users over six months. Although the data only contained the nearby location of the mobile phone towers from where the phone call originated, it still proved to be very helpful in estimating the approximate locations of users with a certain margin of time, and was subsequently used in the prediction of human movement [36]. Various properties of Geographical Information System (GIS) functionalities and their potential role in urban mining studies were reviewed by Zhu [37] through a discussion on how GIS data can be utilized to analyze, visualize, report, and mine the temporal or spatial features of recyclable waste and its collection and recovery systems. The modern digitized world allows for researchers to conduct quantitative analyses of user activity patterns and related factors, such as living area, social contacts, and personal references [38][39][40]. Fan et al. [41] categorized user activity research into three different classes, namely location prediction, trajectory mining, and location recommendations. The authors also emphasized its role in our understanding of user activity patterns, and how it can be beneficial in many areas, such as traffic control, disaster relief, mobile marketing, city planning, and public health.
One of the most significant sources of big data is online social networks because of their widespread and ever-growing use in almost every part of the world [16]. LBSNs allow for users to share their current locations, activities, and interests, and generate data that provide us with the opportunity to conduct different kinds of studies in various fields. The analytical methods for, and the studies conducted on human activities from, mobile data are discussed in various works [42][43][44]. The use of LBSNs was investigated by Lindqvist et al. [45], followed by a number of studies on human activity patterns based on LBSN data [16,[46][47][48]. Zhang and Chow [49] presented personalized geo-social recommendations based on LBSNs by using two different datasets (Foursquare and Gowalla), and observed similar patterns in both datasets. Preoţiuc-Pietro and Cohn [16] also investigated 10,000 Foursquare users for a better understanding of human activity patterns across different venue categories. They further divided the users into various clusters based on their behavior, and predicted their movements based on frequency. Colombo et al. [50] used similar data from two different cities in the United Kingdom to improve recommendation systems, by collecting more frequent check-ins at various venues. Li et al. [51] conducted a broader study by using Foursquare data from 14 different counties and 2.4 million venues to uncover the reasons for venue popularity. It was concluded that there are three main reasons influencing the popularity of a venue: (1) venue profile information, as venues with complete profile information are undoubtedly more popular; (2) venue age, as people tend to visit known and famous places; and (3) venue category, as venues under the 'Food' category were found to have the highest number of check-ins. Alrumayyan et al. [2] studied peoples' patterns related to various venue categories in the capital city of Saudi Arabia, Riyadh, with the support of Foursquare data. The study was more focused on the 'Food' category because people are more interested in sharing their experience and leaving comments while visiting food venues. LBSN data have been used in a number of critical fields. Graham et al. [52] studied the importance of LBSNs in assisting local governments by conducting a survey of more than 300 local government officials from the United States. They discussed the contribution of social media tools to the management of a crisis, resulting in positive relationships with the ability of users to control the crisis situation. Other similar studies highlighting the use and ability of social media in crises include articles on the wildfires in California [53], Hurricane Sandy, and the earthquake in Haiti [54]. A study by Lin et al. [55] in New York city, San Francisco, and Hong Kong, from check-in data of more than 19,000 Swarm (an APP of Foursquare) users, discussed the user preferences and associations between different venue categories at different times of the day. Loo et al. [12] used LBSN data and the kernel density method to study the spatial distribution of road crashes in Shanghai.
Lots of research has been done to uncover different features in and from LBSN data in the last few years. Most researchers studied information from LBSNs such as Twitter and Foursquare to investigate a variety of patterns, including a user's activity and mobility, urban planning, and venue categorization. Weibo is a famous LBSN in China, and has been proven to be an efficient source of data for this type of analysis. A case study of Shanzhen, China, introduced an approach to analyzing LBSN check-in data to analyze tourism venues' attraction features by using Weibo data from the period 2012-2014 [14]. Long et al. [13] used human mobility and activity patterns based on Weibo data and proposed a framework to analyze the growth of urban boundaries for the city of Beijing. Another study by Shi et al. [56] used Weibo data for mining the tourism crowd in Shanghai. This study initially analyzed check-in data from Weibo to determine the popularity of tourism venues and afterward used spatial pattern analysis to find the association between these venues, followed by a sentiment analysis of tourist opinions from Weibo contextual information. Rizwan et al. [57] used Weibo data from early 2016 to observe check-in behavior and gender differences.
However, to our knowledge, there is no comprehensive study for the area of Shanghai that mines both the temporal and spatial characteristics of check-ins, and associates the Weibo check-in features with different venue classes within the city. Our goal is to study the volatility and patterns of users' check-ins at different time scales (e.g., time of the day, day of the week, over six months) in association with the type of venues, prove their effectiveness under the consideration of venue categories, and finally show the locations of different venues and the density of check-ins from Weibo for the time period of January 2017 to July 2017.

Data Source
The data used in the current study are from one of the most popular Chinese microblogs, Weibo. Facebook and Twitter are the most popular LBSNs in the world. In China, Weibo, a hybrid of Facebook and Twitter, is one of the most dominant LBSNs [56]. It has become a major platform, enabling users to share their activities, opinions, preferences, and locations along with audio, images, and videos through checking and writing posts, alongside communicating with their friends. Since Weibo was launched on 14 August 2009, the number of users, check-ins, and activities has increased rapidly. Weibo provides different types of geo-spatial resources; three of the main resources include user-profile locations, places mentioned in posts, and sharing real-time locations through check-ins. By the end of 2018, the total number of users increased to over 500 million, reaching 462 million monthly active and 200 million daily active users. Weibo launched an international version in March 2017 and claims to have users in more than 190 countries [57,58]. This study mined check-in patterns through different classes and further estimated the check-in density on a real map using SPSS, ArcMap, and OpenStreetMap via socially generated spatio-temporal data from Weibo in the famous city of Shanghai, China for a period of six months, from 1st January 2017 to 30th June 2017.

Study Area
This study was conducted on Weibo data taken for Shanghai, China, which is situated on the eastern edge of the Yangtze River Delta between 30 40 -31 53 N and 120 52 -122 12 E, with a total area of 8359 square kilometers, as shown in Figure 1. In 2016, Shanghai was divided into 16 districts and one county, namely Baoshan, Changning, Fengxian, Hongkou, Huangpu, Jiading, Jingan, Jinshan, Minhang, Pudong New Area, Putuo, Qingpu, Songjiang, Yangpu, Xuhui, and Chongming (which was not included as it is rarely visited by people) [57]. ArcMap, and OpenStreetMap via socially generated spatio-temporal data from Weibo in the famous city of Shanghai, China for a period of six months, from 1 st January 2017 to 30 th June 2017.

Study Area
This study was conducted on Weibo data taken for Shanghai, China, which is situated on the eastern edge of the Yangtze River Delta between 30*40`-31*53`N and 120*52`-122*12`E, with a total area of 8359 square kilometers, as shown in Figure 1. In 2016, Shanghai was divided into 16 districts and one county, namely Baoshan, Changning, Fengxian, Hongkou, Huangpu, Jiading, Jingan, Jinshan, Minhang, Pudong New Area, Putuo, Qingpu, Songjiang, Yangpu, Xuhui, and Chongming (which was not included as it is rarely visited by people) [57].

Methodology
This section describes the data acquisition and preparation of the dataset used in our research, and provides an overview of our descriptive and analytical methods. Our methodology consists of the following: data collection acquisition and preparation, descriptive analysis, and spatial analysis.

Data Acquisition and Preparation
The primary inspiration for the use of LBSNs is to share interests and activities and thereby build new and close social relationships, enabling researchers to discover patterns in users' activities and preferences from the big data generated by the LBSN. The data source for this research is Weibo, which is regarded as one of the most popular microblogs in China. We used a Python-based Weibo API (Application Program Interface) to collect data in specific regions of China, specifically Shanghai city. We collected our data during 2017, and initially there were approximately 3.5 million check-ins from about 2 million users. The data acquired from Weibo were in the standard API (Java Script Object Notation, JSON). The JSON format was converted into CSV (Comma-Separated Values) format using MongoDB for further analysis.
The initial dataset included several attributes, such as User_ID, Gender, Check-in Date/Time, account creation Date/Time, Location_ID, and text messages. The dataset was first filtered for anomalies, missing attributes, and attributes irrelevant to our study. In order to make it more significant and to consider only the important venues, we included venues with more than 100 check-ins within the study period of six months. The final dataset included 166,898 users with 222,525 check-ins at 722 different venues. A sample of the final dataset is shown in Table 1.

Temporal Methods
We performed a descriptive statistical analysis using IBM SPSS 25 on the dataset to reveal various patterns in the check-ins of users based on check-in frequencies at different hours of the day, different days of the week, and for all individual days throughout the study period of six months. Various check-in venue categories were examined to investigate from where people used LBSNs more frequently. All of the descriptive results include gender, in order to show the frequency patterns of both males and females.
The venue categorization was completed by comparing latitude/longitude and location names from the dataset with real locations all over the city. This study includes famous and frequently visited locations, and therefore highlights venue categories with the maximum number of check-ins. Each check-in was assigned a category according the check-in that is best suitable for the venue class. The overall flow of our research methodology is shown in Figure 2.

Spatial Methods
To observe the geo-data on a map, we collected the map attributes from OpenStreetMap and used Shape files in ArcMap with a built-in Python programming platform to show the actual locations of the venues and density of check-ins within the study area of Shanghai. OpenStreetMap is a geo-information platform providing real-time and user-generated content related to the global map, including various attributes of maps such as roads, canals, streets, and districts, and is available free of cost. It is widely used by researchers to analyze and visualize geo-spatial data [61].
In order to obtain a more accurate and smooth density, we used KDE. KDE is a multivariate method that uses a random sample of data to estimate the density. We can calculate the density as shown in Equation 1: where is a two-dimensional location containing and , and is the set of data. Using the bandwidth ℎ for both spatial dimensions and the Gaussian Kernel Function ( ) provides an efficient way of estimating the density [62].

Results
With the advancements in online services, wireless communication, mobile devices, and location-sharing technologies, LBSNs such as Facebook, Foursquare, Twitter, and Weibo are attracting researchers' attention due to the huge amount of data generated by these LBSNs. The data can be used to extract very useful information for urban planning, crisis and disaster management, and for other fields of study involving big data with a high spatio-temporal resolution. The current study had three different aspects of analysis: temporal, check-in venue classification, and spatial analysis of the Weibo data for Shanghai. This section includes the results and discussion of these three aspects.

Temporal Patterns
The temporal check-in analysis further consists of three parts: daily patterns, weekly patterns, and check-in patterns for 180 days, from 1 st January to 30 th June 2017 (the study period of our research). All of these results also highlight the frequency of both male and female users.

Spatial Methods
To observe the geo-data on a map, we collected the map attributes from OpenStreetMap and used Shape files in ArcMap with a built-in Python programming platform to show the actual locations of the venues and density of check-ins within the study area of Shanghai. OpenStreetMap is a geo-information platform providing real-time and user-generated content related to the global map, including various attributes of maps such as roads, canals, streets, and districts, and is available free of cost. It is widely used by researchers to analyze and visualize geo-spatial data [61].
In order to obtain a more accurate and smooth density, we used KDE. KDE is a multivariate method that uses a random sample of data to estimate the density. We can calculate the density as shown in Equation (1): where l j is a two-dimensional location containing x and y, and D is the set of data. Using the bandwidth h for both spatial dimensions and the Gaussian Kernel Function K( ) provides an efficient way of estimating the density [62].

Results
With the advancements in online services, wireless communication, mobile devices, and location-sharing technologies, LBSNs such as Facebook, Foursquare, Twitter, and Weibo are attracting researchers' attention due to the huge amount of data generated by these LBSNs. The data can be used to extract very useful information for urban planning, crisis and disaster management, and for other fields of study involving big data with a high spatio-temporal resolution. The current study had three different aspects of analysis: temporal, check-in venue classification, and spatial analysis of the Weibo data for Shanghai. This section includes the results and discussion of these three aspects.

Temporal Patterns
The temporal check-in analysis further consists of three parts: daily patterns, weekly patterns, and check-in patterns for 180 days, from 1st January to 30th June 2017 (the study period of our research). All of these results also highlight the frequency of both male and female users.

Daily Patterns (Hours)
To investigate the check-in frequency pattern of Weibo users, we observed the distribution of check-ins for 24 h of the day, as shown in Figure 3. It can be observed that routine activities have a profound impact on the number and time of check-ins. For instance, the number of check-ins starts rising in the early morning, is considerable after 10 a.m. and is highest after 12 p.m., while the check-ins start declining after midnight. The peak of check-ins was 10 p.m. to 12 a.m., a typical time frame for social activities of many people. To investigate the check-in frequency pattern of Weibo users, we observed the distribution of check-ins for 24 hours of the day, as shown in Figure 3. It can be observed that routine activities have a profound impact on the number and time of check-ins. For instance, the number of check-ins starts rising in the early morning, is considerable after 10 a.m. and is highest after 12 p.m., while the checkins start declining after midnight. The peak of check-ins was 10 p.m. to 12 a.m., a typical time frame for social activities of many people. It can be seen in Figure 3 that on the time scale from midnight to midnight (00 to 24 hours), the check-in frequency is more skewed towards the right, showing more check-ins in the afternoon, evening, and before midnight. The figure shows the normal distribution of the data, shown by the kurtosis having a nearly normal value of 3. There are less check-ins after midnight and in the early morning because of the sleeping routine of Shanghai residents. As one of the most developed cities of China, the check-in frequency of both males and females is almost the same, but the number of check-ins differs because of the different numbers of male and female users in our dataset. The frequency is normal until the afternoon because the people are mostly at work, and it increases as they finish work and as they meet their friends and families or visit places, before eventually decreasing for the night period.

Weekly Patterns (Hours)
This section analyzes the weekly rhythm of check-ins. Weekly patterns suggest that the user's check-ins are predominant in the weekends when compared to the weekdays. The full view of the total number of check-ins for each day of the week can be seen in Figure 4. It can be seen in Figure 3 that on the time scale from midnight to midnight (00 to 24 h), the check-in frequency is more skewed towards the right, showing more check-ins in the afternoon, evening, and before midnight. The figure shows the normal distribution of the data, shown by the kurtosis having a nearly normal value of 3. There are less check-ins after midnight and in the early morning because of the sleeping routine of Shanghai residents. As one of the most developed cities of China, the check-in frequency of both males and females is almost the same, but the number of check-ins differs because of the different numbers of male and female users in our dataset. The frequency is normal until the afternoon because the people are mostly at work, and it increases as they finish work and as they meet their friends and families or visit places, before eventually decreasing for the night period.

Weekly Patterns (Hours)
This section analyzes the weekly rhythm of check-ins. Weekly patterns suggest that the user's check-ins are predominant in the weekends when compared to the weekdays. The full view of the total number of check-ins for each day of the week can be seen in Figure 4. It can be observed from Figure 4 that most of the check-ins took place on Saturday and Sunday, suggesting the behavior of people using LBSNs on the holidays. Users tend to increase their social activities after work on Friday, Saturday, and Sunday, and this increase sometimes lasts until overnight on Sunday; therefore, more activities occur from Friday night until Monday morning. This figure illustrates that the frequency of check-ins on Saturdays and Sundays is the highest, followed by Friday and Monday. Tuesday, Wednesday, and Thursday show the minimum number of social activities throughout the week.

Patterns by Date (180 days)
This section represents the daily trends of the total number of Weibo users for 180 days (1 st January 2017 to 30 th June 2017) in Shanghai. Figure 5 shows the variations of check-in frequencies for both males and females during the study period.  It can be observed from Figure 4 that most of the check-ins took place on Saturday and Sunday, suggesting the behavior of people using LBSNs on the holidays. Users tend to increase their social activities after work on Friday, Saturday, and Sunday, and this increase sometimes lasts until overnight on Sunday; therefore, more activities occur from Friday night until Monday morning. This figure illustrates that the frequency of check-ins on Saturdays and Sundays is the highest, followed by Friday and Monday. Tuesday, Wednesday, and Thursday show the minimum number of social activities throughout the week.

Patterns by Date (180 Days)
This section represents the daily trends of the total number of Weibo users for 180 days (1st January 2017 to 30th June 2017) in Shanghai. Figure 5 shows the variations of check-in frequencies for both males and females during the study period. It can be observed from Figure 4 that most of the check-ins took place on Saturday and Sunday, suggesting the behavior of people using LBSNs on the holidays. Users tend to increase their social activities after work on Friday, Saturday, and Sunday, and this increase sometimes lasts until overnight on Sunday; therefore, more activities occur from Friday night until Monday morning. This figure illustrates that the frequency of check-ins on Saturdays and Sundays is the highest, followed by Friday and Monday. Tuesday, Wednesday, and Thursday show the minimum number of social activities throughout the week.

Patterns by Date (180 days)
This section represents the daily trends of the total number of Weibo users for 180 days (1 st January 2017 to 30 th June 2017) in Shanghai. Figure 5 shows the variations of check-in frequencies for both males and females during the study period.   Week, the Shanghai Formula 1 Grand Prix, the Shanghai Ballet Company's 'Swan Lake', and Easter, all from the 'Entertainment' category, which had the highest impact, and was also due to the number of venues belonging to this category in the dataset. The least number of check-ins occurred in the last two weeks of January 2017 and the first two weeks of February 2017 because of the periodic vast migration of people around the Chinese New Year or Chinese Spring Festival [63], wherein a massive number of Shanghai residents move back to their hometowns on vacation, accounting for 39% (in 2010) of the total population of Shanghai. The results also reveal that people tend to share their activities using LBSNs more on such occasions as visiting places and meeting with friends compared to being physically present at home or work.

Check-In Venue Categories
One of the primary advantages of using LBSN data is the ability to identify the location of the check-in activity, along with its purpose. Each check-in provides the latitude and longitude of the original location by the LBSN (e.g., Weibo) [64]. When searched for in the LBSN data, the latitude and longitude give a specific location on a geo-referenced map. This location can be used to obtain information about the visited venue. We classify these venues based on their type and the activities performed at them.
In  Table 2 below.  According to the above criteria, the distribution of locations can be put into categories, as shown in Table 3. The check-in venue class 'Entertainment' contains 136 distinct locations, and 88, 27,55,30,51,172,90,26, and 47 locations are found in 'Education', 'Food', 'General Location', 'Hotel', 'Professional', 'Residential', 'Shopping & Services', 'Sports', and 'Travel', respectively. We investigated more interesting patterns by applying the same category distribution to the whole dataset through different characteristics of prescribed categories. The most common characteristics of the 10 venue categories are given below. First, we looked at the number of locations of each category in our dataset. We can see that the categories with the highest number of users and check-ins are 'Entertainment' and 'Shopping & Services'. 'Residential' and 'Educational' also have a significant number of users and check-ins as compared to the other categories. This shows the regular day-to-day behavior of people; as expected, people in entertainment and shopping places tend to use LBSNs more when compared to people working in their offices. Another insight is that students and people during their free time at home more frequently use LBSN services. Hence, the results are similar to expectations, with an additional trend of the check-in data being that the number of check-ins in the 'Residential' category is less than that in 'Entertainment' and 'Shopping & Services', despite these having a greater number of locations in the dataset.

Spatial Patterns
In this section, we investigate the spatial analysis by visualizing the location of check-in venue categories and the density of total check-ins by using the geo-location data from Weibo on a map of Shanghai. For this purpose, we used a map including features from OpenStreetMap, because it contains the most recent updates of the map features [65]. We can observe features such as city boundaries, districts and district boundaries, Shanghai Metro lines, and the road structure. With the help of these features, it is easy to evaluate and recognize the different locations on the map. For spatial analysis, we first plotted the locations of all the famous venues in Shanghai, as shown in Figure 6.
ISPRS Int. J. Geo-Inf. 2020, 9, 76 12 of 19 In this section, we investigate the spatial analysis by visualizing the location of check-in venue categories and the density of total check-ins by using the geo-location data from Weibo on a map of Shanghai. For this purpose, we used a map including features from OpenStreetMap, because it contains the most recent updates of the map features [65]. We can observe features such as city boundaries, districts and district boundaries, Shanghai Metro lines, and the road structure. With the help of these features, it is easy to evaluate and recognize the different locations on the map. For spatial analysis, we first plotted the locations of all the famous venues in Shanghai, as shown in Figure  6. It can be observed from the above figure that, as per the planning of one of the major cities of China and ease of access, most of these locations are located either in the city center or near the Shanghai Metro. The seven districts, namely Changning, Huangpu, Putuo, Hongkou, Xuhui, Jingan, and Yangpu, situated in Puxi (Huangpu West) are collectively called the downtown area or the city center of Shanghai [66]. The downtown has a higher concentration of famous places, as would be expected in any major city; however, the Educational and Residential venues are relatively dispersed within the city.
Although plotting the venues gives us an abstract idea about the distribution of check-in locations, we need further investigation in order to analyze the spatial patterns in our dataset. Therefore, we used KDE to find the density of check-ins using ArcMap. We calculated the density based on the check-ins by all users, providing us with more accurate results for further analysis, as shown in Figure 7. It can be observed from the above figure that, as per the planning of one of the major cities of China and ease of access, most of these locations are located either in the city center or near the Shanghai Metro. The seven districts, namely Changning, Huangpu, Putuo, Hongkou, Xuhui, Jingan, and Yangpu, situated in Puxi (Huangpu West) are collectively called the downtown area or the city center of Shanghai [66]. The downtown has a higher concentration of famous places, as would be expected in any major city; however, the Educational and Residential venues are relatively dispersed within the city.
Although plotting the venues gives us an abstract idea about the distribution of check-in locations, we need further investigation in order to analyze the spatial patterns in our dataset. Therefore, we used KDE to find the density of check-ins using ArcMap. We calculated the density based on the check-ins by all users, providing us with more accurate results for further analysis, as shown in Figure 7.  Red represents the highest density and white represents the average, which eventually dissolves into the base color of the map according to the type of data [67]. Check-ins that did not satisfy the minimum criteria in our dataset were not considered; thus, such data does not appear on the map. This figure clearly indicates that check-ins in the city center are more dense as compared to the regions away from the city center (as expected). The areas of Hongkou, Huangpu, and Jingan are the most dense areas as compared to the other districts.
The spatio-temporal analysis was conducted by comparing weekly density for the first two weeks of April (having the maximum number of check-ins) with the last week of January and the first week of February (containing the minimum number of check-ins) as shown in Figure 8.  Red represents the highest density and white represents the average, which eventually dissolves into the base color of the map according to the type of data [67]. Check-ins that did not satisfy the minimum criteria in our dataset were not considered; thus, such data does not appear on the map. This figure clearly indicates that check-ins in the city center are more dense as compared to the regions away from the city center (as expected). The areas of Hongkou, Huangpu, and Jingan are the most dense areas as compared to the other districts.
The spatio-temporal analysis was conducted by comparing weekly density for the first two weeks of April (having the maximum number of check-ins) with the last week of January and the first week of February (containing the minimum number of check-ins) as shown in Figure 8.  Figure 5. It can be observed that, although the density varies in different areas all over the city, the downtown area remains the denser area even with a smaller number of check-ins throughout the weeks of January and February; however, the overall check-in distribution covers a larger area during different periods of time. It is important to consider that the downtown area is considered to be the commercial center of Shanghai; therefore, these areas have more facilities in almost every way, including transportation, food, shopping malls, government offices, and nightspots. However, as Shanghai is a considerably developed and modern city with lots of parks and diverse Educational and Residential venues, the check-in clusters can be observed in different places throughout the city.

Discussion
The analysis shows that data from Weibo are an efficient resource for analyzing the distribution of user activities and preferences in terms of spatio-temporal aspects. One of the benefits of using LBSN data for spatio-temporal analysis is that we can extract and visualize large-scale information for a megacity such as Shanghai in more detail. Some areas in the downtown of Shanghai are crowded, while other suburban areas have less visitors. This study intended to observe the behavioral traits of users by providing evidence that the dynamics of a megacity can be influenced by various facilities, and the contribution of the nature of different venues. We explored the spatio-temporal patterns in the check-ins to show the distribution of users in Shanghai. In this study, we performed an empirical analysis of check-ins using graphs, tables, and density maps based on LBSN data. The  Figure 5. It can be observed that, although the density varies in different areas all over the city, the downtown area remains the denser area even with a smaller number of check-ins throughout the weeks of January and February; however, the overall check-in distribution covers a larger area during different periods of time. It is important to consider that the downtown area is considered to be the commercial center of Shanghai; therefore, these areas have more facilities in almost every way, including transportation, food, shopping malls, government offices, and nightspots. However, as Shanghai is a considerably developed and modern city with lots of parks and diverse Educational and Residential venues, the check-in clusters can be observed in different places throughout the city.

Discussion
The analysis shows that data from Weibo are an efficient resource for analyzing the distribution of user activities and preferences in terms of spatio-temporal aspects. One of the benefits of using LBSN data for spatio-temporal analysis is that we can extract and visualize large-scale information for a megacity such as Shanghai in more detail. Some areas in the downtown of Shanghai are crowded, while other suburban areas have less visitors. This study intended to observe the behavioral traits of users by providing evidence that the dynamics of a megacity can be influenced by various facilities, and the contribution of the nature of different venues. We explored the spatio-temporal patterns in the check-ins to show the distribution of users in Shanghai. In this study, we performed an empirical analysis of check-ins using graphs, tables, and density maps based on LBSN data. The spatio-temporal patterns were studied from various perspectives, including hours, days, and venue categories. From the chronological perspective, the results verified the frequency of check-ins rising from the middle of the day until late at night, and the obvious increase in weekend activities as compared to weekdays. From the spatial point of view, the level of spatial intensity of users in the city center was higher in the downtown area, as this is the center of activity for most of the venue classes.
This research used geo-tagged check-in data from an LBSN as a representation to approximate the general population of Shanghai, as it is more efficient than time-and labor-intensive questionnaires and surveys, and can therefore offer exceptional spatial and temporal coverage. Weibo provides an open geo-database and excludes all of the information related to the privacy of the users. However, this approach has its own limitations. For example, we do not have a way to measure the exact sample ratio of LBSN users and the population of Shanghai, so we can only determine the correlation between check-in data and actual people in the evaluation and planning of a megacity, as the connection between the check-ins of Weibo and actual residents may vary across different areas. Although the LBSN provides many attributes as compared to traditional census data, it generally does not directly include some demographic data such as age, marital status, and ethnicity, although there are other ways to extract these kinds of data indirectly, as discussed by Longley and Adnan [68].
The comprehensive spatio-temporal coverage of this research provides useful results and information that could be beneficial in analyzing user activities in an urban city and, therefore, it may be beneficial for the planning and development of large cities, and also provide a basis for using Weibo data to analyze individual categories, such as travel, food, and educational venues. The study of the influence of different venue types on the preferences of inhabitants in various urban areas has significant potential for planning and activity preferences among urban residents.

Conclusions
We used check-in data from Weibo to analyze geo-spatial data to uncover various temporal and spatial patterns throughout the most famous places in Shanghai. The study was carried out to look at three different aspects of analysis: a temporal analysis to reveal the patterns based on time, a check-in venue classification to provide insight into LBSN users in each category, and a spatial analysis, resulting in a clear observation of venues and check-ins through mapping. The findings demonstrated that people tend to use LBSNs more in the evening instead of the morning and work day. We also observed that LBSNs are more widely used while visiting locations and shopping. Check-ins from educational institutions are substantial, suggesting that students are frequent users of LBSNs. Though many of the results are similar to what we expected, we obtained some interesting facts about the use of LBSNs. For example, despite having more locations in the residential category in our dataset, the check-in number for the entertainment and shopping categories exceeded the residential check-ins. Another interesting pattern that we uncovered was that the density extends to suburban areas, mainly because of educational institutions and residential areas, an important fact that has not yet been discussed.
Data from LBSNs can play a strategic role in both the development and improvement of various aspects of mega cities' "smartness". The possibility to analyze the activities of urban agents has completely modified the representation of relationship between activities and spaces. This can assist in urban planning by providing the tools to attain objectives of sustainability and make mega cities livable and more efficient. For example, finding the factors that increasingly influence cities (venue categories), if not planned well, can affect both the objectives of sustainability and development. The study of user activities essentially requires the availability of big data and information; therefore, the use of LBSNs to collect data from people residing and moving inside a mega city could be beneficial to planning the distribution of different types of venues throughout the city. In this framework, information about the various activities of the city's users and residents can describe the events occurring in the physical space. The current study attempts to address this using data from Weibo to better explore the activities of urban populations in Shanghai. Further study is indeed required to explore of the behavior of the population described here, with a more specific definition to strengthen the relation between the baselines of this study and the effect of various urban functions, such as restaurants, transport, and educational institutions. The results could provide insights into the linkage between urban entropy and urban magnets (types of venues attracting more people) in the city, therefore identifying the areas and aspects that need special attention and a well-planned distribution in the management of the city. These results are based on a dataset containing a minimum of 100 Weibo check-ins for a single venue, which is why the results concentrate on specific areas within the city. The analysis could be improved by using more micro-level data from different LBSNs. Similarly, the categories could be extended by covering more venues and specialized distributions. Another dimension of a future study might be the use of diverse datasets and extending the category classes to obtain more specific and accurate patterns. In this regard, we are working on analyzing user behavior in the "Food" and "Educational" categories, which were classified in this study.