Strategies of Success for Social Networks: Mermaids and Temporal Evolution

The main goal of this article is to investigate techniques that can quickly lead to successful social systems by boosting network connectivity. This is especially useful when starting new online communities where the aim is to increase the system utilization as much as possible. This aspect is very important nowadays, given the existence of many online social networks available on the web, and the relatively high level of competition. In other words, attracting users’ attention is becoming a major concern, and time is an essential factor when investing money and resources into online social systems. Our study describes an effective technique that deals with this issue by introducing the notion of mermaids, special attractors that alter the normal evolutive behavior of a social system. We analyze how mermaids can boost social networks, and then provide estimations of fundamental parameters that business strategists can take into account in order to obtain successful systems within a constrained budget.


Introduction
Social networks are nowadays among the most important and successful online systems. Since 2010, when Facebook became the most visited site in the US beating all other players including Google, we have witnessed an enormous growth of the social world, with many social systems arising, and also with many other online systems adding social functionalities. This huge growth has also correspondingly brought lot of competition, making it very hard to build new successful social networks. It is therefore of crucial importance to study how social systems can blossom, maximizing their chances of success.
In this study, we analyze online social networks viewing them as complex systems: Whereas in a previous paper (cf. [1]) we studied complex systems along their spatial informational axes, this paper instead tackles the other informative axis, time. We focus on the temporal dynamics of social networks, so keeping fixed the spatial axes and investigating how networks evolve as time passes by. In particular, the analysis presented here is designed to evaluate online social systems, and suitable strategies that allow for their successful development. However, we think that the models are general enough that they can easily fit other complex networks as well.
Previous studies ( [2,3]) suggested that online interaction is driven by the same needs as face-to-face interaction, and should not be regarded as a separate arena but as an integrated part of modern social life (cf. [3]). Thus, communicative actions taken by members of online communities can be expected to share many features with the web of human acquaintances and romances in the offline social world. Indeed, for many people in contemporary western societies, interaction on the Internet is as real as any other interaction (e.g., see [4]). For these reasons, the formation, the dynamics, and the general evolution over successful social systems by introducing the concept of mermaids, that is to say, special attractors that accelerate the efficiency of the system. Mermaids are actors introduced on top of a social network that are under the control of the network owner and act under specific rules. They can be impersonated by using persons specifically hired for that purpose, although (according to the nature of the social network) they might be also impersonated by using specially developed online social bots ( [14]), or even using hybrid techniques like bots supervised by humans ( [15]). Note existing social bots have different purposes than mermaids, namely to increase the number of followers and to spread information (and sometimes also misinformation, cf. [16]).
We analyze the use of mermaids within the business perspective of financial technology (see for instance [17][18][19][20]). This means that we consider the practical scenario of having a predefined budget constraint, and then proceed to study what the best strategies are in order to maximize the successful effects (in other words, to minimize the cost/benefit ratio).

Network Growth Models
In this section, we describe the details of the network growth models we investigated in this paper. We start from an empty network and, by repeatedly apply rules at local level, make the network evolve and (eventually) reach an equilibrium state.
We already know that many real-world systems such as power grids, communication networks, biochemical interaction, as well as social networks can be modeled as graphs (cf. [21]). Using standard graph theory notation, we will consider online social networks as unweighted undirected graphs G = (V, E), where V are the vertices and E the edges. Nodes u i ∈ V represent users while edges (u i , u j ) ∈ E mutual friendship relations between them. The evolution of a graph G = (V, E) is conceptually represented by a series of graphs G 1 , · · · , G t , so that G i = (V, E i ) is the graph at step i. Since G 1 , · · · , G t represent different snapshots of the same graph, we have E i ⊆ E.
Note that it remains an open problem to have a fully detailed history of a social network: This is due to many reasons such as, for example, the unfeasibility of data gathering, or restrictions when crawling a web site. Indeed, if we want to consider big social network datasets with millions of users, the process of retrieving all the actions users perform could be a very difficult task caused by the large amount of dataset modification occurring at the same time (we obviously assume not to have direct access to the website dataset).
These limitations prompted us to take another path to reach our goal. Instead of searching and waiting for the gold datasets that contains the complete dynamics of a social systems, we propose to simulate it by using some basic and advanced growth rules that could be potentially applied to all kind of complex systems. In this framework, the assumptions that we will make is to have a snapshot of a complex system in an equilibrium state and that the growth dynamics is not free but it is constrained to the dataset snapshot that we have. This means that (unless stated otherwise) during simulations all the edges available to be picked up are those present in the referring snapshot graph; any other edge is not allowed in the simulation process. This way, the final graph G t will be always equal to the referring network.
In this phase of our research, we will not consider links or nodes removal, therefore E 1 ⊆ E 2 ⊆ · · · ⊆ E t . This choice is motivated by the observation that in online social networks, there are fewer removals than there are users and friendships relations (see for instance [22]).
Simple network dynamics simulations need at least two parameters. The first is the definition of the order with which edges will be inserted into the network (the order will influence the overall connection pattern) and the second one defines how many edges will be added at each step. Network connectivity evolves according to the following three rules: • Random order. Every edge will have the same probability p = 1 |E|−|E i | to be selected during the growth process. This rule, as many studies showed, is far from real. However, it is a good candidate for a baseline.
• Aristocratic order. This rule is based on the preferential attachment process (cf. [23,24]) where older nodes have a higher probability of attracting new links. The process selects edges by choosing a source node, according to the degree, and a target node, randomly chosen on the available neighbors' list of nodes. By randomly choosing target nodes, low degree nodes can acquire new links as well.
More formally, the probability of selecting the source node is the following: where α is a scale factor that increase or decrease the influence of degree on the final probability value, deg(u) is the node degree.
• Social order. This rule is inspired by the local clustering of small world networks (also known as triadic closure), and in particular from the observation that two friends of a person are likely to know each other (see [25]). This rule considers it more likely that the edges that close triangles will be selected. Edges that make more than one triadic closure are inserted sooner into the network than others. More formally, the probability of edge (u, v) of being selected is the following: where soc(i, j) is the number of times edge (i, j) closes triangles (see for example Figure 1). As for the previous rule, α tunes the effect of triadic closures on a final probability value. Example of social rule. The figure represents a hypothetical snapshot of graph G at time t during network evolution. Straight links indicate already existing edges whereas the dashed lines indicate the ones that will be added in the following steps. Edge (a, c) closes three triads (a, d)(d, c), (a, b)(b, c), and (a, f )( f , c), whereas (d, f ) and (b, e) only close two and one triangle respectively. Therefore, the probability of been selected at time t + 1 is 0.5, 0.33 and 0.16 respectively.

Evolutionary Models: Serial and Parallel
Although the previous three rules are sufficient to define the order with which nodes will be connected, it is also necessary to define how many edges are inserted into the network at every time step. One trivial solution is the serial (also called inertial) setting: We add an edge every time slot, so that in a network composed by m edges the simulation will last m simulated time units. This represents the baseline in our experiments (see Section 5.2) and it is crucial for reporting which rules achieve best when the system behavior will be unfolded.
However, since the previous dynamics might be realistic only in specific situations (for instance in the initial part of a network evolution), we also consider simulations where more than one edge are allowed to be inserted at the same time. We assumed that the number of edges added changes as a function of network efficiency E glob (cf. [26]). This model, that we call parallel (alternatively called accelerated), is described in Algorithm 1.
The algorithm accepts as input: (i) a graph G = (V, E) and (ii) a rule r n = {random|aristocratic|social}. It starts from an empty graph G that has the same nodes as G and no edges. The algorithm deals with parallel edges creation by selecting at each time a subset F such that F ⊆ E to be added to G . Since the edges can be added into the network only once, E will be updated adequately with the remaining edges. The number of connections selected varies according to the following formula: where G ideal is the ideal network in which all edges exist K |V| , nar i−1 is the number of edges that has to be inserted into the network, C is a constant factor and E(G t−1 ) is the global efficiency of the network G at step t − 1. At the beginning, few nodes will be inserted because of low efficiency and, as soon as the network grows and many people are involved in the network, more edges will be chosen and added concurrently. The "1+" factor at the beginning of Equation (1) allows the ability to pick at least one edge at each step, which is fundamental to allow a minimal growth in the initial phase. The C factor is used to tune the effect of efficiency in the number of chosen edges. We studied the effect of C on the network growth and we found out that it only expands (C < 1) or shrinks (C > 1) the time needed to get target efficiency, without considerably altering the curve behavior (see Figure 2). For this reason, we decided to use C = 1 in all our experiments and simulations.

Mermaids
The aim of this section is to introduce the concept of a mermaid-a way to successfully drive network evolution and boost connectivity. The idea is to support the natural evolution of a network by introducing special nodes (the mermaids) that act as "helpers", and that are used to quicken the process of network growth. The owner of a social system that wants to quickly reach a successful status can act with this kind of artificial boosters in order to modify the natural evolution process of the network and making it grow faster. This is especially important in today's online world, where the importance of social systems is well-known and thus competition is fierce.
Mermaids are therefore special members of a social network that are in fact hired by the owners so to get speed-up and make the network successful. As said in Section 2, mermaids can operationally correspond to real people that are accordingly hired, or in some cases even to online bots, although the success of this second option depends on the features of the social system itself: For instance, if the social system is based on sustained personal interactions like chats or personalized images and videos, bots might not be sophisticated enough to sustain such complexity (at least in the present days). Whether human or artificial, every mermaid pursues the same goal: to boost the network growth of the system, so as to rapidly get a successful social environment.
We are therefore interested in practical strategies that can guide proper use of mermaids. In the following, we define the rules of operations defining the basic concept of mermaid, and also consider their use in terms of financial technology, that is to say by also considering the impact that mermaids have with respect to limited financial resources. In order to do so, we define the cost of use for mermaids, in order to then explore what the best strategy that fits a specific budget situation is.

Handling Mermaids
We define mermaids as external nodes (in the sense that they are new to the "normal" network of users) whose goal, as previously said, is to interact with normal users and stimulate overall network utilization (i.e., people engagement in online social networks) and increase efficiency. Informally, we can therefore distinguish between the "normal" social network (composed by legitimate users), and the extra boosting components that can be added to the network, taking the form of mermaid actors.
Formally, we denote the mermaids nodes as V s = {s 1 , s 2 , . . . , s m }, and so the new combined graph (normal graph enriched with mermaids) now becomes G s = (V ∪ V s , E ∪ E s ), the total number of nodes |V ∪ V s | = n + m and the number of edges |E ∪ E s | (see for instance Figure 3). Mermaids acts as special attractors to normal members of the network, and so we parameterize them by considering their level of attractiveness (how likely normal users are to socially link with a mermaid): the formal definition is provided later in this subsection. Operationally, this parameter can correspond for instance to physical beauty, and/or ability of interaction, any quality and skill that makes a mermaid a special node of attraction in the specific social system.
Last but not least, we are interested in the temporal impact that mermaids have on the system, and so we also have to consider the time range in which they are operative.
Summing up, we can define a mermaids' configuration as a tuple µ = (m, a, d) formed by the following three parameters: • m specifies the number of mermaids, • a is the mermaids' ability of attracting new edges (i.e., to generate interest in the community), • d is the operational timespan of mermaids.
As mentioned before, in order to develop strategy guidelines for mermaids we are interested in optimizing the cost/benefit ratio, and therefore see what the best course of action is when dealing with a specific budget. Therefore, we define a cost of a configuration, C s (µ), proportional to the previous parameters, that is: In the following, for the sake of conciseness, we will simply write C s to indicate the cost C s (µ) when the underlying configuration µ is clear from the context.
Combinations of these parameters lead to different costs and ideally to different growth behaviors. Another important goal of this paper is to understand how the overall network evolution changes as a function of µ and in particular to test whether increasing the investment on the mermaids (that is, C s ) yields a proportional benefit to the global efficiency. Furthermore, we study which configuration parameters attain the best performance under a specific cost setting (see Section 5.2).
In this context, for the remainder of the paper, we make the following assumptions: • during the network evolution, edges between mermaids {(s i , s j )|s i ∈ V s , s j ∈ V s } are not allowed, • mermaids {s 1 , s 2 , . . . , s m } are active at the beginning of simulation only, i.e., from time t 0 to t d .
The first assumption is just a separation rule that enables us to better factor the impact of mermaids on the social network (links among mermaids are all artificial, and as such they could improve the network statistics without actually being significant from an absolute point of view). In other words, we are interested in the connections among legitimate users, not those among the mermaids.
The second assumption instead pertains to the focus of this study: We focus on using the mermaid boosters in the initial (onset) phase of a social network, given the initial growth phase is the most critical. Further studies might consider relaxing this constraint and studying the effect of the activation in different time periods, so to act as boosters not only in the initial phase but also as an ongoing way to boost the network performances.
Mermaids are artificial components added to the social network and, as such, they are under complete control by the owner of the network: we are free to decide what kind of social rules they use to connect with the rest of the normal network. Still, mermaids should not behave in strange ways, so not to be suspicious to the normal users. So, we make them behave as normal users by making them use the classic social growth rules described in Section 3: the rules by which edges {(s, u)|s ∈ V s , u ∈ V} will be added are identical to those for normal users, namely random, aristocratic, and social (see previous section). Given that these rules are only an approximation of normal user behavior, in the following we also investigate all the mixed cases, so the combinations of having a certain social rule in action for normal users, and another social rule for the mermaids. This way we can investigate whether there are differences in setting a predefined social rule for mermaids, while possibly having another rule better approximating the behavior of the other users.
In general, mermaids' dynamics evolve independently of users' dynamics. However, an exception still exists. In fact, a new link in users' subnetworks could trigger a modification in the likelihood of mermaids' edges of being selected, specifically with the social rule. Figure 4 shows some examples. In particular, Figure 4b shows what happens when a new link (b, c) in users' network is added: edges (s 1 , c),(s 2 , c),(s 3 , c) will be more likely to be selected in the following steps because of triadic closure rule.  Algorithm 2 describes how simulations are made, by combining the agent-based approach of mermaids with the normal evolution model of the network. It uses two sets E and E s from which the edges will be selected, two rules r n and r s (that specify which edge to choose in the users' and mermaids' subnetworks), and a configuration µ. The algorithm has a main loop (line 4) in which two distinct phases are executed and each one manages users' and mermaids' selection of edges (line 5 and 9) according to r n and r s respectively.
The number of edges selected in the first process (line 5) is calculated similarly as in Algorithm 1 (using Equation (1)) but for the input graph that now becomes G = (V ∪ V s , E ) instead of G = (V, E). In the second phase (line 9), the number of selected edges is set to a constant value and equal to |V s |· |V|· a. This means that the total number of links between mermaids and users can be estimated in advance as the following: E s = |V s |· |V|· a· d (this number will be reached at time t d and will not change afterwards). The effect of mermaids will be limited to the first d iterations, after that the mermaids will be deactivated (line 8) and the system will evolve independently by itself.
The attractiveness parameter a quantifies how much a mermaid is able to promote the utilization (i.e., edges creation) of the online community. It is defined as follows: where q(s) is a weight function. In order to meet the requirement that mermaids have a better ability to establish new friendships, we assigned a doubled weight to them compared to normal nodes (see Section 5.2).

Managing Cost
From a business point of view, every system has to deal with costs. In our context, the cost can be split into two parts. The first, that accounts for mermaids' cost (think of mermaids as handled by Algorithm 2: Accelerated networks simulation with mermaids input : E,E s , r n , r s , m, a, d F=Choose q edges from E s with method r s 14 calculates statistics on G 15 p = Number of edges as a function of E glob (G ) 16 end employees), and the second one that accounts for web site cost. More formally, the following formula gives an estimation of the managing cost of setting up a network until it reaches a steady state: where C s is the cost due by using a particular configuration (as defined in Equation (2)), β is the cost by time units of the web site and T min is the minimum timespan needed by the network to evolve toward a connection pattern with a specific global efficiency (given a configuration which costs C s ).
We are now interested in knowing the points in which the function f has minima. For this reason, by calculating the first derivative of Equation (4) and solve f (C s ) = 0, we find that and so the cost is minimum when: We will return to this topic in Section 5.2, where we test how f 's minimal points change versus hypothetical values of β.

Experimental Results
In this section, we report our experimental analysis. The simulation algorithms were implemented in Python and C programming languages. All the experiments were conducted on three Linux machines equipped with an i5 Intel processor and 8Gb of RAM.

Datasets
We conducted experiments on two real world datasets: the Communities and VirtualTourist online social networks. Communities (www.communities.com), CM for short, is considered to be the first social network in the world, as it started in 1996. It is similar to many other social networks like Facebook or LinkedIn where users meet new people, share photos, and chat with friends. Communities is managed by users themselves that create customized web pages in which they express passions, loves, and friendships. Every user can keep track of friends in a friends list, can use guestbooks, blogs, or use photo galleries. In Communities, users can establish virtual contacts, but unlike real world, these ties could be easily maintained over distance. This produces a network of virtual social ties that connects the entire world. Communities lets members create and join communities in order to easily find groups of people sharing similar interests. Joining a community means being able to chat with other members in the community forums and chat rooms.User's locations in Communities are widely distributed and span over 185 nations.
VirtualTourist (www.virtualtourist.com), VT for short, is an online travel-oriented community started in 1998, in which users share their own travel experiences, suggest and review hotels, write comments and opinions on forums, find places to visit, share photos and videos, and is considered to be the ancestor of TripAdvisor (with more emphasis on social networking). It is a community of people that love traveling around the world. In VirtualTourist, users can meet new people and create a network of social virtual friendships as well.  [27,28]). Publicly available profiles and friendships were parsed and anonymized. At that time, there were approximately 700 thousand users in VT, 650 of which are singletons (92.4% of the total), i.e., users that have joined the service but have never made a connection with another user. Conversely, 57 thousand users have at least one friend (approximately 7.6% of the total). There were more than 200 thousand social ties at that time. The VT network has a giant component, a group of users who are pair-wisely connected through paths in the social network, formed by 53.034 nodes (92% of the total nodes with degree greater than zero). The rest of the network is formed by 2077 small (less than 14 nodes each) isolated communities (also called middle region [29]) that are disconnected from the giant component.
In Communities, there were about 30 thousand registered users, 18 of which are singletons (60% of the total) and 12 have more than one friend (about 40%). There exists approximately 60 thousand friendship links. Apart from singletons, the vast majority of the nodes (about 12.131, 92.7% of the total) of the community belongs to the giant component, whereas the rest to the middle region were small communities having less than 8 nodes each.
Since social ties are bidirectional in both systems, we mathematically treat those graphs as undirected. Table 1 summarizes the most important network statistical features. It also contains the metrics calculated on randomized versions of the same graphs. We note that both networks have a small average shortest path L (less than 5 hops between two randomly chosen nodes) and high clustering coefficient C (compared to the randomized versions, fourth, and fifth columns). High E glob and E loc has been detected too. These facts are evidence for classifying them as small world ( [28]). Both networks are formed by many connected clusters, so average path length and clustering coefficient are calculated on the largest connected component (LLC), whereas global and local efficiency on the entire network (the latter two quantities work correctly even for disconnected networks, see [1,30]). Indeed, since cumulative degree distributions P cum have tails that decay as a power law with exponents equal to 2.5 and 2.7 (see Figure 5) and maximum degree k max is higher compared to the average k , they could be classified as scale-free networks.
The plots on Figure 6 and the assortativity values ρ on Table 1 suggest that both networks are disassortative (Pearson correlation equal to −0.59 and −0.30 respectively). This means that, on average, users with many connections tend to connect to users with few friends (see classification in [31]). Many other studies found the same correlation pattern in online social networks, like for instance in the Youtube ( [32]), pussokram ([2]) or Cyworld ( [33]) networks. Being elite in online social networks simply means to have many connections and is just a matter of clicks (cf. [34]). However, this assortative pattern is the opposite compared to the real world where establishing and maintaining friendships require time and effort and where many other factors might influence the likelihood of being a friend of a person, such as cultural, economic, and geographical circumstances.

Results
We now evaluate our models by simulating the network evolution with respect to suitable configurations µ. We selected the global efficiency as the main statistical feature that has been tracked during the experiments. A configuration is defined as a tuple composed by (i) the number of mermaids m used, (ii) the mermaids' attractiveness a, and (iii) the length of time d in which the mermaids are active (starting from t 0 ). We decided to employ 6 or 12 mermaids and to use those special nodes for the firsts 10 or 20 initial time units. To estimate the attractiveness as defined in Equation (3), we use a weight function that is q(u) = 1 for u ∈ V and q(s) = 10 or 20 for s ∈ V s , in order that these special nodes acquire more links compared to normal nodes. Table 2 present the estimated attractiveness values of normal users a n and mermaids a s , together with the variables used to calculate them. With the previous parameters, we created a set of 8 configurations and 4 cost levels (listed in Table 3). |V| is the number of nodes, m the number of mermaids, q(s) is the weight assigned to mermaids, a n is the attractiveness of normal nodes, and a s is the attractiveness of mermaids.
|V| m q(s) a n a s   (12,20,20) The possible configurations that were planned do not exhaust all the axes along with our simulations are based. In fact, two more dimensions are needed: the rule that selects edges between normal nodes and the rule that select edges between mermaids and users (remember from Section 4 we consider the more general case where the preset social rule of mermaids can also differ from the social rule that better approximates real user behavior). Since these dynamics are independent but the names of the rules are still the same, we dub the mermaids' rules as Broadcast, Word of Mouth, and Preferential models in order to uniquely distinguish them from the users' rules.
As said earlier in Section 3, the simulations considered in this paper (unless stated otherwise) are constrained in the sense that every edge that is added among users must exist in the original network. We decided to use this approach because other techniques like, for instance, stochastic simulations (Construct and Link Probability Model) are not well suited to describe big social systems (incurring in computational issues) and because they usually require setting a high number of initial parameters (incurring in a highly nontrivial initial settings to be able to simulate an existing original network).
We start by looking at results obtained for serial analysis (see Section 3.1) so as to understand the effect each rule has in the unfolded network evolution and, subsequently, we consider the more realistic situation in which more than one edge can be added at the same time (called equivalently accelerated parallel or simultaneous). Then, in order to evaluate the effectiveness of our methods to detect new instincts in social systems and to verify whether they are valuable as incentive for network utilization, we test (i) how faster global efficiency will increase when using mermaids and (ii) whether the growth curve will be altered by using these special nodes.
We are interested to uncover all these aspects of online virtual communities by trying to answer the following questions: Q0 Does each rule behave equally in the inertial (serial) context? What happens in the accelerated context? Q1 How do the same cost configurations influence efficiency? Q2 How do parameter variations influence global efficiency? Q3 How much do we have to invest in special nodes?
Before delving into the details of the answers for the previous questions, we can delineate a general discussion about the results for all the growth patterns. In fact, regardless of the configuration adopted, we found that the S-shaped curve characterizes all the growth pattern of E glob . It is well known that the S-shaped curve is at the heart of many diffusion processes and is characteristic of a chain reaction, in which the number of people who adopt a new behavior follows a logistic-like function (cf. [35]): a slow growth in the initial stage, a rapid growth for critical mass time, and a rapid flattening of the curve beyond this point. Because of that, our models and rules could be considered as good candidates for estimating the real network evolution. Figure 7 shows the unfolded behavior of the systems for the three proposed rules, namely: random, aristocratic, and social. Each curve represents global efficiency E glob of the temporal networks that have been created by adding one edge at time. The plots allow for interesting observations. First, we note that until one sixth of the complete spectrum, each rule produces an indistinguishable behavior probably due to weak network structure. After that point, the cumulative effect of drawing edges in different ways starts to appear. The behavior detected is super-linear for the aristocratic rule meaning that preferential attachment is an effective way to boost network efficiency in networks. Conversely, with the social rule we observe a weak sub-linear increase ideally meaning that triadic closure is not the only key ingredient for network evolution. Linear increase is then detected for random rule. To avoid the bias of randomness, we made 100 simulations and then the averaged results are considered. Standard deviations are small and, therefore, are not plotted in favor of clearer plots.  Even though preferential attachment seems to perform better than other rules in serial evolution, this does not necessarily hold in other settings like accelerated (parallel) simulations. In fact, as Figure 8 shows, the random and social rules turn out to be 30% and 60% (CM and VT) faster in reaching the maximum E glob compared to the preferential attachment (see Table 4 that contains the minimum time needed (T min ) to get the original global efficiency). This is probably due to a combined effect of topological structure and rule applied. In fact, online social networks, like social networks in general (cf. [25]), are formed by weak ties that are responsible for keeping subcommunities together and preserving the global reachability among nodes. According to preferential attachment, nodes with high degree are more likely to acquire new links. However, weak ties are not necessarily connected to hubs, meaning that they will not be selected at the beginning, maintaining low the global efficiency. Step rnd ari soc Figure 9. Effect of simultaneous network simulation in randomized version of Communities (left panel) and VirtualTourist (right panel). Curve starts at simulation time t 0 , but we cropped the points for low values of E glob for graphical clarity. Standard deviation is very small and therefore is not plotted.

Q0: Unfolded serial setting.
In order to verify whether the network topology affects the overall behavior, we applied the same rules on randomized version of the networks. Surprisingly, as Figure 9 and Table 4 show, the preferential attachment rule that previously was the slowest, now is 11% and 12% (CM and VT) faster than the others.
The explanation of this phenomenon is again dependent on the specific topological structure of the random networks. The main characteristic of these networks is that global connectivity is not based on the weak ties, but instead by scattered edges that connects randomly chosen nodes. As a consequence, the preferential attachment effect will be now strongly limited by the degree homogeneity of these networks and consequently the likelihood of selecting long-range edges is higher, bringing together far away substructures, yielding a fast increase of efficiency. Indeed, by comparing Figures 8 and 9 we found that the topological structure strongly influences the overall simulated time. In fact, the simulations on artificial random networks are up to two times slower. Q1: Same cost configurations. The following set of figures (from Figure 10 to Figure 16) represents how the same cost configurations affect the global efficiency. In particular, we consider the cost levels C s that have at least two configurations µ, namely 1200 and 2400. Table 3 collects all the possible configurations with a specific cost.
A single simulation's run needs three parameters: a configuration and two rules. The first rule specifies the dynamics of users and the second one of the mermaids. In order to limit the bias due to the randomness of selecting the edges, we decided to repeat the same simulation 100 times and get the averaged results. However, the (simulated) timespan needed to get the target efficiency might vary in every run, making the calculation of averages not so straightforward. For this reason, we extended the timespan so that every simulation fit to the longest. In this way, we were able to average the y values at fixed x intervals. We found that when the spread of the timespan values is large (see for instance Figure 13c) and specifically with some C s of the VirtualTourist dataset, the method we used for averaging the results could create averaged behaviors that seems like stepping functions. We think this issue could be easily figured out by increasing the number of simulations. Figure 10 shows that only one specific configuration performs better compared to the others, and in particular the one that has the higher value of attractiveness. Surprisingly, this result is also quite general because it holds no matter what cost level or selected rule and regardless of the chosen mermaids' dynamics. In particular, we selected broadcast model, random (top panels), and aristocratic (bottom panels) rules. Two cost levels have been considered: C s = 1200 (left panels) and C s = 2400 (right panels). Configurations (6,20,10) and (12,20,10) outperform the others and in this case network efficiency will start to increase earlier, regardless of the growing rule of the users' network.  Figure 11. Comparison between the same cost of configurations of the Communities online social network. We consider two cost level C s = 1200 (left panels), C s = 2400 (right panels), and random, aristocratic, and social rules. All plots refer to the word of mouth model. We clearly see that network efficiency increases faster in configurations that have a higher value of attractiveness, no matter what cost level or rule has been selected. Step aristocratic (12,10,20) (12,20,10) (6,20,20) (d) Figure 14. Behavior of the network's E glob with two different cost levels: C s = 1200 (left panels) and C s = 2400 (right panels) for the VirtualTourist social network, broadcast model. In total, six configurations are considered. The one that has higher attractiveness is the favored one because can reach the efficiency of the original network faster than the others. In all experiments, the configuration that performs better is the one that has fewer mermaids and higher attractiveness (or equivalently that last more). In accordance with the results of accelerated analysis with no mermaids (Figure 7), random and social rules attain the target efficiency in fewer steps than the aristocratic rule. . Accelerated analysis with mermaids, random, aristocratic, and social rules, preferential model, in VirtualTourist social network with cost equal to C s = 1200 (left panels) and C s = 2400 (right panels). We clearly see that the configurations with higher attractiveness reach faster the target efficiency, regardless of the users' growing rules.

Q2: Parameters variation.
In the following experiments, we investigated the effects of parameters' variation in configurations. In particular, we fixed the number of mermaids (m = 6 and m = 12) and checked the performance of other configurations compared to the baseline (that are (6, 10, 10) and (12, 10, 10) respectively). The plots presented are grouped by network: Figures 17, 18, and 19 show the results obtained for Communities and consider different mermaids' dynamics, namely broadcast, word of mouth, and preferential. Conversely, Figures 20, 21, and 22 refer to VirtualTourist. All the plots presented in this section clearly show that increasing C s results in shrinking times to obtain the target efficiency. This is a very interesting result because it confirms the effectiveness of employing more mermaids in order to boost network engagement and in particular to lower the threshold after which the connectivity spreads all over the network. Indeed, another interesting observation could be made. We note that even though increasing C s always triggers a broader connectivity distribution, the benefit is not proportional to C s . For instance, quadrupling C s , the simulated time shrinks less than four times. The question Q3 will account for quantitatively defining this benefit. Remarkably, the observed effects are universal in the sense that they hold regardless of the network and the considered rule, suggesting their validity in a wide class of social networks.

Q3: Trade-off between the benefit of investing on mermaids and the cost.
In the plots presented previously, we described how the timespan needed to get the reference efficiency varies according to C s . Figures 23 and 24 (rightmost panels) show T min as a function of C s and this allows to describe more quantitatively the benefit of investing on mermaids. In fact, plots clearly show that this is not linear as one might guess, instead it is inversely proportional as C s . We think this is probably due to the system saturation. In other words, the network is not able to respond to high level of exogenous stimuli from mermaids resulting in performances that are comparatively similar to those obtained with lower cost configurations.
In order to test whether this finding holds when considering threshold values of efficiency, we investigated the time needed to get half E glob and one third E glob . Surprisingly, as the figures show (left most and centermost panels), the benefit is still inversely correlated to C s . This means that the efficiency growth behavior is quite regular. In Section 4.2 we introduced Equation (4) that accounts for the cost of running a social network and we analytically found that f (C s ) is minimum when Equation (5) holds. Since in real contexts β, i.e., the hypothetical cost of running a web service is influenced by many factors, where a unique value might not exist. For this reason, we tried many combinations of β that meet Equation (5). To obtain those values we calculate the first discrete derivative of T min (C s ) (different threshold values are then considered). Table 7 lists all the values of β we examined in our experiments. Table 8 shows the total cost C t as a function of C s and for different threshold values. This is a very interesting finding because, once the cost per unit time of a web service is known, our method can estimate the C s that accounts for the minimum C t . Indeed, since many configurations can have the same cost, the one that has the higher value of attractiveness will be the one that reach faster the target efficiency (see question Q1). Once β is known, our method estimates the best C s to obtain the minimum cost. For instance, suppose that the cost per unit time β is approximately equal to 90 (with no threshold on E glob ), the configurations that achieve the minimum cost are those with C s ∈ [1200, 2400]. Indeed, since there are many configurations with the same cost level, the one that performs better is the one with the higher value of attractiveness. (a)

Conclusions
In this article we have dealt with the problem of temporal evolution of social networks, trying to shed light on the effective strategies that can trigger a successful system evolution.
To date, many models of network growth have been proposed, especially in the context of social networks. For instance, some researchers observed that new social ties are driven by randomness. Because of that, some classical models are based on random wiring rules, whereas others are based on preferential attachment (i.e., the rate with which older nodes acquire new links is faster than new nodes) or on triadic closure rule (also known as friend's friend rule, that is, two friends of a person are more likely to know each other compared to two randomly chosen people).
Even though the previous classical growth models are well known and applied in social networks as well as in many other complex networks' settings, we focused on what can be the fundamental ingredients for a successful network evolution (considerably boosting connectivity). In particular, we identified an important strategical tool by using a new set of special nodes, called "mermaids", whose aim is to increase network utilization by establishing new links with existing nodes. We then proceeded to identify what the best strategies are to use this tool when also considering the cost factor and dealing with a predefined budget constraint.
The main questions we raised in this paper are the following: Are the mermaids beneficial as a way to widely spread the adoption of new online social systems? How is the global network behavior shaped by employing special nodes? Does the serial model (i.e., one edge added at time) or the simultaneous model (i.e., the number of edges added varies dynamically according to the current efficiency) achieve the best performance? How do the same cost configurations influence network efficiency? How much does it cost to use the mermaids?
We systematically simulated two online communities with different sizes and topics demonstrating the effectiveness of mermaids to drive social evolution and boost community engagement in general. Indeed, simulations were performed as a function of the mermaids configurations composed by three main parameters: the number of mermaids, attractiveness, and time span of the mermaids' utilization. We found that at the same cost, the configurations that attain the best results are those with high attractiveness, regardless of the online social network considered. Therefore, the best strategy to use mermaids when operating within a set budget is to primarily focus on the attractiveness parameter, and consequently adjust the number of mermaids and their operational time.
Several other interesting features could be explored as a natural extension of this paper. For instance, it could be of great importance to investigate whether the identified strategies are universally applicable to any online (and also non-online) social network. Another important direction in the context of social modeling could be to study not only how E glob varies with different configurations and parameters but also to consider other fundamental features such as local efficiency, assortativity, centrality, etc., in order to develop a deeper insight on boosted network evolution.