Information Transfer between Stock Market Sectors: A Comparison between the USA and China

Information diffusion within financial markets plays a crucial role in the process of price formation and the propagation of sentiment and risk. We perform a comparative analysis of information transfer between industry sectors of the Chinese and the USA stock markets, using daily sector indices for the period from 2000 to 2017. The information flow from one sector to another is measured by the transfer entropy of the daily returns of the two sector indices. We find that the most active sector in information exchange (i.e., the largest total information inflow and outflow) is the non-bank financial sector in the Chinese market and the technology sector in the USA market. This is consistent with the role of the non-bank sector in corporate financing in China and the impact of technological innovation in the USA. In each market, the most active sector is also the largest information sink that has the largest information inflow (i.e., inflow minus outflow). In contrast, we identify that the main information source is the bank sector in the Chinese market and the energy sector in the USA market. In the case of China, this is due to the importance of net bank lending as a signal of corporate activity and the role of energy pricing in affecting corporate profitability. There are sectors such as the real estate sector that could be an information sink in one market but an information source in the other, showing the complex behavior of different markets. Overall, these findings show that stock markets are more synchronized, or ordered, during periods of turmoil than during periods of stability.


Introduction
Complex systems, such as financial markets, are usually composed of many subsystems; in the case of financial markets, information flows and interactions within the market itself are rarely investigated even though they are critical to driving the complex dynamics of the complex system as a whole. Many methods have been proposed to unveil these different relationships among subsystems, such as correlations including simple correlation analysis [1,2], Granger causality [3], nonparametric approaches such as the thermal optimal path method [4][5][6], and mutual information analysis [7][8][9]. These different approaches have their own advantages and limitations. Importantly, while Granger causality is commonly used to identify time-varying single or bidirectional causality in economics, it is sensitive to sample period selection and complexity in the underlying time series, as well as having other issues [10,11].
In this paper, we use an alternative approach termed transfer entropy to identify the information transfers between industrial sectors in the world's two largest economies: the USA and China. Transfer entropy, as a kind of log-likelihood ratio [12], is a measure that quantifies information flow based on the probability density function (PDF). Better than correlations or Granger causality, transfer entropy not only identifies the direction of the information flow but also quantifies the flows between different subsystems. In other words, it is capable of quantifying the strength and direction of the interaction between different subsystems at the same time. This approach has found wide application [13][14][15][16][17][18][19][20][21]. Furthermore, variation and extensions of transfer entropy have been developed that are suitable for different situations [22], such as symbolic transfer entropy [23].
There are many studies adopting the concept of transfer entropy to economic systems such as financial time series [18,24,25], stock market indices [26,27], composite index and the constituent stocks [28,29], and indices of industry sectors of a stock market [30].
Stock price fluctuations reflect both global and local news as well as news within a subsystem. There are also well-known calendar anomalies related to business cycle and market participants sector rotations [31]. In a related work, Oh et al. investigated the information flows among different sectors of the Korean stock market [30]. They measured the amount of information flow and the degree of information flow asymmetry between industry sectors around the subprime crisis and identified the insurance sector as the key information source after the crisis. Although the authors do not attribute a economic basis for this finding, it is likely linked to the insurance sector acting as a leading indicator of risk in the economy. In this work, their analysis is extended and a comparative study is performed on the information transfer among different industry sectors of the Chinese and the USA stock markets. These two markets are respectively the largest emerging and developed stock markets associated with the two largest economies in the world.
The rest of this paper is organized as follows. Section 2 describes the method for calculating symbolic transfer entropy and the sector indices time series for the Chinese and the USA stock markets. Section 3 presents the empirical results about the information flows between stock market sectors and its relationship with market states. Section 4 concludes this work.

Symbolic Transfer Entropy
Schreiber was the first to use transfer entropy to measure information transfer and detect asymmetry in the interactions among subsystems [13]. He treated a sleeping human's breath rate time series and heart rate time series as two subsystems and found that the information flow from the heart to the breath signal is dominant. To explore the transfer entropy between two time series, there are various approaches in the literature. We need to briefly summarize what the other approaches are and why the symbolic transfer method is used. We use the symbolic transfer entropy introduced by Staniek and Lehnertz [23]. Consider two different daily closing prices time series {X t } and {Y t }, t = 1, 2, . . . , L, which have the same length L. Closing prices are used to ensure that prices factor in local market news as well as intramarket news from the various sectors. Transfer entropy T S y→x assumes that X t is influenced by the previous l states of the same variable and by the m previous states of variable Y, for financial markets, only the day before is important [32]. Hence, we use l = m = 1 in this study. The procedure to calculate the symbolic transfer entropy T S y→x from time series {y t } to {x t } is briefly described in the following five steps: First, we adopt the log returns {x t } instead of the original price time series {X t } by where X t is the closing price of the index on the tth trading day. Second, the returns are discretized into q nonoverlapping windows of equal length ∆. If there are too many windows, the chance of having particular combinations drops very quickly, making the calculation of probabilities slower and less informative [32]. Hence, it is irrational if q is too large or too small. Marschinski and Kantz consider q = 2 and q = 3 in their research [24]; Sandoval uses q = 24 and q = 6 [32]. We aim at finding the optimal q to maximize the transfer entropy difference between two time series meanwhile minimizing the calculation cost. In our comparative investigations, the parameter q varies from 2 to 22 with a moving step of 1. We find that when q ≥ 15, the difference becomes significantly nonzero. Considering the calculation cost and the strength of transfer entropy, in this work, we use q = 15. We obtain the maximum value x max and minimum value x min of the time series x t under investigation. The length of each interval is ∆ x = [x max − x min ]/q and the kth interval is [x min + (k − 1)∆ x , x min + k∆ x ). Similarly, we repeat the procedure for y t and its ∆ = ∆ y is usually different from ∆ x .
Third, the log return time seriesx andŷ are described aŝ Fourth, the number of elements in the qth interval are denoted byx q t andŷ q t , respectively, and then calculate the probabilities p(x t ) =x q t /(L − 1) and p(ŷ t ) =ŷ q t /(L − 1) and the joint probabilities p(x t ,ŷ t ), p(x t ,x t+1 ) and p(x t+1 ,x t ,ŷ t ).
Fifth, in information theory, different bases of entropy lead to different units of entropy. Base 2 is the most widely applied in transfer entropy for most of empirical works. Therefore, in our study, we use Base 2 to calculate transfer entropy. The symbolic transfer entropy from time series {y t } to time series {x t } is calculated as where the joint probability p(x t+1 ,x t ,ŷ t ) means the probability that the combination ofx t+1 ,x t andŷ t occurs, while p(x t+1 |x t ,ŷ t ) and p(x t+1 |x t ) are the conditional probabilities thatx t+1 has a particular value when the values of previous samplesx t andŷ t are known andx t is known, respectively. Since we can simplify Equation (3) and obtain This expression is used for the estimation of the symbolic transfer entropy.

Data Description
To conduct our analysis, we selected two sets of data from two major stock markets: the Chinese and the USA stock market. The Chinese stock market is the largest emerging market, while the US stock market is the worlds largest developed stock market.
For the Chinese stock market, we retrieved and analyzed the SWS sector indices issued by Shenyin & Wanguo Securities Co., Ltd.

Symbolic Transfer Entropy and Degree of Asymmetric Information Flow of the Whole Samples
As mentioned in Section 1, symbolic transfer entropy can proxy for the strength and direction of the information flow between two time series. Following Oh et al. [30], we used the degree of asymmetric information flow to measure the information effect between two stock sectors, which is defined as It follows that ∆T S j→i = −∆T S i→j . We show the calculation results of our datasets in four heat maps (top row for the Chinese sectors and bottom row for the US sectors) of T S i,j and ∆T S i,j in Figure 1, in which each cell shows the value of T S (left plot) or ∆T S (right plot) from sector i to sector j. We observe that the values in the diagonal matrices T S i,j and ∆T S i,j are zeros, which is trivial and can be understood from the concept of symbolic transfer entropy. We find that the non-bank financial sector (code 790) has roughly the highest T S values for both inflows and outflows among the Chinese sectors, and the technology sector (code TEL) has roughly the highest T S values for both inflows and outflows among the US sectors; the non-bank financial sector comprises three Level 2 sectors in the SWS index system which are security, insurance, and multivariate financial. This suggests that during the sample period from 2000 to 2017, the non-bank financial sector and the technology sector were respectively the most active in the two stock markets. That is, there was more information exchange between these sectors with the other sectors in their own stock markets than between other sectors in the same stock market.

Average Inflow and Outflow
For each sector i, the average outflow F out,i and inflow F in,i of information can be calculated as follows [30]: and where the points with i = j are not included. Figure 2a,c show the bar charts of the average information inflows and outflows for all the sectors. This figure confirms that the non-bank financial sector (code 790) and the technology sector (code TEL) were the most active sectors in information exchange, respectively. We also find that the more information a sector sends out to other sectors, the more information it receives from others generally. Therefore, the outflow and inflow are positively related to each other. We present in Figure 2b,d the scatter plot of F out,i against F in,i , which confirms a significant positive correlation. The least-squares regression results in the following linear relationship for the Chinese market are as follows: where the p-values of the two coefficients are respectively 3 × 10 −15 and 2 × 10 −6 and the adjusted R 2 is 0.908. Similarly, for the USA market we have where the p-values of the two coefficients are 6 × 10 −4 and 5 × 10 −8 , respectively, and the adjusted R 2 is 0.548. It is clear from this simple estimation that the linear relationship is more significant for the Chinese stock market. We argue that the linearity reflects the degree of traders' actions on the idiosyncratic traits of market sectors. The higher linearity of the Chinese stock market implies that the traders in the Chinese market are more irrational, such that their behavior is less reflected in the idiosyncratic traits of market sectors in their decision-making process.  We also use the average degree of asymmetric information flow ∆F i to measure the net information of sector i being sent to other sectors, which is defined as follows [30]: We illustrate in Figure 3 the average degree of asymmetric information flow ∆F of the sectors in descending order for the two stock markets. Among all the 28 Chinese sectors, the bank sector (code 780) has the highest ∆F value, while the ∆F value of the non-bank financial sector is the lowest. This finding suggests that the bank sector has the highest net outflow of information and is thus the most influential sector, while the non-bank financial sector is the most influenced sector. If we regard the Chinese stock market as an information transfer system, the bank sector is a big information source, influencing other sectors, while the non-bank financial sector is a big information sink, influenced by other sectors. Concerning the absolute ∆F value, we find that the biotechnology (code 150) is the closest one to zero, which indicates that the strength of information outflows and inflows are approximately equal and there is little net information transferred between the biotechnology sector and the whole market.
Among all the 16 US sectors, the energy sector (code E2L) has the highest ∆F value, while the ∆F value of the technology sector (code TEL) is the lowest. This suggests that the energy sector has the highest net outflow of information and is thus the most influential sector, while the technology sector is the most influenced sector. Therefore, the energy sector is a big information source, influencing other sectors, while the technology sector is a big information sink, influenced by other sectors. When we consider the absolute ∆F value, we find that the appliances sector (code M3L) is the closest one to zero, which indicates that the strength of information outflows and inflows are approximately equal and there is little net information transferred between the appliances sector and the whole market.
Although the sectors in both markets are similar, they play different roles in the two information transfer processes. For instance, the real estate sector is an information sink in the Chinese market but an information source in the US market. These results highlight the importance of the real estate sector in driving economic output in China and its less significant role in the US.

Yearly Evolution of Symbolic Transfer Entropy and Degree of Asymmetric Information Flow
Economic sectoral relationships are known to be unstable and change over time. For example, Bernanke (2016) highlighted the changing correlation between the energy and industrial sectors in the US over the last decade. To qualify the evolution of information flows over time, we calculated the symbolic transfer entropy matrix T S (t) and the asymmetric average information flow ∆T S (t) for each year t. The four T S (t) heat maps of the Chinese stock market for years 2000,2003,2007, and 2011 are illustrated in Figure 4 , and the four T S (t) heat maps of the US stock market for years 2000,2003,2007, and 2011 are illustrated in Figure 5, respectively. For the Chinese stock market, it is found that the heat maps share some pattern of similarity. For instance, some relative bright lines emerge vertically and horizontally, echoing the pattern in Figure 4. However, these heat maps also exhibit remarkable differences. The most significant feature is that the heat maps become brighter over time, indicating that there are more information transfers among different sectors with the development of the stock market. The corresponding four heat maps of the asymmetric information flow ∆T S (t) are shown in Figure 6. A similar evolution of patterns is observed in the US stock markets, which is shown in Figures 5 and 7. However, we do not observe a monotonic increase in information flows among the US sectors, in which the information flows among sectors were smaller in 2011.  040  050  080  110  120  130  140  150  160  170  180  200  210  230  710  720  730  740  750  760  770  780  790  880  890   890  880  790  780  770  760  750  740  730  720  710  230  210  200  180  170  160  150  140  130  120  110  080  050  040  030  020 020  030  040  050  080  110  120  130  140  150  160  170  180  200  210  230  710  720  730  740  750  760  770  780  790  880  890   890  880  790  780  770  760  750  740  730  720  710  230  210  200  180  170  160  150  140  130  120  110  080  050  040  030  020 010  020  030  040  050  080  110  120  130  140  150  160  170  180  200  210  230  710  720  730  740  750  760  770  780  790  880  890   890  880  790  780  770  760  750  740  730  720  710  230  210  200  180  170  160  150  140  130  120  110  080  050  040  030  020 010  020  030  040  050  080  110  120  130  140  150  160  170  180  200  210  230  710  720  730  740  750  760  770  780  790  880  890   890  880  790  780  770  760  750  740  730  720  710  230  210  200  180  170  160  150  140  130  120  110  080  050  040  030  020   To further quantify the evolution of information flows, we calculated the average of the symbolic transfer entropy matrix T S (t) for each year t as follows: where the diagonal with i = j is not included, and the average asymmetric information flow ∆T S (t) for each year t is measured as follows: where the lower triangle (i.e., the part with i ≤ j) is not included. We note that there are no objective criteria to determine the window size. Too long windows will result in too few data points and vague evolution paths, while too short windows lead to less statistics and more noise [33]. The choice of one year is a trade-off. The evolutionary trajectories of the average symbolic transfer entropy T S (t) and the average asymmetric information flow ∆T S (t) from 2000 to 2017 of the Chinese stock market are presented in Figure 8a,b, respectively, while the results for the US stock market are presented in Figure 8c,d. For the Chinese stock market, we observe two local minima around 2001 and 2016 for T S (t) and three local minima around 2001, 2008, and 2016 for ∆T S (t) . This observation is of particular interest, because the three periods correspond to key periods of market volatility associated with the market crashes in June 2001 [34], December 2007 [35], June 2009 [35], June 2015 [36], and January 2006 [37]. For the US stock market, we observe four local minima around 2001, 2008, 2011, and 2016 for T S (t) and three local minima around 2001, 2011, and 2015 for ∆T S (t) , which correspond to the 9/11 terrorist attack in 2001 [38], the subprime mortgage crisis in 2007 [39], the July-August 2011 stock market crash [40], and the 2015-16 stock market selloff beginning in the United States on 18 August 2015. It is documented for other types of networks that the structure of networks usually changes around large market movements (see [41] and the references therein). We conclude that, during market turmoil periods, both the average information transfer and the average asymmetric information flow are lower than in stable states. This conclusion is not surprising. During bubbles and antibubbles, investors exhibit stronger convergence in decision making. The majority of investors buy stocks during bubbles and sell stocks during antibubbles. Although stock markets have higher volatility during periods of turmoil, investors' actions are more synchronized. In other words, stock markets are more integrated during periods of turmoil than during periods of stability.

Conclusions
In this work, we compared the information transfer between industry sectors in the Chinese and US stock markets based on their symbolic transfer entropy. We used daily returns of key sector indices from 2000 to 2017. The results of this work offer several important insights into information flows between industry sectors. First, we find that the most active sector in information exchange is the non-bank financial sector in the Chinese market and the technology sector in the US market. Second, concerning the net information flow of individual sectors, we find that the main information source is the bank sector in the Chinese market and the energy sector in the US market, while the information sink is the non-bank financial in the Chinese market and the technology sector in the US market. The two information sinks with the largest net information inflow in the two markets are exactly the two most active sectors with the largest information transfer. Third, the same sector may play different roles in the two markets. For example, the real estate sector is an information sink in the Chinese market but an information source in the US market. Thus, the US stock market is expected to react to demand related to news originating from the housing sector, such as building approvals, whereas in China this is not the case since the markets are driven by supply side factors such as changes in bank lending.
We also investigated the evolution of the yearly information transfer for both markets. It is found that the local minima of the average symbolic transfer entropy T S (t) and the average asymmetric information flow ∆T S (t) correspond to periods of market turmoil. We argue that stock markets are more integrated during periods of turmoil than in stable periods, which results in smaller entropy.
Note that while there have been several studies that use entropy-based techniques to predict market fluctuations and crashes [42][43][44][45][46][47] or measures [48][49][50], in this study we argue that the average symbolic transfer entropy T S (t) and the average asymmetric information flow ∆T S (t) do not have a direct predictive power for market crashes. Further research is required to better understand the dynamics of market crashes, which are likely not driven by historical correlations but rather by behavioral factors.