Analysis of Cell Signal Transduction Based on Kullback–Leibler Divergence: Channel Capacity and Conservation of Its Production Rate during Cascade

Kullback–Leibler divergence (KLD) is a type of extended mutual entropy, which is used as a measure of information gain when transferring from a prior distribution to a posterior distribution. In this study, KLD is applied to the thermodynamic analysis of cell signal transduction cascade and serves an alternative to mutual entropy. When KLD is minimized, the divergence is given by the ratio of the prior selection probability of the signaling molecule to the posterior selection probability. Moreover, the information gain during the entire channel is shown to be adequately described by average KLD production rate. Thus, this approach provides a framework for the quantitative analysis of signal transduction. Moreover, the proposed approach can identify an effective cascade for a signaling network.


Introduction
Kullback-Leibler divergence (KLD) is a type of generalized entropy or information quantity. It was introduced by Solomon Kullback and Richard A. Leibler, who discussed information source coding theory for information transmission efficiency [1]. At present, KLD finds diverse applications, including the imaging analytical field [2,3], hydrodynamics [4], clinical laboratory tests including electrocardiogram [5,6], network analysis [7,8] for biological applications [9], cellular biology [10], evaluating the bioequivalence of formulations of a drug [11], and experimental design for clinical study [12]. In this study, KLD is applied to analyze the cell information transmission, signal transduction, mediated by the cellular biochemical reaction. In particular, the proposed approach using Bayesian statistics [13,14], which is based on KLD, is expected to provide a novel theoretical framework [7].
Previous studies have discussed cell signal transduction through pathways from a viewpoint of similarity in the thermodynamic process that produces entropy [15]. Luo et al. [16] analyzed the heat production during carbohydrate metabolism and estimated the relationship between energy consumption and biological information from a biological metabolism perspective. Moreover, several research studies have applied variable concepts of entropy. In biologic genome informatics, a type of expanded Shannon entropy, such as the local Shannon-Jayne entropy, is utilized for analyzing the correlation between a set of gene expressions [17][18][19][20][21][22]. Teschendorff et al. [20] introduced a stochastic matrix, whose components are the normalized probabilities of the gene expressions in individual samples; the signal entropy rate was obtained using the matrix. In the network, the maximum entropy rate is determined by the adjacency matrix of the network [20]. For a multi-cell system, the network approach is useful for understanding biological signal transduction behaviors [23][24][25]. Recently, single cell entropy was introduced as the analytical basis of the variable phenotypic or genotypic state of a single cell, which is based on the assembly framework of statistical physics [26]. In another study, using a non-Markovian approach, the mutual entropy between the stimulus and the response in a biological system was considered in a sensory system, wherein the past trajectories are utilized to add useful information to the present state. In a simulation conducted by Becker et al., E. coli was used to reliably predict the concentration changes of environmental chemokines for chemotaxis [27].
Previously, the authors reported analyses of biological signal transduction based on information thermodynamics. In these studies, a theoretical framework was developed on Shannon entropy. However, the objective of the present study is to evaluate signal transduction efficiency of KLD in individual steps on the actual biochemical reaction kinetics [28,29]. KLD is introduced for analysis of signal transduction in reference to fluctuation theorem (FT) [30][31][32][33].

Signal Cascade Model
The signal events can be modeled as a cascade of modification and/or demodification cycle reactions of proteins in a cell that are named signaling molecules. Equation (1) presents a signal cascade model [34,35]. Here, suffixes m and j represent the number of cascades and the step number, respectively. In this model, the signaling molecule at step 1 of cascade m, denoted by X m1 , induces the modification of the X m2 into X m2 * by binding the signal mediated molecule A such as adenosine triphosphate (ATP). Subsequently, X m2 activates X m3 in the same manner. In this way, the signaling molecule at the (j − 1)-th step of cascade m, denoted as X mj−1 , induces the modification of X mj into X mj *. As the opposite orientation of signal, demodification of X mj * into X mj occurs, at the −(j − 1)-th step of cascade m, and the pre-stimulation steady state is subsequently recovered [34]: In the above model, the subscript m represents the total number of the cascade. We introduce a priori (prior) selection probability of signaling molecule for the analysis. Here, q mj , which represents the selection probability of inactive X mj used in the j-th step in cascade m (forward direction), takes the form of the j-th molecule. On the other hand, q mj *, which represents the selection probability of active X mj *, is used in the −j-th step for cascade m (backward direction), as follows: where, Here, X m indicates the total concentration of signaling molecules in cascade m: and the total concentration of active and inactive signaling molecules is given by: The total duration of cascade m, τ m 0 , which indicates the sum of forward and backward cascades comprising a set of signaling molecules, is determined by: In Equations (2), (5), (6) and (7), the total duration was determined using the probabilities q mj and q mj *: The suffix 0 represents the prior state. Here, the duration, as forward τ mj 0 and backward τ −mj 0 , are defined as shown in Figure 1. The Positive and negative values are assigned to τ mj 0 and τ −mj 0 corresponding to the direction of the step in the m cascade [34,35]. In the above equations, τ mj 0 represents the duration corresponding to positive code length in which the active molecule X mj * increases in concentration. On the other hand, τ −mj 0 represents the duration corresponding to negative code length in which the active molecule X mj * decreases in concentration. In this manner, the duration of individual step j-th can be represented as τ mj Entropy 2018, 20, x 3 of 11 The total duration of cascade m, τm 0 , which indicates the sum of forward and backward cascades comprising a set of signaling molecules, is determined by: In Equations (2), (5), (6) and (7), the total duration was determined using the probabilities qmj and qmj*: The suffix 0 represents the prior state. Here, the duration, as forward τmj 0 and backward τ−mj 0 , are defined as shown in Figure 1. The Positive and negative values are assigned to τmj 0 and τ−mj 0 corresponding to the direction of the step in the m cascade [34,35]. In the above equations, τmj 0 represents the duration corresponding to positive code length in which the active molecule Xmj* increases in concentration. On the other hand, τ−mj 0 represents the duration corresponding to negative code length in which the active molecule Xmj* decreases in concentration. In this manner, the duration of individual step j-th can be represented as τmj 0 −τ−mj 0 .

A Prior Probability Distribution of Signaling Molecules
Here, the author hypothesizes that the selection of signaling molecules is equal a priori. In our previous studies [34,35], Shannon's entropy Hm for m-th cascade was demonstrated. Using A common time course of the j-th step for both prior and posterior cascades, indicating concentration X mj *. The suffix 0 is omitted. The vertical axis denotes the concentration of signaling active molecule. τ mj and τ −mj represent the duration of the j-th step and the reverse −j-th step, respectively. The horizontal line X mj * = X mj * st denotes the concentration of X mj * at the steady state [35]. The "//" symbol on the horizontal axis indicates −τ −mj or |τ −mj | >> τ mj .

A Prior Probability Distribution of Signaling Molecules
Here, the author hypothesizes that the selection of signaling molecules is equal a priori. In our previous studies [34,35], Shannon's entropy H m for m-th cascade was demonstrated. Using Equations (3), (5) and (6), entropy H m 0 at a priori (prior) state can be represented as: To maximize H m 0 , using non-determined parameters α m 0 , and β m 0 , in reference to the constraints established by Equations (3) and (8), let us introduce a function G.
As indicated above, Equations (14) and (15) imply an important result; the coefficient β m 0 is independent of the step number j. Therefore, Equations (14) and (15) will be utilized as a prior probability distribution later. Therefore, from Equations (9), (14) and (15) the author has:

Average Entropy Production Rate in a Signal Cascade
Next, the kinetics were investigated using q m (j|j − 1), which is the transitional probability of the j-th given (j − 1)-th step, and v m (j|j − 1), which is the transitional rate of the j-th step in a forward signaling direction. Given j-th step, q m (j − 1|j) is the transitional probability of the (j − 1)-th step given step j-th step. Similarly, given j-th step, v m (j − 1|j) is the transitional rate of the (j − 1)-th step in a backward signaling direction in a given cascade. The cell system remains at detailed balance around the steady state, the homeostasis, as follows: Therefore, the author has: Using kinetic coefficients for (j − 1)-th step and the reverse −(j − 1)-th step, k m,j−1 and k m,−(j−1) in (1), the right side of (18) is given When the change of X m,j−1 * is negligible during signal transduction, relative to the fluctuation of X mj *, we have: In above, we used Equation (2). Dividing the both sides of above by τ mj − τ −mj and taking the limit, the variables q mj and q mj * remain in the right side: Using Equations (14) and (15), the author has: In above Equations (22) and (23), |τ −mj 0 | is sufficiently longer than τ mj 0 , according to experimental studies ( Figure 1) [36][37][38][39][40][41][42][43][44][45][46]. Here, using an arbitrary time parameter t, the average entropy production rate (AEPR), ζ mi and ζ −mi are defined during signal transduction for τ mj 0 − τ −mj 0 and |τ −mj 0 − τ mj 0 |. respectively for m cascade and reverse cascade −m.
The fluctuation theorem (FT) states that the right sides of Equations (22)-(25) are equal to AEPR and Here, β m 0 has the dimension of entropy production rate and AEPRs are independent of the step number. Subsequently, AEPRs are redefined using Equations (14), as follows: Notably, Equations (14), (15) and (28) indicate that AEPR is consistent during signal cascade. Here, the channel capacity is given by AEPR.
In previous our studies, a simple formulation was proposed between selection probability q mj and duration τ mj , using an arbitrary parameter, ζ, which was independent of step numbers [34,35].

Multinomial Distribution with Population Distribution
KLD was used as a measure of information gain when obtaining a posteriori (posterior) distribution from the prior distribution in Equations (30) and (31) to a posterior distribution. Let probability q mj be the prior distribution. Therefore, the uncertainty reduces: We define information D m , as the KLD of signal events in cascade m, The above equation represents KLD, which indicates the average value of the information obtained from data I with respect to p mj and p mj *. Consequently, KLD is known as information gain. The maximum likelihood estimation is thought to be an estimation method that empirically minimizes KLD.
Posterior probabilities are defined: and In addition, Therefore, when considering the signal transduction occurs under a certain given condition, the probability is transformed from q mj * into p mj * with minimum KLD under the given condition.
To minimize D m (p mj ||q mj ) using non-determined parameters α m , and β m in reference to the constraints established by Equations (34) and (36), a function L was introduced to apply Lagrange's method to undetermined multipliers. L(p m1 , p m2 , · · · p mn ; p m1 * , p m2 * , · · · p mn * ; X m ) Here, the differences between α m and α m 0 and β m 0 and β m are indicative of the signaling. Subsequently, For the minimization of L, the right hand sides of Equations (38)- (40) are equated to zero, as follows: Therefore, from (41) and (42), the author has Accordingly, from Equations (7), (43) and (44), KLD is given by: And the author has: Accordingly, β m is equal to the average KLD production rate, δ(p m q m ), during the signal transduction and is consistent during the entire signal cascade. Therefore, The author defined p m (j|j − 1), which is the transitional probability of the j-th given (j − 1)-th step, and v m (j|j − 1), which is the transitional rate of the j-th step in a forward signaling direction. In addition, given j-th step, p m (j − 1|j) is the transitional probability of the (j − 1)-th step given step j-th step. Similarly, given the j-th step, v m (j − 1|j) is the transitional rate of the (j − 1)-th step in a backward signaling direction in a given cascade.
Likewise from (18)- (20), dividing the both sides of above by τ mj − τ −mj and taking the limit, we have: lim Further, Equations (47), (50) and (51) give In above, the author used τ mj << τ −mj (Figure 1). Above Equation corresponds to the extended FT considering KLD, and the extended channel capacity can be defined as: Here, K is an arbitrary constant. If entropy unit is used, K = k B , Boltzmann's constant. On the other hand, in information science, K is equivalent to log 2 e.

Conclusions
Recently, theoretical analysis of the transduction capacity of biochemical signaling networks has greatly developed [47]. In this study, KLD, the average KLD production rate, and channel capacity based on the average KLD production rate were shown to be critical quantities that can be attributed to the entire signal transduction step. This KLD is a tool to estimate entropy production from stationary trajectories [48].
In this work, the author deduced simple but important relational formulae, (47)-(53). For a prior distribution probability q mj , the author derived Equations (14) and (15) in association with FT and source-coding theory [28]. This method was introduced in our previous study of Tsallis entropy [29]. The theoretical framework in the current study is shown in Figure 2.
Further, Equations (47), (50) and (51) give In above, the author used τmj << τ−mj (Figure 1). Above Equation corresponds to the extended FT considering KLD, and the extended channel capacity can be defined as: Here, K is an arbitrary constant. If entropy unit is used, K = kB, Boltzmann's constant. On the other hand, in information science, K is equivalent to log2e.

Conclusions
Recently, theoretical analysis of the transduction capacity of biochemical signaling networks has greatly developed [47]. In this study, KLD, the average KLD production rate, and channel capacity based on the average KLD production rate were shown to be critical quantities that can be attributed to the entire signal transduction step. This KLD is a tool to estimate entropy production from stationary trajectories [48].
In this work, the author deduced simple but important relational formulae, (47)-(53). For a prior distribution probability qmj, the author derived Equations (14) and (15) in association with FT and source-coding theory [28]. This method was introduced in our previous study of Tsallis entropy [29]. The theoretical framework in the current study is shown in Figure 2. In this study, uniform distribution was not applied as a prior distribution qmj before stimulus. As reported previously in References [34,35], the selection probability of signal molecules, qmj, can be described by simple formulae given by Equations (14) and (15) In this study, uniform distribution was not applied as a prior distribution q mj before stimulus. As reported previously in [34,35], the selection probability of signal molecules, q mj , can be described by simple formulae given by Equations (14) and (15), which illustrate the simple relationship between the logarithm of the probability and time elapsed between tentative modification and demodification. Subsequent to the stimulus, the selection probability of the signal molecules are transformed into a posterior distribution probability p mj . It is likely that the formulation of signal transformation using KLD is more intuitive and easier to understand than using mutual entropy.
The Mitogen-activated Protein Kinase (MAPK) pathway is a multistep signal conversion step, in which the process of modification and demodification in the entire cascade can be understood as a repeated cycle reaction. It has been experimentally demonstrated that the demodification process is significantly longer than the modification process, as shown in τ mj << τ −mj (Figure 1), with the former requiring a few hours for completion, while the latter achieves completion in a few minutes. This asymmetry in the time course kinetics points to an important result pertaining the conservation of AEPR and average KLD production rate with reference to FT. In addition, it should be noted that in Equation (53), the channel capacity of the entire signal cascade is represented by KLD. Moreover, the author introduced average KLD production rate in Equation (47), and the production rate was found to be consistent during the whole cascade, which is an expanded form of AEPR conservation during the whole cascade, as reported previously by the author [35].
In the experimental studies [36][37][38][39][40][41][42][43][44][45][46], the ratio of modified active type of signaling molecules was determined by the immunoblot intensity or other corresponding data. Thus, it is possible to compute, in principle, the channel capacity of the entire signal cascade. This can be attributed to the fact that in most cases, extracellular substances, such as ligands, simultaneously promote the activation of multiple cascades. Therefore, the selection of specific ligands for the given cascade is essential and enables the measurement of the rigorous activity of the cascade. In future, such recombinant protein ligand will be required to quantify the signal cascade accurately.
Thus, KLD and average KLD production rate may be regarded as a critical attribution of cell signal cascade. The thermodynamic approach in this manuscript can provide a theoretical framework for the quantitative analysis of signal transduction.