Between Waves and Diffusion: Paradoxical Entropy Production in an Exceptional Regime

The entropy production rate is a well established measure for the extent of irreversibility in a process. For irreversible processes, one thus usually expects that the entropy production rate approaches zero in the reversible limit. Fractional diffusion equations provide a fascinating testbed for that intuition in that they build a bridge connecting the fully irreversible diffusion equation with the fully reversible wave equation by a one-parameter family of processes. The entropy production paradox describes the very non-intuitive increase of the entropy production rate as that bridge is passed from irreversible diffusion to reversible waves. This paradox has been established for time- and space-fractional diffusion equations on one-dimensional continuous space and for the Shannon, Tsallis and Renyi entropies. After a brief review of the known results, we generalize it to time-fractional diffusion on a finite chain of points described by a fractional master equation.


Introduction
Physical processes are naturally separated into two broad classes: reversible and irreversible. Time reversal for the former leads to possible physical processes while for the latter implies manifestly unphysical outcomes. Mathematical descriptions representing irreversibility are correspondingly not time reversal invariant: the equations backward in time are different from those evolving forward in time.
Typical examples for irreversible dynamics include diffusion processes either describing molecular or turbulent diffusion as well as heat conduction or chemical reactions. The diffusion equation is the representative example, where P(x, t) is a probability distribution function (PDF) that describes the probability to find a particle at a certain position x at time t. For the diffusion equation, solutions running backward in time correspond to a process of "undiffusing": for example, diluted ink, under such a process, would gather itself back into the initial drops of inks added to the mixture. That is not so for reversible processes. Typical examples for reversible dynamics are Hamiltonian mechanics, quantum mechanics, and classical electrodynamics. These are iconically representable by the wave equation which is unchanged under time reversal. All that happens is the reversal of the propagation direction, which remains physical. If a light wave encounters a mirror, it travels just back the way it came. Furthermore, wave phenomena are characterized by propagation, unlike diffusion which is characterized by dispersion without propagation. These fundamental differences between the diffusion and the wave equation are also apparent on mathematical grounds and in their solutions. While the former has evolutionary relaxation over all space scales to infinity, the latter has finite speed propagation. Mathematically, the diffusion equation is parabolic and has one characteristic, whereas the wave equation is hyperbolic with two characteristic solutions. In addition, the numerical treatments of parabolic versus hyperbolic equations are different too.
As a result, these two prototypical equations seem to be clearly separated and unconnected. It was thus a fascinating undertaking to seek out a way to join these two worlds nonetheless and to study what the consequences are. The first way to accomplish such a connection was to explore the use of fractional calculus in the context of this equation The fractional derivative ∂ γ ∂t γ P(x, t) is actually defined via an integral as with γ = n + β and β ∈ (0, 1). As solving the (fractional) diffusion equation is essentially an initial value problem, Davison and Essex [1] were first to prove that only the k = n + 1 case works for normal initial value problems. Thus, the fractional derivative is given as which is also known as Caputo derivative [2]. For the time-fractional diffusion Equation (3), the domain is t ≥ 0 and x ≥ 0. Negative x adds nothing due to spatial symmetry about the origin. This results in an evolution equation that fully contained both the diffusion, γ = 1, and the wave equation, γ = 2, as special cases-thus for 1 ≤ γ ≤ 2 represents a bridge between two different worlds [3][4][5][6].
The obvious probe to explore this extraordinary bridging regime is something that assesses the differences in the property of reversibility as one traverses between diffusion and waves. Classical (Shannon) entropy and particularly entropy production, is the obvious first consideration as an appropriate measure such that the entropy production rate (EPR) [3,7,8] quantifies how irreversible a process is: The reversibility and irreversibility of the wave equation and the diffusion equation respectively manifests itself in the entropy production occurring during the time evolution. Then, in the diffusion case, one has a positive entropy production while for the wave propagation the entropy production is zero. One thus suspects that, for the transition from the diffusion case γ = 1 to the wave case γ = 2, the EPR will decrease and become zero in the reversible case. However, this expectation has been shown to be erroneous and the corresponding phenomenon has been dubbed the entropy production paradox for fractional diffusion [3]. The entropy production paradox exhibits remarkable robustness, but that 44 does not mean it is universal [9,10]. A key question is why it exists at all.
In this short note, we will review the original results for the entropy production paradox [3,6] in the time-fractional context that arises from the above equations and why, taking particular note to how the symmetries of the extraordinary differential Equation (3) lead to the unexpected result: the EPR increases as the solution approaches the reversible limit.
Furthermore, we will also sketch out how this scaling symmetry plays out in a formally different extraordinary differential equation with space-fractional derivative, with completely different solutions and domains, leading to the same remarkable outcome [6,11]: the EPR increases in the reversible limit. We will also observe in passing that this paradox persists even for generalizations of entropy available in the literature such as Tsallis and Renyi entropies [4,6,[12][13][14]. In this context, we want to note that there might be implications of the entropy production paradox on applications known in finance [15,16], ecology [17], computational neuroscience [18], and physics [19][20][21].
We will then widen the scope of the entropy production paradox by leaving the realm of systems continuous in space and investigate the entropy production for time-fractional diffusion discrete in space. We generalize a classical master equation describing diffusion on a chain of points by replacing the time derivative with a time-fractional one, thus opening the full range up to the wave equation on the chain. We find an EPR rich in features. In particular, the scaling symmetry persists even in this finite discrete scenario governed by a master equation picture allowing the paradoxical behavior to reveal itself until the finite size effects dominate. We can thus conclude that this paradoxical behavior is not only unique but extraordinarily robust.

The Paradox for Time-Fractional Diffusion
The time-fractional problem (3) is fully realized with the following initial conditions: where δ(x) represents the delta distribution. The solution of Equation (3) is known in terms of the H-function (for details, see [5,22]) as Note that, for each γ, a different P(x, t) is obtained. In accordance with [23], the second initial condition in Equation (9) is set to be zero to guarantee the continuity at γ = 1.

The Transformation Group
Interesting as the H-function is in its own right, it has little to do with the nature of the regime properties in terms of entropy production. This can be seen by considering the similarity group x = λ ax and t = λ bt , choosing a/b = γ/2. Equation (3) is invariant under this group and thus solutions, G(η), can be found in terms of a similarity variable, We note that G(η) is a solution of an auxiliary equation [3], which is not relevant here.
While the probability distributions P(x, t) is not a similarity solution in its own right, the function G(η) with its scaling property is essential to the form of P(x, t), which must have a time independent integral over the domain, Σ, in x. With Σ being the corresponding domain in η, one finds This suggests the primary form for the PDF, leading to which confirms the form (13) for the probability density. A quick observation reveals that this is precisely the form of (10).

Entropy Production Directly from the PDF
Inserting Equation (13) into (6) leads to where C(γ) = Σ G(η) ln G(η)dη decreases monotonically with γ [5]. It follows that for γ < 2 which shows decreasing entropy production with time, but increasing entropy production with γ, where increasing γ corresponds to the direction of the reversible limit in γ-space. Moreover, it is clear that this property has nothing whatever to do with the form of the PDF in (10). This is the primary form of the entropy production paradox. Instead of a decreasingṠ while approaching the reversible wave case, i.e., increasing γ towards 2, the entropy production increases up to a final non-continuous jump at γ = 2 down toṠ = 0.

Entropy as Order Function
Perhaps this paradoxical behaviour is misleading. That is, one can consider each γ as representing a separate system with its own intrinsic rates. Maybe one might argue that the EPR does not represent the physically meaningful ordering of states in this domain.
One alternative is the entropy itself. In Figure 1, the entropy S is given over γ and time t. For small times, we observe a monotonic decreasing behavior of S for increase γ, i.e., approaching towards the reversible case the entropy decreases. However, after a critical time t c , a maximum of the entropy within γ ∈ [1, 2) appears. (A detailed discussion can be found in [5].) Thus, we find that comparing the different probability distributions P(x, t) via the entropy productionṠ or via the entropy S at constant times does not represent the standard notion of the entropy production or of the entropy that it should decrease towards the reversible case.
This leaves us with entropy production which changes in a paradoxical manner, or entropy which does not even order the states between diffusion and waves because of the maximum. However, perhaps the notion of ordering the respective systems along lines of constant time misses an essential aspect of the matter.
S(γ,t) Figure 1. The entropy S(γ, t) is shown over γ and t-for small times, a monotonic increasing entropy decreasing γ. At times t > t c , a maximum S at γ > 1 is found for each t, as indicated by the red dots.
Examination of the evolution of the PDFs computed shows that the solutions between diffusion and waves exhibit properties of both dissipation and propagation simultaneously. The peak of the PDF moves in the half space while the width of the peak broadens. As γ grows, the peak tightens and moves more quickly, until the peak approaches a delta distribution in the γ = 2 limit. In this sense, it is a precursor to propagation, which we call pseudo propagation, as it is not strict propagation in a mathematical sense.
As this pseudo propagation is more rapid for larger γ, this suggests that the different systems in γ operate with a different internal clocks. Larger γ means the process has greater "quickness". For this case, this property may be captured by the rate of movement of the mode of the PDF. Although the mode is not known analytically, it exists and can be determined numerically, whereas for instance the mean or the first moment does not always exist in such problems, as we shall see in the case below.
The time dependent mode for this distribution can be determined from the modex γ,t=1 at t = 1 viâ From this, we can determine a corresponding time, τ γ , that puts each system at a similar evolutionary position as a function of γ Along lines of constantx instead of constant t, as shown in Figure 2, we observe a monotonic decreasing entropy while crossing from the diffusive to the reversible case. While this does not eliminate the primary paradoxical behavior of the EPR, it causes the entropy itself to operate as an order parameter that has some intuitive content.
The entropy S(γ, τ γ (x)) is shown over γ andx. We observe a monotonic decreasing function of entropy going from γ = 2 to 1. This is emphasized by the red dots, indicating the maximum of S, which is always given for γ = 1.

The Paradox Is Not Unique
The entropy production paradox for time-fractional diffusion [3] naturally leads to the question as to whether the paradox is unique. How special and singular is this phenomenon?
It turns out that, while the circumstances that permit this paradox to occur are not universal, they are far from unique. This section provides an alternative circumstance that is as removed from the original paradox as one might get without losing all contact with the context of the problem.
We still must have PDFs that bridge the regime between diffusion and waves with a one-parameter family of PDFs. This can be accomplished by beginning with the diffusion equation and letting the space derivatives decrease in order to 1, where α takes the values (1, 2] with α = 1 represents the (half) wave equation, while α = 2 represents the diffusion equation. Equation (19) could not be more different for this context than (3). Equation (19) induces an infinite domain, while (3) has a semi-finite one. Here, space does not have an initial point as time does. Time-fractional derivatives imply a nonlocality or "memory" respectively in the their definition [24,25] by the time integration, while space-fractional ones have the classical picture of time in regular differential equations.
A suitable "memory-less" fractional derivative can be found using Fourier transforms.
as used in [6,[11][12][13], but other definitions are also possible [14]. While the space-fractional derivative is defined through transforms for historical reasons, this does not mean that the time-fractional derivative could not be defined alternatively through a suitable transform.
Utilizing the initial condition P(x, 0) = δ(x), the PDF solutions of Equation (19) turn out to be Lévy stable distributions [6,11] P(x, t) = S x| α, 1, (D α t) 1 α , 0; 1 (21) with D α = − cos( απ 2 ). Note that P(x, t) is defined for −∞ < x < ∞ and 0 ≤ t < ∞. As in the case of time-fractional diffusion, the solution functions (21) show a scaling behavior in the variables space x and time t that they can be written as a product of a function of time only and a function of the similarity variable η = x t − 1 α = x t − γ 2 . Thus, the entropy can then be written in terms of the similarity variable and a scaling function G(η) as The analysis of the resulting EPR proceeds in the same fashion as in the time-fractional case and leads again to the entropy production paradox, i.e., where here α decreases to the reversible limit meaning the EPR again increases. As shown in Figure 3, again an entropy maximum in the bridging regime (and thus the paradox) occurs, and the internal clock question recurs. As indicated above, this scenario is one of those, where due to the very different nature of the Lévy stable distributions, i.e., having heavy tails and exhibiting no first moments, only the mode can be used to track the internal clocks of the respective system for each α. In Figure 4, the entropy is given as a function of α and the mode positionx of the distribution function. It turns out that here again the mode provides an effective strategy to resolve the ordering problem. However, this successful conclusion distracts from the essential difference between the space-fractional and the time-fractional case.  S(α, t) is shown over α and t. For small times t < t c , a monotonic increasing entropy for reaching α = 2, i.e., the reversible limit, is observed. At larger times, a maximum of S at α < 2 is found for each t, as indicated by the red dots. . The entropy S(α, τ α (−x)) is shown over α andx. We observe a monotonic increasing function of entropy going from α = 1 to 2. This is emphasized by the red dots, indicating the maximum of S, which is always given for α = 2.

The Paradox Is Robust
This section makes a small but important point. Does the paradox persist under relaxations in the definitions of entropy itself? It turns out that it does. Entropy generalizations such as Tsallis [26][27][28] and Renyi entropies [8], which include the classical entropy as a limiting case, do not affect the paradox [4,6,11].
In particular, the EPRs for the Tsallis entropy with q ∈ R\{1} and the Renyi entropy where q > 0, have been analyzed. Both definitions become in the limit of q → 1 the classical entropy. While the extensity parameter q makes the resulting EPRs become more complex, both entropies reflecting the paradox demonstrate its robustness.

Time-Fractional Diffusion on a Finite Interval
The above discussion has already shown that the entropy production paradox occurs in a variety of systems and for a variety of entropy definitions. In this section, we want to show that the entropy production paradox does not only occur in systems using continuous space but also for processes in discrete systems (e.g., [29]).
We consider the dynamics on a linear chain of finite length m. In Figure 5, the setting for a chain with four sites is depicted. The figure visualizes that we treat the case in which only nearest neighbor connections are present. We consider a simple diffusion process on that set with the dynamics of the probability P(i, t) to be in state i at time t given by the master equation where w ij are denoting the transition probabilities from site j to site i. They are set to w ij = κ for |i − j| = 1 and w ij = 0 otherwise, and w ii = ∑ j =i w ji . We now generalize this master equation by substituting the first order time derivative on the left-hand side of (26) with the fractional derivative as given in Equation (5) d γ dt γ P(t) = W P(t), (27) given in matrix-vector notation. We note that now the w ij no longer represent transition probabilities but a connectivity strength indicating how much the value of P(i + 1, t) and P(i − 1, t) (in addition to P(i, t)) influences the fractional derivative of P(i, t). Below, we will refer to the w ij as generalized transition probabilities.
Equation (27) represents a set of m coupled linear time-fractional differential equations. They can be decoupled by an appropriate transition to variables based on the eigensystem of the connectivity strength matrix W with elements w ij : which has m eigenvalues λ ν , ν = 1 . . . m and corresponding eigenvectors e ν . In detail, we find The probability distribution can now be expressed as a linear combination of the eigenvectors with coefficients a ν (t) Inserting (31) into (27) then leads to a set of m decoupled fractional differential equations which follows from the orthogonality of the eigenvectors.
The resulting solution to this fractional differential equation can be determined for known initial values a ν (0) andȧ ν (0) as where E α,β (z) represents the generalized Mittag-Leffler function. It is defined by its power series and encompasses the Mittag-Leffler function E α (z) = E α,1 (z). For given P γ (0) andṖ γ (0), the corresponding a ν (0) andȧ ν (0) are set by Combining the results, we obtain as well as its time derivativė Finally, the entropy production rate is determineḋ In the following figures (Figures 6-10), the EPR is shown for m = 129 and P(i, 0) = δ i,(m+1)/2 andṖ(i, 0) = η(δ i,(m−1)/2 − δ i,(m+3)/2 ), where η indicates the strength of the initial probability change. In Figures 6-8, η is set to zero and the EPR is shown for γ = 1.3 and γ = 1.5. An apparent feature is that the EPRs for different γ cross each other several times. This alone exemplifies that the entropy production paradox is present in certain cases.
If we focus on the small time regime as presented in Figure 6, we can see that the γ = 1.5-case has a larger rate than the γ = 1.3-case and thus the entropy production paradox is recovered. In the intermediate time regime, between 100 and 800, the two rates cross each other (cf. Figure 7) several times and thus we find the paradox but also the intuitive behavior where the EPR is larger for smaller values of γ indicating larger proximity to the irreversible diffusion process. Finally, for large times, we see in Figure 8 that EPRs show regular ordering in the final relaxation process towards equilibrium.  20,000]. Here, it shows regular ordering with a higher production rate for smaller γ.
In Figure 9, the EPR is shown for two different values of η, i.e., η = 0.006 and η = 0.012. Again, the entropy production paradox appears in the small time regime, while regular behavior is observed for long time periods. Interestingly, the initial probability change η shows only a small influence for short and intermediate times (see Figure 9a,c), as both EPRs apparently behave in the same way as for η = 0. For long time periods, as shown in Figure 9b,d, we observe a stronger influence of η = 0.012 than of η = 0.006 as expected and additionally we find that, for smaller values of γ, the influence of η is stronger than for larger γ.  A further comparison of the EPRs for different initial probability changes η (cf. Figure 10) shows that the influence of η persists longer for γ = 1.3 than for γ = 1.5. Furthermore, it can be seen that at least for the smaller values of γ the initial probability change does not lead automatically to a larger or lower EPR. Whether the EPR is raised or reduced depends on time. This is an exciting finding, which needs further investigation.

Conclusions
The first encounter with the entropy production paradox was realized by an increase of the entropy production rate (EPR) as the parameter γ, representing the fractional order of the time derivative in the time-fractional diffusion equation, is moved from γ = 1, the diffusion case, to γ = 2, the wave case. Subsequently, it was realized that this was not unique. Substantively different cases, as the space-fractional diffusion equation, were found where the EPR increased as the control parameter approached the reversible limit. This unexpected behavior contradicts the expectation that the EPR ought to decease as the reversible limit is approached. Furthermore, we have shown that this peculiar behavior is robust, even valid for generalized entropies.
Recently, we extended the analysis to time-fractional diffusion on a finite chain of points. This extension promises new insights as the scaling features present in the fractional diffusion on a half-infinite space no longer exist due to the confinement of the probability in the chain. By a numerical analysis of the EPR, we could establish the existence of the entropy production paradox for short time periods, which are here characterized by the time span in which the distribution initially starting as a δ-distribution in the middle of the chain has not "seen" one of the ends. Thereafter, a complex behavior appears due to the distribution sloshing against the reflecting ends of the chain. The further analysis of this complex behavior is left open here for further research. The treatment of these systems in terms of fractional entropies may be of interest in future work [30][31][32].