The Distribution of Cross Sectional Momentum Returns When Underlying Asset Returns Are Student’s t Distributed

In Kwon and Satchell (2018), a theoretical framework was introduced to investigate the distributional properties of the cross-sectional momentum returns under the assumption that the vector of asset returns over the ranking and holding periods were multivariate normal. In this paper, the framework is extended to derive the corresponding results when the asset returns are multivariate Student’s t. In particular, we derive the probability density function and the moments of the cross-sectional momentum returns and examine in detail the special case of two underlying assets to demonstrate that many of the salient features reported in the empirical literature are consistent with the theoretical implications.


Introduction
Empirical investigation of the observed patterns in asset returns has been an active area of research in finance, with momentum, or persistence, in asset returns being one of the more popular examples of this line of research. Of these, perhaps the most prominent is cross-sectional momentum (CSM), which refers to the observation that the set of assets that outperform relative to another set over a prior period tend to continue to outperform over a subsequent period. The existence of CSM is usually tested empirically by sorting the assets according to their returns over a prior "ranking" period, and constructing a portfolio over a subsequent "holding" period by taking a long position in the "winners" and a short position in the "losers". Statistically significant excess returns from following such a strategy would then support the existence of cross-sectional momentum.
Cross sectional momentum strategies are popular with practitioners since they tend to generate positive returns, while they are popular with academics due to the fact their existence would run contrary to an implication of the efficient market hypothesis that there does not exist any discernible patterns in asset returns. There is an extensive academic literature investigating the properties of CSM returns covering various asset classes, markets, and jurisdictions. The most notable findings are that CSM returns are generally slightly positive, but become highly negative during times of market uncertainty, and that losses during such periods tend to cancel out, or at least significantly reduce, the prior gains.
Various authors, including Fama and French (1992), Titman (1993 2001), Asness (1994), and Israel and Moskowitz (2013), found that momentum strategies are profitable in US equities markets over different time periods dating back to 1927. Analogous results were found for country equity indices by Richards (1997), Asness et al. (1997), Chan et al. (2000), and Hameed and Yuanto (2002), for emerging markets by Rouwenhorst (1998), for exchange rate markets in Okunev and White (2003) and Menkhoff et al. (2012), for commodities by Erb and Harvey (2006), for futures contracts in Moskowitz et al. (2012), and in industries by Sefton and Scowcroft (2004). Similar results were also found by Asness et al. (2013) and Daniel and Moskowitz (2016) for markets in the European Union, Japan, the United Kingdom, and the United States, and across asset classes including fixed income, commodities, foreign exchange, and equity from 1972 through 2013.
Despite the extensive literature on the empirical properties of momentum based returns, there are relatively few that consider the distributional properties of these returns from a theoretical viewpoint, with Kwon and Satchell (2018) being a notable exception that addresses the CSM returns as defined in this paper. Most of the known theoretical results, obtained for example by Lo and MacKinlay (1990), Jegadeesh and Titman (1993), Lewellen (2002), and Moskowitz et al. (2012), are concerned only with the expected values and first order autocorrelations of returns from the so-called weighted relative strength strategy in which the portfolio over the holding period is constructed from all underlying assets weighted, essentially, in proportion to their absolute or relative returns over the ranking period. The reason why we wish to calculate the distribution of CSM returns is that we can then calculate percentiles, quantiles, and related quantities. We can deduce the degree to which moments of returns exist and their precise form. Such information can be used, for example, to assess the fatness of the tails of the distribution and this is valuable for risk management calculations as well as understanding the benefits and limitations of portfolio construction.
By assuming that underlying asset returns are Gaussian, the distribution and the moments of the CSM returns were derived in Kwon and Satchell (2018). In this paper, we extend their results to the case where the underlying asset returns are Student's t to derive the probability density function and the moments of the CSM returns. The t distribution arises naturally, for example, in a framework where asset volatility is stochastic, and conventional mean-variance analysis will create returns which are very similar to t-distributed returns. The important distinction between Student's t returns and normal returns is that the distribution of Student's t has an additional parameter which governs the fatness of the tails of the distribution and can be used to assess tail risk. There is a trade-off between realism and complexity; we would like to use a more complex distribution such as the skewed Student's t considered in Theodossiou (1998) and Hansen et al. (2010), but the analytical complexity that results becomes prohibitive.
Although the individual asset returns do not exhibit skewness under the generalization to Student's t, they can be leptokurtic which is a well-established feature in the empirical literature. Moreover, the CSM returns can, and do, exhibit skewness that depends on the statistical properties of the underlying assets. A detailed analysis of the special case of two underlying assets reveals that many of the salient features of the CSM returns reported in the empirical literature are consistent with the theoretical implications from this framework. This analysis is of interest because Kwon and Satchell (2018) were able to show that non-normality was a consequence of the momentum structure, even when the underlying returns were normal. We therefore wish to assess what the impact of assuming non-normality in the underlying returns will have on CSM returns. For example, will it exacerbate non-normality or make very little difference? Answers to this question will shed light on applying CSM to universes of assets which are fundamentally non-normal, such as emerging markets.
It should be pointed out that since we work under the assumption that asset returns over the ranking and holding periods are jointly t-distributed, there are limitations in the properties of momentum returns that can be addressed in the theoretical framework of this paper. For example, it is not possible to adequately address properties that depend on certain firm specific, economic, or financial factors such as liquidity, credit spread, market sentiment, business cycle, and information asymmetry since these factors cannot easily be captured in the distributional assumption on asset returns. Theoretical investigation of such properties would require an extension with the ability to incorporate such factors.
Finally, it may be asked what the connection is between our analysis and the extensive linear factor modelling that dominates the asset pricing literature. This literature essentially says that the time t mean of an asset, say the first, is a linear function of factor returns. In the framework of this paper, we can accommodate such modelling by interpreting the asset mean to be conditional on factor returns.
The remainder of this paper is organized as follows: Section 2 introduces the notation and the key results on multivariate normal distributions, and Section 3 provides a mathematically precise definition of CSM returns. Although the expressions for the CSM return density and the associated moments are quite complex in general, they simplify considerably in the case of two assets with one winner and one loser, and this special case is examined in detail in Section 4, along with implications to the empirically observed features reported in the literature, and the paper concludes with Section 5.

Notation and Preliminaries
For the convenience of the reader, we introduce in this section the notation that will be used throughout the paper, and present some known results that will be relied upon in subsequent sections.

Notation
For any x ∈ R n , we will write x i for the i-th coordinate of x, and given y ∈ R n write x ≺ y if and only if x i < y i for all 1 ≤ i ≤ n. Similarly, given a matrix M ∈ R m×k , we will write M i,j for the (i, j)-th entry of M, and the transpose of a vector or a matrix will be denoted by the superscript . The vector in R n with all entries equal to 1 will be denoted 1 n , and given a subset A ∈ R n , we will denote by I A the indicator function on A.
Given a random vector, X, with values in a region D X ⊂ R n , we will write f X (x) and F X (x) for the probability density and the cumulative density functions of X, respectively. Moreover, given another random vector Y, with values in D Y ⊂ R m , we will denote by f X|Y (x | y) and F X|Y (x | y) the conditional probability density and conditional cumulative density functions of X given Y = y, respectively. For any n ∈ N, let [n] = {1, 2, . . . , n} and let S n be the set of permutations of [n]. We will denote the permutation that maps 1 → i 1 , 2 → i 2 , . . . , n → i n by a sequence (i 1 , i 2 , . . . , i n ), and given any τ ∈ S n write τ(i) for the image of i under τ so that if τ = (1, 3, 2), for example, then τ(1) = 1, τ(2) = 3, and τ(3) = 2. Given a permutation τ ∈ S n , we will denote by P τ ∈ R n×n the permutation matrix corresponding to τ and denote by D n ∈ R (n−1)×n the matrix The elements of the permutation group S n act naturally on the set of polynomials, R[x 1 , . . . , x n ] by the rule for any polynomial p ∈ R[x 1 , . . . , x n ] and τ ∈ S n . For any n 1 , n 2 ∈ N, let p n 1 ,2n 2 ∈ R[x 1 , . . . , x n 1 +2n 2 ] be the polynomial p n 1 ,2n 2 (x 1 , . . . , x n 1 +2n 2 ) = Denote by Z(n 1 , 2n 2 ) the stabilizer of p n 1 ,2n 2 under the action of S n 1 +2n 2 so that Z(n 1 , 2n 2 ) = {τ ∈ S n 1 +2n 2 | τ p n 1 ,2n 2 = p n 1 ,2n 2 }, and let Q(n 1 , 2n 2 ) = S n 1 +2n 2 /Z(n 1 , 2n 2 ) be the quotient group, 1 with elements of Q(n 1 , 2n 2 ) identified with their coset representatives τ ∈ S n 1 +2n 2 . Finally, define as the m-fold Cartesian product of [n] = {1, 2, . . . , n}.

Multivariate Normal Distributions
The density of an n-dimensional normal distribution with mean µ and covariance Σ at x ∈ R n will be denoted φ n (x; µ, Σ), and the corresponding cumulative density function will be denoted Φ n (x; µ, Σ). In general, given random variables X 1 , . . . , X n , their joint probability density function will be denoted f X 1 ,··· ,X n , and we will write F X 1 ,··· ,X n for the cumulative density function. Theorem 1. Let n 1 , n 2 ∈ N and suppose X ∼ N n 1 +n 2 (µ, Σ), where with X i , µ i ∈ R n i and Σ i,j ∈ R n i ×n j for 1 ≤ i, j ≤ 2, and Σ positive definite. Then, the conditional distribution of X 1 given X 2 is normal with mean and covariance respectively, and φ n 1 +n 2 (x; µ, Σ) decomposes as Proof. Refer to Muirhead (1982) Theorem 1.2.11.
Given an n-dimensional random vector X and p = (p 1 , . . . , p m ) ∈ {1, 2, . . . , n} m , we will denote by µ p (X) the p-th moment of X so that Note that the subscripts, p i , in (9) may be repeated so that the above definition is equivalent to the more familiar definition of moments in which the powers of the components of X appear inside the expectation on the right-hand side, viz.
1 Refer to Rotman (1995) Chapter 2 for the details on quotient groups.
An alternative expression for µ p (X), where X is multivarite normal, is given in Kan (2008) Proposition 2.
Corollary 1. Let X ∼ N 1 (µ, σ 2 ). Then, for any m ∈ N, the m-th moment of X is given by In the special case where µ = 0 and m is even, Proof. Follows from Theorem 2, since the inner sum in (10), for which 2l = i, consists of ( m i )(i − 1)!! identical terms that are all equal to σ i µ m−i .

Multivariate Student's t Distribution
Asset returns are often assumed to be normally distributed in the academic literature for theoretical convenience, in which case they are completely determined by the location and the scale parameters. However, it is widely reported in the empirical literature that the observed asset returns exhibit excess kurtosis. Student's t-distribution is a distribution from the elliptical family with an additional parameter ν, viz. number of degrees of freedom that controls the kurtosis. The Student's t distribution reduces to the normal in the limit as ν → ∞, and hence provides a convenient framework under which to investigate the impact of excess kurtosis in the underlying asset returns on the distributional properties of the CSM return. This subsection provides a brief summary of the key properties of the Student's t distribution that will be required in the remainder of this paper.
The t distribution has a long history in mathematical statistics. The univariate probability density function (pdf), t (µ,σ 2 ,ν), of a t-variate with mean µ, scale parameter σ, and degrees of freedom ν is given by From the origins of the t-test in mathematical finance, it is clear that we can write the corresponding random variable as x = µ + z/Y, where z and Y are independent, z ∼ N (0, σ 2 ), Y = g/ν, and g ∼ χ 2 (ν).
In extending the definition of the t distribution to the multivariate case, we are faced with a choice. Although the choice z ∼ N n (0, Σ) is clear, Y can be defined in various ways. For example: All three of these choices have a stochastic volatility interpretation corresponding to • idiosyncratic shocks, • common factor shocks, • economy-wide or market-wide shock.
Although we have chosen the final characterization, cross sectional momentum could also be analyzed under other characterizations.
Throughout this paper, the probability density function of n-dimensional Student's t distribution, St n 1 +n 2 (µ, Σ, ν), with ν degrees of freedom, location µ, and shape matrix Σ at x ∈ R n will be denoted t n (x; µ, Σ, ν), and we will write T n (x; µ, Σ, ν) for the corresponding cumulative density function. The next theorem shows that multivariate Student's t distribution is closed under conditioning, in the sense that the conditional density of a subset given its complement is again Student's t. This property will be crucial in the investigation of the CSM returns in later sections.
with X i , µ i ∈ R n i and Σ i,j ∈ R n i ×n j for 1 ≤ i, j ≤ 2, and Σ positive definite. Then, the conditional distribution of X 1 given X 2 is Student's t with degrees of freedom ν X 1 |X 2 = ν + n 2 , and location and shape matrix respectively, and t n 1 +n 2 (x; µ, Σ, ν) decomposes as Proof. Refer to Roth (2013) Appendix A.6, or Muirhead (1982) Problems 1.29 and 1.30.
Although a Student's t distribution does not have finite moments of all orders, the next theorem provides an explicit expression for those that do exist.
Proof. Refer to Appendix A.
Setting n = 1 gives the moments of the one-dimensional Student's t distribution.
Corollary 2. Let X ∼ St 1 (µ, σ 2 , ν). Then, for any m ∈ N such that m < ν, the m-th moment of X is given by where k!! is the double factorial defined in Corollary 1.
Proof. Follows from similar arguments to Theorem 4 noting that the indices p i j are all equal in this case.
For an alternative derivation of the moments of Student's t distribution, refer to Kirkby et al. (2019).

Unified Skew t Family of Distributions
Multivariate skew-normal (SN) distributions were introduced in Azzalini and Valle (1985) to generalize normal distributions to those that allow non-zero skewness, and the seemingly disparate distributions related to the multivariate SN distributions were brought together under the umbrella of the so-called unified skew-normal (SUN) family of distributions in Arellano-Valle and Azzalini (2006), where it was shown that the SUN family contains many of these skew-normal variants as special cases. The extension of the normal family to those with non-zero skewness was then extended to the elliptical family of distributions in Arellano-Valle and Genton (2010). In what follows, we only summarize the results on the extension for the multivariate Student's t distributions that will be required in this paper, and refer the reader to Arellano-Valle and Genton (2010) and Jamalizadeh and Balakrishnan (2012) for the details. Given Then, the probability density function, f U (u), of an n 1 -dimensional unified skew t (SUT) distributed random variable, U, associated with (X 1 , X 2 ) given by where The key characteristic of f U (u) is that it is a product of an n 2 -dimensional Student's t density and an n 1 -dimensional cumulative Student's t density with the variable u appearing as the main variable in the former and in the mean and variance parameters of the latter. As will be seen, the densities of cross sectional momentum returns will be a weighted sum of these SUT distributions.

Cross-Sectional Momentum Returns with Student's t Distributed Asset Returns
In this section, we derive the distributional properties of the cross sectional momentum (CSM) returns under the assumption that the underlying asset returns are multivariate Student's t. We begin by recalling the mathematically precise definition of the CSM return from Kwon and Satchell (2018).
Let 0 < m + , m − , n ∈ N such that m + + m − ≤ n, and for each 1 ≤ i ≤ n denote by r i,t the return on asset i at time t. Moreover, let r t = (r 1,t , . . . , r n,t ) ∈ R n , and for any τ ∈ S n , define Note that any given τ ∈ S n defines an ordering, r t,τ 1 > r t,τ 2 > · · · > r t,τ n , of the components of r t . Thus, r τ,m ± ,t represents the return on a portfolio where the top m + ranked assets are equally weighted and held long while the bottom m − assets are equally weighted and held short. The assumption of equal weighting is for notational simplicity only, and not crucial for the general theoretical results. Note also that x τ,t is defined to allow the ranking of the components of r t corresponding to τ ∈ S n to be written succinctly as x τ,t ≺ 0 n−1 .
Definition 1. The (m + , m − )-cross sectional momentum return, r m ± ,t+1 , is defined by where I A , for any subset A ⊂ R n , denotes the indicator function on the set A.
For intuition behind the definition of r m ± ,t+1 , note that the components of r t , representing asset returns over the ranking period, can be arranged in any of the n! orderings corresponding to the permutations τ ∈ S n . For each such ranking r t,τ 1 > r t,τ 2 > · · · > r t,τ n , the m + winner returns over the holding period are r t+1,τ 1 , . . . , r t+1,τ m + while the m − loser returns are r t+1,τ n−m + 1 , . . . , r t+1,τ n . Equally weighting the returns in the winner and the loser portfolios gives r τ,m ± ,t+1 , and since the ranking of components of r t determined by τ ∈ S n is equivalent to the condition x τ,t ≺ 0 n−1 , summing over all possible r τ,m ± ,t+1 and prefixing by the matching indicator function gives the expression for r m ± ,t+1 in (28).
For the remainder of this paper, we make the following assumption on the distribution of (r t , r t+1 ) . Assumption 1. The vector of returns, (r t , r t+1 ) , is multivariate Student's t distributed so that with ν ∈ R + , µ u ∈ R n , and Σ u,v ∈ R n×n , where u, v ∈ {t, t + 1}.
Since the t-distribution is symmetric, there are limitations on the properties of asset returns that can be captured adequately by the above assumption as already discussed in Section 1. Nevertheless, the assumption is sufficiently general to accommodate linear factor models and econometric models such as vector autoregressive moving average models where the factors and noise terms, respectively, are t-distributed. Moreover, the framework also allows consideration of more general cases where µ t and µ t+1 are conditional means linear in factors without requiring the factors themselves to be multivariate t. However, since the analysis in later sections will show that momentum returns are nonlinear in the underlying asset returns, the common practise of regressing momentum returns on the various factors must be interpreted as a best linear prediction rather than a conditional expectation in such cases. We now derive the probability density function, f m ± ,t+1 (r), of the CSM return r m ± ,t+1 .
Note that the summands that appear in the pdf of the CSM return in (30) have the characteristic form of the SUT densities given in (21) other than for the omission of the normalization factor 2 that appears in the denominator of (21). It follows that pdf of the CSM return is a weighted sum of the SUT densities. The next result gives the pdf in the special case where r t and r t+1 are independent, which can be considered as the case where the market is efficient.
Corollary 3. If r t , r t+1 satisfies Assumption 1, and r t and r t+1 are independent, then where The normalization factor is, in fact, the probability of the event 0 ≺ X 1 , where X 1 is as defined in (20).
We next derive the expressions for the non-central moments of the CSM returns. Since t distributions do not have moments of all orders as noted in Theorem 4, the moments of CSM returns will also only exist up to a certain order.

Proof.
Refer to Appendix C.

Special Case of Two Assets
In this section, we examine in detail the special case of two assets, and begin by computing the partial moments of one-dimensional Student's t distributions that will be required. To reduce notational burden, we define for η ∈ R, ς ∈ R + , and ν ∈ R + c η, so that from (13) we have Lemma 1. Let η ∈ R, ς ∈ R + and 2 < ν ∈ R + . Then, for m ∈ N + , we have Proof. Refer to Appendix D.
The next theorem will play a key role in the derivation of the non-central moments of the CSM returns.
As it will be seen, the quantities that play a key role in the two asset case are the spreads, r t,2 − r t,1 and r t+1,2 − r t+1,1 , and so we define where 1 ≤ i, j ≤ 2 and u ∈ {t, t + 1}. Note that ς 2 u is the variance of the spread r u,2 − r u,1 , and t,t+1 is the correlation between r t,2 − r t,1 and r t+1,2 − r t+1,1 . Next, we compute the terms γ τ (x) and Υ τ (x) that appear in the expression (33) for the pdf of the CSM return. For u, v ∈ {t, t + 1} and τ ∈ S 2 , we have ι P (1,2) µ u = −η u = −ι P (2,1) µ u , If we define the sign of permutations in S 2 by ε((1, 2)) = 1 and ε((2, 1)) = −1, then the expressions for γ τ (x) and Υ τ (x) can be written succinctly as The next lemma will provide the building blocks for the non-central moments of CSM returns.
Proof. Follows from using the binomial formula to expand the powers of γ τ (x) and Υ τ (x), and applying the definition of κ m η, ς 2 , ν .
Theorem 8. Let η ∈ R, ς ∈ R + and ν ∈ R + . Then, for m ∈ N + such that m < ν, the m-th non-central moment, µ m r 1 ± ,t+1 , of r 1 ± ,t+1 is given as follows: where κ α,β,τ (η, ς, ν) is as defined in (49). In particular, the first four non-central moments are Proof. Follows from the general expression (38) for the moments of r 1 ± ,t+1 and the definition of We remark that the non-central moments of r 1 ± ,t+1 given in Theorem 8 are sums indexed by S 2 that consists of two elements, and that each term that appears in these sums can be computed recursively using (43), (48), and (49). Since the right-hand side of (48) consists of a finite number of terms and (43) is equivalent to the explicit expressions (44) or (45) depending on the index m, these moments of r 1 ± ,t+1 can be computed without having to make any simplifying approximations. For example, the first moment is given explicitly by which is reassuring since it has the same functional form as the following expression 3 obtained for the normally distributed asset return case in Kwon and Satchell (2018) except for the distribution functions being Student's t rather than normal. The numerical calculations in this paper were performed using code written in C++ that relied on the boost library 4 to compute the functions t 1 and T 1 . Returning briefly to the linear factor structure discussed in Section 1, we could consider µ t,1 and µ t,2 to be a linear combinations of factors, which in a Carhart (1997) model context would consist of size, market, value, and momentum. Thus, if we were to go long asset 1 and short asset 2 in our CSM momentum portfolio, we might expect a larger exposure to the momentum factor for asset 1 and a smaller exposure for asset 2. We could carry out further detailed analysis to accommodate these features but leave this for further research. If we denote by µ 1 ± ,t+1 the mean, σ 2 1 ± ,t+1 the variance, γ 1 ± ,t+1 the skewness, and κ 1 ± ,t+1 the excess kurtosis of the CSM return, then these quantities are easily computed from the non-central moments 3 Rewritten in the notation of this paper. 4 Refer to www.boost.org/doc/libs/1_72_0/boost/math/distributions/students_t.hpp given in Theorem 8. It should be noted that the quantities corresponding to the odd moments, viz. µ 1 ± ,t+1 and γ 1 ± ,t+1 are approximately odd as functions of t,t+1 , and those associated with the even moments, viz. σ 2 1 ± ,t+1 and κ 1 ± ,t+1 , are approximately even as functions of t,t+1 . This is because the return from a portfolio formed by taking a long position in the loser and a short position in the winner when t,t+1 < 0 would have the same distributional properties as the return from taking the opposite positions when t,t+1 > 0.
In the analysis that follows, we assume that the asset returns are stationary in order to reduce the number of parameters. Moreover, we have set µ t,1 = 6%, µ t,2 = 4%, σ 2 t,1 = 0.18(ν − 2)/ν and σ 2 t,2 = 0.14(ν − 2)/ν, where the asset variances have been scaled by a factor dependent on ν to ensure that they are independent of the number of degrees of freedom. It should be noted that the cross-sectional correlation, ρ t , then determines the variance, ς 2 t , of the spread, r t,2 − r t,1 . The mean of the CSM return is shown in Figure 1 as a function of the degrees of freedom ν, and the spread autocorrelation, t,t+1 , for ρ t = −0.9, ρ t = 0.0, and ρ t = 0.9. As expected, µ 1 ± ,t+1 is an increasing function of t,t+1 . For t,t+1 > 0, the mean decreases slightly with ν, while the opposite is the case when t,t+1 < 0. In the region t,t+1 ≈ 0, the mean is slightly positive. Since this is the region corresponding to small autocorrelations in the underlying asset returns, and the situation most commonly observed in practice, it is reassuring that the small positive CSM returns implied by the model is consistent with the findings reported in the empirical literature. Moreover, we see that the degrees of freedom parameter, ν, that controls the kurtosis of asset returns has very little impact on r 1 pm ,t+1 . Interpreting small ν as representative of assets from emerging markets, this is consistent with findings from Rouwenhorst (1999) and Bekaert et al. (1997) that, although there is evidence of momentum in emerging markets, it is not significantly different to those observed in developed markets, despite the assets from the respective markets having different distributional properties. The surface flattens out as the cross-sectional correlation increases from −0.9 to 0.9. Since an increase in ρ t corresponds to a decrease in the variance, ς 2 t , of the spread r t,2 − r t,1 , this behavior is consistent with the positive relationship usually associated with risk and return in finance. The standard deviation of the CSM return as a function of ν and t,t+1 is shown in Figure 2. Although not clearly evident from the figure, σ 1 ± ,t+1 is a decreasing function of ν, and for a fixed value of ν the standard deviation is convex in t,t+1 and takes the maximum value at t,t+1 = 0. Finally, since the variance of the spread decreases as ρ t increases, σ 1 ± ,t+1 likewise decreases with increasing ρ t .
The skewness of the CSM return in Figure 3 shows that γ 1 ± ,t+1 is negative when t,t+1 < 0 and positive otherwise. In fact, although it is not clearly evident from the figure, γ 1 ± ,t+1 is negative even for small positive values of t,t+1 . As discussed above, the autocorrelations in the asset returns tend to be small in practice, and hence t,t+1 will also be small. The corresponding model implied skewness in the CSM return will then be slightly negative, which is consistent with the observations in the empirical literature. In contrast to the surfaces for other quantities that flatten to a large extent as ρ t increases from −0.9 to 0.9, the surface for κ 1 ± ,t+1 in Figure 4 remains relatively unchanged. The excess kurtosis of CSM returns is generally positive, in line with the findings reported in the literature, and increases significantly when ν is small and | t,t+1 | is high. It should be noted that κ 1 ± ,t+1 is largest when ν is small. Since the deviation of the Student's t distribution from the normal is greatest when ν is small, it follows that the extension considered in this paper will be useful in situations where the observed kurtosis in the CSM returns is higher than the value implied under the assumption of normal asset returns. This would be the case, for example, when considering emerging markets.

Conclusions
In this paper, the theoretical framework introduced in Kwon and Satchell (2018) was extended to investigate the distributional properties of cross-sectional momentum (CSM) returns under the assumption that the vector of asset returns were multivariate Student's t. The probability density function and the moments of the CSM returns were derived, and investigated in detail for the special case of two assets.
It was found that, in situations where the assets return, and hence the return spread, autocorrelations are small, and the CSM return has a small positive mean, negative skewness, and excess kurtosis. These are all consistent with the findings reported in the empirical literature. Moreover, the skewness and the kurtosis both become more pronounced as the number of degrees of freedom in the Student's t distribution decreases and the corresponding asset returns become less normal.
In modeling asset returns that deviate significantly from being normal, such as those for emerging markets, the extension to the Student's t considered in this paper would address some of the limitations of assuming normality. Since the Student's t distribution approaches the normal in the limit as the number of degrees of freedom approaches infinity, the extension also provides a framework under which to analyze the implication and the limitations of the assumption of normality in asset returns to CSM returns. we have that r τ,m ± ,t+1 , x τ,t is a linear transformation of r t+1 , r t , and so it follows from Roth (2013) Equation (4.1) that r τ,m ± ,t+1 x τ,t ∼ St n ι P τ µ t+1 D n P τ r t , ι P τ Σ t+1,t+1 P τ ι ι P τ Σ t+1,t P τ D n D n P τ Σ t,t+1 P τ ι D n P τ Σ t,t P τ D n , ν . Now, the joint pdf, f r τ,m ± ,t+1 ,x τ,t (r, x), is given by f r τ,m ± ,t+1 ,x τ,t (t, x) = f x τ,t |r τ,m ± ,t+1 (x | r) f r τ,m ± ,t+1 (r).