Abstract
In this paper, we study the hedging effectiveness of crude oil futures on the basis of the lower partial moments (LPMs). An improved kernel density estimation method is proposed to estimate the optimal hedge ratio. We investigate crude oil price hedging by contributing to the literature in the following twofold: First, unlike the existing studies which focus on univariate kernel density method, we use bivariate kernel density to calculate the estimated LPMs, wherein the two bandwidths of the bivariate kernel density are not limited to the same, which is our main innovation point. According to the criterion of minimizing the mean integrated square error, we derive the conditions that the optimal bandwidths satisfy. In the process of derivation, we make a distribution assumption locally in order to simplify calculation, but this type of local distribution assumption is far better than global distribution assumption used in parameter method theoretically and empirically. Second, in order to meet the requirement of bivariate kernel density for independent random variables, we adopt ARCH models to obtain the independent noises with related to the returns of crude oil spot and futures. Genetic algorithm is used to tune the parameters that maximize quasilikelihood. Empirical results reveal that, at first, the hedging strategy based on the improved kernel density estimation method is of highly efficiency, and then it achieves better performance than the hedging strategy based on the traditional parametric method. We also compare the risk control effectiveness of static hedge ratio vs. timevarying hedge ratio and find that static hedging has a better performance than timevarying hedging.
Introduction
Along with the expanding economic and business ties between countries and increasingly tense international situation, there are huge fluctuations in the prices of some important energy and a lot of uncertainty in the future, especially for the case with crude oil. During the past few days, international oil price fell sharply as a result of shock. On the one hand, OPEC, led by Saudi Arabia, and Russia have failed to reach an agreement on cutting output, and then Saudi Arabia launched a price war; on the other hand, the global spread of coronavirus pandemic creates a panic in the market. Take the crude oil price in March 9, 2020, as an example, the crude oil price went down 24%, which was the biggest oneday drop since the 1991 gulf war. In fact, as a kind of global commodity, crude oil can affect economic activities and financial markets, for example gold, oil and equities (Maghyereh et al. 2017), WTI crude oil futures returns and hedge funds (Zhang and Wu 2019), global crude oil market and China’s commodity sectors (Meng et al. 2020), and so on. Therefore, under the background of highly volatile crude oil price, considering its complex risk transmission mechanism, people who need to hedge oil price risk are not limited to oil producers and refiners only, but also financial market participants and policy makers.
Hedging is one of the most important functions of futures markets. When hedging the risk of crude oil price, we have to establish a hedged portfolio. Computational problems arise when we embed spot and futures in a portfolio. The traditional parametric and semiparametric methods usually assume that the joint distribution is known, which is likely to cause misspecification if we have no economic reason to prefer one functional form over another (Backus et al. 1998). For example, Feng et al. (2012) argue that the assumption of a certain type of distribution can cause biased results when studying carbon returns. By contrast, nonparametric kernel density does not require any prior information related to distributions and estimators are driven by real data (Li and Racine 2007), so the misspecification problem can be relieved to a large extent. For this purpose, kernel density estimation is adopted in this paper to fit the joint distribution of the hedged portfolio. There are a number of researches about financial problems by means of kernel density estimation. Bouezmarni and Rombouts (2010) adopted the gamma kernel density under the background of positive time series data for the sake of boundary problems and demonstrated the superiority of it. Harvey and Oryshchenko (2012) utilized kernel density estimations to describe probability density functions of stock market indexes. Shi et al. (2017) combined the Bayes discriminant approach based on the multivariate kernel density with the extension discriminant approach to advance the concreteness of discrimination. Yan and Han (2019) compared the performance of some normal mixture models and kernel density estimations in fitting the behavior of different stock returns. Since that the hedging research is related to the spot and futures returns, so we adopt bivariate kernel density estimation, at the same time, different from the existing literature which sets up a same bandwidth for different variables (Hazelton and Marshall 2009; Gramacki and Gramacki 2017), we assume two different bandwidth for spot and futures and find the optimal solutions by minimizing the mean integrated square error. In this process, normal distribution is assumed for simplifying calculation, but this assumption is solely used for acquiring optimal bandwidths and is local to some extent, which is different from the global distribution assumption in the traditional parameter method and has better performance empirically.
There is a condition for using the kernel density estimation that variables must be independent of each other, which is opposite to the fact that spot returns and futures returns are highly related. So we adopt autoregressive conditional heteroskedasticity (ARCH) model to separate two independent series from spot and futures returns, named noise terms in the model, and the density function of independent noised is estimated through kernel density. The ARCH model was introduced by Engle(1982), aiming to investigate the timevarying volatility of economic data and being used widely in financial market, especially in pricing financial derivatives and measuring investment risk. Giot and Laurent (2004) compared the performance of a model on the basis of the daily realized volatility and a daily ARCH type model, aiming to study the volatility of stocks and exchange rate returns. Catani and Ahlgren (2017) proposed a bootstrap combined equationbyequation Lagrange multiplier test for ARCH errors in VAR models in order to overcome the difficulty of high dimensionality facing multivariate tests. Further, ARCH model also plays an important role in crude oil market volatility analysis. Cheong (2009) used ARCH model, which considers lots of crucial volatility facts just like clustering volatility, to discuss the timevarying volatility within some important crude oil markets. Nademi and Nademi (2018) conducted a price forecast of crude oil including OPEC, WTI and Brent by means of a semiparametric Markov switching ARARCH model. There is also one point we’d like to stress, although ARCH model is adopted, we do not want to research volatility, and the only purpose is to obtain two independent series.
For the risk management, a appropriate risk measure is consequential, and the adopted one in this paper is lower partial moment (LPM). The characteristics of LPM when measuring the risk include: (1) measurement of oneside risk, and the focus is the negative deviation from the target rate of return, that is, downside risk; in addition, by measuring the return characteristics of loss (Brogan and Stidham 2008), the lower partial moment can reflect the difference of investors’ attitude towards profit and loss. (2) By setting different target rates of return and risk parameters, the LPM can contain the heterogeneity of investors. (3) LPM satisfies the subadditivity, monotonicity and transformation invariance as a coherent measure of risk. (4) Decision criterion based on the LPM conforms to the expected utility maximization criterion and the random dominant criterion, and it is not necessary to make special assumptions about the utility function. Due to the outstanding features of it, LPM has been the center of a large amount of studies. Demirer and Lien (2003) calculated the optimal hedging ratios and corresponding hedging performance as well as compared the results between short and long hedgers. Baghdadabad (2014) extended the ndegree ADRM risk measures within the framework of ndegree LPM and then put up with a new MV model to evaluate the US investors’ indications in respect to portfolio performance. Dai et al. (2017) calculated the optimal hedging ratios by means of minimizing the LPM. Jasemi et al. (2019) put up with an practical methodology to approximate the LPM of the first order to dealing with computational difficulties. In this paper, we deduce the hedging strategy of crude oil futures based upon the lower partial moments (LPMs).
The rest of paper is structured as follows:Section 2 introduces kernel density estimation, deriving the functions which optimal bandwidths cater to. And then Section 3 introduces the ARCH model and solves the parameter estimation by genetic algorithm. We incorporate the kernel density into the LPMs and calculate the optimal hedging position in Section 4. Further, empirical analysis including the comparison between kernel density estimation and parametric method as well as static hedging and dynamic hedging is conducted in Section 5. Based on the research results, conclusions and suggestions for investors are provided in Section 6.
An improved kernel density estimation method
There are parametric, semiparametric and nonparametric methods to determine the probability density function of the sample data, and the common nonparametric methods include histogram and kernel density estimation. The concept of histogram estimation is simple, but the result is discontinuous, that is, the density value will suddenly drop to zero at the regional boundary, while the kernel density has the advantage of continuous estimation, and it is an efficient nonparametric density estimation method. The expression of kernel density is as follows:
where n is the number of sample, \(h_{1}\) and \(h_{2}\) represent the bandwidths or smooth parameters. In the existing research, \(h_{1}\) and \(h_{2}\) are generally considered to be the same, i.e., \(h_{1}=h_{2}=h\), while, in this paper, we are not assuming they’re the same. \(X_{1i}\) and \(X_{2i}\) are the two given sample series, \(K(\cdot ,\cdot )\) is kernel function. Many studies have pointed out that different kernel functions have little effect on the accuracy of kernel density estimation, and there is asymptotic normality for kernel estimation in most samples, so Gaussian kernel is selected in this paper.
Kernel density fuses the form with observation point as the center, and the performance depends on the bandwidth selection. If the bandwidth is too small, the whole estimation, especially the tail, will appear interference and have a tendency to increase variance; if the bandwidth is too large, the distribution characteristics will be masked, and overaveraging will make the estimator have a large deviation. When considering estimation at a single point, a natural measure is the mean square error(MSE), defined as
By standard elementary properties of mean and variance,
The first and most widely used way of placing a measure on the global accuracy of \({\hat{f}}\) is the mean integrated square error (MISE) (Silverman 1986), defined as
which gives the MISE as the sum of the integrated square bias and the integrated variance.
Let \(y_{1}=X_{1i},y_{2}=X_{2i},t_{1}=\frac{y_{1}x_{1}}{h_{1}},t_{2}=\frac{y_{2}x_{2}}{h_{2}}\), and the kernel function \(K(\cdot ,\cdot )\) is a symmetric function satisfying:
As was pointed out earlier, the calculation of bias is not determined by the size of sample (n) but rather the bandwidth (\(h_{1}, h_{2}\)), of course, if the calculation of bandwidth depends on the n, then the bias will depend on n through its dependence on h. The approximation expression of bias is obtained as follows:
where,
By integrating the result above, we can get the following one:
We now turn to the variance,
The result is obtained by using the approximation for the bias and assuming that \(h_{1}, h_{2}\) is small and n is large. Further, we have
The expressions of MISE and AMISE can be obtained according to the analysis mentioned above:
Then we can get the optimal window width \(h_{1}^{*}\) and \(h_{2}^{*}\) by calculating the follow equations:
That is, the optimal window widths satisfy:
The solutions of Eqs. (13) depend on the real density function. Assume that \(\eta _{1}\sim N(0,\sigma _{1}^{2}),\eta _{2}\sim N(0,\sigma _{2}^{2})\), and they are independent of each other. It should be emphasized that the normal assumption here is only a local assumption made in the derivation of the optimal window width, which is substantially different from the global assumption made in the parametric method. The joint density of \(\eta _{1}\) and \(\eta _{2}\) is
We think this as the real density of population, and the derivative part contained in the above two equations can be expressed as follows:
At the same time, for the \({\hat{f}}(x_{1},x_{2})\) in Eq. (1), we adopt Gaussian kernel, and \(k_{1}, k_{2}\) and \(k_{3}\) are calculated as follows:
Then, Eqs. (13) can be simplified as follows:
By solving the equations, we can obtain the new optimal window widths \((h_{1}^{*},h_{2}^{*})\), for which we can estimate the kernel density \({\hat{f}}(x_{1},x_{2})\):
Independent sequences from ARCH Model
Since the sample data are not independent of each other in finance, insurance and other aspects, it would be a mistake to estimate the kernel density directly using the relevant data. Therefore, we use the ARCH model to fit the returns of spot and futures prices, and further to obtain the independent errors. Based on the independent errors, we estimate the optimal bandwidth for binary kernel density.
ARCH model is able to describe the timevarying volatility of economic data, and the generalized ARCH model can further depict the clustering of volatility, that is, volatility will change as time goes by as well as present an relatively high or low situation at some time. Of course, ARCH model is used here just for separating independent series and has nothing to do with volatility. The fundamental content of ARCH model is shown as follows:
where \({\mathbf {X}}_{t}=\left( \begin{matrix}X_{1t}\\ X_{2t}\end{matrix}\right) ,\varepsilon _{t}=\left( \begin{matrix}\varepsilon _{1t}\\ \varepsilon _{2t}\end{matrix}\right) ,\eta _{t}=\left( \begin{matrix}\eta _{1t}\\ \eta _{2t}\end{matrix}\right) ,\varphi =\left( \begin{matrix} \varphi _{1}\\ \varphi _{2}\end{matrix}\right), \) and \(w_{1},w_{2},A_{11},A_{12},A_{21},A_{22}\) are constant parameters that should be estimated.
Since that the distribution of \(\eta _{t}\) is unknown, so here the quasilikelihood estimation method is adopted. That is, we maximize the following criterion function to obtain the quasilikelihood estimation of parameters.
Then we deduce the concrete form of criterion function, as we all know,
Let \(\Gamma =\left( \begin{matrix} 1&{}0\\ 0&{}1\end{matrix}\right) \). We have
In this way, the likelihood function can be expressed as:
and
So, it yields
Then, the likelihood function is shown as follows:
In parallel, we know that,
Finally, based on the given data, we can rewrite the likelihood function as follows
where,
To estimate the parameters in the ARCH model, Alzghool and AlZubi (2018) adopted semiparametric methods including quasilikelihood and asymptotic quasilikelihood estimation. For the problem of numerical implementation of model structure choice, approach, which is based on genetic algorithm, is proposed. It is a heuristic search algorithm used for solving optimization and modeling tasks by random selection, combination and variation of the required parameters with the use of mechanisms that resemble biological evolution. A distinctive feature of genetic algorithm is an emphasis on the use of “crossing” operator, which makes an operation of recombination of solution candidates, whose role is similar to that of crossing in living nature. In this paper, GA is used to tune the parameters that maximize quasilikelihood.
Lower Partial Moments
LPM is associated with downside risk, according to Bawa and Linderberg (1997) and Lien and Tse (2001); its expression is shown as follows:
where c is the expected return and n is the power of the shortfall, the higher c is, the investors expect a higher return; m represents the risk aversion coefficient, if \(m<1\), the investors appetite for risk, and if \(m>1\), the investors are riskaversion. In particular, let \(m=0\), the LPM is the equal of valueatrisk (VaR); when \(m=1\), the LPM is equivalent to conditional value at risk (CVaR); when \(c=0\) and \(m=2\), the LPM is similar to semivariogram of Markowitz. In addition, \(r_{p}\) is the hedged portfolio return, and \(r_{p}=r_{s}Hr_{f}\), in which \(r_{s}\) is the spot return, \(r_{f}\) is the futures return and H is the hedged ratio.
Based on ARCH model, we can express \(r_{s}\) and \(r_{f}\) as follows:
and
Then we incorporate the noise into LPM:
Here, \(D_{1}=cr_{1}\sqrt{h_{1}}x_{1}H(r_{2}+\sqrt{h_{2}}x_{2})\ge 0\). \(f(x_{1},x_{2})\) are the joint density of \(\eta _{1}\) and \(\eta _{2}\). Whenever the joint distribution of \(r_{s}\) and \(r_{f}\) is known, we can apply numerical methods to find the optimal hedge ratio. Due to the fact that the true distribution of rs and rf is unknown, so we adopt an indirect method to estimate the distribution of the hedged portfolio returns considering any given c. Specifically, for a given c, we construct the data series for \(\eta _{1}\) and \(\eta _{2}\) from the data of \(r_{s}\) and \(r_{f}\), and then apply nonparametric methods to estimate the distribution of \(\eta _{1}\) and \(\eta _{2}\). The details are as follows.
Minimum LPM Hedged Ratios
Further, we incorporate the calculated kernel density into the LPM. For the calculation of optimal hedging ratios, traditional approach called static hedging figures out a constant value by minimizing the risk measure, which originated from Johnson (1960) and Stein (1961), who select an optimal futures position to minimize the variance of the hedged portfolio. Then Ghosh (1993) adopted the error correction model to calculate the constant hedge ratio based on the cointegration theory. Although the static hedging strategy has been widely used in existing literature, it ignores the timevarying characteristic of the (co)variance between the spot and futures returns. Qu et al. (2019) investigated the dynamic hedging performance of China’s CSI 300 index futures, utilizing the highfrequency intraday information with RMVHRbased models. So we calculate the optimal hedging ratios of static and dynamic hedging, respectively.
Optimal hedged ratios based on the static Hedging
The optimal hedged ratios are calculated based on the whole sample data. Based on Eq. (30), the expression of LPMs is written as follows:
where \(D_{2}:cr_{1i}\sqrt{h_{1i}}x_{1}H(r_{2i}+\sqrt{h_{2i}}x_{2})\ge 0\). Let
Here, \(D_{3}:\frac{cr_{1i}H(r_{2i}+\sqrt{h_{2i}}x_{2})}{\sqrt{h_{1i}}}\), then we have
Therefore, the LPMs are expressed by
We can obtain the optimal hedged ratio by calculating \(\frac{\partial L}{\partial H}=0\), that is, the optimal hedged ratio satisfies the following equation:
According to Eq. (31), we have
where \(A=\sqrt{h_{1i}}X_{1i}+uc+r_{1i}+H(r_{2i}+\sqrt{h_{2i}}x_{2})\).
For the different values of m, we can deduce the condition that the optimal hedge ratio satisfies. The results are shown in the following proposition.
Proposition 1
Suppose a hedger want to hedge the downside risk measured by LPMs with a static hedging strategy. The optimal hedge ratio \(H^{*}\), therefore, satisfies the following conditions:

when \(m=0\), the optimal hedged ratio \(H^{*}\) is solved from the following equation
$$\begin{aligned}&\sum _{i=1}^n \exp \left\{ \frac{1}{2} \frac{(aH^{*}+b)^{2}}{h_{1}^{*2}h_{1i}+h_{2}^{*2}H^{*2}h_{2i}}\right\} \nonumber \\&\quad \frac{ah_{1}^{*2}h_{1i}bH^{*}h_{2}^{*2}h_{2i}}{ (h_{1}^{*2}h_{1i}+H^{*2}h_{2}^{*2}h_{2i})^{\frac{3}{2}}}=0 \end{aligned}$$(35)where \(a=\sqrt{h_{2i}}X_{2i}+r_{2i},b=\sqrt{h_{1i}}X_{1i}c+r_{1i}\). \(X_{1i},X_{2i}\) are the return series of spot and futures for the given data. \(h_{1}^{*}, h_{2}^{*}\) are the best bandwidths estimated based on Eqs. (17). And, \(h_{1i},h_{2i}\) are obtained from Eq. (27).

when \(m=1\), the optimal hedged ratio \(H^{*}\) is solved from the following equation
$$\begin{aligned}&\sum _{i=1}^n \int _{\infty }^{+\infty } \frac{v}{\sqrt{h_{2i}}}\exp \left\{ \frac{1}{2}\left( \frac{av}{\sqrt{h_{2i}} h_{2}^{*}}\right) ^{2}\right\} \nonumber \\&\quad \Phi \left( \frac{bH^{*}v}{\sqrt{h_{1i}}h_{1}^{*}}\right) \,dv=0 \end{aligned}$$(36) 
when \(m=2\), the optimal hedged ratio \(H^{*}\) is solved from the following equation
$$\begin{aligned} \begin{aligned}&\sum _{i=1}^n\int _{\infty }^{+\infty }\sqrt{\frac{2\pi }{h_{2i}}}(bv+H^{*}v^{2})\\&\quad \exp \left\{ \frac{1}{2}\left( \frac{av}{\sqrt{h_{2i}}h_{2}^{*}}\right) ^{2}\right\} \Phi \left( \frac{bH^{*}v}{\sqrt{h_{1i}}h_{1}^{*}}\right) \,dv\\&\quad +\sum _{i=1}^n\frac{h_{1}^{*2}h_{2}^{*}h_{1i} \sqrt{2\pi h_{2i}}(ah_{1}^{*2}h_{1i}bH^{*}h_{2}^{*2}h_{2i})}{(h_{1}^{*2}h_{1i}+H^{*2}h_{2}^{*2}h_{2i})^{\frac{3}{2}}}\\&\quad \exp \left\{ \frac{1}{2}\frac{(aH^{*}+b)^{2}}{h_{1}^{*2}h_{1i}+h_{2}^{*2}H^{*2}h_{2i}}\right\} =0 \end{aligned} \end{aligned}$$(37)
Optimal hedged ratios based on the dynamic Hedging
Different from the static hedging, the optimal hedged ratio in every day changes according to the market states. The LPMs in day k (\(k=1,2,3\ldots n\)) is expressed as follows:
where
and \(D_{3}:\frac{cr_{1k}H(r_{2k}+\sqrt{h_{2k}}x_{2})}{\sqrt{h_{1i}}}\) Then we can get the optimal hedged ratio \(H_{k}\) by calculating the firstorder condition of \(\frac{\partial L_{k}}{\partial H_{k}}=0\), that is, the optimal hedged ratio satisfy the following equation:
Here,
and
For the different values of m, we can deduce the condition that the optimal dynamic hedge ratio in day k satisfies. The results are shown in the following proposition.
Proposition 2
Suppose a hedger want to hedge the downside risk measured by LPMs with a dynamic hedging strategy. The optimal hedge ratio \(H_{k}^{*}\) in day k, therefore, satisfies the following conditions:

when \(m=0\), the optimal dynamic hedged ratio \(H_{k}^{*}\) satisfy the following equation
$$\begin{aligned}&\sum _{i=1}^n \exp \left\{ \frac{1}{2}\frac{(aH_{k}^{*}+b)^{2}}{h_{1}^{*2}h_{1k}+h_{2}^{*2}H_{k}^{*2}h_{2k}}\right\} \nonumber \\&\quad \frac{ah_{1}^{*2}h_{1k}bH_{k}^{*}h_{2}^{*2}h_{2k}}{ (h_{1}^{*2}h_{1k}+H_{k}^{*2}h_{2}^{*2}h_{2k})^{\frac{3}{2}}}=0 \end{aligned}$$(40)where \(a=\sqrt{h_{2k}}X_{2i}+r_{2k},b=\sqrt{h_{1k}}X_{1i}c+r_{1k}\).

when \(m=1\), the optimal dynamic hedged ratios satisfy the following equation
$$\begin{aligned}&\sum _{i=1}^n \int _{\infty }^{+\infty } \frac{v}{\sqrt{h_{2k}}}\exp \left\{ \frac{1}{2}\left( \frac{av}{\sqrt{h_{2k}}h_{2}^{*}} \right) ^{2}\right\} \nonumber \\&\quad \Phi \left( \frac{bH_{k}^{*}v}{\sqrt{h_{1k}}h_{1}^{*}}\right) \,dv=0 \end{aligned}$$(41) 
when \(m=2\), the optimal dynamic hedged ratios satisfy the following equation
$$\begin{aligned} \begin{aligned}&\sum _{i=1}^n\int _{\infty }^{+\infty }\sqrt{\frac{2\pi }{h_{2k}}} (bv+H_{k}^{*}v^{2})\exp \left\{ \frac{1}{2}\left( \frac{av}{\sqrt{h_{2k}}h_{2}^{*}} \right) ^{2}\right\} \\&\quad \Phi \left( \frac{bH_{k}^{*}v}{\sqrt{h_{1k}}h_{1}^{*}}\right) \,dv\\&\quad +\sum _{i=1}^n\frac{h_{1}^{*2}h_{2}^{*}h_{1k} \sqrt{2\pi h_{2k}}(ah_{1}^{*2}h_{1k}bH_{k}^{*}h_{2}^{*2}h_{2k})}{(h_{1}^{*2}h_{1k}+H_{k}^{*2}h_{2}^{*2}h_{2k})^{\frac{3}{2}}}\\&\quad \exp \left\{ \frac{1}{2}\frac{(aH_{k}^{*}+b)^{2}}{h_{1}^{*2}h_{1k} +h_{2}^{*2}H_{k}^{*2}h_{2k}}\right\} =0 \end{aligned} \end{aligned}$$(42)
Empirical Study
In this section, we achieve the following tasks. First, descriptive statistics for spot and futures returns. Second, estimation of relevant parameters in ARCH model through genetic algorithm. Third, optimal hedged ratios and corresponding effectiveness are calculated according to different objective return (c) and risk aversion coefficient (m) of LPMs, and comparisons, including kernel density versus parametric method under the framework of static hedging, static hedging versus dynamic hedging by kernel density, kernel density versus parametric method in dynamic hedging, are made. The conclusions are at the end.
Data
According to the ex ante versus ex post method (Alizadeh et al. 2015; Ghoddusi and Emamzadehfard 2017), we divide the history day data of WTI crude oil into two parts for the sake of static hedging research. The former part for the insample analysis covers the period between January 2, 2015, and April 7, 2018, while the latter part for the outofsample analysis covers from April 8, 2018 to October 11, 2019. For the dynamic hedging, in order to simplify the calculation, we select 100 samples included in the sample data mentioned above to accomplish the test. The insample analysis covers the period between January 2, 2015, and March 16, 2015, while the outofsample analysis covers from April 8, 2018, to June 4, 2018. The optimal bandwidths calculated of insample and outofsample are \(h_{1}^{*}=0.2405,h_{2}^{*}=0.0881\) and \(h_{1}^{*}=0.1992,h_{2}^{*}=0.0701\), respectively. Here is the descriptive statistic of the whole data in Figure 1:
From Fig. 1, we can clearly notice the volatility clustering among the estimators of noise. Further, we test the ARCH effects which are shown in Table 1.
For Table 1, the upper gives summary statistics on returns while the lower presents the results of ARCH effect test. It is obvious that there exists positive or negative skewness or kurtosis among the insample and outofsample data, especially for the case with futures returns in sample which have the largest skewness and kurtosis, that’s to say, it is more appropriate to adopt kernel density to estimate the distribution of returns rather than normal assumption. In addition, the LM(K) statistic delineates the existence of ARCH effect for spot and futures returns, which identifies the rationality of our usage of ARCH model to fit the return data and obtain the independent noise series.
Parameter estimation of ARCH model
Genetic algorithm is adopted in this paper to solve the parameter estimation problem of ARCH model, which has been widely used as a highefficiency optimization instrument. The GA was proposed first by Holland (1975), which operates directly on the structure object without the limitation of derivative and continuity of function. According to Abdullah et al. (2018), the GA can conduct a multidirectional search within crowds of candidate solutions, which allows the seeds of possible success to be spread uniformly over the whole solution space and make itself achieve success in the process of optimizing compared to single search pointbased algorithms. Genetic algorithm is a kind of stochastic algorithm, developing randomly generated individuals for better solution by iterative process, and the definition of the survival of the fittest of this algorithm is a process to find the optimal offspring, and the ultimately generated individual is the optimal solution within the optimization process. Each individual represents a solution of the optimization problem, and the fitness is used as the evaluation index. Fitness represents the survival chance of the individuals. The higher the fitness is, the higher the probability of the individual entering the next iteration. In practical optimization problems, fitness is usually the value of objective function. During iteration, new individuals are generated by crossover operators and mutation operators, and two different generations are generated by random combination and exchange of elements in a pair of individuals by crossover operators, while the mutation operator adds some small random changes to the offspring. Genetic algorithm can set reinitialization after each convergence to ensure that the most suitable individuals are retained in the iteration process and new random individuals can be created at the same time, so as to reduce the risk of premature convergence of the algorithm. The parameters of ARCH model for insample data are estimated and are presented in Table 2.
Empirical results of static hedging
Static hedging means that the optimal hedged ratios and effectiveness are calculated according to all the sample data, with all the sample data as a whole. At the same time, the results based on the kernel density estimation are compared with the ones under parameter method which assumes a normal distribution through the standardization of sample according to the center limit theorem. Their results of insample test are shown in Tables 3 and 4, and Table 5 depicts the result of outofsample test.
From Tables 3 and 4, we can firstly confirm that all the hedging effectiveness is bigger than zero, so the model constructed by us to solve the hedging problem is effective. Then we can see that situations differ with the change of risk aversion coefficient. For the case \(m=0\), compared with kernel density estimation, the hedged ratios are relatively smaller while the effectiveness is higher for the most data of results of parametric. For example, when \(c=0.01\), the former position and effectiveness are 0.34 and 0.48 while the latter ones are 0.21 and 0.51. When \(m=1\), it is difficult to tell which one is better, because two results are similar. For the case \(c=0.002\), there are a relatively smaller position and a relatively higher effectiveness in parametric method, but the opposite is true for the case \(c=0.005\). Different from the previous results, when \(m=2\), the result of kernel density estimation achieves a better performance. When \(c=0.01\), the position is larger while the effectiveness is lower in parametric method; in addition, for the same efficiency, the positions calculated by parametric method are generally larger. Next we turn to the outofsample results which are shown in Table 5.
For the results using kernel density estimation, whether \(m=0,m=1\) or \(m=2\), the hedging effectiveness of outofsample test is higher than the ones of insample test generally; on the contrary, effectiveness from outofsample test becomes smaller compared with the results of insample for parametric method. Finally, combination of the insample analysis is likely that kernel density estimation represents the real distribution characteristics of data in financial market better and achieves a better hedging performance.
The empirical results of dynamic hedging
Dynamic hedging means that the calculation of the optimal hedged ratios and effectiveness is based on the single daily data; for the all the observations, we can get n results. Here \(c=0\). At first, we compare static hedging and dynamic hedging under the framework of kernel density. The results of insample and outofsample test as well as the comparison with static hedging are shown in Figures 2 and 3, in which the value represented by the straight line is the result of static hedging with same target return(\(c=0\)) and risk aversion coefficient(m).
From Figs. 2 and 3, we find that for the insample result, considering the optimal hedging ratios, there are half the points above and half the points below the straight line, so there is no particular benefit to using one approach over the other. Then we turn to the effectiveness; it is obvious that the effectiveness obtained by static hedging is higher than most of results of dynamic hedging. The similar conclusion can be acquired from the outofsample test, that is, static hedging strategy achieves better performance. In addition, it is of crucial importance that, whether for optimal hedging ratios or for effectiveness, the results of dynamic hedging are discrete and unstable, what’s more, there are many invalid points that the effectiveness is below the zero. Further, we incorporate the calculated optimal hedged ratios into the portfolios \(r_{p}=r_{s}Hr_{f}\), finding different wealth paths, and there is a descriptive statistic about returns shown in Table 6 and 7.
From Tables 6 and 7, we can see that all the mean and most of medians of static hedging strategy are bigger than those of dynamic hedging strategy, as for variance, although the values of static hedging are little bigger for the insample test, the opposite is the truth with outofsample test, that is, static hedging strategy achieves better performance. In a word, we think static hedging based on the whole sample is a more appropriate hedging strategy. The above content compares the performance of static and dynamic hedging under the framework of kernel density; more importantly, in order to prove the superiority of our improved kernel density, the comparison between kernel density and parametric method should be made under the framework of dynamic hedging. The optimal hedging ratios and efficiency based on the insample data as well as the results from outofsample data are shown in Table 8. Because there is a large amount of data in results, difficult to show in pictures, so we compare them in a statistical sense.
For the insample results, by comparing mean and median, it is easy to find that the optimal hedging ratios calculated by kernel density are smaller, which means lower cost, than those obtained by parametric method, while the efficiency calculated by kernel density is higher. For outofsample efficiency, we also find kernel performs better. So the conclusion can be drawn that the strategy based on the kernel density achieves better hedging performance with lower cost compared with parametric method, which proves the superiority of our improved kernel density again.
Conclusion
The LPMs measures an individual hedger’s downside risk, as opposed to the twosided risk measure. This study proposed an improved kernel density estimation to estimate the optimal hedge ratio of crude oil futures hedging based on LPMs. Our goal in this paper is twofold: (a) Due to the correlation between spot and futures returns, we extend the kernel method to the bivariate case. Furthermore, different from the existing literature, for the spot and futures returns, we assume different optimal bandwidths which are calculated by minimizing the mean integrated square error. (b) In order to get independent time series, we adopt ARCH model which relevant parameters are estimated by means of genetic algorithm. The purpose of this treatment is to satisfy the independent sequence requirement of binary kernel density estimation. In the part of empirical analysis, comparisons, including kernel density versus parametric method under the framework of static hedging, static hedging versus dynamic hedging by kernel density, kernel density versus parametric method in dynamic hedging, are made.
Empirical results reveal that, at first, the hedging strategy based on the kernel density estimation method is of highly efficiency, and then it achieves better performance than the hedging strategy based on the traditional parametric method (normal) under the framework of both static hedging and dynamic hedging, that is, smaller hedged ratios and higher effectiveness, which proves the superiority and robustness of our improved kernel density fully. What’s more, in accordance with the comparison of optimal positions, effectiveness and returns, we come to the conclusion that the results of static hedging strategy are better and more stable due to the incorporation of more sample points while the results of dynamic hedging strategy are inefficient, discrete and unstable.
Last but not least, when calculating optimal bandwidths, normal distribution is assumed for simplifying calculation, which is local to some extent in kernel density and is different from the global distribution assumption in the traditional parameter method. So how to avoid dependence on distributions altogether and obtain the optimal bandwidths through simple calculation in the case of higher dimensions will be challenging and rewarding.
References
Abdullah Y, Birdal S, Ufuk Y (2018) Maximum likelihood estimation for the parameters of skew normal distribution using genetic algorithm. Swarm and Evolutionary Comput 38(2):127–138
Alizadeh AH, Huang CY, Dellen SV (2015) A regime switching approach for hedging tanker shipping freight rates. Energy Econom 49(3):44–59
Alzghool R, AlZubi LM (2018) Semiparametric estimation for ARCH models. Alexandria Eng J 57(1):367–373
Angle R (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50:987–1007
Backus D, Foresi S, Zin S (1998) Arbitrage opportunities in arbitragefree models of bond pricing. J Business and Econom Statistics 16:13–24
Baghdadabad MRT (2014) Average drawdown risk reduction and risk tolerances. Res Econom 68(3):264–276
Bawa VS, Linderberg E (1997) Capital market equilibrium in a mean lower partial moment frame work. J Financial Econom 5(2):189–200
Bouezmarni T, Rombouts JVK (2010) Nonparametric density estimation for positive time series. Comput Statistics Data Anal 54(2):245–261
Brogan AJ, Stidham S (2008) Nonseparation in the mean Clowerpartialmoment portfolio optimization problem. Eur J Operat Res 184:701–710
Catani PS, Ahlgren NJC (2017) Combined Lagrange multiplier test for ARCH in vector autoregressive models. Econom Statistics 1:62–84
Cheong CW (2009) Modeling and forecasting crude oil markets using ARCHtype models. Energy Policy 37(6):2346–2355
Dai J, Zhou HG, Zhao SQ (2017) Determining the multiscale hedge ratios of stock index futures using the lower partial moments method. Physica A: Statistical Mech Appl 466(15):502–510
Demirer R, Lien D (2003) Downside risk for short and long hedgers. Int Rev Econom Finance 12:25–44
Feng ZH, Wei YM, Wang K (2012) Estimating risk for the carbon market via extreme value theory: an empirical analysis of the EU ETS. Appl Energy 99:97–108
Ghoddusi H, Emamzadehfard S (2017) Optimal hedging in the US natural gas market: the effect of maturity and cointegration. Energy Econom 63(3):92–105
Ghosh A (1993) Hedging with stock index futures: estimation and forecasting with error correction model. J Futures Market 13(7):743–752
Giot P, Laurent S (2004) Modelling daily ValueatRisk using realized volatility and ARCH type models. J Empirical Finance 11(3):379–398
Gramacki A, Gramacki J (2017) FFTbased fast bandwidth selector for multivariate kernel density estimation. Comput Statistics & Data Anal 106:27–45
Harvey A, Oryshchenko V (2012) Kernel density estimation for time series data. Int J Forecast 28(1):3–14
Hazelton ML, Marshall JC (2009) Linear boundary kernels for bivariate density estimation. Statistics & Probab Lett 79(8):999–1003
Holland J (1975) Adaptation in natural and artificial system: an introduction with application to biology, control and artificial intelligence. University of Michigan Press, Ann Arbor
Jasemi M, Monplaisir L, Jam PA (2019) Development of an efficient method to approximate the risk measure of lower partial moment of the first order. Comput Ind Eng 135:326–332
Johnson LL (1960) The theory of hedging and speculation in commodity futures. Rev Econom Stud 27(3):139–151
Li Q, Racine JS (2007) Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton, NJ
Lien D, Tse YK (2001) Hedging downside risk: futures vs. options. Int Rev Econom Finance 10(2):159–169
Maghyereh AI, Awartani B, Tziogkidis P (2017) Volatility spillovers and crosshedging between gold, oil and equities: Evidence from the gulf cooperation council countries. Energy Econom 68(10):440–453
Meng J, Nie H, Jiang YH (2020) Risk spillover effects from global crude oil market to China’s commodity sectors. Energy 117208. https://doi.org/10.1016/j.energy.2020.117208
Nademi A, Nademi Y (2018) Forecasting crude oil prices by a semiparametric Markov switching model: OPEC, WTI, and Brent cases. Energy Econom 74:757–766
Qu H, Wang TY, Zhang Y, Sun PF (2019) Dynamic hedging using the realized minimumvariance hedge ratio approach  Examination of the CSI 300 index futures. PacificBasin Finance J 57:101048. https://doi.org/10.1016/j.pacfin.2018.08.002
Shi JL, Zhu SH, Zhou YY, Li RH (2017) Bayesextension discriminant method of two populations based on multivariate kernel density estimation. Procedia Comput Sci 122:780–787
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London
Stein JL (1961) The simultaneous determination of spot and futures prices. Am Econom Rev 51:1012–1025
Yan HH, Han LY (2019) Empirical distributions of stock returns: mixed normal or kernel density? Physica A: Statistical Mech Appl 514:473–486
Zhang YJ, Wu YB (2019) The timevarying spillover effect between WTI crude oil futures returns and hedge funds. Int Rev Econom Finance 16(5):156–169
Acknowledgements
This paper is supported by Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (71720107002); National Natural Science Foundation of China (No.71501076, 71971086 ); Guangdong Basic and Applied Basic Research Foundation (No. 2019B151502037); Financial Service Innovation and Risk Management Research Base of Guangzhou; The Raising initial capital for Highlevel Talents of Central China Normal University (30101190001); Fundamental Research Funds for the Central Universities (CCNU19A06043, CCNU19TD006, CCNU 19TS062, No. 2019ZD13); Fundamental Research Funds for the Central Universities (Innovation Funding Projects)(2020CXZZ047); Humanities and Social Science Planning Fund from Ministry of Education (Grant No.21YJC790148).
Author information
Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, X., Wang, X., Zhang, W. et al. Optimal futures hedging strategies based on an improved kernel density estimation method. Soft Comput 25, 14769–14783 (2021). https://doi.org/10.1007/s00500021061853
Accepted:
Published:
Issue Date:
Keywords
 Futures hedging
 Improved kernel density estimation
 ARCH model
 Lower partial moment
 Genetic algorithm
 Crude oil price