How to adjust butterfly 2s5s10s swaps trade for directionality?

Question

I am looking into a 2s5s10s swaps idea using a 50-50 weighting scheme, where it's 2 times the 5 year minus the 2 year and 10 year. However, there is a correlation between the butterfly spread and the slope of the curve (2s10s) and also with the 5 year. I want to adjust the weights of my butterfly to remove the directionality such that I still want the body to have a weight of 2.

Essentially, I think it should be X : 2 : Y but I am having a hard time.

Hi - just to understand, are you saying that you want your butterfly to be flat/hedged to 2s10s steepening/flattening given the 5y correl you observe to 2s10s? — Mehness, Jun 28 '18 at 11:35
basically would be good to know how you want to make money in the trade and what you want to hedge out. Eg are you hedging a 2s10s slope Or do you want to make money on 2s10s steepening/flattening whilee being overall dv01 neutral given what you expect to happen to 5y etc.. — Mehness, Jun 28 '18 at 11:39
I think it may more simpler than that. Spoke to a trader, they said if you regress the 2s5s10s with the 2s10s and 5y as the independent variable, you'll get coefficient of b1 for the 2s10s slope and b2 for 5y. Then this is where I get lost: the weight would be 1 - b1 for the 2y, 2 for the 5y, etc... Why is the weight 1- b1 — VanillaCall, Jun 28 '18 at 23:29
@Doika (for my answer) yes the elements of PC1 are the principal component factor loadings. The only other consideration in my answer is the amount of VaR one is willing to be exposed to via executing the trade. — Attack68, Nov 09 '19 at 22:25

Attack68 · Answer 1 · 2021-10-03T06:37:52.607

Method 1: PCA directionality hedged

Here is one way to do it using PCA and hedging the directionality implied by the first principal component.

Since you have quoted 3 effective instruments; 2s5s10s, 2s10s and 5Y you will observe that you can derive these instruments from the underlying 2Y, 5Y, and 10Y. That is;

$$ \begin{bmatrix} 5Y \\\ 2s10s \\\ 2s5s10s \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 \\\ -1 & 0 & 1 \\\ -1 & 2 & -1 \end{bmatrix} \begin{bmatrix} 2Y \\\ 5Y \\\ 10Y \end{bmatrix} , \quad or \quad P_2 = A P_1$$ where $P_1$ and $P_2$ are your set of prices in the different basis systems.

You can also observe that if you have the covariance matrix of the instruments in $P_1$, say $Q(P_1)$, then the covariance of the instruments of basis $P_2$ can be obtained with: $$Q(P_2) = A Q(P_1) A^T \quad \implies \quad Q(P_1) = A^{-1}Q(P_2)A^{-T}$$ So you can work in both basis systems but I'm going to focus on the default $P_1$ system.

If you now derive your eigenvalues and eigenvectors of $Q(P_1)$ take the eigenvector corresponding to the highest eigenvalue - this is the first principal component (PC1). In order to hedge this component so that you have no risk exposure to it, take your underlying trade proposition and divide it by the elements of PC1:

$$ \begin{bmatrix} 2Y: -1 \\\ 5Y: +2 \\\ 10Y: -1 \end{bmatrix} \div \begin{bmatrix} PC1_{2Y} : 0.660 \\\ PC1_{5Y} : 0.604 \\\ PC1_{10Y} : 0.447 \end{bmatrix} = \begin{bmatrix} -1.51 \\\ 3.31 \\\ -2.24 \end{bmatrix} \propto \begin{bmatrix} -0.91 \\\ 2.00 \\\ -1.35 \end{bmatrix} $$

PCA Alternative Approach (edit 3-Oct-2021):

A second method for PCA is to consider a a formulation that begins with the original trade strategy, and attempts to modify it by the minimal risk change in order that it satisfies the condition of zero risk to the principal component.

Suppose $\mathbf{p}$ is the principal component values above and $\mathbf{x}$ is the original trade risks, i.e. -1, 2, -1 above. Then we have the minimsation problem to seek the minimal risk changes, $\mathbf{\delta}$ to $\mathbf{x}$:

$$ \min_{\mathbf{\delta}} \mathbf{\delta^T I \delta} \quad \text{subject to} \quad (\mathbf{x + \delta})^T \mathbf{I p} = 0$$

This quadratic function has an analytic solution (see Karush-Kuhn-Tucker Conditions on Wikipedia):

$$ \begin{bmatrix} 2 & 0 & 0 & p_1 \\ 0 & 2 & 0 & p_2 \\ 0 & 0 & 2 & p_3 \\ p_1 & p_2 & p_3 & 0 \end{bmatrix} \begin{bmatrix} \delta_1 \\ \delta_2 \\ \delta_3 \\ \lambda \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ \mathbf{-x^Tp} \end{bmatrix} $$

Solving the linear programme we obtain $\mathbf{x+\delta} = \begin{bmatrix} -1.1002 \\ 2 \\ -1.0780 \end{bmatrix}$

As you can see there are a variety of modifications that can be made to the original trade than result in PC1 risk of zero, which to use is a question of formulation. I have started to prefer this modification since it is more transparent, if more difficult to derive/calculate.

Method 2: Minimising VaR approach

A second considered way would be to suppose you trade 5Y and seek the combination of 2Y and 10Y positions to minimise your VaR. This allows you then maximise the absolute 5Y size relative to your target VaR of the trade.

Suppose you have the following risk:

$$S = \begin{bmatrix} 2Y: 0 \\\ 5Y: 2 \\\ 10Y: 0 \end{bmatrix} $$ and now you evaluate what positions in 2Y and 10Y give the smallest VaR. For the same covariance matrix as I used above to derive the PCA the answer is:

$$ S^* = \begin{bmatrix} 2Y: -1.48 \\\ 5Y: 2.00 \\\ 10Y: -0.38 \end{bmatrix} $$

This is an optimisation problem solvable with a numeric solver or more simply actually with analytic calculus but I'm not going to cover that here, the link below has it.

These methods are obviously fundamentally different but each has merit against a specific view, you are more likely in your position to favour the first. The differences here are such that the 2Y has a much higher correlation with 5Y directly so it is a better hedge to reduce VaR by overweighting it, whereas with PCA the 10Y moves less so you need more risk it in to have a directionality hedge.

Note if you want to try this yourself you can use the $Q(C)$ covariance matrix values for the 2Y, 5Y, and 10Y trades in this link: http://www.tradinginterestrates.com/revised/PCA.xlsb Note that all of this material I got from Darbyshire Pricing and Trading Interest Rate Derivatives.

Edit

Method 3: Multivariable Least Squares Regression

If we include the third method from @dm63 of multivariable regression of the form:

$$ \mathbf{y} - \mathbf{\beta X} = \mathbf{\epsilon} $$

where $\mathbf{y}$ is the 2s5s10s timeseries, and $\mathbf{X}$ is the 2s10s and 5y timeseries, then your optimal estimators for $\beta_1, \beta_2$ are given by,

$$ \mathbf{\hat{\beta}} = \mathbf{(X^TX)^{-1}X^T y} $$

and as he states the trade weights are given as $(-(1-\beta_1), 2-\beta_2, -(1+\beta_1))$

--------------

As an example I tried all three of these methods out on some EUR swap sample data from 2016. From Jan-1 to Jun-30 is my sample data and from Jul-1 to Dec-22 is my out of sample back test. Below I have plotted the results. The interesting thing is that the multivariable regression is actually has smallest volatility in this out-of-sample data, but the minimum var has almost the same volatility. And of course min Var will have the lowest volatility over the sample data from which it was derived by definition.

If you are interested in the code...

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df_hist = pd.read_csv('historical_daily_changes.csv', index_col='DATE', parse_dates=True)
df_fore = pd.read_csv('forecast_daily_absolues.csv', index_col='DATE', parse_dates=True)
z = df_fore[['2Y', '5Y', '10Y']].values
Method 1: PCA directionality weighted trade
x = df_hist[['2Y', '5Y', '10Y']].values
Q = np.cov(x.T)
eval, evec = np.linalg.eig(Q)
w = np.array([-1 / evec[0, 0], 2 / evec[1, 0], -1 / evec[2, 0]])
print('Weights for trade using PCA are:', 2w[0]/w[1], 2, 2w[2]/w[1])
df_fore['PCA'] = 100 * (w[0]z[:, 0] + w[1]z[:, 1] + w[2]z[:, 2])  2/w[1]
Method 2: Minimum Variance approach
Q = np.cov(x.T)
Q_hat = Q[[0, 2], :]
Q_dhat = Q_hat[:, [0, 2]]
w[[0, 2]] = -np.einsum('ij,jk,k->i', np.linalg.inv(Q_dhat), Q_hat, np.array([0,2,0]))
w[1] = 2
print('Weights for trade using min VaR are:', 2w[0]/w[1], w[1], 2w[2]/w[1])
df_fore['Min VaR'] = 100 * (w[0]z[:, 0] + w[1]z[:, 1] + w[2]z[:, 2])  2/w[1]
Method 3: Multivariable least square regression
x = df_hist[['2Y10Y', '5Y']].values
y = df_hist[['2Y5Y10Y']].values
beta = np.matmul(np.linalg.pinv(x), y)
w = np.array([-(1-beta[0]), 2-beta[1], -(1+beta[0])])
print('Weights for trade using MVLSR are:', 2w[0]/w[1], 2, 2w[2]/w[1])
df_fore['MVLSR'] = 100 * (w[0]z[:, 0] + w[1]z[:, 1] + w[2]z[:, 2])  2/w[1]
Plot an out of sample forecast
fig, ax = plt.subplots(1,1)
ax.plot_date(df_fore.index, df_fore['2Y5Y10Y'] + 36, 'k-', label='2Y5Y10Y')
ax.plot_date(df_fore.index, df_fore['MVLSR'] + 6.7, 'r-', label='MVLSR weights')
ax.plot_date(df_fore.index, df_fore['PCA'] - 2.3, 'g-', label='PCA Weights')
ax.plot_date(df_fore.index, df_fore['Min VaR'] + 14.9, 'b-', label='Min VaR weights')
ax.legend()
plt.show()
print(df_fore[['2Y5Y10Y', 'MVLSR', 'PCA', 'Min VaR']].diff().describe())

+1, was made to learn PCA pretty much first day on a swaptions desk eons back, ubiquitous and used heavily in rates, funny how never really taken up as much elsewhere, defo has its place... — Mehness, Jun 28 '18 at 17:47
This is such a great answer that if I could upvote it multiple times I would =) The only thing I'd add is that quantitative methods won't also be sufficient to neutralize level/slope exposures. For example, for quite a few years before the Fed started hiking again, 2y rates barely moved. In those years, it doesn't matter how much structuring you do – your 2s/x/y fly will behave like an x/y curve... — Helin, Jun 29 '18 at 00:13
@Helin - I guess that's the thing abt PCA, components (and eigenvalues/importance) change with regimes, you could never expect your timeseries and derived components/importance to be stable... but it does have its uses but only if you have a good feel for the regime you're in, so like all things, needs a bit of 'nous' in application :) — Mehness, Jun 29 '18 at 00:53
Are you sure that the minimum variance approach the multi-variate regression approach aren't doing the same thing? — Jared, Aug 15 '18 at 20:18
What's the justification for the PCA technique?
For instance let's imagine in recent history the main driver of volatility was the curve slope, so that $\text{PC}_1=[-1, 0, 1]$ (unlikely market conditions, but theoretically possible), then your calculation would give weights $w=[\frac{-1}{-1}, \frac{2}{0}, \frac{-1}{1}] = [1, \infty, -1]$. This is not really what one would expect for a butterfly.

Similarly $\text{PC}_1=[-1, 2, -1]$ would yield $w=[1,1,1]$. — Thrastylon, Aug 16 '21 at 09:03
@Thrastylon the justification is that the usual PC1 is market directional, and therefore by hedging you are removing the market directional component of your relative value trade. If PC1 is not representative of market directionality, suppose a curve slope like you say, then it is obviously not practical as per your calculations. It is a subjective choice. — Attack68, Aug 16 '21 at 12:30
How do you justify that you are hedging PC1 with your calculations? It is not obvious to me, hence my asking for a justification. My example was only to illustrate that these calculations can give non-sensical numbers (e.g. infinite weights). — Thrastylon, Sep 28 '21 at 13:29
Because when you multiply adjusted risk vector with the pc weights you get zero, which is the definition of no exposure to that PC. — Attack68, Sep 29 '21 at 05:08
@Attack68 1. which one are you using and what is your comment on applying PCA on relative changes vs levels? 2. Would it make more sense to apply PCA on non overlapping forward rates rather than spot rates? — Pontus Hultkrantz, May 28 '22 at 11:07
@PontusHultkrantz you should not use levels, only on changes (https://quant.stackexchange.com/a/50931/29443). If your trade strategies are forwards it is more direct, and possibly more obvious, to use PCA on forwards. But there is a mathematical relationship between forwards and pars so the analysis should hold either way. — Attack68, May 29 '22 at 12:51
@Attack68: thanks for your answer, def agree on using changes primarily due to stationarity requirement. The strat is priced using par rates so I def makes sense to use them. However I'm speculating using spot and not fwds will convolute the PCA, by describing the obvious overlap rather than the unknown dynamics, especially if we want to understand curve dynamics using indep factors. Example, 9Y vs 10Y=9Y+1Y, almost perfect linear dependency, so first eigenvector will be~0.5(9Y+10Y), which is really just saying that 9Y and 10Y are linearly dependent, which we already knew. Thoughts? — Pontus Hultkrantz, Jun 01 '22 at 05:05

score 3 · Accepted Answer · answered Jun 29 '18 at 00:16

3

I think he's saying that if

$$ Fly=b_1(2s-10s) + b_2(5s) + error $$

But $$ Fly= 2s-2(5s)+10s $$

Then doing some algebra ,

$$ (1-b_1)2s -(2-b_2)5s + (1+b_1)10s = error $$

Hence the weights of the fly that give pure noise , un correlated to curve and rate , are as given above. If you want 2 in the middle , you have to rescale.

answered Jun 29 '18 at 00:16

dm63

17,083
1
24
54

Thanks, this makes intuitive sense. So the second equation is your original fly and you subtract the first equation which results in the weights such that your neutralizing curve and rate components so you're trading based on the error term. – VanillaCall Jun 29 '18 at 03:15
Follow up question, I've tested this on one fly and notice that even the error term has directionality with the curve and rate. Could something else in the error term still be directional with the curve and rate? – VanillaCall Jul 18 '18 at 00:44

How to adjust butterfly 2s5s10s swaps trade for directionality?

2 Answers2

Method 1: PCA directionality hedged

PCA Alternative Approach (edit 3-Oct-2021):

Method 2: Minimising VaR approach

Method 3: Multivariable Least Squares Regression

Method 1: PCA directionality weighted trade

Method 2: Minimum Variance approach

Method 3: Multivariable least square regression

Plot an out of sample forecast

Linked