The following is a demonstration of how to use R to do quadratic programming in order to do mean-variance portfolio optimization under different constraints, e.g., no leverage, no shorting, max concentration, etc.
Taking a step back, it’s probably helpful to realize the point of all of this. In the 1950s, Harry Markowitz introduced what we now call Modern Portfolio Theory (MPT), which is a mathematical formulation for diversification. Intuitively, because some stocks zig when others zag, when we hold a portfolio of these stocks, our portfolio can have some notional return at a lower variance than holding the stocks outright. More specifically, given a basket of stocks, there exists a notion of an efficient frontier. I.e., for any return you choose, there exists a portfolio with the lowest variance and for any variance you fix, there exists a portfolio with the greatest return. Any portfolio you choose that is not on this efficient frontier is considered sub-optimal (for a given return, why would you choose a a higher variance portfolio when a lower one exists).
The question becomes if given a selection of stocks to choose from, how much do we invest in each stock if at all?
In an investments course I took a while back, we worked the solution for the case where we had a basket of three stocks to choose from, in Excel. Obviously, this solution wasn’t really scalable outside of the N=3 case. When asked about extending N to an arbitrary number, the behind-schedule-professor did some handwaving about matrix math. Looking into this later, there does exist a closed-form equation for determining the holdings for an arbitrary basket of stocks. However, the math starts getting more complicated with each constraint you decide to tack on (e.g., no leverage).
The happy medium between “portfolio optimizer in Excel for three stocks” and “hardcore matrix math for an arbitrary number of stocks” is to use a quadratic programming solver. Some context is needed to see why this is the case.
Quadratic Programming
According to wikipedia, quadratic programming attempts to minimize a function of the form \frac{1}{2}x^{T}Qx + c^{T}x subject to one or more constraints of the form Ax \le b (inequality) or Ex = d (equality).
Modern Portfolio Theory
The mathematical formulation of MPT is that for a given risk tolerance q \in [0,\infty), we can find the efficient frontier by minimizing w^{T} \Sigma w - q*R^{T}w.
Where,
- w is a vector of holding weights such that \sum w_i = 1
- \Sigma is the covariance matrix of the returns of the assets
- q \ge 0 is the “risk tolerance”: q = 0 works to minimize portfolio variance and q = \infty works to maximize portfolio return
- R is the vector of expected returns
- w^{T} \Sigma w is the variance of portfolio returns
- R^{T} w is the expected return on the portfolio
My introducing of quadratic programming before mean-variance optimization was clearly setup, but look at the equivalence between \frac{1}{2}x^{T}Qx + c^{T}x and w^{T} \Sigma w - q*R^{T}w.
Quadratic Programming in R
solve.QP, from quadprog, is a good choice for a quadratic programming solver. From the documentation, it minimizes quadratic programming problems of the form -d^{T}b + \frac{1}{2} b^{T}Db with the constraints A^{T}b \ge b_0. Pedantically, note the variable mapping of D = 2\Sigma (this is to offset the \frac{1}{2} in the implied quadratic programming setup) and d = R.
The fun begins when we have to modify A^{T}b \ge b_0 to impose the constraints we’re interested in.
Loading Up the Data
I went to google finance and downloaded historical data for all of the sector SPDRs, e.g., XLY, XLP, XLE, XLF. I’ve named the files in the format of dat.{SYMBOL}.csv
. The R code loads it up, formats it, and then ultimately creates a data frame where each column is the symbol and each row represents an observation (close to close log return).
The data is straight-forward enough, with approximately 13 years worth:
> dim(dat.ret) [1] 3399 9 > head(dat.ret, 3) XLB XLE XLF XLI XLK [1,] 0.010506305 0.02041755 0.014903406 0.017458395 0.023436164 [2,] 0.022546751 -0.00548872 0.006319802 0.013000812 -0.003664126 [3,] -0.008864066 -0.00509339 -0.013105239 0.004987542 0.002749353 XLP XLU XLV XLY [1,] 0.023863921 -0.004367553 0.022126545 0.004309507 [2,] -0.001843998 0.018349139 0.006232977 0.018206972 [3,] -0.005552485 -0.005303294 -0.014473165 -0.009255754 >
Mean-Variance Optimization with Sum of Weights Equal to One
If it wasn’t clear before, we typically fix the q in w^{T} \Sigma w - q*R^{T}w before optimization. By permuting the value of q, we then generate the efficient frontier. As such, for these examples, we’ll set q = 0.5.
solve.QP’s arguments are:
solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE)
Dmat
(covariance) and dvec
(penalized returns) are generated easily enough:
risk.param
Amat and bvec
are part of the inequality (or equality) you can impose, i.e., A^{T}b \ge b_0. meq
is an integer argument that specifies “how many of the first meq constraints are equality statements instead of inequality statements.” The default for meq
is zero.
By construction, you need to think of the constraints in terms of matrix math. E.g., to have all the weights sum up to one, Amat
needs to contain a column of ones and bvec
needs to contain a single value of one. Additionally, since it’s an equality contraint, meq
needs to be one.
In R code:
# Constraints: sum(x_i) = 1 Amat
Having instantiated all the arguments for solve.QP
, it’s relatively straightforward to invoke it. Multiple things are outputted, e.g., constrained solution, unconstrained solution, number of iterations to solve, etc. For our purpose, we’re primarily just interested in the solution.
> qp qp$solution [1] -0.1489193 0.6463653 -1.0117976 0.4107733 -0.4897956 0.2612327 -0.1094819 [8] 0.5496478 0.8919753
Things to note in the solution are that we have negative values (shorting is allowed) and there exists at least one weight whose absolute value is greater than one (leverage is allowed).
Mean-Variance Optimization with Sum of Weights Equal to One and No Shorting
We need to modify Amat
and bvec
to add the constraint of no shorting. In writing, we want to add a diagonal matrix of ones to Amat
and a vector of zeros to bvec
, which works out when doing the matrix multiplication that for each weight, its value must be greater than zero.
# Constraints: sum(x_i) = 1 & x_i >= 0 Amat qp$solution [1] 0.0000000 0.4100454 0.0000000 0.0000000 0.0000000 0.3075880 0.0000000 [8] 0.2823666 0.0000000
Note that with the constraints that all the weights sum up to one and that the weights are positive, we’ve implicitly also constrained the solution to have no leverage.
Mean-Variance Optimization with Sum of Weights Equal to One, No Shorting, and No Heavy Concentration
Looking at the previous solution, note that one of the weights suggests that we put 41% of our portfolio into a single asset. We may not be comfortable with such a heavy allocation, and we might want to impose the additional constraint that no single asset in our portfolio takes up more than 15%. In math and with our existing constraints, that’s the same as saying -x \ge -0.15 which is equivalent to saying x \le 0.15.
# Constraints: sum(x_i) = 1 & x_i >= 0 & x_i <= 0.15 Amat qp$solution [1] 0.1092174 0.1500000 0.0000000 0.1407826 0.0000000 0.1500000 0.1500000 [8] 0.1500000 0.1500000
Turning the Weights into Expected Portfolio Return and Expected Portfolio Volatility
With our weights, we can now calculate the portfolio return as R^{T}w and portfolio volatility as sqrt{w^T \Sigma w}. Doing this, we might note that the values look “small” and not what you expected. Keep in mind that our observations are in daily-space and thus our expected return is expected daily return and expected volatility is expected daily volatility. You will need to annualize it, i.e., R^{T}w * 252 and \sqrt{w^{T} \Sigma w * 252}.
The following is an example of the values of the weights and portfolio statistics while permuting the risk parameter and solving the quadratic programming problem with the constraints that the weights sum to one and there’s no shorting.
> head(ef.w) XLB XLE XLF XLI XLK XLP XLU XLV XLY 1 0 0.7943524 0 0 0 0 0 0.1244543 0.08119329 1.005 0 0.7977194 0 0 0 0 0 0.1210635 0.08121713 1.01 0 0.8010863 0 0 0 0 0 0.1176727 0.08124097 1.015 0 0.8044533 0 0 0 0 0 0.1142819 0.08126480 1.02 0 0.8078203 0 0 0 0 0 0.1108911 0.08128864 1.025 0 0.8111873 0 0 0 0 0 0.1075003 0.08131248 > head(ef.stat) ret sd 1 0.06663665 0.2617945 1.005 0.06679809 0.2624120 1.01 0.06695954 0.2630311 1.015 0.06712098 0.2636519 1.02 0.06728243 0.2642742 1.025 0.06744387 0.2648981 >
Note that as we increase the risk parameter, we’re working to maximize return at the expense of risk. While obvious, it’s worth stating that we’re looking at the efficient frontier. If you plotted ef.stat
in its entirety on a plot whose axis are in return space and risk space, you will get the efficient frontier.
Wrap Up
I’ve demonstrated how to use R and the quadprog package to do quadratic programming. It also happens to coincide that the mean-variance portfolio optimization problem really lends itself to quadratic programming. It’s relatively straightforward to do variable mapping between the two problems. The only potential gotcha is how to state your desired constraints into the form A^{T}b \ge b_{0}, but several examples of constraints were given, for which you can hopefully extrapolate from.
Getting away from the mechanics and talking about the theory, I’ll also offer that there are some serious flaws with the approach demonstrated if you attempt to implement this for your own trading. Specifically, you will most likely want to create return forecasts and risk forecasts instead of using historical values only. You might also want to impose constraints to induce sparsity on what you actually hold, in order to minimize transaction costs. In saying that your portfolio is mean-variance optimal, there’s the assumption that the returns you’re working with is normal, which is definitely not the case. These and additional considerations will need to be handled before you let this run in “production.”
All that being said, however, Markowitz’s mean-variance optimization is the building block for whatever more robust solution you might end up coming with. And, an understanding in both theory and implementation of a mean-variance optimization is needed before you can progress.
Helpful Links
Lecture on Quadratic Programming and Markowitz Model (R. Vanderbei)
Lecture on Linear Programming and a Modified Markowitz Model (R. Vanderbei)
Many thanks! This is exactly I’m waiting for.
Very clear article.
There is a typo in the part about setting the constraints to a maximum allocation of 15% in a single asset. It should say -x >= -0.15 instead of -x <= -0.15.
Good catch Brent. Thanks!
Hi,
This looks fantastic — thanks.
There is a website here (http://economistatlarge.com/portfolio-theory/r-optimized-portfolio) that has a programme so similar that it must have been taken from yours. However, while he still uses colMeans to calculate the expected return, the data he uses are geometric returns, not logarithmic returns. So I think his programme is wrong. I am planning to use this myself, so I thought I would solicit your view!
Thanks for the comments and bringing up the “inspired post” you saw.
I’m going to give the guy benefit of the doubt, especially since it’s apparent that he wrote his own R code, and it’s in a larger context of “Modern Portfolio Theory.”
Additionally, if you’re attempting to do mean-variance optimization, that is, minimize xT %*% Q %*% x – xT %*% R, in R, there’s only so many ways to do it, and quadprog is a natural candidate to do that.
Great blog, could you use solve.QP to directly incorporate a linear transaction cost constraint ?
Yes, you can. But, the challenge is in how you setup the constraints / objective function. Without going into huge details, you will need to modify the objective function directly, i.e., it’s not enough to just look at “x” since x is inclusive of existing position and new trades, so you will need to decompose x into existing position and new trades.
Great post. Learnt a lot about using quadprog in R and MPT through this. I am having one problem though What if the sum of absolute values of the weights was constrained to be below a threshold?
That is |w_1| + |w_2| .. + |w_n| = 0 for 1<=i <=n
2. y_1 + y_2 … + y_n <= 3.
3. w_1 <= y_1
-w_1 <= y_1
w_2 <= y_2
-w_2 <= y_2
…
But how can we do this using quadprog in R?
Summing up the absolute values can be particularly tricky. E.g., say you wanted your “invested portfolio size” to be less than some amount. You can’t just sum up the weights (assume they represented dollars to invest, which is analogous to weighting of portfolio in fractions), since in a dollar neutral, it’d just be zero. In a very hand-wavy way, you need to decompose your weights into “positive weights” and “negative weights” and then do math on that. It goes without saying, you will have to modify the objective function to capture this.
There’s much more nuances than can be obtained in a reply, so I’ll link this writing, which breaks down the math: https://www.wpi.edu/Pubs/E-project/Available/E-project-042707-112035/unrestricted/TurnoverConstraintsMQP.pdf
Hi, great post, just wondering how to constrain a single asset to say 5% or 10%of the portfolio, and leave the others constrained to 1 or -1 weighting (or 0 / 1)
Hi Humph,
There’s actually a difference in how to implement that for when shorting is allowed and when it’s long only, where the latter is much easier.
Dealing with the latter case, if you look underneath the section of “No Heavy Concentration”, you’ll see that I have this line here
bvec <- c(1, rep(0, nrow(Dmat)), rep(-0.15, nrow(Dmat)))
Basically, I'm applying a limit of 0.15 across the entire board. If you wish for any to be zero or one, just specify the vector of limits by hand, e.g.,
c(1,1,0.15,1,0,1,1,0.15)
, where zero implies you don't want to hold a position.Hi – Great infomration. I was also using this website code: (http://economistatlarge.com/portfolio-theory/r-optimized-portfolio) before noticing yours. My question though, how so I know which risk free is being used? If I want to impose my own Risk-free rate, how to adjust the code? Also, if I’m trying to use a Riskt-free rate that is negative, will that be possible. Thanks
The most clear, intuitive explanation of the application of quadratic programming to portfolio optimization I’ve heard. Thank you so much. I’ll be practicing this in r.
That said, I’m revisiting math I’ve not seen in years and am picking up linear algebra on my own.
I want to expand my knowledge to more types of non-linear programming which will help me with more sophisticated types of portfolio constructions, but am having trouble with vernacular, due to lack of some foundational education. Any suggestions on resources, sites, strategies that would help me climb the curve?
Your input would be most appreciated.
Hi – very helpful post!
I’m having trouble finding the tangency portfolio though. any help on that?
Thanks you!
Hi
How could I use quadprog in R but adapt the objective function to include additional parameters which I would like to minimize?
Is there a way to do this
Thank you
BLala
Nice article but I will like to know how the contraint direction is defined eg. whether a constraint is less than or equal to OR greater than or equal to.
Hello,
Could you please let me know how to generate an efficient frontier graph with these results?
In addition, what if I wanted to specify a minimum allocation for each asset as well as as a maximum?
This was exactly what I needed to get an intro to quadratic programming and to successfully run quadprog with my own data. Thank you so much!
Hey, thank you very much for the post… you saved me a couple of hours here! I just one question: if the code to be optimized by R is
(1/2)*x^T*Q*x – d^T*x
and we’re minimizing
x^T * Sigma * x – mu^T*x
shouldn’t we use Dmat = 2*Sigma, instead of Dmat = Sigma as you suggest?
Again, thanks for the post!
Hi,
Thanks for the compliment!
Good point! I’ll respond with a hedged yes/no.
Yes, in that if you were pedantic in interpreting everything in their right units, you could say, “well, I only want half of the variance to contribute” (setting the risk-aversion parameter to 0.5).
No, in that in practice, the risk-aversion parameter is a hyper parameter you search over so the lack of multiplied-by-two gets glossed over. What I mean is that you’d permute your risk-aversion parameter, simulate/test over those permutation over some validation period, and “pick” one for which you were satisfied with the trading characteristics (whatever it may be).
All that being said, I think I’ll go ahead and change it just to cross-my-t’s-and-dont-my-i’s.
Hi, very useful. Would you be able to provide a link to the r code behind this?
Hi CG,
This was more of a one-off and was written several years ago. As such, it’s been lost in the ether.
I’m pretty busy these days, but whenever I find the chance, I might try to recreate it (and honestly reacquaint myself with it).
Best of luck!
Mr. T
Hello,
This post is truly exceptional! Thanks for writing it. in the case of weights being above 1, what if I wanted not to apply a constraint but a linear penalty?
Said differently, I am willing to go above 1 when it comes to my total sum of weights only if it is very beneficial. For that reason a linear penalty would likely do the trick.
Any ideas there?
Hey there, thanks so much for this article, it’s really helpful.
I have another question on argument-setting though. What if my goal was to maximize the expected return given a max-limit on the portfolio variance? How can I set the arguments?
Hello,
Can I get the returns before calculating the covariances for the 9 stocks XLB , …historical data you have got from google finance.
I am a mathematician and I am developing an active set method for portfolio optimization with Sum of Weights Equal to One and No Shorting.
So that I can compare the answer I will get with your answer:
[1] 0.0000000 0.4100454 0.0000000 0.0000000 0.0000000 0.3075880 0.0000000
[8] 0.2823666 0.0000000