Constantly Rebalanced Portfolios, Part One
A decade ago, the Stanford Report was excited about a new hedge fund, Mountain View Analytics, run by the well-pedigreed Thomas Cover. To wit:
Tom Cover has the next-best thing to a time machine: He has an algorithm — a computational procedure — that uses the past to predict the future. It works as well or better than hindsight, outperforming a pretty good investment strategy: diversifying your stock portfolio and hoping that performance of superstars will more than make up for money wasted on losers. [several paragraphs redacted] So who wants to be a millionaire?
Sometime earlier this year, the domain registration on mountainviewanalytics.com expired, the page having not been updated for many years. These posts are intended to answer why constantly rebalanced portfolios (CRPs) — the broad class of investment strategies of which Cover’s is one — don’t actually work. In this post, I’ll give the technical and theoretical background, and in the next post, I’ll talk about the myriad of ways CRPs come up short in practice.
Why bother? Well, I think that CRPs are an interesting problem for two reasons:
They provide a very illuminating example of the issues involved in bringing academic theory into practice, and the risks of hubris inherent in that process.
They involve some very interesting math. They are featured at the end of the book Prediction, Learning, and Games (which I have considered buying if only for its cool cover art) which suggests they can be thought of as representing a culmination of learning theory, one of the more interesting topics I’ve had the opportunity to study in grad school.
The idea behind a CRP is that you always maintain a constant portfolio (distribution) of your wealth regardless of how the underlying assets in the portfolio change in value. So, if you’re mixing fifty-fifty between two stocks and one of them doubles in value while the other stays the same, you’d sell a quarter of your expensive stock holdings and use the proceeds to buy the cheaper stock — this maintains your holdings at an even split of wealth. Ideally, this scheme allows you to capture the value from positive swings in price while priming your portfolio to be ready for when undervalued assets become more in line. I probably should know a real answer to this, but my guess would be that if all the movements of the assets in your portfolio are completely arbitrary and not reflective of any kind of underlying value, then a CRP is your optimal policy.
Now, okay, let’s get into some math (sorry for the formatting, tumblr is nice for many things but not for this). Let x be a CRP (row vector of non-negative values, without loss of generality have it sum to 1 by introducing a “cash” asset if you’d like). Let aN represent a column vector of the return of the assets of the N-th day. (So if they were stocks, it would be a column vector that looks like [.97, 1.04, 1.01, …]).
Then we see:
return from best CRP in hindsight= max over x: (x * a1)(x * a2)(x * a3)*…
Now consider the RHS. By monotonicity of log, the same x will also solve:
max over x: log(x * a1) + log(x * a2) + log(x * a3) + …
which is equivalent to
min over x: -log(x * a1) - log(x * a2) - log(x * a3) - …
So this is the insight transforms the task into an online convex optimization problem: minimizing a convex function (the sum of all those negative logs), over a convex domain (the simplex), which changes in an online fashion (you’re not statically optimizing, instead you get a new update every day).
If this sounds cool to you, you’ll probably want to check out Elad Hazan’s recent paper summarizing online convex optimization in this context. But essentially, there are now a large number of algorithms which stay competitive with the best CRP in hindsight. What’s neat about these algorithms is that, to achieve these bounds, you generally act to optimize over what you’ve currently seen, performing gradient-descent type algorithms over your historical data set. While gradient descent makes lots of sense in offline convex optimization, It’s very neat to me that such algorithms are successful in the online setting, where the shape of the space you’re optimizing over can change so dramatically from time step to time step, because assets could plummet or soar in value and you have no control over that.
So, I hope I’ve given a good illustration of how this should work in theory. Next time: why doesn’t it work in practice.