Dane Van Domelen
[email protected]
2020-04-26
You can install and load stocks from GitHub via the following code:
devtools::install_github("vandomed/stocks")
library("stocks")
The stocks package has a variety of functions for analyzing investments and investment strategies. I use it for a lot of my articles on Seeking Alpha. The package relies heavily on Yahoo! Finance for historical prices and on the quantmod package for downloading that data into R.
There are functions for calculating performance metrics, visualizing the performance of funds and multi-fund portfolios, and backtesting trading strategies. The main functions are:
Function | Purpose |
---|---|
load_prices |
Download Historical Prices |
load_gains |
Download Historical Gains |
plot_growth |
Plot Investment Growth |
calc_metrics |
Calculate Performance Metrics |
calc_metrics_overtime |
Calculate Performance Metrics over Time |
calc_metrics_2funds |
Calculate Performance Metrics for Two-Fund Portfolios |
calc_metrics_3funds |
Calculate Performance Metrics for Three-Fund Portfolios |
plot_metrics |
Plot One Performance Metric (Sorted Bar Plot) or One vs. Another (Scatterplot) |
plot_metrics_overtime |
Plot One Performance Metric vs. Time or One vs. Another over Time |
plot_metrics_2funds |
Plot One Performance Metric vs. Another for Two-Fund Portfolios |
plot_metrics_3funds |
Plot One Performance Metric vs. Another for Three-Fund Portfolios |
Stocks and bonds are obviously the primary building blocks for a retirement portfolio, and I think the ETF’s SPY and TLT pair together very nicely for a very effective two-fund strategy. Let’s look at the performance of these funds separately and together.
We can use load_gains
to download historical daily gains for SPY and
TLT over their mutual lifetimes:
library("stocks")
gains <- load_gains(c("SPY", "TLT"), to = "2018-12-31")
head(gains)
#> Date SPY TLT
#> 2395 2002-07-31 0.00242 0.01239
#> 2396 2002-08-01 -0.02611 0.00569
#> 2397 2002-08-02 -0.02241 0.01024
#> 2398 2002-08-05 -0.03480 0.00441
#> 2399 2002-08-06 0.03366 -0.00855
#> 2400 2002-08-07 0.01744 0.00240
We can call (or pipe into) calc_metrics
to calculate some performance
metrics. calc_metrics
returns a normal data frame, but I’ll call
knitr::kable
to print it as a neat-looking table:
metrics <- calc_metrics(gains)
knitr::kable(metrics)
Fund |
CAGR (%) |
Max drawdown (%) |
Mean (%) |
SD (%) |
Sharpe ratio |
Annualized alpha (%) |
Beta |
Correlation |
---|---|---|---|---|---|---|---|---|
SPY |
8.49 |
55.2 |
0.039 |
1.168 |
0.034 |
0.0 |
1.000 |
1.000 |
TLT |
6.31 |
26.6 |
0.028 |
0.844 |
0.033 |
10.4 |
-0.292 |
-0.404 |
We see here that SPY has achieved stronger growth (8.5% vs. 6.3%), but with a much worse max drawdown (55.2% vs. 26.6%). TLT’s Sharpe ratio (a measure of risk-adjusted returns) is somewhat higher than SPY’s.
Without getting too far ahead of myself, TLT’s positive alpha (0.039%) and negative beta (-0.292) are precisely why it pairs so well with SPY. This isn’t unique to TLT; all bond funds should generate alpha (otherwise, don’t invest!), and they’re often negatively correlated with equities.
For a visual comparison of the returns and volatility of these two
ETF’s, we can plot mean vs. SD using plot_metrics
.
plot_metrics(metrics, mean ~ sd)
No surprise, the S&P 500 ETF had more growth, but also higher volatility.
(Side note: You could achieve the same plot by specifying gains
rather
than metrics
, or by simply specifying the tickers
input.)
Negative correlation works wonders for a two-fund portfolio, so let’s
look at how consistently TLT achieves negative correlation with SPY,
using calc_metrics_overtime
and plot_metrics_overtime
. For
illustrative purposes, I’ll include the full 3-step process: load
historical gains, calculate the correlation over time, and generate the
plot.
c("SPY", "TLT") %>%
load_gains(to = "2018-12-31") %>%
calc_metrics_overtime("r") %>%
plot_metrics_overtime(r ~ .)
While the tendency is certainly for negative correlation, there’s a lot of variability, and in some years the correlation was actually slightly positive.
As you can see, the default behavior is to calculate the requested
metric on a per-year basis. You can also request per-month calculations
or rolling windows of a certain width (see ?calc_metrics_overtime
).
And the Pearson correlation is just one of many metrics you can plot
(see ?calc_metrics
for the full list).
Everyone loves piping these days, but for typical use cases I would
actually recommend skipping directly to plot_metrics_overtime
. If you
specify tickers
, it will download the data it needs on the fly. This
code is much shorter and produces the same figure as above:
plot_metrics_overtime(formula = beta ~ ., tickers = "TLT")
A 50% SPY, 50% TLT portfolio should generate much better risk-adjusted returns than SPY (and perhaps TLT) itself, but a 50% bonds allocation is pretty high so raw returns will probably be lower.
To look at this, we can add a column to gains
and then call
calc_metrics
, requesting a few particular metrics:
gains$`50-50` <- gains$SPY * 0.5 + gains$TLT * 0.5
calc_metrics(gains, c("cagr", "mdd", "sharpe", "sortino")) %>%
knitr::kable()
Fund |
CAGR (%) |
Max drawdown (%) |
Sharpe ratio |
Sortino ratio |
---|---|---|---|---|
SPY |
8.49 |
55.2 |
0.034 |
0.042 |
TLT |
6.31 |
26.6 |
0.033 |
0.050 |
50-50 |
8.37 |
23.0 |
0.059 |
0.082 |
Indeed, while the 50-50 portfolio achieved slightly lower raw returns than SPY alone, its max drawdown was far better, and its Sharpe and Sortino ratios indicated much better risk-adjusted growth compared to the individual ETF’s.
That will likely depend on what metric you want to maximize. In terms of raw growth, roughly 75% SPY is optimal, but the curve is pretty flat–the CAGR is roughly the same from 60-100% SPY.
plot_metrics_2funds(gains = gains,
formula = cagr ~ allocation,
tickers = c("SPY", "TLT"),
from = "2010-01-01")
In terms of risk-adjusted growth, the Sharpe ratio curve is somewhat more interesting. The maximum Sharpe ratio occurs around 40% SPY, and the Sharpe ratio gets much worse as you approach 60% SPY and higher.
plot_metrics_2funds(gains = gains,
formula = sharpe ~ allocation,
tickers = c("SPY", "TLT"),
from = "2010-01-01")
We can gain additional insight by plotting two metrics against each other, across all possible allocations. A common strategy is to plot the mean vs. standard deviation as a function of the allocation:
plot_metrics_2funds(gains = gains,
formula = mean ~ sd,
tickers = c("SPY", "TLT"),
from = "2010-01-01")
This plot yields an interesting finding: starting at 100% TLT, increasing the allocation to SPY simultaneously reduces volatility and increases returns. In other words, you’d be crazy not to ride the curve up and to the left, adding at least a 30% SPY allocation.
A big caveat is that this is all based on historical data. There’s no guarantee that 30% SPY, 70% TLT will have lower volatility or greater returns than TLT going forward.
I think three-fund portfolios are the sweetspot in terms of balancing complexity and performance. With two funds, you’re relying on a single source of alpha generation; with > 3 funds, it’s hard to visualize, and thus hard to understand whether the constituent funds actually complement each other.
I won’t go into full detail about it here, but three asset classes that I think work really well together are large-cap stocks, long-term bonds, and junk bonds. To visualize such a strategy, implemented via Vanguard mutual funds:
plot_metrics_3funds(formula = mean ~ sd,
tickers = c("VFINX", "VBLTX", "VWEHX"),
from = "2010-01-01")
100% VFINX maximizes expected returns, but also volatility. If you wanted to take on no more than one-half of the S&P’s volatility, while maximizing returns, you could add an allocation to VBLTX (move from 100% VFINX to the left along the upper black curve). If you’re very conservative and want to target something like 0.4% volatility, a VWEHX allocation eventually becomes helpful (get off of black curves before it veers downward and to the right).
Mean vs. SD is the standard way of visualizing portfolios, but Sharpe ratio vs. SD is more useful for understanding how risk-adjusted performance varies with allocation. If we plot Sharpe ratio vs. SD, the benefit of adding exposure to bonds becomes more clear:
plot_metrics_3funds(formula = sharpe ~ sd,
tickers = c("VFINX", "VBLTX", "VWEHX"),
from = "2010-01-01")
Groovy! By the way, if you want to see individual data points on the
plot (i.e. what allocation each data point corresponds to) you can just
set plotly = TRUE
when you call plot_metrics_3funds
or any of the
other plotting functions.
You can find me on Twitter at @DaneVanDomelen, and of course feel free to make feature requests and collaborate directly on GitHub.
Version | Updates |
---|---|
1.0 | Original |
1.2-1.4 | Added functions, bug fixes, etc. |
2.0 | Switched to ggplot, added piping support, simplified functions for calculating metrics |
Ryan, Jeffrey A., and Joshua M. Ulrich. 2017. Quantmod: Quantitative Financial Modelling Framework. https://CRAN.R-project.org/package=quantmod.