monte carlo simulation Archives

5 minute read

We’re kicking off a new, multi-part series here. We’re going to be looking at several different investment strategies using Monte Carlo Simulation techniques. Our goals with this series are to:

Demystify the Monte Carlo simulation technique.
Objectively evaluate the performance of different strategies against each other.
Learn.

I’m going to drop in our disclaimer right here just to make sure there is no confusion:

Disclaimer: While we have a passion for providing entertaining, informational, and possibly useful articles about personal finance, we’re just random people on the internet with no formal credentials or expertise. Talk to a licensed professional advisor if you need advice.

What Is A Monte Carlo Simulation?

Monte Carlo simulations attempt to show how a system responds through the use of repeated, random sampling of a model of that system. In observing how the system responds to a range of inputs, we can make better decisions in real life. We would like to see if we can learn about how well different investment strategies performed so we can make decisions about what to do in the future. Check out wikipedia and investopedia for some more detail on Monte Carlo Simulations.

While this post/series is not a comprehensive overview of the topic, a brief introduction is useful. Remember the “Normal” distribution from your first statistics class? If not that’s OK. My first statistics class was traumatic too. It represents a range of values/probabilities that we’re likely to see in many systems. Here is a distribution that represents the US Stock Market’s annual returns:

1000 samples of annual US stock market returns from a distribution with mean of TBD and standard deviation of TBD.

For Monte Carlo Simulation, the distribution is at the heart of everything. It is our representation of the system. The underlying distribution tells us how often we expect to see a given result. Finally, it is also fundamentally based on assuming that the general shape of the past can give us clues to how the future will look. How?

We iteratively and randomly sample points from the distribution. In our case, this provides a hypothetical sequence of returns for that asset class. If we’re simulating a 30 year retirement, we need 30 points from each asset. We’re using a distribution rather than actual historical sequences like cFireSIM. Therefore, we can simulate an infinite number of sequences. Let me be clear: that’s not a knock against cFireSIM. It’s actually one of my favorite tools and an inspiration for a lot of our work here.

What does a Monte Carlo Simulation Look Like?

Next, let’s look deeper at the first 5 points sampled from this type of distribution. We will illustrate how we can start to build up a sequence of returns. Remember, these 5 points are randomly drawn from the same distribution. Think of them as the first 5 years of a single “run” representing one potential retirement reality. Below, each panel shows a new point being randomly generated from the underlying distribution and added to the prior sequence.

Five successively chosen points from an underlying distribution.

Next, we can extend the sequence to 30 points (or any number) to represent a single retirement “run.” The next plot shows three such runs. Remember, we drew 3 sequences of 40 points from the same underlying distribution. And, the underlying distribution represents the annual performance of the US stock market. Therefore, you can think of this as three potential retirement experiences.

Three simulated “runs” of randomly generated sequences of returns.

When you make thousands of such multi-decade “runs”, you start to see the range of potential outcomes from this portfolio over time. And, that’s the foundation of our Monte Carlo simulation. We iteratively sample from the historical return data. We then simulate thousands of 20 year, 30 year, or 40 year (or more for those in the FIRE community) return sequences. Finally, let’s put the whole thing together and illustrate our 3 runs from above against a fuller population of simulation data.

In this plot, we simulated 1000 runs and then took the 10th to 90th percentile of those runs within a given year. We’re essentially eliminating some of the less likely returns from the summary. This reduced population forms the grey band in the graph. Overlaid on top of that are the three runs from above. Notice how many individual points are well outside of the grey bands. That’s important: any individual run can have some pretty extreme values (March of 2020, anyone?), but when you look at expected values they’re frequently less extreme. Are those extremes possible? Yes! But, they’re also less likely to occur.

Three 30 year simulated runs highlighted against the 10th -90th percentiles (grey band) of a 1000 run Monte Carlo Simulation

If you had two distributions, one that represents the annual performance of the US stock market, and another that represents the annual performance of the US bond market, you could start to build a model of their respective performance over time. From there, we can start to compare how well different portfolios perform…but we’ll dig into that another time. For now, let’s look at one caveat of many simulations: the shape of the underlying distribution(s).

Pitfalls of the Normal Distribution

In many systems, the normal distribution is a good fit for the underlying data. Stock market performance is not one of them. Here’s a great discussion on the topic. The key phrase is, “fat tails”. Over time, people observed that the stock market sees big movements more frequently than the normal distribution would suggest. This results in errors: differences in the model relative to historical performance. We would like our models to be as right as possible. I need to pause for the obligatory quote from legendary statistician and 20th century Renaissance Man, George Box:

“Essentially all models are wrong, but some are useful.”
George Box

Of course, we would like the models to be as right as possible, especially if we’re going to use them.

Metalogs – An Answer to the Normality Problem

Meta what?

“Metalogs”

They’re flexible distributions that more accurately reflect the underlying data than many of the classic distributions we’re used to (e.g., the Normal). Check them out here. They were invented by Tom Keelin who could be the 21st century’s Renaissance Man. By making distributions that can generate continuous samples from the underlying source data, Tom enabled us to reduce the bias in our original models. He helped us to fatten up our models’ tails when working with stock market return data (and his invention can be useful for modelling in any discipline. Have I sung his praises enough yet?).

Here’s a picture to help illustrate the differences between the actual data and two simulations. We can make 1802 annual return data points from Dr. Shiller’s dataset, called “Actuals” going forward. First, I calculated the mean/standard deviation of the Actuals and used those statistics to generate 1802 simulated returns using the Normal distribution. Then, I fit a Metalog to the original data (a 13-term metalog had the lowest standard error) and simulated 1802 more annual returns using a Metalog based on the actual data. Here’s a Box Plot (yes, the same George Box) showing how the three distributions compare.

Visually, you can see the Actual Returns and Metalog Simulation both have longer whiskers and more outliers than the Normal Simulation.

Wrap Up

That will do it for this first introduction to the topic of portfolio evaluation. It’s a fascinating problem. Inevitably, we will make mistakes along the way. I’m excited to dig into this topic and learn more about it. Hopefully, you have a better understanding of how we’re approaching this idea of portfolio evaluation. In subsequent posts, I will lay out some sample scenarios and start simulating!

Tag: monte carlo simulation

The Portfolio Series – Part 1: Monte Carlo Simulation

What Is A Monte Carlo Simulation?

What does a Monte Carlo Simulation Look Like?

Pitfalls of the Normal Distribution

Metalogs – An Answer to the Normality Problem

Wrap Up